Category: Blog

New! SCP Preparation Courses Offered by Loop1

Loop1 Systems is proud to be one of the selected SolarWinds partners to provide training for the SolarWinds Certified Professional (SCP).

The SCP was recently updated and will now be product focused, NPM and SAM, as well as being offered as the industry’s first subscription-based certification program. Earning the prestigious SCP certification indicates that a SolarWinds administrator has mastered effective use of SolarWinds networking and systems management products. Read more about the SolarWinds announcement of SCP updates here.

As a premier partner for SolarWinds, Loop1 Systems this year is debuting a SCP Preparation course for both NPM and SAM. Loop1 Systems engineers have been involved in the beta testing of the updated SCP, and have mapped its training around the new exam. Our team boasts the most qualified group of engineers in the market. In 2017 alone, Loop1 team members spent 15.5 years of work on SolarWinds.

Over the course of a two-day workshop, Loop1 engineers will prepare SCP candidates for the exam with instructor-led curriculum, practice questions, as well as a voucher to take the test.

Stay tuned for the full schedule of SCP preparation courses, offered by Loop1 Systems.

New Owner at Loop1 Systems Prepares for Prosperous Future

Co-Owner William Fitzpatrick Buys Out Partner, Assumes 100% Ownership

AUSTIN, Texas (Dec. 6, 2017) – Loop1 Systems, an IT infrastructure solutions provider and SolarWinds Authorized Partner, announced today that it has new ownership. William Fitzpatrick, who previously owned 42 percent of Loop1 Systems, now has full ownership.

Fitzpatrick, taking the title of President and CEO, became full owner this June after buying out long-time partner and co-founder, Don Kinnett. Bringing deep industry knowledge of engineering to Loop1 Systems, Fitzpatrick has a unique mix of technical expertise, application development and entrepreneurship to take Loop1 Systems to its next level of growth.

Fitzpatrick’s background helped catapult the Austin-based company to success. The company brought in $22 million in business last year, and counts more than 200 of the Fortune 500 corporations as clients. Loop1 Systems has been named several times on the coveted Inc. 500|5000 list and earned several other accolades for its fast growth since establishment in 2009. Fitzpatrick plans to build upon what has made Loop1 Systems successful, while also adding new outbound and culture-building initiatives to its operations.

“We’ve had so many accomplishments, but we’ve only scratched the surface of what we’re capable of,” Fitzpatrick said. “We have built a reputation we’re proud of: Loop1 Systems makes SolarWinds even better. We will continue to hold to that standard of excellence, while better serving our clients’ enterprise needs and growing existing client relationships.”

In order to best prepare for Loop1 System’s future growth, Fitzpatrick made several changes to his leadership team, including several promotions of internal talent and bringing on a few key consultants. Fitzpatrick has also launched a Loop1 Cares program focused on helping employees set and reach professional and personal goals, orchestrating team-building activities and supporting philanthropic initiatives as a company.

ABOUT LOOP1 SYSTEMS

Loop1 Systems, Inc. provides IT infrastructure solutions for clients of all sizes and verticals. Loop1 Systems makes SolarWinds even better, specializing in offering the most comprehensive training and professional services for SolarWinds customers across North America. With corporate headquarters in Austin, Texas, Loop1 Systems is able to deliver technical solutions worldwide both remotely or onsite. Visit www.loop1.com.

# # #

Technical Jargon Decrypted – Part Two – Networking Basics

In Part One of this series we looked at some of the basic terminology used in the tech world. Going forward in the series we will take a deeper dive by focusing on a specific topic.

Since it seems to be common practice to prematurely blame the network for most issues and unexpected outages, I thought this would be a great place to start.

Yes, I must admit that I am a little biased because of my background as a network engineer.

OSI Model overview:

To quickly isolate and troubleshoot infrastructure issues a clear understanding of the OSI Model is not only needed but is imperative. There are many different mnemonic methods for memorizing the seven layers of the OSI model but I want to share the one I used (this one starts at layer one and works up to layer seven):

Please Do Not Throw Sausage Pizza Away!

Pretty simple right? We’ll break this thing down backwards, which is the approach you would use when troubleshooting with a user.

Layer 7 (Application) – interacting with the application and/or operating system this is where web browsing, file transfers, email takes place; i.e. HTTP, SMTP, FTP, etc.

Layer 6 (Presentation) – Responsible for converting the data at the application so the underlying layers can understand them; i.e. ASCII, GIF, JPEG, etc.

Layer 5 (Session) – This layer is responsible for maintaining communication with the end device; i.e. NETBIOS, Appletalk, PTPP, etc.

Layer 4 (Transport) – At this layer flow control and error checking are established; i.e., TCP and UDP.

Layer 3 (Network) – In this layer the protocols, routing and addressing are assigned to the data; i.e. internet protocol (IP), ICMP (ping), IPSEC, ARP, etc.

Layer 2 (Data Link) – A physical protocol is defined at this level; i.e. ethernet, ATM, PPP, MAC Address, etc.

Layer 1 (Physical) – Is the hardware level and defines the physical connections such as cabling and connections; i.e. CAT 5, fiber, RJ45, RJ11, etc.

In part three of this series we are going take a more in-depth look at layers 7-4 and then in part four we will complete our tour of the OSI model by finishing with layers 3-1.

Wes Johns
Sales Engineer

Decrypting Technical Jargon

Technical Jargon Decrypted – Part 1

Technology and its use of jargon can be confusing and frustrating, especially for new users (newbies). Understanding the terminology early on will help lessen these feelings and improve the overall user experience from the start. While starting from the most basic level and building a foundation, each post will advance, as we progress through this series.

The basics:

Bit – A basic unit of information which can only have one of two values; off (zero) or on (one).

CPU (Central Processing Unit) – The brains of a computer, responsible for interpreting and executing commands from other hardware and/or software.

Byte – A unit of information containing 8 bits or the equivalent of one character.

WWW (World Wide Web) – Invented in 1989 by Tim Berners-Lee and is a global space where documents, pictures, movies, applications, etc., can be accessed via the internet by a URL.

KB (Kilobyte) – One thousand Bytes is the equivalent of one Kilobyte.

RAM (Random Access Memory) – Stores frequently accessed information for quick retrieval by the CPU. All data stored in RAM is lost if the device loses power.

IP (Internet Protocol) – An identifier assigned to a device or node allowing network communication.

URL (Universal Resource Locator) – Identify the resources (documents, pictures, movies, etc.) and are interlinked by HyperText.

HDD (Hard Disk Drive) – A fixed disk that uses magnetic technology to store and retrieve data.

HTTP (HyperText Transfer Protocol) – The foundation for communication on the World Wide Web. Uses logical links allowing navigation between resources and nodes.

GUI (Graphical User Interface) – A visual interface allowing interaction with the underlying hardware.

HTML (HyperText Markup Language) – Used to create webpages and is the building block of the World Wide Web.

CLI (Command Line Interface) – A text based interface allowing interaction with the underlying hardware by issuing commands.

ISP (Internet Service Provider) – A company that provides a service for accessing the internet; for example; AT&T, Spectrum, Comcast, etc.

ROM (Read Only Memory) – Like RAM with the exception that data is not lost when the device loses power.

SSL (Secure Sockets Layer) – A standard used to establish secure links between hosts. For example: a web server and a client web browser.

We have only scratched the surface by defining some of the commonplace terminology used in tech speak today. There are literally hundreds of great resources available for free on the web and best of all they are only a mouse click or two away.

Wes Johns
Sales Engineer

IT-Project-Help

Basics of Setting Up a Network Monitoring System – Part 2

Part 2 of 2
(Read part 1 here)

I have all this data coming in, now what?

Dashboard building

Once you have all your important systems in the monitoring tool, you have logical polling intervals, and everything is categorized and labeled you probably have a default summary page with a ton of red and green Christmas tree lights and severe looking red words all over it.

Your boss walks past the screen and sees red dots and error messages and starts asking why so many things are broken. Now you must explain that this is probably normal stuff and they shouldn’t worry, but what is the point of the monitoring tool if you are supposed to ignore half of what shows up on the screen?

How do you know which half is the important stuff and which can you ignore? This is where you start to tailor the tool to your environment and make it helpful. Those tags we set up earlier will be critical in this regard.

Often the initial summary page should be a basic snapshot of the current availability of the key services that impact nearly everyone in the organization. How do the domain controllers look? Can we still send emails? Are the main business offices and datacenters okay? Is the company website up?

Depending on which monitoring tools you have you may have different methods available to you to validate all these things but generally you want to find a way to simply display that a given service is available, unavailable, and maybe an indicator if things are degraded in some way but still not completely offline.

Keep things simple and high level, if the issue is directly relevant to someone they can drill in deeper.

In SolarWinds I will often build out lists of critical services as groups, and then display the statuses of those groups with a simple map made using the Network Atlas tool. You can get really elaborate with customizing icons and such here but big green, yellow, and red indicators do a perfectly fine job.

Going beyond that high-level service indicator, you might also want to include information about upcoming maintenance windows or changes. A simple custom HTML box with messages you want to get out there would do the job or a custom table with every device that is scheduled to be unmanaged this week.

Are there any significantly congested points on the network that might have wide ranging impacts such as the WAN interfaces? You can add in a filtered resource that just shows the current utilization of these circuits, or if there are many circuits just show any of them that are above their thresholds.

Probably useful to have a search box to help people jump to the specific device they were interested in if they logged in with a mission in mind. I would shy away from having resources that list every event happening in the environment here as there are going to be constant streams of noise and they are likely going to be too scattered to be very helpful without some filtering.

If your environment is larger you may also want to add tabs to this view or links to other dashboards where you split things up based on the support teams or types of monitored objects involved. You find that the layout that makes sense to one team is often not particularly relevant to another.

While the network team could want the environment grouped by Site names, the DBA team might not be as concerned with physical locations if their workloads are all run in the central datacenter. Maybe they would benefit more from sorting their objects out by the type of database on the server, Oracle vs MSSQL, or the environment, Prod vs Dev.

On these more detailed pages you will likely also want to get into displaying more charts to show how things change over time in the environment.

A Network team could benefit from a chart indicating average response time for network devices grouped by Site name over the last 24 hours.

An Applications team might want to see things like the average CPU and memory use of their servers but they ask for a rolling week in order to see day over day changes in the trends and alongside that they want a chart of the application’s active user sessions or data throughput.

Spending time talking to the consumers of this data can give you a lot of insight into what metrics they care about, and what format is most helpful in presenting it to them. Putting a table where you need a chart makes it hard to spot changes over time and charts are unnecessary if all you need is a current status.

Thresholds and Responses

A key element in monitoring is setting your thresholds. How much CPU load is enough to get your admin involved and at what point does slow response to pings warrant investigating?

You will find out of the box thresholds built into whatever tool you are using but you will need to tweak them to the reality of your environment. I typically find that the most effective way to use thresholds is to set my critical threshold to the value where I would expect someone to try to address the issue immediately.

If you know that a server uses a high amount of memory then leaving it with the default threshold of 90 is not efficient since that metric will always show as being critical, which makes the dashboards look like there are more problems than there are and gets the users into the habit of ignoring the red signs.

Similarly, if you have monitors set up on something like a SQL performance counters and your DBA tells you that they do not generally worry about the number of connected sessions then don’t set a critical value for this metric. If something is just nice to know or gives clues about what is going on but isn’t a main indicator, then I don’t want to get messages about it in my inbox.

Alert fatigue is a very common problem so I set my alerts only to notify me via email when we have crossed the critical threshold, and many metrics require me to stay above the threshold for a specified amount of time. This way I know that if something shows up in my inbox that it is probably important, instead of getting so many messages that I route them to a folder I never check.

I won’t need to address a short CPU spike, but a server that has been maxed out for 30 minutes is potentially worth considering.

When it comes to warning thresholds, I will reference these in my reports and on the dashboards so I have some opportunity to see how often devices are in that zone without getting numbed by a constant stream of emails.

Going farther into the topic of email alerts and thresholds, you will typically start off with simple global thresholds like “Notify me when memory utilization goes above 90%” but people eventually find that these rules are too generic.

It turns out that their database servers always use high percentages of memory, or they don’t care when the dev machines max out, or they have a rarely used utility that only has 1 GB ram but they don’t feel like they use it often enough to upgrade the resources.

As the use of the monitoring tools mature, people begin to find more and more exceptions to these global rules and one-off little edge cases. Instead of carving out all kinds of exclusions from the standard memory alert saying “Don’t email me if the server is a database, and not if it is in dev, and not if today is Tuesday” it is more efficient to just get to a place where you have individual thresholds per device.

In SolarWinds you would do this by changing the trigger conditions to a Double Value Comparison and instead of saying Memory Percent is greater than 90 you set it Memory Percent is greater than Memory Critical Threshold. Now it will check on a per device basis against the thresholds you have set in the node properties.

It would seem like doing it individually would be less scalable but in practice it is a lot easier to have the granular threshold capabilities rather than having several variations on the same alert tweaked for each individual edge case. The duplicate alert scenario eventually ends up with unintentional gaps or duplicate alerts because years from now people have forgotten the edge cases that are already in the system they don’t want to go back and check them all.

You can set these thresholds in bulk from the Manage Nodes screen and there are methods to automatically set them based on custom properties so you don’t have to manage them actively.

I also mentioned reports. I try not to do any email alerts based on predictions of use trends because ultimately a prediction is always a guess. This kind of information makes more sense in a report that you can run periodically, gather all the dangerous looking trends into one place rather than separately investigating each disk volume that looks like it might fill up in 3 weeks.

When building reports and email messages I always try to think about all the additional information I have that might be useful to include in the message. If I get an email indicating that CPU load on a server has been high then it can be useful to include information like what OS does it run, how many cpu cores does it have, what is the threshold on this server, and what application is it associated with.

As a senior admin, you may already have all this information in your head but in a big environment there might be so many servers that no single person knows them all. Including as much context information as you can in the alert will help the people who end up dealing with the alert to remember how things are connected and gets problems resolved faster.

If there are common troubleshooting steps or issues associated with a particular server including a comment about that in the server’s tags helps to get your institutional knowledge documented and available at the times when it is most needed.

So, as you see, there is a lot to keep in mind when setting up a monitoring system that is effective at tracking the health of your environment but a little planning and strategy can dramatically improve the results you get from yours.

Marc Netterfield
Field Systems Engineer

IT-Project-Help