Server/Device Monitoring 101

A tremendous amount of attention over the past few years has gone to the critical importance of IT and APM monitoring. Company after company has sprouted up to offer all sorts of services. At first it was focused heavily on website monitoring with dozens of companies telling us all how if your website went down you would be losing huge amounts of revenue, and this is of course true. Then the focus shifted a bit to drill down on the importance of how quickly your website pages would build and once again how slow build times would hurt your brand and of course hurt your revenues, and once again this is obviously true. The monitoring evolution we witnessed grew to include cloud based monitoring, monitoring the cloud, overall IT infrastructure monitoring and ultimately APM.  As Big Data and IoT push the industry further and further and the demands on our monitoring systems continue to grow and increase in criticality – we need to stay equally focused on the basics, the underlying layer that makes everything run at its best. It is like purchasing your Jaguar and you are so focused on the inner working of the engine dynamics and speed that you can easily forget the basics.

server monitoring

Server/device monitoring is the foundation that we can never forget about and we should always be on the lookout for any troubles or deterioration. It is amazing how many times I have seen companies not attending to this and then ending up with egg on their face when things go wrong. So for the moment let’s get focused on the basics.  As I took a look through the industry to see who could do what, and taking into consideration the total package beyond just server/device monitoring, I honed in on one company that caught my attention, Monitis. As I reviewed their capabilities, and then trialed it myself, I was impressed with the totality of their service offering.  They have been cloud based since the start, long before it was fashionable, and offer a full suite of monitoring services; website, server, network, cloud, application, RUM and a series of custom monitors that are easily setup via a very easy to use API. Plus, what I really liked was their all inclusive dashboard that was so informative and intuitive.

Server

But like I said, let’s get back to the underlying basics and stay focused on server/device monitoring. When selecting the monitoring platform you are going to employ, make sure it has all of the capabilities I highlighted above plus;

  •  CPU Monitoring
  •  Memory Monitoring
  • Drive Monitoring
  • Linux Load Monitoring
  • Disk I/O Monitoring
  • Bandwidth Monitoring
  • rocess Monitoring
  • Windows Service Monitoring
  • System Events Monitoring
  • SNMP Monitoring
  • PING Monitoring
  • HTTP Monitoring
  • HTTPS Monitoring
  • TCP Monitoring
  • EC2 Monitoring

What is important to always keep focus on is that it is the information provided by these monitors that will allow you to provide the highest level of service to your customers, both external and internal.  Now let’s take a quick look at what each of these is and why they are so important.

CPU monitoring allows you setting CPU thresholds so that you get alerted if your machine’s CPU utilization reaches some critical level preset by you. By preventing CPU overload you will be assured of optimal processing speeds and performance levels.

Memory monitoring allows you to set RAM thresholds so that you get alerted if your machine’s RAM utilization reaches a critical level. This will again help you prevent any overloads that might slow down your processing performance and is especially important in production servers.

Drive monitoring allows you setting hard drive space thresholds so that you get alerted if your machine’s hard drive utilization reaches some critical level.

Linux Load monitoring allows you setting load average thresholds so that you get alerted if your Linux machine’s load reaches some critical levels as determined by you.

Dik I/O monitoring allows you to monitor Read and Write operations of logical disks on your machine and set thresholds so that you get alerted if any of the below metrics reaches a critical level preset by you:

  • Reads/sec – the rate of read operations on the disk.
  • Writes/sec – the rate of write operations on the disk
  • Queue length – the number of requests outstanding on the disk at the time the performance data is collected.
  • Busy time – the percentage of elapsed time that the selected disk drive was busy servicing read or write requests.

Bandwidth monitoring allows you to monitor your network interface and set thresholds so that you get alerted if any of the below metrics reaches a critical level :

  • Input/output traffic speed (B/sec)
  • Number of Sent/Received error packets
  • Number of Sent/Received dropped packets

Processor monitoring allows you setting CPU, RAM and Virtual Memory thresholds for processes running on your Windows or Linux machine so that you get alerted if your machine’s utilization of any of these resources reaches a critical level.

Windows Service Monitor checks the status of Windows services running on your machine.

System Events monitoring allows you monitoring certain Windows system events, so that you get alerted if any of these events occur.

SNMP, short for Simple Network Monitoring Protocol, is the most common protocol for checking network-attached devices, such as routers and switches, for conditions that warrant administrative attention.

PING monitoring allows you to test the accessibility of your server over IP network not only externally from multiple locations around the world but also from your local network, if e.g. your website is deployed within your intranet and is not accessible outside of it.

You can also use this to monitor your website externally from additional locations. Failure status is returned by your PING monitor if:

  • No response from the server within the set timeout
  • Count of lost packets has crossed your set threshold

HTTP monitoring allows you to test the availability and response time of your website not only externally from multiple locations around the world ,but also from your local network, if e.g. your website is deployed within your intranet and is not accessible outside of it.

You can also use this to monitor your website externally from additional locations, by simply installing Monitis Smart Agent and adding the Server/Device HTTP Monitor on a machine in any location you want to monitor your website from.

Failure status is returned by your HTTP monitor if:

  • No response from the server within the set timeout
  • DNS resolving error
  • HTTP error
  • Network or connection error
  • Connection closed by server
  • Basic authentication failed
  • Content matching failure

HTTPS monitoring allows you to test the availability and response time of your website not only externally from multiple locations around the world (see Uptime Monitoring ->HTTPS), but also from your local network, if e.g. your website is deployed within your intranet and is not accessible outside of it.

You should use HTTPS vs HTTP monitoring if your web site uses HTTPS protocol for secure communication.

You can also use this to monitor your website externally from additional locations, by simply installing Monitis Smart Agent and adding the Server/Device HTTPS Monitor on a machine in any location you want to monitor your website from.

Failure status is returned by your HTTPS monitor if:

  • No response from the server within the set timeout
  • DNS resolving error
  • HTTP error
  • Network or connection error
  • Connection closed by server
  • Basic authentication failed
  • Content matching failure
  • SSL version not supported by user’s server

TCP monitoring allows you to test the availability and response time of your TCP server from your local network, if e.g. your TCP server is deployed within your intranet and is not accessible outside of it.

You can also this to monitor your TCP server externally from additional locations, by simply installing Monitis Smart Agent and adding the Server/Device TCP Monitor on a machine in any location you want to monitor your TCP server from.

Failure status is returned by your Server/Device TCP monitor if:

  • No response from the server within the set timeout
  • Network or connection error
  • DNS resolving error

The bottom line is that with the combination of the high level monitoring capabilities provided by Monitis, combined with their server/device monitoring, you will have a clear, concise and current view of what your customers and users are experiencing and the information to deal with any trouble that might come up. Plus, you can try it for free! So go ahead and test it out now, stress, it, kick the tires and take it for a ride. I am sure you will be as pleased as I was.

Passion Artist & a Social entrepreneur. Interests goes like this; Art & Design ,Psychology, Software, Electronics and Nanotechnology. NNITO!