Nagios Log Monitoring: Monitor Unix Log Files Efficiently

Nagios Log File Monitoring – Monitoring log files using Nagios can be just as difficult as it is with any other monitoring application. However, with Nagios, once you have a log monitoring script or tool that can monitor a specific log file the way you want to monitor it, Nagios can be trusted to handle the rest. This kind of versatility is what makes Nagios one of the most popular and easy-to-use monitoring applications out there. It can be used to monitor anything effectively. Personally, I love it. It has no equal!

My name is Jacob Bowman and I work as a Nagios monitoring specialist. I’ve found, given the number of requests I get at my job to monitor log files, that monitoring log files is a big deal. IT departments have a continual need to monitor their UNIX log files to ensure that application or system problems can be caught early. When problems are known, unplanned outages can be avoided entirely.

But the common question that many ask is, what monitoring application is available that can effectively monitor a log file? The simple answer to this question is NONE! The log monitoring apps out there require too much configuration, which in effect makes them unworthy of consideration.

Log monitoring should allow pluggable arguments on the command line (rather than in separate configuration files) and should be very easy for the average UNIX user to understand and use. Most log monitoring tools are not like this. They are often complex and take time to become familiar with (by reading endless pages of installation configurations). In my opinion, this is an unnecessary problem that can and should be avoided.

Again, I strongly believe that to be efficient, one must be able to run a program directly from the command line without having to go elsewhere to edit configuration files.

So the best solution, in most cases, is to either write a log monitoring tool for your particular needs, or download a log monitoring program that has already been written for your type of UNIX environment.

Once you have that log monitoring tool, you can give it to Nagios to run at any time, and Nagios will schedule it to start at regular intervals. If after running it at set intervals, Nagios finds the issues/patterns/chains you tell it to watch for, it will alert and send notifications to whoever you want them to.

But then you wonder, what kind of log monitoring tool should you write or download for your environment?

The log monitoring program you should get to monitor your production log files should be as simple as the following, but it should still be powerfully versatile:

Example: logrobot /var/log/messages 60 ‘error’ ‘panic’ 5 10 -foundn

Departure: 2—1380—352—ATWF—(Tues/1)-(16:15)—(Tues/1)-(17:15:00)

Explanation:

The “-found” option searches /var/log/messages for the strings “error” and “panic”. Once it finds it, it will cancel with a 0 (for OK), 1 (for WARNING), or 2 (for CRITICAL). Each time you run that command, it will provide a one-line statistical report similar to the output above. Fields are delimited by “—“.

The first field is 2 = which means this is critical.

The second field is 1380 = number of seconds since the strings you specified last occurred in the log.

The third field is 352 = 352 occurrences of the string “error” and “panic” were found in the log in the last 60 minutes.

The fourth field is ATWF = Don’t worry about this for now. Irrelevant.

Means of fields 5 and 6 = The log file was searched from (Tuesday/1)-(16:15) to (Tuesday/1)-(17:15:00). And from the data collected from that time period, 352 occurrences of “error” and “panic” were found.

If you really want to see all 352 occurrences, you can run the following command and pass the “-show” option to the logrobot tool. This will display on the screen all the matching lines in the log that contain the strings you specified and that were written to the log in the last 60 minutes.

Example: logrobot /var/log/messages 60 ‘error’ ‘panic’ 5 10 -show

The “-show” command will display to the screen all the lines it finds in the log file that contain the “error” and “panic” strings within the last 60 minutes you specified. Of course, you can always change the parameters to suit your particular needs.

With this Nagios log monitoring tool (logrobot), you can perform the magic that the famous big name monitoring applications cannot perform.

Once you write or download a log monitoring script or tool like the one above, you can have it regularly run by Nagios or CRON, which in turn will allow you to keep a bird’s eye view of all your servers’ logged activities. important.

Do you have to use Nagios to run it regularly? Absolutely not. You can use whatever you want.

Leave a Reply Cancel reply