Tuesday, February 10, 2015

Real-time Log File Monitoring

Source: Astiostech's SYSMON, Nagios GURU: http://onelegkickingit.blogspot.com/
 

System, transaction or application log files are the CRITICAL repositories to find problems affecting your business, systems or even applications. Most administrators will commonly monitor the log files via cron or a NMS system that monitors for certain keywords on the log files in every 5 minutes time. Once a certain key word is detected it will trigger a notification to the administrator.

This is very good for most administrators but is it good enough?

As we all know, the drawback from monitoring in intervals is that there will be always be delays in between of checks. For an example if a log file is monitored every 5 minutes time, there will be a delay of 4 minutes (or more) in detecting the problem. Even though, you could decrease the check interval time to every 1 minute (or less) and it will work much better, but this comes at the expence of higher processing (due to the increase of monitoring frequency). Furthermore if these log files are sitting in a CRITICAL server, any delays in detection will result in disasters for a lot of businesses or applications.

Well it aint looking good, doesnt it and so there is a need to monitor log files in real time or to be precise almost realtime ;D.

Introducing, SEC (a.k.a Simple Event Correlator) is  an  event  correlation tool for advanced event processing which can be harnessed for event log monitoring, for network and security management, for fraud detection, and for any other task which involves event correlation. Event correlation is a procedure where a stream of events is processed, in order to  detect  (and  act on) certain event groups that occur within predefined time windows. Unlike many other event correlation products which are heavyweight solutions, SEC is a lightweight and platform-independent event correlator which runs as a single process. The user can start it as a daemon,  employ it in shell  pipelines,  execute  it interactively in a terminal, run many SEC processes simultaneously for different tasks, and use it in a wide variety of other ways.


Now let us get started.

1) Download the installation file from this URL. http://sourceforge.net/projects/simple-evcorr/files/latest/download?source=typ_redirect
# cd /usr/src
# wget -c http://sourceforge.net/projects/simple-evcorr/files/latest/download?source=typ_redirect



2) Untar the download tar.gz file and copy the directory to '/usr/local
# tar zxvf sec-*.tar.gz
# cp sec-* /usr/local/sec

So for an example if you want to monitor the SSH login failures (Bold) of your system and get notified via the email.

#Logs involving logins, change of UID and privilege escalations (USERACT)
#-------------------------------------------------------------------------
#Nov 14 12:14:58 foohost sshd[3388]: fatal: Timeout before authentication for 192.168.1.1
#Nov 14 19:58:34 foohost sshd[6597]: Bad protocol version identification '^B^S^D^Q^L' from 192.168.1.100
#Oct 18 06:16:53 foohost sshd[131]: Accepted keyboard-interactive/pam for jpb from 192.168.1.1 port 1077 ssh2
#Nov 14 12:55:29 foohost sshd[3425]: Accepted keyboard-interactive/pam for jpb from fe80::2c0:4fff:fe18:13fd%ep0 port 27492 ssh2
#Nov 15 04:02:24 foohost login: 1 LOGIN FAILURE ON ttyp2
#Nov 15 04:02:24 foohost login: 1 LOGIN FAILURE ON ttyp2, mysql
#Oct 18 03:20:46 foohost login: 2 LOGIN FAILURES ON ttyv0
#Oct 18 02:52:04 foohost login: ROOT LOGIN (root) ON ttyv1
#Oct 18 06:11:11 foohost login: login on ttyv0 as root
#Nov 10 19:40:03 foohost su: jpb to root on /dev/ttyp0
#Nov 18 09:37:38 foohost su: BAD SU jpb to root on /dev/ttyp3
#Nov 22 12:26:44 foohost su: BAD SU badboy to root on /dev/ttyp0



3) Create a new configuration file (sshd-failures.conf) in the directory '/usr/local/nagios/etc/sec' with your favorite editor, and copy in the following text:

# Example sshd-failures
# Recognize a pattern and log it.
#
# login FAILURES
# ---------------
#
type=Single
ptype=RegExp
pattern=\S+\s+\d+\s+\S+\s+(\S+)\s+login: (.*?FAILURE.)(.*?ON) (.*)
desc=$0
action=write - USERACT: $1 login $2 on $4 at %t /usr/bin/mailx -s "LoginFailures" @

Note: Under the pattern section, the value is based on the Perl's regular expression.

To test this configuration, excute the following command. This will invoke sec to monitor the input from the user's input.
#sec -conf=sshd-failures.conf -input=-
Reading configuration from sshd-failures.conf

type in the following after this. (without the '#' of course)
#Nov 15 04:02:24 foohost login: 1 LOGIN FAILURE ON ttyp2

It will match the rule and will trigger a notification to the administrator's email as shown below.
1 rules loaded from sshd-failures.conf
Writing event 'Nov 15 04:02:24 foohost login: 1 LOGIN FAILURE ON ttyp2' to file -
USERACT: foohost login 1 LOGIN FAILURE on ttyp2 at

To run it as a daemon process just change the -input=- into a log file name and add the -detach flags.
eg:
#sec -conf=sshd-failures.conf -input=/var/log/auth -detach



 So yes, things are running now and emails are sending out. Great! So let's hook it up to a Network MonitoriS system. Our choice of NMS system is Nagios. This is an event based monitoring and in Nagios we treat these monitorings as Passive Check (please see Nagios documentation for more info).

4) Create the following bash script (submit_check_result) and save it in '/usr/local/nagios/libexec/eventhandlers' directory.

#!/bin/sh
# SUBMIT_CHECK_RESULT
# Written by Ethan Galstad (egalstad@nagios.org)
# Last Modified: 02-18-2002
#
# This script will write a command to the Nagios command
# file to cause Nagios to process a passive service check
# result.  Note: This script is intended to be run on the
# same host that is running Nagios.  If you want to
# submit passive check results from a remote machine, look
# at using the nsca addon.
#
# Arguments:
#  $1 = host_name (Short name of host that the service is
#       associated with)
#  $2 = svc_description (Description of the service)
#  $3 = return_code (An integer that determines the state
#       of the service check, 0=OK, 1=WARNING, 2=CRITICAL,
#       3=UNKNOWN).
#  $4 = plugin_output (A text string that should be used
#       as the plugin output for the service check)
#
echocmd="/bin/echo"
CommandFile="/usr/local/nagios/var/rw/nagios.cmd"
# get the current date/time in seconds since UNIX epoch
datetime=`date +%s`
# create the command line to add to the command file
cmdline="[$datetime] PROCESS_SERVICE_CHECK_RESULT;$1;$2;$3;$4"
# append the command to the end of the command file
`$echocmd $cmdline >> $CommandFile`


Then modify the sshd-failures.conf file and edit the follwing line
From:
action=write - USERACT: $1 login $2 on $4 at %t /usr/bin/mailx -s "LoginFailures" @

To:
action=shellcmd submit_check_result localhost "SSHD Failures" 2 "USERACT: $1 login $2 on $4 at %t"

Note: In the above example the "localhost" refers to the host that is monitored in Nagios (the Nagios server itself). If you are monitoring it on a different server use that hostname instead.



5) Optional: If you are planning to monitor a log file on a remote hosts the are two ways to achieve this.

i) Setup all the SEC configuration on the remote server and use the send_nsca script to submit results to the Nagios server. (please see the Nagios documentation for more info)

ii) Mount the NFS directory that contains the log file to the Nagios server. Then change the SEC '-input' parameter to point to NFS mounted log files. Make sure the log files are readable by the SEC process daemon.

Test it out and do let me know on the outcome! Screenshots will come later! ;D

No comments: