check_logfiles Beispiele
Posted on September 28th, 2009 by lausser
Auf dieser Seite befinden sich Beispiele für Konfigurationsdateien.
Beispiel 1: Fehlermeldungen von FCAL-Devices
Einsatz als Nagios-Plugin zur Überwachung von FCAL-Devices an einem Solaris-System. Dies ist eine einfache Anwendung, die lediglich nach einigen Patterns in /var/adm/messages sucht.
@searches = (
{
tag => 'san',
logfile => '/var/adm/messages',
rotation => 'SOLARIS',
criticalpatterns => [
'Link Down Event received',
'Loop OFFLINE',
'fctl:.*disappeared from fabric',
'.*Lun.*disappeared.*'
],
});Beispiel 2: Nochmal, aber diesmal als passiver Service mit send_nsca
Mit dem folgenden Konfigfile kann check_logfiles als Stand-Alone-Script aufgerufen werden (z.b. per Cronjob). Falls die genannten Fehlermeldungen in der Messages-Datei auftauchen, wird am Ende der Laufzeit von check_logfiles eine zusammenfassende Meldung per send_nsca an einen NSCA-Server geschickt.
$scriptpath = '/usr/bin/nagios/libexec:/usr/local/nagios/contrib';
$MACROS = {
NAGIOS_HOSTNAME => 'orschgeign.muc',
CL_NSCA_HOST_ADDRESS => 'nagios1.muc',
CL_NSCA_PORT => 5778
};
$postscript = 'send_nsca';
$postscriptparams = '-H $CL_NSCA_HOST_ADDRESS$ -p $CL_NSCA_PORT$
-to $CL_NSCA_TO_SEC$ -c $CL_NSCA_CONFIG_FILE$';
$postscriptstdin = '$CL_HOSTNAME$\t$CL_SERVICEDESC$\t
$CL_SERVICESTATEID$\t$CL_SERVICEOUTPUT$\n';
@searches = (
{
tag => 'san',
logfile => '/var/adm/messages',
criticalpatterns => [
'Link Down Event received',
'Loop OFFLINE',
'fctl:.*disappeared from fabric',
'.*Lun.*disappeared.*'
],
},
);Beispiel 3: Nochmal, aber diesmal mit Versand der Einzeltreffer
Soll jedesmal, wenn eine Zeile mit kritischem Inhalt erkannt wurde, eine eigene Meldung per NSCA versandt werden, dann kommen folgende Modifikationen zum Einsatz. Es ist jedoch zu beachten, das womöglich Hunderte von Fehlermeldungen in die Messages-Datei geschrieben wurden, die einen entsprechenden Event-Sturm verursachen.
$scriptpath = '/usr/bin/nagios/libexec:/usr/local/nagios/contrib';
$MACROS = {
NAGIOS_HOSTNAME => 'orschgeign.muc',
CL_NSCA_HOST_ADDRESS => 'nagios1.muc',
CL_NSCA_PORT => 5778
};
@searches = (
{
tag => 'san',
logfile => '/var/adm/messages',
criticalpatterns => [
'Link Down Event received',
'Loop OFFLINE',
'fctl:.*disappeared from fabric',
'.*Lun.*disappeared.*'
],
options => 'script',
script => 'send_nsca',
scriptparams => '-H $CL_NSCA_HOST_ADDRESS$ -p $CL_NSCA_PORT$
-to $CL_NSCA_TO_SEC$ -c $CL_NSCA_CONFIG_FILE$',
scriptstdin => '$CL_HOSTNAME$\t$CL_SERVICEDESC$\t
$CL_SERVICESTATEID$\t$CL_SERVICEOUTPUT$\n',
},
);Beispiel 4: Funktion des Syslog-Service prüfen
Im folgenden Beispiel wird beim Start des Plugins mittels des logger-Kommandos eine Meldung an den Syslog-Daemon geschickt. Nach einer kurzen Verzögerung von 5 Sekunden (welche dem syslogd ausreichend Zeit lässt, die Meldung in die Messages-Datei zu schreiben) wird nach dieser Meldung gesucht und, sollte diese nicht zu finden sein, ein Alarm ausgelöst.
$scriptpath = '/usr/bin';
$prescript = 'logger';
$prescriptparams = '-t nagios';
$prescriptstdin = 'braver syslog ($CL_DATE_YYYY$-$CL_DATE_MM$
-$CL_DATE_DD$ $CL_
DATE_HH$:$CL_DATE_MI$:$CL_DATE_SS$)';
$prescriptsleep = 5;
@searches = (
{
tag => 'syslogworks',
logfile => '/var/adm/syslog/syslog.log',
rotation => 'bmwhpux',
criticalpatterns => ['!nagios:\s+braver\s+syslog'],
options => 'count',
},
);Beispiel 5: Überwachung von HP ServiceGuard
Hier wird nach typischen Fehlermeldungen des Clusters gesucht. Der Wert HPUX des rotation-Parameters sorgt dafür, daß syslog.log und eventuell OLDsyslog.log durchsucht werden.
$seekfilesdir = '/lfs/opt/nagios/var/tmp';
$protocolsdir = '/lfs/opt/nagios/var/tmp';
$scriptpath = '/lfs/opt/nagios/nrpe/locallibexec';
@searches = (
{
tag => 'mcsg',
logfile => '/var/adm/syslog/syslog.log',
rotation => 'HPUX',
criticalpatterns => [
'.*cmcld: Inbound connection from unconfigured address.*',
'.*cmclconfd.*Unable to activate keep alive option on
incomming connection.*',
'.*inetd.*hacl-cfg/udp: Server failing (looping),
service terminated.*',
'.*inetd.*hacl-probe/tcp: accept: Bad file number.*',
'.*cmcld: Inbound.*message from unconfigured address.*',
'.*cmcld: Unable to connect to quorum server .*
It may be down.*',
'.*cmcld: Failed to receive from quorum server.*',
'.*cmcld: Connection failure to quorum server.*'
],
warningpatterns => [
'Cluster Files not in Sync',
],
options => 'protocol,count'
},
);Beispiel 6: Überwachung von LVM unter HP-UX
Im folgenden Beispiel werden typische Fehlermeldungen von LVM gesucht.
@searches = (
{
tag => 'lvm',
logfile => '/var/adm/syslog/syslog.log',
rotation => 'HPUX',
criticalpatterns => [
'.*vmunix: LVM: vg\[[0-9]*\]: pvnum=.*is POWERFAILED',
'.*vmunix: SCSI: Read error.*dev:.*errno:.*resid:.*',
'.*vmunix: LVM:.*PVLink.* Failed! The PV is still accessible.*',
'.*vmunix: LVM: Restored PV.*',
'.*vmunix: LVM: Performed a switch for Lun ID.*',
'.*vmunix: LVM:.*PVLink.*Recovered.*',
'.*vmunix:.*vxfs:.*vx_metaioerr.*file system meta data read error',
],
},
);Beispiel 7: Einfache Überwachung der Hardware von SUN-Servern
Wenn man unter Solaris das Kommando prtdiag mit der Option -l aufruft, dann werden eventuelle Meldungen bzgl. defekter Hardware an Syslog geschickt. In diesem Beispiel wird zuerst prtdiag wie beschrieben aufgerufen. Wird anschliessend eine entsprechende Meldung in der Messages-Datei gefunden, dann bedeutet dies, daß ein Defekt festgestellt wurde.
#
# This config file implements a simple method to monitor the
# hardware health of a solaris machine.
# From the prtdiag(1M) manpage:
# -l Log output. If failures or errors exist in the system,
# output this information to syslogd(1M) only.
# This means, if you run prtdiag and you find something
# prtdiag-related in the messages file, then there must be
# an error somewhere in the system.
#
$scriptpath = '/usr/platform/sun4u/sbin';
$prescript = 'prtdiag';
$prescriptparams = '-l';
@searches = (
{
tag => 'prtdiag',
logfile => '/var/adm/messages',
rotation => 'SOLARIS',
criticalpatterns => 'prtdiag:',
},
);Beispiel 8: Überwachung der Hardware von SUN-Servern mit Versand von SNMP-Traps
Im folgenden Beispiel wird die /var/adm/messages nach Meldungen durchsucht, die auf Speicherfehler hinweisen. Check_logfiles läuft in diesem Szenario nicht als Nagios-Plugin, sondern als Stand-Alone-Programm, welches im Fehlerfall einen SNMP-Trap verschickt. Dazu wird das Script send_snmptrap.pl aufgerufen, welches die nötigen Informationen in Form von Environmentvariablen übergeben bekommt. Hier wird im Fehlerfall nur ein einziger Trap versandt, nämlich am Programmende. Möchte man einzelne Traps für jede der gefundenen Fehlermeldungen, so gibt man statt des “$postscript” in der Definitiion des Search ein “script” an.
$MACROS = {
SNMP_TRAP_SINK_HOST => 'nagios.dierichs.de',
SNMP_TRAP_SINK_VERSION => 'snmpv1',
SNMP_TRAP_SINK_COMMUNITY => 'public',
SNMP_TRAP_SINK_PORT => 162,
SNMP_TRAP_ENTERPRISE_OID => '1.3.6.1.4.1.20006.1.5.1',
};
$seekfilesdir = '/lfs/opt/nagios/var/tmp';
$protocolsdir = '/lfs/opt/nagios/var/tmp';
$scriptpath = '/lfs/opt/nagios/nrpe/locallibexec';
@searches = (
{
tag => 'hwmsgs',
logfile => '/var/adm/kern.log',
rotation => 'kern\d{4}-\d{2}-\d{2}',
criticalpatterns => [
# bitfehler kann vom scrubber nicht repariert werden.
# gleich krachts.
'.*Sticky Softerror encountered.*',
],
warningpatterns => [
# speicher broeselt
'NOTICE: Previously reported error on page \w+\.\w+ cleared',
# netwerkkabel wurde gezogen
'WARNING: \w+: fault detected external to device; service degraded',
],
options => 'noprotocol',
},
);
$postscript => 'send_snmptrap.pl';Jörg Linge hat dazu freundlicherweise folgendes Script zur Verfügung gestellt:
#! /usr/bin/perl
#
# send_snmptrap.pl
#
use strict;
use Net::SNMP;
my $hostname = $ENV{CHECK_LOGFILES_SNMP_TRAP_SINK_HOST}
|| 'nagios.dierichs.de';
my $version = $ENV{CHECK_LOGFILES_SNMP_TRAP_SINK_VERSION}
|| 'snmpv1';
my $community = $ENV{CHECK_LOGFILES_SNMP_TRAP_SINK_COMMUNITY}
|| 'public';
my $port = $ENV{CHECK_LOGFILES_SNMP_TRAP_SINK_PORT}
|| 162;
my $oid = $ENV{CHECK_LOGFILES_SNMP_TRAP_ENTERPRISE_OID}
|| '1.3.6.1.4.1.20006.1.5.1';</p>
my ($session, $error) = Net::SNMP->session(
-hostname => $hostname,
-version => $version,
-community => $community,
-port => $port # Need to use port 162
);
if (!defined($session)) {
printf('ERROR: %s.\n', $error);
exit 1;
}
my @varbind = ($oid, OCTET_STRING, $ENV{CHECK_LOGFILES_SERVICEOUTPUT});
my $result = $session->trap(
-enterprise => $oid,
-specifictrap => $ENV{CHECK_LOGFILES_SERVICESTATEID},
-varbindlist => \@varbind);
$session->close;
exit 0;Beispiel 9: Überwachung der Hardware von SUN-Servern mit Alarmierung per NSCA
Anstelle von SNMP-Traps können Fehlermeldungen auch per send_nsca an den Nagios-Server gemeldet werden. Auch hier läuft check_logfiles als Stand-Alone-Programm.
$scriptpath = '/usr/local/nagios/bin';
$MACROS = {
NAGIOS_HOSTNAME => 'orschgeign.muc',
CL_NSCA_HOST_ADDRESS => 'nagios1.muc',
CL_NSCA_PORT => 5778,
CL_NSCA_CONFIG_FILE => '/usr/local/etc/send_nsca.cfg',
};
@searches = (
{
tag => 'hwmsgs',
logfile => '/var/adm/kern.log',
rotation => 'kern\d{4}-\d{2}-\d{2}',
criticalpatterns => [
# bitfehler kann vom scrubber nicht repariert werden.
# gleich krachts.
'.*Sticky Softerror encountered.*',
],
warningpatterns => [
# speicher broeselt
'NOTICE: Previously reported error on page \w+\.\w+ cleared',
# netwerkkabel wurde gezogen
'WARNING: \w+: fault detected external to device; service degraded',
],
options => 'noprotocol',
},
);
$postscript = 'send_nsca';
$postscriptparams = '-H $CL_NSCA_HOST_ADDRESS$ -p $CL_NSCA_PORT$
-to $CL_NSCA_TO_SEC$ -c $CL_NSCA_CONFIG_FILE$';
$postscriptstdin = '$CL_HOSTNAME$\t$CL_SERVICEDESC$\t
$CL_SERVICESTATEID$\t$CL_SERVICEOUTPUT$\n';Beispiel 10: Linux-Logfiles als unprivilegierter User lesen.
Beim Start von check_logfiles werden die Rechte der Logfiles so erweitert, daß sie vom Nagios-User gelesen werden dürfen. Dazu ist noch ein Eintrag in /etc/sudoers nötig:
qqnagio ALL = (root) NOPASSWD: /usr/bin/setfacl
Sollte das sudo-Kommando fehlschlagen, dann sorgt sein Exitcode von 1 zusammen mit der Option supersmartprescript dafür, daß check_logfiles mit einer Warnung abbricht.
Falls in /etc/sudoers die Zeile
Defaults requiretty
steht, muss diese auskommentiert werden.
$scriptpath = '/usr/bin';
$prescript = 'sudo';
$prescriptparams = 'setfacl -m u:$CL_USERNAME$:r-- /var/log/messages*';
$options = 'supersmartprescript';
@searches = ({
tag => 'reiserfs',
logfile => '/var/log/messages',
rotation => 'SUSE',
criticalpatterns => [
'vs-5150: search_by_key:',
'is_tree_node: node level \d+ does not match to the expected one',
'vs-500: unknown uniqueness -1',
'vs-5657: reiserfs_do_truncate: i/o failure',
'green-16006: Invalid item type observed, run fsck ASAP'],
...
});
....Beispiel 11: Apache unter Windows auf Einbruchsspuren überwachen
Bei der Verwendung unter Windows ist darauf zu achten, daß Pfadangaben wegen der ‘\’ in einfache Hochkommata zu setzen sind.
$MACROS = {
APACHEDIR => 'C:\Programme\Apache Software Foundation\Apache2.2'
};
@searches = ({
tag => 'apachebreakin',
logfile => '$APACHEDIR$\logs\access.log',
criticalpatterns => [
'GET.*cmd\.exe.*',
'SEARCH /\\x90\\x02\\xb1\\x02\\xb1' ]
});Beispiel 12: Treffer mit Hilfe eines Scripts rückgängig machen
Scripts mit der Eigenschaft “supersmart” können helfen, Treffer in der Logdatei genauer zu untersuchen und gegebenfalls nachträglich zu ändern.
@searches =(
{
tag => 'heiss',
logfile => '/var/log/messages',
criticalpatterns => '.*Thermometer: \d+ Grad.*',
options => 'supersmartscript',
script => sub {
my $grad = 0;
$ENV{CHECK_LOGFILES_SERVICEOUTPUT} =~ /: (\d+) Grad/;
$grad = $1;
if ($grad > 30) {
if (($ENV{CHECK_LOGFILES_DATE_MM} >= 6) &&
($ENV{CHECK_LOGFILES_DATE_MM} <= 8)) {
printf 'OK - ist ja schliesslich Sommer\n';
return 0; # Dieser Treffer hat somit niemals existiert.
} elsif (($ENV{CHECK_LOGFILES_DATE_MM} >= 11) &&
($ENV{CHECK_LOGFILES_DATE_MM} <= 2)) {
printf 'CRITICAL - es brennt!\n';
return 2;
} else {
printf 'WARNING - bisschen warm hier drin\n';
return 1;
}
} else {
printf 'OK - unter 30 Grad\n';
return 0;
}
}
}
);Beispiel 13: Überwachung von Fibre Channel Links
Mit dem Typ “virtual” lassen sich Dateien im /proc- oder /sys-Verzeichnis überwachen. Im folgenden Beispiel wird von einem Emulex LPe1150 Adapter das Kabel abgezogen.
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host0/model
ServeRAID 8i
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host1/modeldesc
Emulex LPe1150-F4 4Gb 1port FC: PCIe SFF HBA
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host2/modeldesc
Emulex LPe1150-F4 4Gb 1port FC: PCIe SFF HBA
.
.
.
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host0/state
running
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host1/state
Link Up - Ready:
Fabric
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host2/state
Link Up - Ready:
Fabric
.
.
.
@searches = (
{
tag => 'host0',
logfile => '/sys/class/scsi_host/host0/state',
type => 'virtual',
criticalpatterns => [
'^[^running]+'
],
options => 'nologfilenocry,noprotocol',
},
{
tag => 'host1',
logfile => '/sys/class/scsi_host/host1/state',
type => 'virtual',
criticalpatterns => [
'Link [^Up]+'
],
options => 'nologfilenocry,noprotocol',
},
{
tag => 'host2',
logfile => '/sys/class/scsi_host/host2/state',
type => 'virtual',
criticalpatterns => [
'Link [^Up]+'
],
options => 'nologfilenocry,noprotocol',
},
);
.
.
.
nagios@ibmsrv05:/> check_logfiles -f linux_fs_check_fcal.cfg
OK - no errors or warnings |host0=1;0;0;0 host1=2;0;0;0 host2=2;0;0;0
.
.
.
nagios@ibmsrv05:/> cat /sys/class/scsi_host/host2/state
Link Down
.
.
.
nagios@ibmsrv05:/> check_logfiles -f linux_fs_check_fcal.cfg
CRITICAL - (1 errors) - Link Down |host0_lines=1
host0_warnings=0 host0_criticals=0
host0_unknowns=0 host1_lines=2 host1_warnings=0
host1_criticals=0 host1_unknowns=0 host2_lines=1
host2_warnings=0 host2_criticals=1 host2_unknowns=0Beispiel 14: Weiterleitung des Eventlogs von Windows-Servern zu einem Unix-Syslogserver
Wenn in den Syslogmessages Meldungen von vielen Servern vorkommen, weil die Eventlogs aller Windows-Server an einen Syslogserver geleitet werden, dann kann mit der syslogclient-Option gezielt nach den Meldungen eines bestimmten Windows-Servers gesucht werden.
@searches = ({
tag => 'exchange1.dom',
logfile => '/var/log/messages',
rotation => 'SUSE',
criticalpatterns => [
'An MTA database server error was encountered',
],
options => 'syslogclient=exchange1.dom'
},
{
tag => 'exchange2.dom',
logfile => '/var/log/messages',
rotation => 'SUSE',
criticalpatterns => [
'An MTA database server error was encountered',
],
options => 'syslogclient=$CL_TAG$'
});
....Beispiel 15: Durchsuchen des AIX Errpt
AIX schreibt viele Fehlermeldungen in den sog. Error Report, der mit dem errpt-Kommando ausgelesen werden kann. Mit type=errpt kann man check_logfiles anweisen, anstelle einer Logdatei die Ausgabe dieses Kommandos zu durchsuchen.
@searches = (
{
tag => 'minor_errors',
type => 'errpt',
criticalpatterns => ['ADAPTER ERROR',
'The largest dump device is too small.',
'The copy directory is too small.',
'Kernel heap use exceeds allocation count',
'Kernel heap use exceeds percentage thres',
'LINK ERROR',
'Permanent fatal error',
'SCSI BUS OR DEVICE ERROR',
'SCSI DEVICE OR MEDIA ERROR',
'Possible malfunction on local adapter',
'ETHERNET DOWN',
'UNABLE TO ALLOCATE SPACE IN KERNEL HEAP'
],
}
);Beispiel 16: Weiterleitung von EventLogs mit Templates
Wenn in einer Logdatei Meldungen verschiedenster Syslog-Clients landen, dann ist es möglich, nur die von einem Client stammenden Meldungen in einem Lauf von check_logfiles zu betrachten. Man benutzt dazu die Option syslogclient. Deren Wert ist der Hostname, der zur Vorfilterung der Logdatei verwendet wird.
define command {
command_name check_client_logs
command_line $USER2$/check_logfiles --tag=$HOSTNAME$ \
--logfile='/var/log/messages' \
--criticalpattern='$ARG1$' --syslogclient='$CL_TAG$'
}
define service {
service_description dr_watson
host_name pc0815.muc
check_command check_client_logs!4097.*generated an application error
}Mit Templates ist es möglich, mehrere Suchaufträge in einer Konfigurationsdatei zu formulieren und je nach Host-Typ einzelne daraus zu selektieren. Ansonsten müsste man für jeden Client eine eigene Search-Definition schreiben.
@searches = (
{
template => 'drwatson',
logfile => '/var/log/messages',
criticalpattern => '4097.*generated an application error',
options => 'syslogclient=$CL_TAG$'
},
{
template => 'virus',
logfile => '/var/log/messages',
criticalpattern => 'a virus was found',
options => 'syslogclient=$CL_TAG$'
},
{
template => 'cluster',
logfile => '/var/log/messages',
criticalpatterns => ['5029.*The cluster log is corrupt',
'5038.*A cluster resource failed', ],
options => 'syslogclient=$CL_TAG$'
});Für “normale” Windows-Clients würde man dann aufrufen:
check_logfiles -f <configdatei> --tag='pc0815' \
--selectedsearches='drwatson,virus' \Und für Cluster-Server:
check_logfiles -f <configdatei> --tag='clustsrv1.muc'
Beispiel 17: Oracle Alertlog
Oracle Datenbanken schreiben Fehlermeldungen in ein Alertlog. Die Beachtung dieser Meldungen hilft, schwerwiegende Probleme frühzeitig zu erkennen. (Siehe auch type => “oraclealertlog”)
@searches = ({
tag => 'oraalerts',
logfile => '......../alert.log',
criticalpatterns => [
'ORA\-0*204[^\d]', # error in reading control file
'ORA\-0*206[^\d]', # error in writing control file
'ORA\-0*210[^\d]', # cannot open control file
'ORA\-0*257[^\d]', # archiver is stuck
'ORA\-0*333[^\d]', # redo log read error
'ORA\-0*345[^\d]', # redo log write error
'ORA\-0*4[4-7][0-9][^\d]',# ORA-0440 - ORA-0485 background process failure
'ORA\-0*48[0-5][^\d]',
'ORA\-0*6[0-3][0-9][^\d]',# ORA-6000 - ORA-0639 internal errors
'ORA\-0*1114[^\d]', # datafile I/O write error
'ORA\-0*1115[^\d]', # datafile I/O read error
'ORA\-0*1116[^\d]', # cannot open datafile
'ORA\-0*1118[^\d]', # cannot add a data file
'ORA\-0*1122[^\d]', # database file 16 failed verification check
'ORA\-0*1171[^\d]', # datafile 16 going offline due to error advancing checkpoint
'ORA\-0*1201[^\d]', # file 16 header failed to write correctly
'ORA\-0*1208[^\d]', # data file is an old version - not accessing current version
'ORA\-0*1578[^\d]', # data block corruption
'ORA\-0*1135[^\d]', # file accessed for query is offline
'ORA\-0*1547[^\d]', # tablespace is full
'ORA\-0*1555[^\d]', # snapshot too old
'ORA\-0*1562[^\d]', # failed to extend rollback segment
'ORA\-0*162[89][^\d]', # ORA-1628 - ORA-1632 maximum extents exceeded
'ORA\-0*163[0-2][^\d]',
'ORA\-0*165[0-6][^\d]', # ORA-1650 - ORA-1656 tablespace is full
'ORA\-16014[^\d]', # log cannot be archived, no available destinations
'ORA\-16038[^\d]', # log cannot be archived
'ORA\-19502[^\d]', # write error on datafile
'ORA\-27063[^\d]', # number of bytes read/written is incorrect
'ORA\-0*4031[^\d]', # out of shared memory.
'No space left on device',
'Archival Error',
],
warningpatterns => [
'ORA\-0*3113[^\d]', # end of file on communication channel
'ORA\-0*6501[^\d]', # PL/SQL internal error
'ORA\-0*1140[^\d]', # follows WARNING: datafile #20 was not in online backup mode
'Archival stopped, error occurred. Will continue retrying',
]
});Beispiel 17a: Oracle RAC Clusterware Alertlog
Von Daniel Graef stammt dieses Beispiel zur Überwachung des Alertlog der Oracle Clusterware. Vielen Dank!
@searches = (
{
tag => 'racnode01-clusterware',
logfile => '/oracle/app/crs/product/111_1/log/racnode01/alertracnode01.log',
criticalpatterns => [
'CRS\-1006[^\d]', # The OCR location %s is inaccessible. Details in %s.
'CRS\-1008[^\d]', # Node %s is not responding to OCR requests. Details in %s.
'CRS\-1009[^\d]', # The OCR configuration is invalid. Details in %s.
'CRS\-1011[^\d]', # OCR cannot determine that the OCR content contains the latest updates. Details in %s.
'CRS\-1202[^\d]', # CRSD aborted on node %s. Error [%s]. Details in %s.
'CRS\-1203[^\d]', # Failover failed for the CRS resource %s. Details in %s.
'CRS\-1205[^\d]', # Auto-start failed for the CRS resource %s. Details in %s.
'CRS\-1206[^\d]', # Resource %s went into an UNKNOWN state. Force stop the resource using the crs_stop -f command and restart %s.
'CRS\-1207[^\d]', # There are no more restart attempts left for resource %s. Restart the resource manually using the crs_start command.
'CRS\-1402[^\d]', # EVMD aborted on node %s. Error [%s]. Details in %s.
'CRS\-1602[^\d]', # CSSD aborted on node %s. Error [%s]. Details in %s.
'CRS\-1606[^\d]', # CSSD Insufficient voting files available [%s of %s]. Details in %s.
'CRS\-1608[^\d]', # CSSD Evicted by node %s. Details in %s. [local node eviced, critical for node himself]
'CRS\-1609[^\d]', # CSSD detected a network split. Details in %s.
],
warningpatterns => [
'CRS\-1010[^\d]', # The OCR mirror location %s was removed.
'CRS\-1604[^\d]', # CSSD voting file is offline: %s. Details in %s.
'CRS\-1607[^\d]', # CSSD evicting node %s. Details in %s. [local evicted other node, warning for clsuter state]
'CRS\-2001[^\d]', # memory allocation error when initiating the connection failed to allocate memory for the connection with the target process
'CRS\-2003[^\d]', # error %d encountered when connecting to %s
'CRS\-2004 [^\d]', # error %d encountered when sending messages to %s
'CRS\-2005[^\d]', # timed out when waiting for response from %d
'CRS\-2006[^\d]', # failed to get response from %d
],
options => 'sticky=86400'
});Beispiel 18: IPMI System Event Log
In diesem Beispiel wird nach Problemen mit Stromversorgungen gesucht. (ipmitool sdr zeigt u.U. nicht an, wenn ein Stromkabel gezogen wurde, deshalb sucht man im SEL).
@searches = (
{
tag => 'powercable',
type => 'ipmitool',
ipmitool => { # you don't need this if you are root
path => 'sudo /usr/bin/ipmitool',
},
criticalpatterns => [
'Power Supply.*Failure detected',
'Power Supply AC lost',
],
});
nagios@ibmsrv05:/> check_logfiles -f ibm_power.cfg
CRITICAL - (6 errors in test.protocol-2008-02-12-14-19-36) -
190 ; 02/07/2008 ; 14:28:13 ; Power Supply #0x39 ;
Failure detected ...|
powercable_lines=17 powercable_warnings=0
powercable_criticals=6 powercable_unknowns=0Beispiel 19: Passive Checkergebnisse, die nicht zugeordnet werden können
Passive Checkergebnisse, die keinem Host oder Service zugeordnet werden können (z.b. wegen eines Tippfehlers), werden abgesehen von einem Eintrag im nagios.log stillschweigend verworfen. Mit dieser Methode kann Nagios solche Fehler melden. Die Idee stammt von Augustinus.
$MACROS = {
NAGIOS_LOGFILES => '/var/nagios'
};
@searches = {
tag => 'nagios_unmatched_passive_check_results',
logfile => '$NAGIOS_LOGFILES$/nagios.log',
archivedir => '$NAGIOS_LOGFILES$/archives',
rotation => 'nagios-\d{2}-\d{2}-\d{2}-\d{2}.log',
criticalpatterns => [
'^\[\d+\] Warning: Passive check result was received for service .* on host .* but the service could not be found',
'^\[\d+\] Warning: Passive check result was received for service .* on host .* but the host could not be found',
],
};51 Responses to “check_logfiles Beispiele”
-
selcuk Says:
December 28th, 2009 at 16:37Hi,
Example 10: Scan Linux logfiles as an unprivileged user
The above example can be a security hole.. So the following can be done :
in (Redhat systems)
[root@nagios logrotate.d]# cat /etc/logrotate.d/syslog /var/log/messages /var/log/secure /var/log/maillog /var/log/spooler /var/log/boot.log /var/log/cron { … /usr/bin/setfacl -m u:nagios:rx /var/log/messages endscript }
The above config must be done just 1 time.. after this it will set nagios acl correctly..
you can see file perms as getfacl /var/log/messages You must see nagios user in output with rx rights…
By the way thanks for the scirpt…
-
Ovidiu Says:
January 8th, 2010 at 9:01The examples are really great! Thank you.
Could it be possible to have the Example 10 with all the critical and warning patterns? Thanks in advance.
lausser Reply:
January 8th, 2010 at 11:32I’m sorry, the patterns in the example are all i know. The “…” does not mean that i was just too lazy to write down a complete list.
-
Mihael Says:
March 15th, 2010 at 21:47Hi, How should be configured service.cfg on Nagios server if I want to use send_nsca(Example 2) on remote machine?
-
Jim Says:
May 7th, 2010 at 10:43Hi, do you have an example config to check MS SQL Server logs, in particular the rotation settings as I have not been able to get this working on a Windows 2003 server running SQL Server 2005. Thanks
lausser Reply:
May 10th, 2010 at 0:21I don’t know how your logfiles are organised. Maybe this will work?
@searches = ({ logfile => 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG\ERRORLOG', rotation => 'loglog0log1', # equal to 'ERRORLOG\.\d+' -
Jim Says:
May 7th, 2010 at 13:43Hi,
I have resolved my issues, the SQL log file is Unicode so setting the option = ‘encoding=ucs-3′ fixed my problem
lausser Reply:
May 10th, 2010 at 0:26Ah, ok. Are you sure it’s ucs-3? Not ucs-2?
-
Ryan Says:
May 13th, 2010 at 20:35Anyone know the correct format for send_nsca to include the $CL_SERVICEPERFDATA$ in a manner that allows something like PNP4NAIOGS to automatically grab it?
Not sure that I want to use it but I am curious at how useful the graphs would be over time.
-
Ryan Says:
May 13th, 2010 at 21:13Sorry to spam… Is there anyway to make a global critical/warning exception within the config file? I am looking at crit.log. We have hundreds of explicit ignore (exceptions) that I need to account for in my various @search conditions. I really don’t want to have to put all the suppress conditions in each condition. I am having to do this because our logic requires that last condition to be a “catch all” which allows anything else not explicitly suppressed through as a ticket. Is the best solution just including the exceptions on that last “catch all” condition?
lausser Reply:
May 14th, 2010 at 21:52Maybe like this?
$globalexceptions = [ 'except1', 'except2', ..... ]; ... @searches = ({ criticalexceptions => $globalexceptions, }, { criticalexceptions => $globalexceptions, }, { ...
-
Ryan Says:
May 18th, 2010 at 21:28Two technical questions regarding the check_logfiles cfg files. I am working to migrate away from a vendor product.
We have conditions in our current tool that dictate I need X number of matches over Y time. I see I can use options “count” and “savethresholdcount”. What I don’t see is a way to implement a sliding window of time to compare that “count” to. Any ideas?
Some of our existing conditions take pieces from the matched text to be included in our alert text (very much like the $1 .. $n in regex). Is there anyway to capture those values during the match and make them available with a predefined macro?
Thanks so much for your time.
lausser Reply:
May 19th, 2010 at 16:53There is no such think like sliding window in check_logfiles. Maybe you can implement it by using the “supersmartscript” ans “supersmartpostscript” options. Rewriting the message text can be done with a supersmartscript, where the handler calls the matching operator and adds brackets to the regular expression.
-
M. Bloch Says:
May 26th, 2010 at 16:12Hi,
at first many thanks thanks for the plugin :).
I use Nagios 3 with Fedora 12 and I try to compile the plugin with
./configure –prefix=/usr/lib/nagios/plugins –with-nagios-user=nagios –with-nagios-group=nagcmd –with-perl=/usr/bin –with-gzip=/usr/bin –with-trusted-path=/sbin:/usr/sbin:/bin:/usr/bin –with-seekfiles-dir=/tmp –with-protocols-dir=/tmp
make
make install
I get no error message during compiling. But when I execute the plugin with
./check_logfiles
i get the message:
-bash: ./check_logfiles: /usr/bin/: bad interpreter: Keine Berechtigung
Does anybody have an idea why i get the message?
thx a lot
M. Bloch Reply:
May 26th, 2010 at 16:44i got it :)
the Firstline in the check_logfiles was wrong. There was #! /usr/bin/ -w
i just add perl after bin/ an d now it works :)
lausser Reply:
May 26th, 2010 at 20:49This means: use the Perl interpreter /usr/bin, which of course does not exist. It must be--with-perl=/usr/bin
But as long as you want to use the default perl (/usr/bin/perl) it is not necessary to use –with-perl. The configure script will find the right one automatically.--with-perl=/usr/bin/perl
-
crankytexan Says:
August 4th, 2010 at 0:56How do I have the output show more then just the last row? I have a config file that looks for the right patterns, but because there are 15 lines after that it only prints what is on the line where the alert is. Can I concatenate the line with say 8 lines below where the error occurs?
lausser Reply:
August 4th, 2010 at 1:36with criticalpatterns=>’.*’ and supersmartscripts you can implement this. Look at the examples how you can code your own logic.
-
crankytexan Says:
August 9th, 2010 at 21:20I am new at this so please excuse my multiple questions and I greatly appreciate the help.
So another issue is I want each line in the pattern under the criticalpatterns to output when an alert comes through. In other words if I have under critical patterns:
criticalpatterns => [ 'No messages have been received from the remote host for the specified time period.', 'Error retrieving portfolio', 'Object reference not set to' ],
I need to see each one of those. Currently I am only seeing the last pattern.
Also I have multiple logs in the config file and it is adding those together and outputting the last pattern. There is a count of how many criticals for each log, but I need to know each error for each log:
@searches = ( { tag => ‘Log A’, logfile => ‘c:\Temp\server.net.log’, criticalpatterns => [ 'No messages have been received from the remote host for the specified time period.', 'Error retrieving portfolio', 'Object reference not set to' ], }, { tag => ‘Log B’, logfile => ‘c:\Temp\server.netb.log’, criticalpatterns => [ 'No messages have been received from the remote host for the specified time period.', 'Error retrieving portfolio', 'Object reference not set to' ], }, );
Example return when running manually in a cmd prompt:
CRITICAL – (6 errors in logconfig.protocol-2010-08-09-14-49-12) – Object reference not set to …|Log A_Lines=4 Log A_warnings=0 Log A_criticals=3 Log A_unknowns=0 Log B_lines=4 Log B_warnings=0 Log B_criticals=3 Log B_unknowns=0
crankytexan Reply:
August 9th, 2010 at 21:23Please excuse the above formatting. I thought I had put it in correctly, but the @searches came across weird. Should look like (if it comes out right):
@searches = ( { tag => ‘Log A’, logfile => ‘c:\Temp\server.net.log’, criticalpatterns => [ 'No messages have been received from the remote host for the specified time period.', 'Error retrieving portfolio', 'Object reference not set to' ], }, { tag => ‘Log B’, logfile => ‘c:\Temp\server.netb.log’, criticalpatterns => [ 'No messages have been received from the remote host for the specified time period.', 'Error retrieving portfolio', 'Object reference not set to' ], }, );
lausser Reply:
August 9th, 2010 at 21:42looks okay$options = 'report=long'; @searches = ( { tag => 'Log A', logfile => 'c:\\Temp\\server.net.log', criticalpatterns => [ 'No messages have been received from the remote host for the specified time period.', 'Error retrieving portfolio', 'Object reference not set to' ], }, { tag => 'Log B', logfile => 'c:\\Temp\\server.netb.log', criticalpatterns => [ 'No messages have been received from the remote host for the specified time period.', 'Error retrieving portfolio', 'Object reference not set to' ], }, );
(except the duoble-backslash. it must be just a single one, but my editor is bad)
lausser Reply:
August 9th, 2010 at 21:24Use the global option ‘report’
It will output multiple lines with one line per hit.$options = 'report=long';
crankytexan Reply:
August 9th, 2010 at 21:55Unfortunately that didn’t work. I still get the same output. Would running this on a Windows box make a difference?
-
crankytexan Says:
August 12th, 2010 at 17:40So when I use the above config file I get this return. I know that all of the above alerts are in the test logfile because I put them in there, but it is only returning one of the alerts and is telling me how many are in each tag. How do I get it to spit out for each alert and show the tags separately? I am so close to completing our checks, but I have no reference her because no one has used this plugin at my company. Below is the return when running from a command line:
H:\>perl “c:\nsclient++\PlugIns\check_logfiles” -f “c:\NSClient++\PlugIns\dearconfig.cfg”
CRITICAL – (8 errors in dearconfig.protocol-2010-08-12-11-26-22) – No messages h ave been received from the remote host for the specified time period. …|Sydney Dear Logs_lines=6 Sydney Dear Logs_warnings=0 Sydney Dear Logs_criticals=5 Sydn ey Dear Logs_unknowns=0 Dear Logs EMEA OF_lines=5 Dear Logs EMEA OF_warnings=0 D ear Logs EMEA OF_criticals=3 Dear Logs EMEA OF_unknowns=0
I added what you stated above to the config file (please excuse the formatting if it comes across wrong):
$options = ‘report=long’; @searches = ( { tag => ‘Sydney Dear Logs’, logfile => ‘c:\Temp\dear.server.net.log’, criticalpatterns => [ 'No messages have been received from the remote host for the specified time period.', 'Error retrieving portfolio', 'Object reference not set to' ], }, { tag => ‘Dear Logs EMEA OF’, logfile => ‘c:\Temp\dear.server.netb.log’, criticalpatterns => [ 'No messages have been received from the remote host for the specified time period.', 'Error retrieving portfolio', 'Object reference not set to' ], }, );
-
Stefan Says:
October 12th, 2010 at 10:57Hallo, verstehe nicht warum, das postscript nicht ausgeführt wird. Sollte doch alles stimmen, oder?
Grüße Stefan
$scriptpath = ‘C:\temp\check_logfiles\send_nsca’; $MACROS = { CL_NSCA_DELIMITER => ‘,’, CL_NSCA_HOST_ADDRESS => ’192.168.138.88′ }; $postscript = ‘send_nsca.exe’; $postscriptparams = ‘-H $CL_NSCA_HOST_ADDRESS$ -d $CL_NSCA_DELIMITER$ -c $CL_NSCA_CONFIG_FILE$’; $postscriptstdin = ‘$CL_HOSTNAME$,$CL_SERVICEDESC$,$CL_SERVICESTATEID$,$CL_SERVICEOUTPUT$’;
@searches = ( { tag => ‘test1′, logfile => ‘C:\temp\check_logfiles\log.txt’, criticalpatterns => [ '222', '555' ], }, );
lausser Reply:
October 12th, 2010 at 11:03Es fehlt noch
Das Vorhandensein von $postscript allein reicht nicht, man muss explizit per Option einschalten, dass es auch ausgeführt wird.$options = 'postscript';
Stefan Reply:
October 12th, 2010 at 12:19Danke für die schnelle Antwort. Habe es hinzugefügt, aber leider wird das postscript nicht ausgeführt…. Ist noch ein Fehler vorhanden?
Grüße
$scriptpath = ‘C:\temp\check_logfiles\send_nsca’; $postscript = ‘send_nsca.exe’; $postscriptparams = ‘-H $CL_NSCA_HOST_ADDRESS$ -d $CL_NSCA_DELIMITER$ -c $CL_NSCA_CONFIG_FILE$’; $postscriptstdin = ‘$CL_HOSTNAME$,$CL_SERVICEDESC$,$CL_SERVICESTATEID$,$CL_SERVICEOUTPUT$’; $options = ‘postscript’; $MACROS = { CL_NSCA_DELIMITER => ‘,’, CL_NSCA_HOST_ADDRESS => ’192.168.138.88′ }; @searches = ({ tag => ‘test1′, logfile => ‘C:\temp\check_logfiles\log.txt’, criticalpatterns => [ '222', '555' ], }, );
Stefan Reply:
October 12th, 2010 at 13:20@Stefan, Jetzt habe ich dieses Configfile versuch, leider auch hier keinen Erfolg. Send_nsca funktioniert händisch. Ich komme leider nicht weiter.
$scriptpath = ‘C:\temp\check_logfiles\send_nsca’; $options = ‘postscript’; $MACROS = { CL_NSCA_DELIMITER => ‘,’, CL_NSCA_HOST_ADDRESS => ’192.168.138.88′ }; @searches = ({ tag => ‘test1′, logfile => ‘C:\temp\check_logfiles\log.txt’, criticalpatterns => [ '222', '555' ], options => ‘postscript’, postscript => ‘send_nsca.exe’, postscriptparams => ‘-H $CL_NSCA_HOST_ADDRESS$ -d $CL_NSCA_DELIMITER$ -c $CL_NSCA_CONFIG_FILE$’, postscriptstdin => ‘$CL_HOSTNAME$,$CL_SERVICEDESC$,$CL_SERVICESTATEID$,$CL_SERVICEOUTPUT$’, }, );
lausser Reply:
October 12th, 2010 at 13:28Das geht nicht. $postscript etc. muss ausserhalb der searches-Definitionen liegen. Das mit dem Text “Habe es hinzugefügt, aber leider….” sieht gut aus. Du müsstest eine Environmentvariable %TEMP% haben, die auf ein temporäres Verzeichnis zeigt. In diesem Verzeichnis legst du eine leere Datei namens check_logfiles.trace an. Wenn du dann check_logfiles nochmal aufrufst, wird da ein Haufen Debugginginfo reingeschrieben, u.a. der (versuchte) Aufruf von send_nsca.exe.
Stefan Reply:
October 12th, 2010 at 13:58Sehr gut, hier ist der Fehler: Die send_nsca.cfg verweist auf ein Unix-Dir, aber es ist eine Windose.
Tue Oct 12 13:45:53 2010: execute C:\temp\check_logfiles\send_nsca\send_nsca.exe -H 192.168.138.88 -d , -c /usr/local/nagios/etc/send_nsca.cfg
Also fix das passende Macro gesetzt: CL_NSCA_CONFIG_FILE => ‘C:\temp\check_logfiles\send_nsca\send_nsca.cfg’
-> Es klappt. Vielen Dank, Gerhard! Btw: Das Postscript läuft auch ohne “$options = ‘postscript’;”. Ist das so gewollt?
lausser Reply:
October 12th, 2010 at 14:03Da musste ich grad selber nochmal nachschauen. Tatsächlich reicht es, wenn $postscript vorhanden ist. Die $options braucht man nur, wenn man ‘smartpostscript’ (return/exitcode des Postscripts wird als weiterer Critical/Warning gezählt) oder ‘supersmartpostscript’ (return/exitcode des Postscripts bestimmt das Endergebnis von check_logfiles, egal ob vorher schon Fehlermeldungen im Logfile gefunden wurden oder nicht) angibt.
Stefan Reply:
October 12th, 2010 at 13:59@lausser, Die komplette Konfig-Datei:
$scriptpath = ‘C:\temp\check_logfiles\send_nsca’; $options = ‘postscript’; $postscript = ‘send_nsca.exe’; $postscriptparams = ‘-H $CL_NSCA_HOST_ADDRESS$ -c $CL_NSCA_CONFIG_FILE$’; $postscriptstdin = ‘$CL_HOSTNAME$\t$CL_SERVICEDESC$\t$CL_SERVICESTATEID$\t$CL_SERVICEOUTPUT$\n’; $MACROS = { CL_NSCA_HOST_ADDRESS => ’192.168.138.88′, CL_NSCA_CONFIG_FILE => ‘C:\temp\check_logfiles\send_nsca\send_nsca.cfg’ }; @searches = ({ tag => ‘test1′, logfile => ‘C:\temp\check_logfiles\log.txt’, criticalpatterns => [ '222', '555' ], }, );
-
Stefan Says:
October 12th, 2010 at 11:49Hello! The Pattern is found, but no postscript ist executed. Please help me finding the problem within this configfile. Regards Stefan
$scriptpath = ‘C:\temp\check_logfiles\send_nsca’; $MACROS = { CL_NSCA_DELIMITER => ‘,’, CL_NSCA_HOST_ADDRESS => ’192.168.138.88′ }; $postscript = ‘send_nsca.exe’; $postscriptparams = ‘-H $CL_NSCA_HOST_ADDRESS$ -d $CL_NSCA_DELIMITER$ -c $CL_NSCA_CONFIG_FILE$’; $postscriptstdin = ‘$CL_HOSTNAME$,$CL_SERVICEDESC$,$CL_SERVICESTATEID$,$CL_SERVICEOUTPUT$’;
@searches = ( { tag => ‘test1′, logfile => ‘C:\temp\check_logfiles\log.txt’, criticalpatterns => [ '222', '555' ], }, );
-
Paul Kilgour Says:
November 23rd, 2010 at 13:02Hi, thanks very much for the script. It works great for me using one server but now I have 2 servers monitoring the same log files for failover reasons. Is there a way for each to keep a track of new lines in the log rather than them both using each others. i.e. I need them both to report on the same errors found. Currently one will check the file, see there is an error, then the 2nd server will check the log and report back as ok because it only checks from the lines that the other server checked up to. Sorry for my bad explanation but hopefully you understand.
Many thanks,
Paul
lausser Reply:
November 24th, 2010 at 12:29check_logfiles uses a so-called seekfile to keep track of it’s actions. There it saves for example the position in the logfile where the last scan ended. At the beginning of the next run check_logfiles reads that position from the seekfile and starts scanning the logfile from this position. I wonder why two check_logfiles on two separate machines do not work completely independent from each other, as each one should have it’s private seekfile. By default the seekfile is saved in /var/tmp/check_logfiles/ (can be changed by setting the variable $seekfilesdir in the config file). So there should be two seekfiles, one on each of the two machines. I only can explain the behaviour you’re observing with a seekfile in a shared location. Did you set $seekfilesdir to point to a directory in a cluster-filesystem?
-
mr.h Says:
January 15th, 2011 at 20:34Finde vor allem den $NAGIOS_HOSTNAME super! :)
-
sdouce Says:
February 14th, 2011 at 17:57Hello , First thanks you for your plugin ! My question is i would like to use the check_logfile command to do something like that :
./check_logfiles –logfile=”/path/to/file.log” –config=ALERTLOG.cfg
This way i can control many different logfile with identical config file …? Do you know if its possible ?
-
outremont Says:
February 23rd, 2011 at 19:49Guten Tag Ich hab 2 logfiles welche ich auf ein Eintrag prüfen muss. Leider befinden sich die Logs in anderen Ordner!
Zudem reicht es mir wenn nur eine Datei den Eintrag besitzt, bzw. er muss nicht in beiden Dateien erscheinen.
Gibt es eine mögichkeit dies abzufragen?
Gruss Outremont
Ich hab folgendes Skript, welches nur ein OK bringt wenn beide Dateien den Eintrag haben:
logfile monitoring
#
DBExport logfile
$seekfilesdir = ‘C:\\groundwork\\temp’; @searches = ( { tag => ‘DBExport Logfile’, logfile => ‘D:\\backup\\oracle\\full\\posdb\\monday\\log\\exp_monday.log’, criticalpatterns => [ '!Export erfolgreich ohne Warnungen beendet.'
],options => 'protocol'},
{ tag => 'DBExport Logfile', logfile => 'D:\\backup\\oracle\\full\\posdb\\tuesday\\log\\exp_tuesday.log', criticalpatterns => [ '!Export erfolgreich ohne Warnungen beendet.'
],options => 'protocol'},
);
-
Matt Hawkins Says:
March 11th, 2011 at 22:20Lausser,
I’m trying to figure out how the timeout works. I’m trying to get the check_logfiles script to die after 60 seconds. I’ve use the “-t 60″ option but it doesn’t seem to work. I’m sure I’m doing something wrong or I misunderstand what it is used for.
Thanks in advance as always.
-
Hernan Fonseca Says:
March 31st, 2011 at 18:04Lausser, Im trying this to catch some events ID in the application event log on Windows
@searches = ( { options => ‘eventlogformat=”%w src:%s id:%i %m”‘, tag => ‘evt_app’, type => ‘eventlog’, eventlog => { eventlog => ‘application’, include => { source => ‘BizTalk Server 2006′, eventid => ['5429','5410','6912','6913','5753','10034','7184','5439','7221','5649','5773','5888','5777','5697','5652','5743','5740'], }, }, criticalpatterns => ‘.*’, });
The problem is, that i wanna catch all events Id =5429 or all events ID =5410, or all events Id=6912 , and so on….
how could be the best way to do that with out making all in separates files of course.
thanks in advance
-
Rahul Says:
May 23rd, 2011 at 22:01Hi Lausser,
I am using this plugin for a while now and am facing a problem right now, Here is how my configuration file looks like,
@searches = (
#
Pattern1
# { tag => ‘Pattern1′, logfile => ‘/var/tmp/logfile1-log4j.log’, criticalpatterns => [ 'FATAL', 'Item limit per service exceeded', 'java.lang.OutOfMemoryError: Java heap space', 'ParserException', ], criticalexceptions => [ 'RESCODE=IDNOTFOUND', ], options => ‘sticky=900,noprotocol,nologfilenocry,supersmartscript’, script => sub { ( my $line = $ENV{CHECK_LOGFILES_SERVICEOUTPUT}) =~ s/\|/\;/g; print “$line found in $ENV{CHECK_LOGFILES_LOGFILE}”; return $ENV{CHECK_LOGFILES_SERVICESTATEID}; } },
#
Pattern2
# { tag => ‘Pattern2′, logfile => ‘/var/tmp/logfile2-log4j.log’, criticalpatterns => [ 'FATAL', 'ParserException', ], options => ‘sticky=900,noprotocol,nologfilenocry,supersmartscript’, script => sub { ( my $line = $ENV{CHECK_LOGFILES_SERVICEOUTPUT}) =~ s/\|/\;/g; print “$line found in $ENV{CHECK_LOGFILES_LOGFILE}”; return $ENV{CHECK_LOGFILES_SERVICESTATEID}; } },
);
I have 20-30 such entries in given file. The problem is, even though the pattern match occurs in any of the logfiles, the logfile name displayed is always the last pattern.
e.g. In above configuration, if their is one line found with ‘FATAL’ pattern in Pattern1 section, the alert text would still show the logfile name as /var/tmp/logfile2-log4j.log instead of /var/tmp/logfile1-log4j.log
do you see any issues with the way I am setting this up.
Thanks!
-
Erik Johansson Says:
July 11th, 2011 at 17:07Hi. I am testing this and running a check from Linux to the Windows computer (NSClient++) I get this: erik@ubuntu:/etc/icinga$ /usr/lib/nagios/plugins/check_nrpe -H 192.168.1.170 -c check_logfiles Unrecognized character \x90; marked by <– HERE after MZ<– HERE near column 3 at C:\Program Files\NSClient++\scripts\check_logfiles.exe line 1. Is this something wrong with the .exe, like the error message says? I dowloaded it today from this site. Or is it Perl? I installed via ActiveState Perl64, 5.14.1.1401 I have this in NSC.ini under [External Scripts]: check_logfiles=C:\Perl64\bin\perl.exe “C:\Program Files\NSClient++\scripts\check_logfiles.exe” -f “C:\Program Files\NSClient++\scripts\check_logfiles.cfg” Any ideas? Thanks.
Erik Johansson Reply:
August 4th, 2011 at 14:36I went back to this and it works now, I have relative paths instead for the command and no Perl: check_logfiles=scripts\check_logfiles.exe -f scripts\check_logfiles.cfg I can’t recall if have changed anyting else.
-
David Fisher Says:
November 7th, 2011 at 21:32I am building a monitoring configuration for an application in windows that writes log files. I found your plugin and downloaded the Check_logfiles.zip which looks like it contains the windows executable. When I run the program against a very simple test log file (looking for ‘Critical’ in a line I always get everything is OK nothing is found. Do I need to compile the program from the tar download or is there another windows executable? Thank you.
lausser Reply:
November 11th, 2011 at 2:14As soon as a new error message appears in your logfile, the status will be CRITICAL



lausser Reply:
January 7th, 2010 at 1:59
Thanks for that. I always found the prescript/setfacl method a bit ugly, but i had some enterprises in mind, where the admins prefer to grant sudo-privileges (which is done frequently) that to edit files in /etc (which is practically never done, because they install from an image and want to keep their servers identical). But integrating your method in an installation image is definitively a good idea.
progre55 Reply:
August 16th, 2010 at 4:59
@lausser, First off, thanks for the plugin =) I need to monitor tomcat logs on a linux machine (Ubuntu), but having problems with permissions. However, the setfacl solution didnt work, as it says “Operation not supported”. Any other better/secure ideas, please?
lausser Reply:
August 18th, 2010 at 14:15
Seems your filesystem does not allow acl operations. Either because of the filesystem type or because it’s not allowed. Look at the mount-options and remount the filesystem. (something like mount -oacl )
progre55 Reply:
August 19th, 2010 at 1:54
@lausser, Well, it’s an Ubuntu server on amazon aws, and I wouldn’t want to re-mount it.. are there any other ways, than acl?