check_mysql_health
Posted on July 6th, 2009 by admin
Description
check_mysql_health is a plugin to check various parameters of a MySQL database.
Command line parameters
- –hostname <hostname>
The database server which should be monitored. In case of "localhost" this parameter can be omitted. - –username <username>
The database user. - –password <password>
Password of the database user. - –mode <modus>
With the mode-parameter you tell the plugin what it should do. See the list of possible values further down. - –name <objektname>
Here the check can be limited to a single object. (Momentarily this parameter is only used for mode=sql) - –name2 <string>
If you use –mode=sql, then the SQL-Statement appears in the output and performance values. With the parameter name2 you’re able to specify a string for this.. - –warning <range>
Determined values outside of this range trigger a WARNING. - –critical <range>
Determined values outside of this range trigger a CRITICAL. - –environment <variable>=<wert>
With this you can pass environment variables to the script. Multiple declarations are possible. - –method <connectmethode>
With this parameter you tell the plugin how it should connect to the database. (dbi for using DBD::mysql (default), mysql for mysql-Tool). - –units <%|KB|MB|GB>
The declaration from units serves the "beautification" of the output from mode=sql
Use the option –mode with various keywords to tell the Plugin which values it should determine and check.
| Keyword | Description | Range |
| connection-time | Determines how long connection establishment and login take | 0..n Seconds (1, 5) |
| uptime | Time since start of the database server (recognizes DB-Crash+Restart) | 0..n Seconds (10:, 5: Minutes) |
| threads-connected | Number of open connections | 1..n (10, 20) |
| threadcache-hitrate | Hitrate in the Thread-Cache | 0%..100% (90:, 80:) |
| q[uery]cache-hitrate | Hitrate in the Query Cache | 0%..100% (90:, 80:) |
| q[uery]cache-lowmem-prunes | Displacement out of the Query Cache due to memory shortness | n/sec (1, 10) |
| [myisam-]keycache-hitrate | Hitrate in the Myisam Key Cache | 0%..100% (99:, 95:) |
| [innodb-]bufferpool-hitrate | Hitrate in the InnoDB Buffer Pool | 0%..100% (99:, 95:) |
| [innodb-]bufferpool-wait-free | Rate of the InnoDB Buffer Pool Waits | 0..n/sec (1, 10) |
| [innodb-]log-waits | Rate of the InnoDB Log Waits | 0..n/sec (1, 10) |
| tablecache-hitrate | Hitrate in the Table-Cache | 0%..100% (99:, 95:) |
| table-lock-contention | Rate of failed table locks | 0%..100% (1, 2) |
| index-usage | Sum of the Index-Utilization (in contrast to Full Table Scans) | 0%..100% (90:, 80:) |
| tmp-disk-tables | Percent of the temporary tables that were created on the disk instead in memory | 0%..100% (25, 50) |
| slow-queries | Rate of queries that were detected as "slow" | 0..n/sec (0.1, 1) |
| long-running-procs | Sum of processes that are runnning longer than 1 minute | 0..n (10, 20) |
| slave-lag | Delay between Master and Slave | 0..n Seconds |
| slave-io-running | Checks if the IO-Thread of the Slave-DB is running | |
| slave-sql-running | Checks if the SQL-Thread of the Slave-DB is running | |
| sql | Result of any SQL-Statement that returns a number. The statement itself is passed over with the parameter –name. A Label for the performance data output can be passed over with the parameter –name2. The parameter –units can add units to the output (%, c, s, MB, GB,..). If the SQL-Statement includeds special characters or spaces, it can first be encoded with the mode encode. | 0..n |
| encode | Reads standard input (STDIN) and outputs an encoded string. | |
| cluster-ndb-running | Checks if all cluster nodes are running. |
Depending on the chosen mode two labels can appear in the performance data output.
<label>= and <label_now>=
The determinded values apply to the complete runtime of the database and to the time since the last run of check_mysl_health.
Example: qcache_hitrate=71.63%;90:;80: qcache_hitrate_now=8.25%
The Hitrate of the Query-Cache is calculated from Qcache_hits / ( Qcache_hits + Com_select ). This values are continuously increased. A serious change in access behaviour affects the hitrate only slowly. To be able to recognize temporarily fluctuations in the hitrate and, for example, assign it to an application update, the value qcache_hitrate_now is printed out additionally. This value is calculated through the difference (delta) between Qcache_hits and Com_select (actual value of the variables minus the value since the last run from check_mysql_health).
Here the command line parameter –lookback is used.
- if this is missing, than qcache_hitrate_now is calculated from the difference (delta) between Qcache_hits and Com_select since the last run from check_mysql_health. Important for the exitcode of the plugin is the long-term result qcache_hitrate (since database start).
- if –lookback is specified with an argument n, than qcache_hitrate_now is calculated from the difference (delta) from Qcache_hits and Com_select since the last n seconnds.
For example: With –lookback 3600 you’ll get the average hitrate of the last hour, calculated back from the last plugin execution. The exitcode now also depends on this short-term test result.
It’s recommended to use –lookback but specify at least half an hour (–lookback 1800) because the now-value underlies a heavy fluctuation which would lead to frequent alarms.
Pleae note, that the thresholds must be specified according to the Nagios plug-in development Guidelines.
"10" means "Alarm, if > 10" und
"90:" means "Alarm, if < 90"
Connect to the database
Creating a database user
In order to be able to collect the needed information from the database a database user with specific privileges is required:
GRANT usage ON *.* TO 'nagios'@'nagiosserver' IDENTIFIED BY 'nagiospassword'
Connectionstring
To connect to the database you use the parameters –username and –password. The database server which should be used can be specified more precise with –hostname and –socket or –port.
Use of environment variables
It’s possible to omit –hostname, –username and –password as well as –socket and –port completely, if you provide the corresponding values in environment variables. Since Version 3.x it is possible to extend service definitions in Nagios through own attributes (custom object variables). These will appear during the exectution of the check command in the environment.
The environment variables are:
- NAGIOS__SERVICEMYSQL_HOST (_mysql_host in the service definition)
- NAGIOS__SERVICEMYSQL_USER (_mysql_user in the service definition)
- NAGIOS__SERVICEMYSQL_PASS (_mysql_pass in the service definition)
- NAGIOS__SERVICEMYSQL_PORT (_mysql_port in the service definition)
- NAGIOS__SERVICEMYSQL_SOCK (_mysql_sock in the service definition)
Examples
nagios$ check_mysql_health --hostname mydb3 --username nagios --password nagios \ --mode connection-time OK - 0.03 seconds to connect as nagios | connection_time=0.0337s;1;5 nagios$ check_oracle_health --mode=connection-time OK - 0.17 seconds to connect | connection_time=0.1740;1;5 nagios$ check_mysql_health --mode querycache-hitrate CRITICAL - query cache hitrate 70.97% | qcache_hitrate=70.97%;90:;80: qcache_hitrate_now=72.25% selects_per_sec=270.00 nagios$ check_mysql_health --mode querycache-hitrate \ --warning 80: --critical 70: WARNING - query cache hitrate 70.82% | qcache_hitrate=70.82%;80:;70: qcache_hitrate_now=62.82% selects_per_sec=420.17 nagios$ check_mysql_health --mode sql \ --name 'select 111 from dual' CRITICAL - select 111 from dual: 111 | 'select 111 from dual'=111;1;5 nagios$ echo 'select 111 from dual' | \ check_mysql_health --mode encode select%20111%20from%20dual nagios$ check_mysql_health --mode sql \ --name select%20111%20from%20dual CRITICAL - select 111 from dual: 111 | 'select 111 from dual'=111;1;5 nagios$ check_mysql_health --mode sql \ --name select%20111%20from%20dual --name2 myval CRITICAL - myval: 111 | 'myval'=111;1;5 nagios$ check_mysql_health --mode sql \ --name select%20111%20from%20dual --name2 myval --units GB CRITICAL - myval: 111GB | 'myval'=111GB;1;5 nagios$ check_mysql_health --mode sql \ --name select%20111%20from%20dual --name2 myval --units GB \ --warning 100 --critical 110 CRITICAL - myval: 111GB | 'myval'=111GB;100;110
Installation
The plugin requires the installation of a mysql-client packages. The installation of the perl-modules DBI and DBD::mysql is desirable, but not mandatory.
After unpacking the archive ./configure is called. With ./configure –help some options can be printed which show some default values for compiling the plugin.
- –prefix=BASEDIRECTORY
Specify a directory in which check_mysql_health should be stored. (default: /usr/local/nagios)
- –with-nagios-user=SOMEUSER
This User will be the owner of the check_mysql_health file. (default: nagios)
- –with-nagios-group=SOMEGROUP
The group of the check_mysql_health plugin. (default: nagios)
- –with-perl=PATHTOPERL
Specify the path to the perl interpreter you wish to use. (default: perl in PATH)
Download
check_mysql_health-2.1.2.tar.gz
check_mysql_health-2.1.2.shar.gz
Some versions of tar are having problems with the long filesnames. In this case please unpack the shar-Paket with
cat check_mysql_health-xxx.shar.gz | gzip -d | sh
Changelog
- 2010-06-10 2.1.2 Changed some statements for better 4.x compatibility. (Thanks Florian)
- 2010-03-30 2.1.1 More tracing (touch /tmp/check_mysql_health.trace to watch), fixed a bug in master-slave modes, so it outputs a more meaningful error message (Thanks Will Oberman), fixed a typo (Thanks Larsen)
- 2009-10-02 2.1 New parameter –lookback
- 2009-09-20 2.0.5 Bugfix in master-slave modes. (Thanks Thomas Mueller). Bugfix in bufferpool-wait-free. (Thanks Matthias Flacke). Bugfix in PNP template. (Thanks Matthias Flacke). Mode slave-lag recognizes failed io threads. (Thanks Greg)
- 2009-04-02 2.0.4 Bugfix in mode cluster-ndb-running, Bugfix in Master/Slave-Code. (Thanks Arkadiusz Miskiewicz)
- 2009-03-18 2.0.3 Bugfix because of warning=0, Bugfix in long-running-procs (affects MySQL < 5.1) (Thanks Bodo Schulz)
- 2009-03-11 2.0.1 Removed annoying Uninitialized-Messages (Thanks John Alberts & Thomas Borger). Passwordless login on localhost is now possible.
- 2009-03-06 2.0 first public version
Copyright
Gerhard Laußer
Check_mysql_health is published under the GNU General Public License. GPL
Author
Gerhard Laußer (gerhard.lausser@consol.de) gladly answers questions to this plugin.
Translation
Thanks to Christian Lauf there is finally an english translation of this page :-)
43 Responses to “check_mysql_health”
-
Jozef Fulop Says:
November 12th, 2009 at 18:35To check replication related parameters (slave-lag, slave-io-running, etc.) you need also these privileges: grant super,replication client on . to nagios@foo
[Reply]
-
Larsen Says:
December 3rd, 2009 at 14:02Hi,
checking MySQL 4 servers doesn´t work with check_mysql_health: “CRITICAL – cannot connect to information_schema”. It used to work with check_mysql_perf 1.3.
Is there a workaround?
[Reply]
lausser Reply:
December 3rd, 2009 at 15:41Please try it with this testscript:
use DBI;</p> <h1>please change these settings if necessary</h1> <p>$self->{hostname} = 'mydb3'; $self->{username} = 'nagios'; $self->{password} = 'nagios'; $self->{port} = 3306; $self->{database} = 'information_schema'; # $self->{dsn} = "DBI:mysql:"; $self->{dsn} .= sprintf "database=%s", $self->{database}; $self->{dsn} .= sprintf ";host=%s", $self->{hostname}; $self->{dsn} .= sprintf ";port=%s", $self->{port}; eval { if ($self->{handle} = DBI->connect( $self->{dsn}, $self->{username}, $self->{password}, { RaiseError => 1, AutoCommit => 0, PrintError => 1 })) { printf "connected\n"; $self->{handle}->disconnect(); } else { printf "%s\n", DBI::errstr(); } }; if ($@) { printf "%s\n%s\n", $@, DBI::errstr(); }
[Reply]
Jens Rantil Reply:
January 26th, 2010 at 16:54I am experiencing the same issue as Larsen. However, your script above works perfectly. Any hint?
Jens
[Reply]
Jens Rantil Reply:
January 26th, 2010 at 18:01It seems the error might be related to the fact that I am connecting with a user that has no password. Is anyone able to recreate this?
[Reply]
Jens Rantil Reply:
January 26th, 2010 at 18:48Finally, I was able to fix it by applying a small patch: http://pastebin.com/m4067bf7e
The script required a password. This should not be a requirement (unless the user has a password, of course).
[Reply]
-
Andreas Says:
December 16th, 2009 at 12:03Hallo. Erstmal: tolles plugin!!!
Ich habe eine Frage zu: tmp-disk-tables.
Es wird bei mir dieser Wert angezeigt: “WARNING – 32.07% of 1059 tables were created on disk”
Aufruf mit: –method mysql –mode tmp-disk-tables
Die DB sagt mir aber (show status) das hier: | Created_tmp_disk_tables 0 | Created_tmp_files 5 | Created_tmp_tables 1
Wie kann hier dieser Wert dann zusammenkommen?
Viele Grüße und Danke!
Andreas
[Reply]
lausser Reply:
December 16th, 2009 at 12:48Mit SHOW STATUS bekommst du nur die Werte deiner aktuellen Session. Das Plugin ruft aber SHOW GLOBAL STATUS auf. Damit bekommt man die Summe aller Sessions und dann sieht’s gleich ganz anders aus. Gerhard
[Reply]
-
Mandy Says:
December 24th, 2009 at 13:37Hey,
Thanks for such a cool plugin.
One thing that I couldn’t find here was to check for queries_per_second.
Is it something that’s easily available through –mode ?
Please let me know.
-Mandy.
[Reply]
lausser Reply:
December 27th, 2009 at 18:27queries (selects) per second are part of –mode querycache-hitrate
[Reply]
-
Will Oberman Says:
January 15th, 2010 at 16:32The “slave-lag” test was failing with the following error:
Can’t locate object method “errstr” via package “DBD::MySQL::Server::Connection::Dbi” at /usr/local/nagios/libexec/check_mysql_health line 368.
I’m 90% sure the bug is trying to use $self->{handle}->errstr() verses DBI::errstr(). I changed line 368 to:
$self->add_nagios_critical(sprintf “unable to get replication info%s”, DBI::errstr());
And now I get a meaningful error:
you need the SUPER,REPLICATION CLIENT privilege for this operation
Which lines up with the first comment above (though, it’s easier to figure that out with a good error message) ;-)
[Reply]
lausser Reply:
January 15th, 2010 at 19:00Good catch! errstr is an attribute, not a method here. Can you please try $self->{handle}->{errstr} instead of DBI::errstr()?
[Reply]
-
John McLear Says:
January 19th, 2010 at 0:20Great plugin, using it on a few boxes :) good work!
[Reply]
-
John McLear Says:
January 19th, 2010 at 19:11./check_mysql_health –hostname myhostname –username myuser –password mypass –warning=10 –critical=20 –mode slave-lag Can’t locate object method “errstr” via package “DBD::MySQL::Server::Connection::Dbi” at ./check_mysql_health line 368.
Any idea on this? This check was working fine, it appears that maybe its returning NULL because there is no slave-lag?
Plugin is working, see below:
./check_mysql_health –hostname myhostname –username myuser –password mypass –warning=10 –critical=20 –mode qcache-hitrate OK – query cache hitrate 9.45% | qcache_hitrate=9.45%;10;20 qcache_hitrate_now=9.45% selects_per_sec=0.00
[Reply]
-
John McLear Says:
January 19th, 2010 at 19:12oh I just read above, heh. weird how google didn’t pick up this page when i googled the error :/
[Reply]
-
Mark Reynolds Says:
January 28th, 2010 at 12:35Hi,
Thanks for a great plug in! I am monitoring thread cache hit rate and I would have thought having a high hit rate is a good thing, however the warning and critical values appear to be treated as a greater than threshold rather than a less than e.g.
./check_mysql_health –hostname server –username nagios –password “#BK79o6&” –mode threadcache-hitrate –warning 60 –critical 40
Gives
CRITICAL – thread cache hitrate 95.52% | thread_cache_hitrate=95.52%;60;40 thread_cache_hitrate_now=100.00% connections_per_sec=0.03
I would have thought that should return OK not CRITICAL.
Is there anyway to invert this?
[Reply]
-
Mark Reynolds Says:
January 28th, 2010 at 13:32Sorry to reply to my own post – I have just found the documentation that states adding a : to my value means it will do a less than rather than greater than.
[Reply]
-
Tarak Ranjan Says:
February 12th, 2010 at 8:49Hi ,
I have used check_mysql_health in my nagios server. When i am running the checks from the command line , it works fine.
but when it’s displaying the output on Nagios frontend , that time i’m getting the below error….
” **ePN failed to compile /usr/local/nagios/libexec/check_mysql_health: “Missing right curly or square bracket at (eval 23) line 3116, at end of line “
Please help
[Reply]
Christian Reply:
July 23rd, 2010 at 12:25@Tarak Ranjan, Hi Tarak, its a bit late to answer this but maybe it’ll help someone other.
Your error message seems to indicate ePN has a Problem executing the plugin. Try to disable ePN in the nagios.cfg (Change: enable_embedded_perl=1 to enable_embedded_perl=0), restart Nagios an try again.
[Reply]
-
Felipe Ferreira Says:
March 15th, 2010 at 16:29Great Plugin! Thanks for the good work. Is there a simple way to check a table or a DB size? Also when deadlocks are a happening?
thanks
[Reply]
lausser Reply:
March 18th, 2010 at 23:23I’ll put deadlocks on my list. It’s surely not be done with a few lines of code, so it may take a while. The db/tablesize is something you should be able to implement yourself, if you can manage to get the size with a single sql statement (–mode sql). You can also look into the contrib subdirectory to see how more complicated statements can be used to add functionality.
[Reply]
-
Larsen Says:
March 30th, 2010 at 16:06Hi, the errstr problem also exists in other lines and there is a small typo in line 366 “…io thead is not…”
[Reply]
lausser Reply:
March 30th, 2010 at 17:07Thanks, i forgot to publish the fixed version. You can now download 2.1.1 with the corrected errstr and typo.
[Reply]
-
Pat Bastien Says:
April 26th, 2010 at 18:20I am having the same issue with the Nagios embedded perl ePN compiler running the check_mysql_health plugin that Tarak Ranjan is having. [ **ePN failed to compile /usr/lib/nagios/plugins/check_mysql_health: Missing right curly or square bracket at (eval 12) line 3190, at end of line] I have to run the plugin using the external perl to get it to work. I’d like to use several of the measures in your great plugin but can’t if I have to load perl every measure/server. I tried determining why ePN is complaining by running the plugin with perl -c and using strict mode (which you already use) as recommended by Nagios group, but it doesn’t return any complaints. Note that I am running Nagios 2.7 — could that be the issue? Thanks.
[Reply]
lausser Reply:
April 26th, 2010 at 19:35Hi, i don’t think check_mysql_health is running with ePN at all.
[Reply]
-
kim Says:
May 23rd, 2010 at 11:16thks a lot
[Reply]
-
seteqsystems Says:
June 3rd, 2010 at 13:32Hi,
On line 864 you should replace SHOW VARIABLES WHERE Variable_name = ‘version’; with SHOW VARIABLES LIKE ‘version’;
Because MySQL4 does not support WHERE with SHOW VARIABLES :)
best regards Florian
[Reply]
lausser Reply:
June 8th, 2010 at 11:47Ah, thx. Can you please confirm, that this patch also works correctly on MySQL 5.x?
[Reply]
-
wangjun Says:
June 6th, 2010 at 12:43Could you offer the explanation of English version for your plugin. All in German.
Thank you
[Reply]
lausser Reply:
June 8th, 2010 at 11:43Sorry, i didn’t have the time yet. In the meantime you might try google translate. The output is not too bad imho.
[Reply]
-
Ed Says:
June 18th, 2010 at 15:12Great plugin. One of the best I’ve seen for Nagios so far.
[Reply]
-
Dennis Says:
June 22nd, 2010 at 10:15Hi,
a wrote a mysql-performance plugin in the past and queries/second has been a really useful indicator. It think, even if it is a part of the query-cachehit output, the data itself has its value.
Far more important is the observation, that in mode “index-usage” as well as “querycache-hitrate” the critical value is assumed to be bigger, than the warning value, which may not be correct, in these to modes.
I would appreciate any help.
[Reply]
lausser Reply:
June 22nd, 2010 at 10:43Hi, did you notice the “:”? (according to the plugin developer guidelines where ‘:number’ is ‘less than number’) Default:
90: = less than 90, 80: = less than 80. So when you specify your own thresholds, you must use the ‘:’ too.nagios$ check_mysql_health --mode querycache-hitrate CRITICAL - query cache hitrate 70.97% | qcache_hitrate=70.97%;90:;80:
nagios$ check_mysql_health --mode querycache-hitrate \ --warning 80: --critical 70: WARNING - query cache hitrate 70.82% | qcache_hitrate=70.82%;80:;70:
[Reply]
Dennis Reply:
June 22nd, 2010 at 11:30Oh Sorry, actually i didnt. Sorry about that and thanks for your quick help.
[Reply]
-
Tobias Says:
July 5th, 2010 at 16:29Hallo!
Ich muss mich anschließen: das Plugin ist super, tolle Arbeit, danke fürs freigeben!
In den Zeilen 1244, 1937 und 2512 der Version 2.1.2 wird die open-Funktion verwendet ohne zu prüfen, ob diese erfolgreich war. Man bekommt als Fehlermeldung, falls was mit dem Schreiben nicht hinhaute ein “Could not write to closed filehandle” anstelle eines möglichen “Could not write to file xyz”. Meiner Meinung nach sollten Operationen auf dem dem Dateisystem, o.ä. immer kontrolliert werden. Das ist Geschmackssache, kein Fehler, klar.
Viele Grüße nach München Tobias
[Reply]
lausser Reply:
July 8th, 2010 at 13:03Ja, das könnte man verbessern. Üblicherweise krachts aber nur, wenn mal wieder so ein Spezialist das Plugin als root aufgerufen hat (pfui, pfui und nochmals pfui) und die erzeugte temporäre Datei somit root gehört. Wenn nachher der Nagios-Prozess die Plugin-Ausführung übernimmt, kann die Datei nicht mehr überschrieben werden und die fehlgeschlagene open-Funktion reisst den ganzen Prozess runter. Zur Strafe sollte es eigentlich den ganzen Rechner zerlegen :-) Das Blöde ist, daß es von der Programmierung her nicht leicht ist, einen Fehler im open “nach Oben” durchzureichen. Ich mag es gar nicht, mitten im Programm mit einem die oder exit rauszugehen, aber hier wird wohl nichts anderes übrig bleiben. Ich bin grad im Urlaub und schau es mir nächste Woche an.
[Reply]
-
edvard.pohl Says:
July 8th, 2010 at 11:26For all with mysql 4.x . I also had a problem with geting it work with 4.x mysql and it’s because this proggi use ” information_schema” as default db which don’t exists in 4.x mysql. There is a –database option which you could use.
[Reply]
lausser Reply:
July 8th, 2010 at 13:14I don’t have a 4.x db at hand. Is there an equivalent to information_schema in 4.x?
[Reply]
-
uvdevnull Says:
July 15th, 2010 at 0:53Hi lausser, The english page (http://labs.consol.de/lang/en/nagios/check_mysql_health/) doesn’t actually show in english, still german. And for some reason, google is also unable to translate it. Any clues?
[Reply]


lausser Reply:
November 12th, 2009 at 18:53
Yes, that’s true. Thank you for pointing this out. And another common pitfall…you need to run check_mysql_health against the slave when checking replication.
[Reply]