check_oracle_health
Posted on July 3rd, 2009 by admin
Description
check_oracle_health is a plugin to check various parameters of an Oracle database.
Documentation
Command line parameters
- –connect=
The database name - –user=
The database user - –password=
Password of the database user. - –connect=
Alternativ to the parameters above. - –connect=sysdba@
Login with / as sysdba (if the user that executes the plugin is privileged to do this) - –connect=/@token Login with help of the Password Store (assumes –method=sqlplus)
- –mode=
With the mode-parameter you tell the plugin what it should do. See the list of possible values further down. - –tablespace=
With this you can limit the check of a single tablespace. If this parameter is omitted all tablespaces are checked. - –datafile=
With this you can limit the check of a single datafile. If this parameter is omitted all datafiles are checked. - –name=
Here the check can be limited to a single object (Latch, Enqueue, Tablespace, Datafile). If this parameter is omitted all objects are checked. (Instead of –tablespace or –datafile this parameter can and should be used. It servers the purpose to standardize the CLI interface.) - –name2=
f you use –mode=sql, then the SQL-Statement appears in the output and performance values. With the parameter name2 you’re able to specify a string for this. - –regexp Through this switch the value of the –name Parameters will be interpreted as regular expression.
- –warning=
Determined values outside of this range trigger a WARNING. - –critical=
Determined values outside of this range trigger a CRITICAL. - –absolute Without –absolute values that increase in the course of time will show the increase per second or with –absolute show the difference between the current and last run.
- –runas=
With this parameter it is possible to run the script under a different user. (Calls sudo internally: sudo -u . - –environment
= With this you can pass environment variables to the script. For example: –environment ORACLE_HOME=/u01/oracle. Multiple declarations are possible. - –method=
With this parameter you tell the plugin how it should connect to the database. (dbi for using DBD::Oracle (default), sqlplus for using the sqlplus-Tool). - –units=<%|KB|MB|GB> The declaration from units servers the "beautification" of the output from mode=sql and simplification from threshold values when using mode=tablespace-free
Use the option –mode with various keywords to tell the Plugin which values it should determine and check.
| Keyword | Description | Range |
| tnsping | Listener | |
| connection-time | Determines how long connection establishment and login take | 0..n Seconds (1, 5) |
| connected-users | The sum of logged in users at the database | 0..n (50, 100) |
| sga-data-buffer-hit-ratio | Hitrate in the Data Buffer Cache | 0%..100% (98:, 95:) |
| sga-library-cache-hit-ratio | Hitrate in the Library Cache | 0%..100% (98:, 95:) |
| sga-dictionary-cache-hit-ratio | Hitrate in the Dictionary Cache | 0%..100% (95:, 90:) |
| sga-latches-hit-ratio | Hitrate of the Latches | 0%..100% (98:, 95:) |
| sga-shared-pool-reloads | Reload-Rate in the Shared Pool | 0%..100% (1, 10) |
| sga-shared-pool-free | Free Memory in the Shared Pool | 0%..100% (10:, 5:) |
| pga-in-memory-sort-ratio | Percentage of sorts in the memory. | 0%..100% (99:, 90:) |
| invalid-objects | Sum of faulty Objects, Indices, Partitions | |
| stale-statistics | Sum of objects with obsolete optimizer statistics | n (10, 100) |
| tablespace-usage | Used diskspace in the tablespace | 0%..100% (90, 98) |
| tablespace-free | Free diskspace in the tablespace | 0%..100% (5:, 2:) |
| tablespace-fragmentation | Free Space Fragmentation Index | 100..1 (30:, 20:) |
| tablespace-io-balanc | IO-Distribution under the datafiles of a tablespace | n (1.0, 2.0) |
| tablespace-remaining-time | Sum of remaining days until a tablespace is used by 100%. The rate of increase will be calculated with the values from the last 30 days. (With the parameter –lookback different periods can be specified) | Days (90:, 30:) |
| tablespace-can-allocate-next | Checks if there is enough free tablespace for the next Extent. | |
| datafile-io-traffic | Sum of IO-Operationes from Datafiles per second | n/sec (1000, 5000) |
| soft-parse-ratio | Percentage of soft-parse-ratio | 0%..100% |
| switch-interval | Interval between RedoLog File Switches | 0..n Seconds (600:, 60:) |
| retry-ratio | Retry-Rate in the RedoLog Buffer | 0%..100% (1, 10) |
| redo-io-traffic | Redolog IO in MB/sec | n/sec (199,200) |
| roll-header-contention | Rollback Segment Header Contention | 0%..100% (1, 2) |
| roll-block-contention | Rollback Segment Block Contention | 0%..100% (1, 2) |
| roll-hit-ratio | Rollback Segment gets/waits Ratio | 0%..100% (99:, 98:) |
| roll-extends | Rollback Segment Extends | n, n/sec (1, 100) |
| roll-wraps | Rollback Segment Wraps | n, n/sec (1, 100) |
| seg-top10-logical-reads | Sum of the userprocesses under the top 10 logical reads | n (1, 9) |
| seg-top10-physical-reads | Sum of the userprocesses under the top 10 physical reads | n (1, 9) |
| seg-top10-buffer-busy-waits | Sum of the userprocesses under the top 10 buffer busy waits | n (1, 9) |
| seg-top10-row-lock-waits | Sum of the userprocesses under the top 10 row lock waits | n (1, 9) |
| event-waits | Waits/sec from system events | n/sec (10,100) |
| event-waiting | How many percent of the elapsed time has an event spend with waiting | 0%..100% (0.1,0.5) |
| enqueue-contention | Enqueue wait/request-Ratio | 0%..100% (1, 10) |
| enqueue-waiting | How many percent of the elapsed time since the last run has an Enqueue spend with waiting | 0%..100% (0.00033,0.0033) |
| latch-contention | Latch misses/gets-ratio. With –name a Latchname or Latchnumber can be passed over. (See list-latches) | 0%..100% (1,2) |
| latch-waiting | How many percent of the elapsed time since the last run has a Latch spend with waiting | 0%..100% (0.1,1) |
| sysstat | Changes/sec for any value from v$sysstat | n/sec (10,10) |
| sql | Result of any SQL-Statement that returns a number. The statement itself is passed over with the parameter –name. A Label for the performance data output can be passed over with the parameter –name2. | n (1,5) |
| list-tablespaces | Prints a list of tablespaces | |
| list-datafiles | Prints a list of datafiles | |
| list-latches | Prints a list with latchnames and latchnumbers | |
| list-enqueues | Prints a list with the Enqueue-Names | |
| list-events | Prints a list with the events from (v$system_event). Besides event_number/event_id a shortened form of the eventname is printed out. This could be use as Nagios service descriptions. Example: lo_fi_sw_co = log file switch completion | |
| list-background-events | Prints a list with the Background-Events | |
| list-sysstats | Prints a list with system-wide statistics |
Measurements that are dependent on a time interval can be execute differently. To calculate the end result the following is needed: start value, end value and the passed time between this two values. Without further options the inital value will be the value from the last plugin run. The passed time is normally the time of normal_check_interval of the according service.
If the increase per second shouldn’t be decisive for the check result, but the difference between two measured values, than use the option –absolute. This is useful for Rollback Segment Wraps which happen very rare so that their rate is nearly 0/sec. Nevertheless you want to be alarmed if the number od this events grows.
The threshold values should be choosen in a way that they can be reached during a retry_check_interval. If not the service will change into the OK-State after each SOFT;1.
Pleae note, that the thresholds must be specified according to the Nagios plug-in development Guidelines.
"10" means "Alarm, if > 10" and
"90:" means "Alarm, if < 90"
Preparation of the database
In order to be able to collect the needed information from the database a database user with specific privileges is required:
CREATE user nagios IDENTIFIED BY oradbmon; GRANT CREATE session TO nagios; GRANT SELECT any dictionary TO nagios; GRANT SELECT ON V_$SYSSTAT TO nagios; GRANT SELECT ON V_$INSTANCE TO nagios; GRANT SELECT ON V_$LOG TO nagios; GRANT SELECT ON SYS.DBA_DATA_FILES TO nagios; GRANT SELECT ON SYS.DBA_FREE_SPACE TO nagios; -- -- if somebody still uses Oracle 8.1.7... GRANT SELECT ON sys.dba_tablespaces TO nagios; GRANT SELECT ON dba_temp_files TO nagios; GRANT SELECT ON sys.v_$Temp_extent_pool TO nagios; GRANT SELECT ON sys.v_$TEMP_SPACE_HEADER TO nagios; GRANT SELECT ON sys.v_$session TO nagios;
Examples
nagios$ check_oracle_health --connect bba --mode tnsping OK - connection established to bba. nagios$ check_oracle_health --mode connection-time OK - 0.17 seconds to connect | connection_time=0.1740;1;5 nagios$ check_oracle_health --mode sga-data-buffer-hit-ratio CRITICAL - SGA data buffer hit ratio 0.99% | sga_data_buffer_hit_ratio=0.99%;98:;95: nagios$ check_oracle_health --mode sga-library-cache-hit-ratio OK - SGA library cache hit ratio 98.75% | sga_library_cache_hit_ratio=98.75%;98:;95: nagios$ check_oracle_health --mode sga-latches-hit-ratio OK - SGA latches hit ratio 100.00% | sga_latches_hit_ratio=100.00%;98:;95: nagios$ check_oracle_health --mode sga-shared-pool-reloads OK - SGA shared pool reloads 0.28% | sga_shared_pool_reloads=0.28%;1;10 nagios$ check_oracle_health --mode sga-shared-pool-free WARNING - SGA shared pool free 8.91% | sga_shared_pool_free=8.91%;10:;5: nagios$ check_oracle_health --mode pga-in-memory-sort-ratio OK - PGA in-memory sort ratio 100.00% | pga_in_memory_sort_ratio=100.00;99:;90: nagios$ check_oracle_health --mode invalid-objects OK - no invalid objects found | invalid_ind_partitions=0 invalid_indexes=0 invalid_objects=0 unrecoverable_datafiles=0 nagios$ check_oracle_health --mode switch-interval OK - Last redo log file switch interval was 18 minutes | redo_log_file_switch_interval=1090s;600:;60: nagios$ check_oracle_health --mode switch-interval --connect rac1 OK - Last redo log file switch interval was 32 minutes (thread 1)| redo_log_file_switch_interval=1938s;600:;60: nagios$ check_oracle_health --mode tablespace-usage CRITICAL - tbs SYSTEM usage is 99.33% tbs SYSAUX usage is 93.73% tbs USERS usage is 8.75% tbs UNDOTBS1 usage is 6.65% | 'tbs_users_usage_pct'=8%;90;98 'tbs_users_usage'=0MB;4;4;0;5 'tbs_undotbs1_usage_pct'=6%;90;98 'tbs_undotbs1_usage'=11MB;153;166;0;170 'tbs_system_usage_pct'=99%;90;98 'tbs_system_usage'=695MB;630;686;0;700 'tbs_sysaux_usage_pct'=93%;90;98 'tbs_sysaux_usage'=802MB;770;839;0;856 nagios$ check_oracle_health --mode tablespace-usage --tablespace USERS OK - tbs USERS usage is 8.75% | 'tbs_users_usage_pct'=8%;90;98 'tbs_users_usage'=0MB;4;4;0;5 nagios$ check_oracle_health --mode tablespace-usage --name USERS OK - tbs USERS usage is 8.75% | 'tbs_users_usage_pct'=8%;90;98 'tbs_users_usage'=0MB;4;4;0;5 nagios$ check_oracle_health --mode tablespace-free --name TEST OK - tbs TEST has 97.91% free space left | 'tbs_test_free_pct'=97.91%;5:;2: 'tbs_test_free'=32083MB;1638.40:;655.36:;0.00;32767.98 nagios$ check_oracle_health --mode tablespace-free --name TEST --units MB --warning 100: --critical 50: OK - tbs TEST has 32083.61MB free space left | 'tbs_test_free_pct'=97.91%;0.31:;0.15: 'tbs_test_free'=32083.61MB;100.00:;50.00:;0;32767.98 nagios$ check_oracle_health --mode tablespace-free --name TEST --warning 10: --critical 5: OK - tbs TEST has 97.91% free space left | 'tbs_test_free_pct'=97.91%;10:;5: 'tbs_test_free'=32083MB;3276.80:;1638.40:;0.00;32767.98 nagios$ check_oracle_health --mode tablespace-remaining-time --tablespace ARUSERS --lookback 7 WARNING - tablespace ARUSERS will be full in 78 days | 'tbs_arusers_days_until_full'=78;90:;30: nagios$ check_oracle_health --mode datafile-io-traffic --datafile users01.dbf WARNING - users01.dbf: 1049.83 IO Operations per Second | 'dbf_users01.dbf_io_total_per_sec'=1049.83;1000;5000 nagios$ check_oracle_health --mode latch-contention --name 214 OK - SGA latch library cache (214) contention 0.08% | 'latch_214_contention'=0.08%;1;2 'latch_214_sleep_share'=0.00% 'latch_214_gets'=49995 nagios$ check_oracle_health --mode latch-contention --name 'library cache' OK - SGA latch library cache (214) contention 0.08% | 'latch_214_contention'=0.08%;1;2 'latch_214_sleep_share'=0.00% 'latch_214_gets'=49937 nagios$ check_oracle_health --mode enqueue-contention --name TC CRITICAL - enqueue TC: 19.90% of the requests must wait | 'TC_contention'=19.90%;1;10 'TC_requests'=2015 'TC_waits'=401 nagios$ check_oracle_health --mode latch-contention --name 'messages' OK - SGA latch messages (17) contention 0.02% | 'latch_17_contention'=0.02%;1;2 'latch_17_gets'=4867 nagios$ check_oracle_health --mode latch-waiting --name 'user lock' OK - SGA latch user lock (205) sleeping 0.000841% of the time | 'latch_205_sleep_share'=0.000841% nagios$ check_oracle_health --mode event-waits --name 'log file sync' OK - log file sync : 1.839511 waits/sec | 'log file sync_waits_per_sec'=1.839511;10;100 nagios$ check_oracle_health --mode event-waiting --name 'Log file parallel write' OK - log file parallel write waits 0.045843% of the time | rarr 'log file parallel write_percent_waited'=0.045843%;0.1;0.5 nagios$ check_oracle_health --mode sysstat --name 'transaction rollbacks' OK - 0.000003 transaction rollbacks/sec | 'transaction rollbacks_per_sec'=0.000003;10;100 'transaction rollbacks'=4 nagios$ check_oracle_health --mode sql --name 'select count(*) from v$session' --name2 sessions CRITICAL - sessions: 21 | 'sessions'=21;1;5 nagios$ check_oracle_health --mode sql --name 'select 12 from dual' --name2 twelve --units MB CRITICAL - twelfe: 12MB | 'twelfe'=12MB;1;5 nagios$ check_oracle_health --mode sql --name 'select 200,300,1000 from dual' --name2 'kaspar melchior balthasar' --warning 180 --critical 500 WARNING - kaspar melchior balthasar: 200 300 1000 | 'kaspar'=200;180;500 'melchior'=300;; 'balthasar'=1000;;
Authentication
Example with –runas and an "external user"
There are to users in the database:
- OPS$DBNAGIO IDENTIFIED EXTERNALLY
- NAGIOS IDENTIFIED BY ‘DBMONI’
There are two unix users:
- qqnagio with normal access.
- dbnagio with /bin/false as login shell.
qqnagio$ check_oracle_health --mode=connection-time --connect=nagios/dbmoni@BBA OK - 0.21 seconds to connect as NAGIOS dbnagio$ check_oracle_health --mode=connection-time --connect=BBA --runas=dbnagio --environment ORACLE_HOME=$ORACLE_HOME OK - 0.17 seconds to connect as OPS$DBNAGIO
The background for this example is the following scenario with a SAP-Server:
Only local connections to the database are allowed. The database isn’t reachable over the network. Logging in with username and password is not possible.
Only database-users that are authenticated through the operating system (OPS$-User) are allowed to connect.
These users are not allowed to connect via SSH. (Therefore /bin/false).
Because the Nagios user qqnagio is allowed to connect via SSH, he can’t be used as database user. But the NRPE which executes the plugin will run under the qqnagios-account.
Use of environment variables
It is possible to omit –connect (and if not needed –user and –password) completely, if you provide the corresponding values in environment variables. Since Version 3.x it is possible to extend service definitions in Nagios through own attributes (custom object variables). These will appear during the exectution of the check command in the environment.
The environment variables are:
- NAGIOS__SERVICEORACLE_SID (_oracle_sid in the service definition)
- NAGIOS__SERVICEORACLE_USER (_oracle_user in the service definition)
- NAGIOS__SERVICEORACLE_PASS (_oracle_pass in the service definition)
Installation
The installation of the perl-modules DBI and DBD::Oracle is required.
After unpacking the archive ./configure is called. With ./configure –help some options can be printed which show some default values for compiling the plugin.
- –prefix=BASEDIRECTORY Specify a directory in which check_oracle_health should be stored. (default: /usr/local/nagios)
- –with-nagios-user=SOMEUSER This User will be the owner of the check_oracle_health file. (default: nagios)
- –with-nagios-group=SOMEGROUP The group of the check_oracle_health plugin. (default: nagios)
- –with-perl=PATHTOPERL Specify the path to the perl interpreter you wish to use. (default: perl in PATH)
Download
check_oracle_health-1.6.4.tar.gz
check_oracle_health-1.6.4.shar.gz
Some versions of tar are having problems with the long filesnames. In this case please unpack the shar-Paket with cat check_oracle_health-xxx.shar.gz | gzip -d | sh
Changelog
- 2010-06-10 1.6.4
added checking of dba_registry to mode invalid-objects. Thanks Ovidiu Marcu
speedup of tablespace-remaining-time. Thanks Steffen Poulsen
switch-interval detects redo log timestamps in the future and reports critical
method sqlplus now works with "(DESCRIPTION =(ADDRESS = (PROTOCOL = TCP"-like connectstrings
new parameter –ident to show instance and database names in the output
bugfix in tablespace-usage (temp tbs with multiple datafiles). Thanks Philipp Lemke - 2009-09-09 1.6.3
tablespace-can-allocate-next was optimized.Illegal statefile-Names were fixed. Thanks Franky van Liedekerke.
Bugfix in tablespace-usage under Oracle 8.1.x
switch-interval now works more precise. Thanks Naquada.
Paswords don’t show up in error messages any more. Thanks Jens Seiffert.
Bugfix in mode sql. (Decimalvalues with .5 lead to errors). Thanks Shane Jordan.
Bugfix in sga-latches-hitratio, Thresholds were ignored. Thanks Yannik Charton.
The parameter –user is now –username (user still works)
- 2009-04-05 1.6.2 Bugfix in tablespace-usage/free due to non-autoextensible TEMP-Tablespaces. (Thanks Daniel Graef)
- 2009-03-27 1.6.1 –mode=tablespace-usage|free now recognizes offline tablespaces. (Thanks Daniel Graef)
- 2009-03-11 1.6 Support for DBD::SQLRelay. Mode sql can print out multiple values (Thanks Juergen Lesny). Login as "sys" possible (Thanks Joerg Horchler). Bugfix when using warning/critical=0 (Thanks Danijel Tasov)
- 2008-10-28 1.5.0.1 Bugfix due to , instead of . in decimal values. mode=sql output will be rounded to 2 places after the decimal point. Bugfix in mode=sga-shared-pool-free. (Thanks Birk Bohne)
- 2008-10-15 1.5 New authentication methods password store and as sysdba. New mode tablespace-free. New parameter –units when using mode=sql and mode=tablespace-free. Mode switch-interval considers RAC (Thanks Harald Zahn).
- 2008-09-19 1.4.2.1 New parameter –regexp supplemented –name. Bugfix in tablespace-usage (>100% when using resize datafile)
- 2008-09-10 1.4.1 New mode tablespace-can-allocate-next, Handling from locked accounts, Timeout-Bugfix, Encode, expired Extents in UNDO-Tablespace are considered, Bugfix wg. mode=sql and Null-Values (Thanks Viktor Käfer), mode=top10* optimized.
- 2008-07-09 1.4.0.1 Bugfixes#(–name=0, –method=sqlplus), –invalid-objects and –stale-statistics now consider thresholds (Thanks Konrad Barck)
- 2008-07-03 1.4 Statesdir is now /var/tmp/check_oracle_health, Bugfixes in latch-contention, systats and roll-extends. Performance improvements.
- 2008-07-01 1.3.1.1 Bugfix in method=sqlplus and os$user, Bugfix in tablespace-usage when using Temp-Tablespaces, better performancevalues for pga-in-memory-sort-ratio
- 2008-06-26 1.3.1 Code cleanup, Bugfix in connected-users Thresholds
- 2008-06-24 1.3 data-buffer/library/dictionary-cache-hitratio are now more precise, tablespace-usage considers autoextents, sqlplus, code cleaned up
- 2008-06-20 1.2.7 bugfixes in top10-x and pga-in-memory-sort. New Mode sql. Unrecoverable datafiles removed from invalid-objects (will get his own mode later)
- 2008-06-16 1.2.6.1 New modes sysstat list-sysstats
- 2008-06-14 1.2.6 New modes event-waited event-waits list-events
- 2008-06-11 1.2.5.1 internal changes
- 2008-06-03 1.2.5 New modes latch-contention enqueue-contention enqueue-waiting connected-users list-latches list-enqueues
- 2008-05-27 1.2.4.1 New modes list-tablespaces and list-datafiles (no Monitoringfunction)
- 2008-05-27 1.2.4 New modes datafile-io-traffic and redo-io-traffic
- 2008-05-25 1.2.3.1 stale-statistics now run under Oracle 9.x
- 2008-05-25 1.2.3 New modes –roll-block-contention, –roll-hit-ratio, Bugfix in –switch-interval
- 2008-05-23 1.2.2.1 Modes, that require Oracle 10.x are disabled with Oracle 9.x/8.x
- 2008-05-21 1.2.2 Bugfix in –environment
- 2008-05-19 1.2.1 sga-buffer-cache-hit-ratio now shows percent (thx Maik Ihde), new parameters –runas –environment, support for externally authenticated users, Bugfix in tablespace-remaining-time
- 2008-05-06 1.2 connection timeout handling, stale-statistics
- 2008-05-02 1.1 tablespace-remaining-time, tablespace-io-balance
- 2008-04-16 1.0 first public version
Copyright
2008 Gerhard Laußer
Check_oracle_health is published under the GNU General Public License. GPL
Author
Gerhard Laußer (gerhard.lausser@consol.de) gladly answers questions to this plugin.
Translation
Thanks to Christian Lauf there is finally an english translation of this page :-)
123 Responses to “check_oracle_health”
-
Marco Says:
October 7th, 2009 at 12:58Hallo,
ich teste gerade Ihr Tool. Ich bin sehr begeistert davon, denn es nimmt mir viel Abreit ab. Leider habe ich ein kleines Problem mit dem mode sql ./check_oracle_health –connect ‘(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.102.103)(PORT=1521))(CONNECT_DATA=(SID=XYZ)))’ –user nagios –password oradbmon09 –mode sql –name ‘select count(*) from v$session’ –name2 sessions –warning 100 –critical 150 ERgebnis: WARNING – sessions: 21 | ‘sessions’=21;20;30
In Nagios eingebunden bekomme ich als Status Information nur OK – sessions: Hier müßte eigentlich ja auch die Warning kommen.
Skript:
define command{ command_name check_oracle_per_sql command_line $USER1$/check_oracle_health –connect ‘(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=$HOSTADDRESS$)(PORT=1521))(CONNECT_DATA=(SID=$ARG3$)))’ –user $ARG1$ –password $ARG2$ –mode sql –name ‘select count(*) from v\$session’ –name2 sessions –warning 5 –critical 10 }
define service{ use local-service host_name test2_ch service_description Count Open Sessions2 check_command check_oracle_per_sql!nagios!abc!xyz }
Performance Daten werden auch nicht geschrieben.
Der Mode Tablespace funktioniert bei mir.
Gruß Marco
[Reply]
-
lausser Says:
October 8th, 2009 at 10:24Hallo, das liegt daran, dass Nagios empfindlich auf das Dollarzeichen in seinen Konfigdateien reagiert. Mit –name ‘select count(*) from v$$session’ in der Command-Definition sollte es funktionieren. Eine Alternative wäre, SQL-Statements, die Sonderzeichen beinhalten, vorher zu encodieren. Das geht zwar auf Kosten der Lesbarkeit, dafür muss man sich aber keine Gedanken mehr bzgl. einfache/doppelte Hochkommata, Escapen etc. machen.
echo 'select count(*) from v$session' | check_oracle_health --mode encode select%20count%28%2A%29%20from%20v%24sessioncommand_line $USER1$/check_oracle_health ..... --name select%20count%28%2A%29%20from%20v%24session ....Gerhard
[Reply]
-
Marco Says:
October 9th, 2009 at 8:50Wunderbar es funktionert.
Eine Frage habe ich noch. Kann es sein, dass für einige Abfragen keine Performancedaten geschrieben werden? z.B. für sga-shared-pool-reloads ?
Marco
[Reply]
-
Marco Says:
October 9th, 2009 at 9:29Wenn ich keine Schwellwerte mitgebe, dann schreibt er nichts. Gebe ich welche an, dann werden auch Performancedaten geschrieben.
[Reply]
-
Marco Says:
October 9th, 2009 at 12:25Ich habe die gleiche Version. Bei mir war es so, dass er keine Performance-Daten nach “/usr/local/nagios/share/perfdata” geschrieben hat. Nachdem ich die Schwellwerte angegeben habe, ging es. Lag vielleich an mir.
Eine Frage habe ich nun doch schon wieder: Ich möchte mir “sga-data-buffer-hit-ratio” ausgeben lassen. Ergebnis: CRITICAL – SGA data buffer hit ratio 43.73% | sga_data_buffer_hit_ratio=43.73%;98:;95: Unter SQLPLUS kommt folgendes raus: SELECT ROUND((1-(phy.value / (cur.value + con.value)))*100,2) “Cache Hit Ratio” FROM v$sysstat cur, v$sysstat con, v$sysstat phy WHERE cur.name = ‘db block gets’ AND con.name = ‘consistent gets’ AND phy.name = ‘physical reads’;
Cache Hit Ratio
86.36Interpretier ich etwas falsch?
[Reply]
lausser Reply:
October 9th, 2009 at 16:07“db block gets” und die anderen Werte werden stur hochgezählt. Mit deinem SQL-Statement errechnest du die Hitratio über die gesamte Laufzeit der Instanz. Der Wert wird irgendwann sehr ungenau bzw. ändert sich nur sehr langsam. Bei check_oracle_health werden die Deltas dieser Zähler (zwischen dem aktuellen und dem letzten Lauf des Plugins) zur Berechnung verwendet. So bekommst du immer einen aktuellen Wert. (Den Mittelwert im check_interval).
[Reply]
-
Marco Says:
October 15th, 2009 at 15:04Ich habe schon wieder eine Frage: Kann ich mit “tablespace-usage” auch einige Tablespace excluden? Ich möchte alle TBS’s außer z.B. sysaux, system überwachen. Geht das?
[Reply]
lausser Reply:
October 15th, 2009 at 15:17Der Parameter –regexp sorgt dafür, dass der Parameter –name (mit dem man normalerweise einen Tablespace gezielt abfragt) als regulärer Ausdruck interpretiert wird. Wenn du also –name so formulierst, dass der Pattern alle Namen matcht ausser SYSTEM und SYSAUX, dann werden die beiden nicht angezeigt.
… –name=’^(?!(SYSTEM|SYSAUX))’ –regexp
Gerhard
[Reply]
-
Manuel Says:
October 18th, 2009 at 3:22HAllo aus Spanien , Wo kann ich check_oracle_health download ?’
DAnke
[Reply]
lausser Reply:
October 18th, 2009 at 16:11Ungefähr 80cm nach oben scrollen bis zur Überschrift “Download”.
[Reply]
-
guzik Says:
October 28th, 2009 at 11:35Hi, I’ve got a problem with check_oracle_health plugin. Status information in my Nagios: **ePN /usr/lib64/nagios/plugins/check_oracle_health: “Can’t exec “/usr/sbin/p1.pl”: Permission denied at (eval 1) line 5254,”. Few of check working fine, rest has got a problem. From console there is no problem to execute script. What can I do to correct check services?
[Reply]
lausser Reply:
October 29th, 2009 at 20:34Hi, please add one line in the first 10 lines of the plugin:
# nagios: -epn
This prevents Nagios from executing check_oracle_health with the embedded Perl interpreter. I will add this to the next release of the plugin as the default. Gerhard[Reply]
-
Steffen Poulsen Says:
November 9th, 2009 at 17:29I tried to run check_oracle_health at an install using perl, version 5.005_03 – and it barked at me:
Use of reserved word “our” is deprecated at check_oracle_health.pl line 9. Bareword “our” not allowed while “strict subs” in use at check_oracle_health.pl line 9. Unquoted string “our” may clash with future reserved word at check_oracle_health.pl line 9. Array found where operator expected at check_oracle_health.pl line 9, at end of line (Do you need to predeclare our?) syntax error at check_oracle_health.pl line 9, near “our @ISA ” Global symbol “@ISA” requires explicit package name at check_oracle_health.pl line 9. BEGIN not safe after errors–compilation aborted at check_oracle_health.pl line 80.
We are very fond of your plugin and would like to use it at this install also – is there per incidence a drag-and-drop replacement for the “our @ISA”-construct that would allow the check to run at this old install also?
Best regards, Steffen Poulsen
[Reply]
lausser Reply:
November 11th, 2009 at 22:35I think, this would require a major rewrite of the plugin. Can’t you run it on the Nagios server and check the database with a remote connection? Gerhard
[Reply]
-
SweetBiene91 Says:
November 11th, 2009 at 0:57hey ho bin so einsam jemand lust zu chattn oder so
[Reply]
lausser Reply:
November 11th, 2009 at 13:50Versuch’s mal hiermit: irc server : irc.irclink.net port : 6667 channel : #nagios
[Reply]
-
Manfred Says:
November 12th, 2009 at 18:17Gibt es eine Option (z.B. quiet) welche nur die Werte ausgeben läßt, welche ein warning oder critical ausgeben? Bei über 30 Tablespaces (bei –mode=tablespace-free) ist es fast unmöglich den zu finden, welcher das Warning ausgelöst hat. Ausserdem wird die Ausgabe in Nagios sehr unübersichtlich und viel zu lange. Ich habe schon im Source versucht, ein “if” einzubauen um die Ausgabe zu unterdrücken, bin damit aber gescheitert – da dann die Warnings selbst ausbleiben. Z.B. der orignal Nagios check der Filesysteme gibt auch nur die aus, welche warning oder critical sind.
[Reply]
lausser Reply:
November 12th, 2009 at 19:01Ich würde in dem Fall empfehlen, check_multi zu verwenden. Das hat ausserdem den Vorteil, dass Schwellwerte im check_multi-Konfigfile geändert werden können, ohne dass man Nagios neu starten muss. Wenn man die Tablespacenamen als Label angibt, so erhält man eine knappe Ausgabe 30 plugins checked, 1 critical (TBS_1), 1 warning (TBS_25), 0 unknown, 28 ok
[Reply]
-
fsom Says:
November 20th, 2009 at 15:54Tolles Script! Funktioniert soweit alles, nur bei mode=sql komme ich nicht weiter (v1.6.3): ./check_oracle_health –connect=DB –user=xxxxxx –password=yyyyyy –mode=sql –name=”select count(*) from v$session where status = ‘ACTIVE’”
Use of uninitialized value in numeric gt (>) at /usr/lib/nagios/plugins/check_oracle_health line 3615. Use of uninitialized value in numeric gt (>) at /usr/lib/nagios/plugins/check_oracle_health line 3616. OK – select count(*) from v where status = ‘active’:
ich bekomme nichts von dem SQL Befehl zurück. Mache ich etwas falsch ? danke, fsom
[Reply]
lausser Reply:
November 20th, 2009 at 16:42Du musst das Dollarzeichen entwerten. Dein Statement: from v$session where… Ausgabe: from v where… Für die Shell sieht $session wie eine Variable aus und da diese nicht existiert, macht sie einen Leerstring draus. Schreib stattdessen …from v\$session…. Wenn das SQL-Statement komplizierter ist und viele solcher Sonderzeichen enthält, kann man es auch encodieren. Dazu rufst du check_oracle_health mit dem Parameter “–mode encode” auf. Es liest dann von der Standardeingabe. Du tippst dein Statement (ohne auf Entwertung von Sonderzeichen achten zu müssen und schliesst es mit RETURN ab.
$ check_oracle_health --mode encode select count(*) from v$session where status = 'ACTIVE' select%20count%28%2A%29%20from%20v%24session%20where%20status%20%3D%20%27ACTIVE%27
Als Ausgabe erhältst du das Statement in encodierter Form, das du nun bei –name angeben kannst, ohne auf Dollar- oder irgendwelche Anführungszeichen achten zu müssen.[Reply]
-
Bas de Klerk Says:
November 28th, 2009 at 17:55Hi,
thx for your greate plugin. Saves me a lot of time!
One small problem I’m having in version 1.6.3 is that the sga-data-bufer-hit-ratio sometimes drops to 0%… no clue why but sometimes it does. If I calculate it by hand using statement below the values are fine. If you need any add. info please let me know. For now I’ve made a workaround using mod=sql
Regards Bas
SELECT ((P1.value + P2.value – P3.value) / (P1.value + P2.value))*100 ratio FROM v$sysstat P1, v$sysstat P2, v$sysstat P3 WHERE P1.name = ‘db block gets’ AND P2.name = ‘consistent gets’ AND P3.name = ‘physical reads’;
[Reply]
-
lausser Says:
November 28th, 2009 at 20:48I use the deltas (the difference to the counter value when check_oracle_health was run last time) for the calculation. E.g. the “physical reads” i use for the calculation is “value of physical reads now – value of physical reads approx. 5 minutes ago.” This way the hitrate reflects the current state of the buffer cache. In your formula you use the counters which increased since the database was started, so it’s an average hitrate over the whole lifetime. But isn’t it more interesting to get the current hitrate? When you get 0% sometimes, it actually means a hitrate of 0% (at least during the last check_interval).This is some kind of a “negative spike”. But i understand the problem. I will introduce a parameter “–lookback” which takes a number of minutes as argument. This way, you can for example measure the hitrate during the last 30 minutes, which is pretty up to date, but gives you much smoother results.
[Reply]
-
Andreas Says:
December 12th, 2009 at 8:44Hallo, bei mir funktionieren nur run 50% der Abfragen: z.B. TNSPING, CON.-TIME, CON.-USERS, invalid-objects . Aber bei einigen Abfragen z.B. sga-data-buffer-hit-ratio erhalte ich in Nagios folgende Fehlermeldung: **ePN /usr/lib/nagios/plugins/check_oracle_health: printf() on closed filehandle STATE at (eval 1) line 3841,. “-epn” habe ich schon eingebaut. auf der Kommandozeile funktioniert die Abfrage aber. Danke!
[Reply]
lausser Reply:
December 12th, 2009 at 14:45Kann es sein, dass du check_oracle_health auf der Kommandozeile als root ausgeführt hast? Das Plugin merkt sich nämlich Zwischenergebnisse im Verzeichnis /var/tmp/check_oracle_health, welches automatisch angelegt wird. Falls das Verzeichnis root gehört, kann ein check_oracle_health-Prozess, der unter der Nagios-Kennung läuft, da nicht mehr hineinschreiben. Die Fehlermeldung weist darauf hin. Ein “chown -R nagios:nagios /var/tmp/check_oracle_health” sollte das Problem lösen.
[Reply]
-
Erlon Says:
December 14th, 2009 at 14:41Where I find the download link?
[Reply]
-
Erlon Says:
December 14th, 2009 at 16:54But does not exist the Topic Download!
[Reply]
-
Erlon Says:
December 14th, 2009 at 20:41Ok, I can see now. I didn’t see before because, i was seeing the page in english, and in english this link dont exists.
[Reply]
-
Don Seiler Says:
January 8th, 2010 at 1:14Are there plans for ASM checks, such as disk group free space (v$asm_diskgroup.usable_file_mb)?
[Reply]
lausser Reply:
January 8th, 2010 at 11:40No, i actually have no plans (mostly because i’m too occupied with other things). But if you look in the contribs subdirectory, you’ll find a description how you can extend check_oracle_health with your own custom modes. You simply put the code (mostly the sql stements) in a separate file which is sourced at runtime. Perhaps you want to play around with this and post the result. If it works, i will gladly add it to the core plugin.
[Reply]
Don Seiler Reply:
January 14th, 2010 at 0:16@lausser, I’d love to do this if I have some time later. Thanks.
[Reply]
-
Millet JC Says:
January 13th, 2010 at 11:29Hello All
I’ve a small compilation error on a Solaris system. I’m not expert but think that it’s linked to my environment :
./configure work with success.
make give me this error :
Making all in plugins-scripts make: Fatal error: Don’t know how to make target
Nagios/DBD/Oracle/Server/Instance/SGA/SharedPool/DictionaryCache.pm' Current working directory /tmp/check_oracle_health-1.6.3/plugins-scripts *** Error code 1 The following command caused the error: failcom='exit 1'; \ for f in x $MAKEFLAGS; do \ case $f in \ *=* | --[!k]*);; \ *k*) failcom='fail=yes';; \ esac; \ done; \ dot_seen=no; \ target=echo all-recursive | sed s/-recursive//; \ list='plugins-scripts t'; for subdir in $list; do \ echo "Making $target in $subdir"; \ if test "$subdir" = "."; then \ dot_seen=yes; \ local_target="$target-am"; \ else \ local_target="$target"; \ fi; \ (cd $subdir && make $local_target) \ || eval $failcom; \ done; \ if test "$dot_seen" = "no"; then \ make "$target-am" || exit 1; \ fi; test -z "$fail" make: Fatal error: Command failed for targetall-recursive’[Reply]
Millet JC Reply:
January 13th, 2010 at 11:29@Millet JC, Making all in plugins-scripts make: Fatal error: Don’t know how to make target
Nagios/DBD/Oracle/Server/Instance/SGA/SharedPool/DictionaryCache.pm' Current working directory /tmp/check_oracle_health-1.6.3/plugins-scripts *** Error code 1 The following command caused the error: failcom='exit 1'; \ for f in x $MAKEFLAGS; do \ case $f in \ *=* | --[!k]*);; \ *k*) failcom='fail=yes';; \ esac; \ done; \ dot_seen=no; \ target=echo all-recursive | sed s/-recursive//; \ list='plugins-scripts t'; for subdir in $list; do \ echo "Making $target in $subdir"; \ if test "$subdir" = "."; then \ dot_seen=yes; \ local_target="$target-am"; \ else \ local_target="$target"; \ fi; \ (cd $subdir && make $local_target) \ || eval $failcom; \ done; \ if test "$dot_seen" = "no"; then \ make "$target-am" || exit 1; \ fi; test -z "$fail" make: Fatal error: Command failed for targetall-recursive’[Reply]
lausser Reply:
January 13th, 2010 at 16:01Looks like your tar-command does not support filenames which exceed 100 characters (i think SuSE has such a tar). Instead of the tar.gz please download the shar.gz and unpack it with [sourcecode]cat check_oracle_health-xxx.shar.gz | gzip -d | sh[/sourcecode]
[Reply]
-
Rascal Says:
January 25th, 2010 at 20:44Hallo, ich bin kein Datenbänker, sondern nur “Überwacher”, daher meine Frage: Gibt es eine Möglichkeit den Datenbank-Connect durch das Plugin zu erhalten? Durch den ständigen Auf- und Abbau der Verbindung, schwellen die Logdateien auf der DB an? Oder muss da was an der Datenbank-Config gemacht werden?
[Reply]
-
lausser Says:
January 26th, 2010 at 13:13Mit http://sqlrelay.sourceforge.net/ kann man einen Proxy laufen lassen, der die Verbindung aufrecht hält. Dadurch entfallen dann die Login-Meldungen in der Logdatei.
check_oracle_health --method sqlrelay --connect <proxy-ip>:<proxy-port> --username <proxy-user> --password <proxy-password> ...
[Reply]
-
Frank Says:
February 15th, 2010 at 18:15Hallo, auf Kommandozeile funktioniert die Abfrage als User nagios. Im Nagios selber kommt die Fehlermeldung: ePN failed to compile /usr/lib/nagios/plugins/check_oracle_health “Missing right curly or square bracket at (eval 18) line 4193, at end of line syntax error at (eval 18) line 4200, at EOF at /usr/lib/nagios/p1.pl line 155″
Die Zeile “# nagios: -epn” steht im Skript bereits drin.
Kann es daran liegen dass Nagios noch v.2.9 ist? Gibt es einen Weg das unter dieser Version zum laufen zu bringen?
[Reply]
lausser Reply:
February 16th, 2010 at 2:20Die selektive Abschaltung mit -epn gibt es erst ab der Version 3. Leider, bleibt also nur ein Upgrade auf 3.x oder der komplette Verzicht auf ePN.
[Reply]
-
John Tomawski Says:
February 16th, 2010 at 23:30Be sure to set the –environment flag when required. The flag can be used to set things such as TNS_ADMIN, etc.
Hopefully this comment saves someone 2 hours… sigh
ex. –environment TNS_ADMIN=’/usr/lib/oracle/bleh’
[Reply]
-
Aldo Says:
February 19th, 2010 at 12:52when running the following command:
./check_oracle_health –connect REMOTE –username $ORAUSER –password $ORAPWD –mode tablespace-usage –tablespace USERS
I get the following error message.
Use of uninitialized value in split at /usr/lib/nagios/plugins/check_oracle_health line 3924. bumm Can’t call method “execute” on an undefined value at /usr/lib/nagios/plugins/check_oracle_health line 4230.
Can’t use an undefined value as an ARRAY reference at /usr/lib/nagios/plugins/check_oracle_health line 4242.
and now I’m clue less what todo? can you assist me on this one
thanks in advance
[Reply]
lausser Reply:
February 20th, 2010 at 0:12Did you give the necessary privileges to your ORAUSER?
You also can create an empty file /tmp/check_oracle_health.trace with the touch-command. As long as this file exists, check_oracle_health will write debugging messages into it. You should see the sql statements sent to the database server and the responses. Maybe this gives you an idea what’s wrong.CREATE user nagios IDENTIFIED BY oradbmon; GRANT CREATE session TO nagios; GRANT SELECT any dictionary TO nagios; GRANT SELECT ON V_$SYSSTAT TO nagios; GRANT SELECT ON V_$INSTANCE TO nagios; GRANT SELECT ON V_$LOG TO nagios; GRANT SELECT ON SYS.DBA_DATA_FILES TO nagios; GRANT SELECT ON SYS.DBA_FREE_SPACE TO nagios;
[Reply]
-
Hans-Jürgen Says:
February 22nd, 2010 at 11:03Hallo,
wir benutzen check_oracle_health seit längerem und sind sehr zufrieden damit. Vielen Dank dafür. Für die Tablespaces, bei denen auto-extent eingeschaltet ist, möchten wir die Überwachung von tablespace-usage auf tablespace-can-allocate-next ändern. Wird dabei sowohl überprüft, ob noch genügend Platz ist als auch ob MAX_EXTENT bereits erreicht ist?
[Reply]
lausser Reply:
February 22nd, 2010 at 12:51Hallo, max_extent wird meines Wissens nach nicht angeschaut. Wenn man mit touch /tmp/check_oracle_health.trace eine leere Datei anlegt (beschreibbar vom Nagios-User), dann werden dort die angesetzten SQL-Statements und deren Resultate reinprotokolliert.
[Reply]
Günter Reply:
April 13th, 2010 at 14:14@lausser, wird es in Zukunft eine Möglichkeit geben bei tablespace-usage autoextent Tablespaces auszuschließen?
[Reply]
Günter Reply:
April 13th, 2010 at 15:17@Günter, hat sich erledigt. Hab gerade im Trace File gesehen, dass Autoextent Tablespaces berücksichtigt werden, d.h. es wir die max. Größe verwendet.
[Reply]
lausser Reply:
April 13th, 2010 at 19:05Du kannst auch bestimmte Tablespaces per regulärem Ausdruck ausschliessen:
bedeutet: alles, ausser TABLESPACE1,TABLESPACE2,TABLESPACE3--name='^(?!(TABLESPACE1$)|(TABLESPACE2$)|(TABLESPACE3$))' --regexp
[Reply]
-
angry_admin Says:
February 24th, 2010 at 13:49http://ideas.nagios.org/a/dtd/22035-3955
[Reply]
-
Rik Says:
February 26th, 2010 at 16:21Thanks you Gerhard for an excellent plugin. Here is a tiny correction on the documentation on this page. –method accepts two arguments: dbi (not tns) or sqlplus. Or am I misinterpreting things?
[Reply]
lausser Reply:
February 26th, 2010 at 19:19Thanks! “tns” was how i named it in a very early phase. Later it was replaced by the less misleading “dbi”.
[Reply]
-
Thomas Says:
March 4th, 2010 at 11:57Hallo,
hört sich ja alles sehr schön an. Ich würde es ja auch gerne mal ausprobieren, aber ich finde leider nirgends einen Download Link (auch nicht mittlerweile 1,20 m weiter oben). Habe ich etwas übersehen?
Danke, Thomas
[Reply]
lausser Reply:
March 4th, 2010 at 15:16du bist vermutlich auf der englischen Seite gelandet, die es nicht gibt (bei der allerdings die Kommentare angezeigt werden) Der Download-Link ist auf dieser Seite: http://labs.consol.de/lang/de/nagios/check_oracle_health/
[Reply]
-
Thomas Says:
March 4th, 2010 at 17:27Hallo,
habe leider Schwierigkeiten, den Oracle-Instant-Client zu installieren. Weiß vielleicht jemand eine Seite, die sich mit dem Thema beschäftigt?
Vielen Dank, Thomas
[Reply]
Max Reply:
March 12th, 2010 at 17:05@Thomas, Hallo Thomas, schaue mal hier, http://samushka.blogspot.com/2009/04/installing-oracle-sqlplus-in-ubuntu.html
[Reply]
-
Steffen Poulsen Says:
March 25th, 2010 at 18:09When using –mode=tablespace-remaining-time we have the experience, that on some machines it is somewhat slow. I.e. on the machine below it takes more than 60 seconds to process 34 tablespaces.
Apparantly the processing of each status-file takes two seconds to process at this particular machine (some trace output pasted below) – and as this machine has a new tablespace automaticaly added each week, this is not going to get any better by itself any time soon :-)
We are aware that we could split the tablespace checking into separate checks and do each tablespace individually – but if you would happen to have an idea for making this mode run a bit faster, so that all tablespaces could be checked inside a timeframe of say 60 seconds, that would be a clear number 1? :-)
Best regards, Steffen Poulsen
$ uname -a SunOS 5.10 Generic_141414-07 sun4v sparc SUNW,SPARC-Enterprise-T5220
./check_oracle_health –mode=tablespace-remaining-time –lookback=15 –warning=10: –critical=2: …
Thu Mar 25 15:10:52 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:52 2010 Thu Mar 25 15:10:52 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:52 2010 Thu Mar 25 15:10:54 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:54 2010 Thu Mar 25 15:10:54 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:54 2010 Thu Mar 25 15:10:56 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:56 2010 Thu Mar 25 15:10:56 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:56 2010 Thu Mar 25 15:10:58 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:58 2010 Thu Mar 25 15:10:58 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:58 2010 Thu Mar 25 15:11:00 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:00 2010 Thu Mar 25 15:11:00 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:00 2010 Thu Mar 25 15:11:02 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:02 2010 Thu Mar 25 15:11:02 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:02 2010 Thu Mar 25 15:11:04 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:04 2010 Thu Mar 25 15:11:04 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:04 2010 Thu Mar 25 15:11:06 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:06 2010 Thu Mar 25 15:11:06 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:06 2010 Thu Mar 25 15:11:08 2010: loaded 5822 data sets from Thu Feb 25 11:00:46 2010 – Thu Mar 25 15:11:08 2010 Thu Mar 25 15:11:08 2010: trimmed to 5822 data sets from Thu Feb 25 11:00:46 2010 – Thu Mar 25 15:11:08 2010 Thu Mar 25 15:11:09 2010: loaded 3806 data sets from Thu Mar 4 11:00:53 2010 – Thu Mar 25 15:11:09 2010 Thu Mar 25 15:11:09 2010: trimmed to 3806 data sets from Thu Mar 4 11:00:53 2010 – Thu Mar 25 15:11:09 2010 Thu Mar 25 15:11:10 2010: loaded 1790 data sets from Thu Mar 11 11:00:59 2010 – Thu Mar 25 15:11:10 2010 Thu Mar 25 15:11:10 2010: trimmed to 1790 data sets from Thu Mar 11 11:00:59 2010 – Thu Mar 25 15:11:10 2010 Thu Mar 25 15:11:10 2010: loaded 5 data sets from Mon Mar 22 14:41:08 2010 – Thu Mar 25 15:11:10 2010 Thu Mar 25 15:11:10 2010: trimmed to 5 data sets from Mon Mar 22 14:41:08 2010 – Thu Mar 25 15:11:10 2010 Thu Mar 25 15:11:10 2010: no historical data found Thu Mar 25 15:11:11 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:11 2010 Thu Mar 25 15:11:11 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:11 2010 Thu Mar 25 15:11:13 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:13 2010 Thu Mar 25 15:11:13 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:13 2010 Thu Mar 25 15:11:15 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:15 2010 Thu Mar 25 15:11:15 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:15 2010 Thu Mar 25 15:11:17 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:17 2010 Thu Mar 25 15:11:17 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:17 2010 Thu Mar 25 15:11:19 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:19 2010 Thu Mar 25 15:11:19 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:19 2010 Thu Mar 25 15:11:21 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:21 2010 Thu Mar 25 15:11:21 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:21 2010 Thu Mar 25 15:11:23 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:23 2010 Thu Mar 25 15:11:23 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:23 2010 Thu Mar 25 15:11:25 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:25 2010 Thu Mar 25 15:11:25 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:25 2010 Thu Mar 25 15:11:27 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:27 2010 Thu Mar 25 15:11:27 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:27 2010 Thu Mar 25 15:11:29 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:29 2010 Thu Mar 25 15:11:29 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:29 2010 Thu Mar 25 15:11:31 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:31 2010 Thu Mar 25 15:11:31 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:31 2010 Thu Mar 25 15:11:33 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:33 2010 Thu Mar 25 15:11:33 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:33 2010 Thu Mar 25 15:11:35 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:35 2010 Thu Mar 25 15:11:35 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:35 2010 Thu Mar 25 15:11:37 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:37 2010 Thu Mar 25 15:11:37 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:37 2010 Thu Mar 25 15:11:39 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:39 2010 Thu Mar 25 15:11:39 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:39 2010 Thu Mar 25 15:11:41 2010: loaded 6418 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:41 2010 Thu Mar 25 15:11:41 2010: trimmed to 6418 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:41 2010 Thu Mar 25 15:11:42 2010: found 2027 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 1 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 6 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 1791 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:44 2010: DESTROY DBD::Oracle::Server::Database::Tablespace with handle null null
[Reply]
lausser Reply:
March 26th, 2010 at 1:32You’re right. 2 seconds is quite long. In /var/tmp/check_oracle_health you should find several files named tablespace-remaining-time_*
Please mail me one of these files. I’ll have a look at it.
[Reply]
Steffen Poulsen Reply:
March 29th, 2010 at 12:39Thank you very much for the patch you sent us, run time is down from 65 to 11 seconds at this particular host now :-)
[Reply]
lausser Reply:
March 29th, 2010 at 17:25You’re welcome. If anybody stumbled upon the same problem….i’ll release a version with this patch soon.
[Reply]
Khadija Reply:
April 13th, 2010 at 1:18@Steffen Poulsen, Can you plz let me know the patch that you suggest Steffen?
Regards, Khadija
[Reply]
lausser Reply:
April 13th, 2010 at 19:01I forgot to release this update. New version of check_oracle_health is coming asap.
[Reply]
-
Frank Says:
April 14th, 2010 at 16:03Hello,
We used this plugin (1.5) for some months now and everything worked fine, but since yesterday we receive the message “CRITICAL – connection could not be established within 60 seconds”. Nothing has changed on the plugin, nothing has changed on the network, nothing has changed on the machines nor on nagios/centreon?
I don’t have a clue where to look to resolve this problem. Does it sound familiar to somebody?
Regards, Frank
[Reply]
lausser Reply:
April 14th, 2010 at 18:37can you connect with the sqlplus command? (executed on the Nagios server)
[Reply]
-
Steffen B Says:
April 15th, 2010 at 9:41Hallo,
erstmal großes Lob an euch, ein super Plugin was ihr dort kreiert habt. Wir nutzen es komplett zur Oracle Überwachung unserer Kundensysteme.
Seit heute hab ich aber ein Problem wo ich nicht mehr weiter weiß. Situation ist folgende:
1DB Server – darauf zwei Datenbanken mit jeweils einem Schema – beide mit dem gleichen DB Stand 10.2.0.4 und dem gleichen Schemanamen.
Ich möchte mit dem Plugin die “Usage” des Tablespaces ermitteln. –mode=tablespace-usage Die Syntax auf der Kommandozeile ist die gleiche, es ändert sich nur der TNSAlias für die Datenbank. Und bei der einen DB funkioniert es ohne Probleme und bei der anderen DB zeigt er mir folgenden Fehler:
Use of uninitialized value in split at /usr/local/nagios/libexec/check_oracle_health line 3924. bumm Can’t call method “execute” on an undefined value at /usr/local/nagios/libexec/check_oracle_health line 4230.
Can’t use an undefined value as an ARRAY reference at /usr/local/nagios/libexec/check_oracle_health line 4242.
Wie schon vorher hier empfohlen, hab ich das check_oracle_health.trace file angelegt und mir ist als einziges aufgefallen, dass ein anderes SQL Statement abgesetzt wird. Aber warum? Die Datenbanken sind gleich isntalliert und auf dem Selben Server, also kann es nicht mit der DB zu tun ahben oder mit dem Betriebssystem, oder?
Wäre für jede Hilfe Dankbar.
[Reply]
lausser Reply:
April 17th, 2010 at 12:19“..dass ein anderes SQL Statement abgesetzt wird.” Es wäre natürlich hilfreich, diese beiden unterschiedlichen Statements sehen zu können.
[Reply]
-
Frank Says:
April 15th, 2010 at 10:10I think you’re right, it seems to be a problem with sqlplus.
As root user I can connect with sqlplus, as Nagios user I cannot connect. I think we have to find out what is changed there…
[Reply]
-
Frank Says:
April 15th, 2010 at 13:20We had to relocate the nagios server today, so we had to restart the server. Problem is now solved.
[Reply]
-
Hans Wolters Says:
April 23rd, 2010 at 14:40Dear all,
Great Plugin. Started to configure it this week and currently for some databases I already have nearly all of the checks possible with the default options.
One question remains for me. If I have overlooked this in the documentation (yes, I can read German) the please let me know.
Situation:
We have several machines with more then one database per service id. Would it be possible to return the SID and database name with the return string of nagios (given by the plugin written in perl). This will enable me to use short service descriptions on those machines and setup service entries with multiple databases/sids on one machine. Maybe even with a parameter so people using only one database can skip the options.
I could hack it into the source my self but I can imagine I am not the only one who would like that feature.
Freundliche Grusse,
Hans Wolters
[Reply]
-
Geoff Sears Says:
April 30th, 2010 at 23:53Hi. I’m having trouble making a connection as sysdba, though I understand this should be possible.
Would you post an example of how to make it work?
Thanks,
-geoff
[Reply]
lausser Reply:
May 10th, 2010 at 0:07check_oracle_health –connect sysdba@ …
[Reply]
Geoff Sears Reply:
May 14th, 2010 at 2:44That’s what I can not get to work. Works fine with connect=host:port//service or connect=//host:port/service
But, If I use:
connect=sysdba@host:port/service or connect=sysdba@//host:port/service
results in ORA-12154: TNS:could not resolve the connect identifier specified (DBD ERROR: OCIServerAttach)
I believe that getting a sysdba connection with DBI/DBD::Oracle requires setting a connection attribute ora_session_mode => ORA_SYSDBA ; just passing that string “sysdba@host:port/service” as the data source won’t do it.
[Reply]
Geoff Sears Reply:
May 15th, 2010 at 2:31ok, I finally sat down and read through the code: sysdba@… is only supported for sqlplus connections. I hacked it so tns connections are possible.
[Reply]
lausser Reply:
May 15th, 2010 at 15:21Which version did you use? I looked into the source (in my git repository) and found (in the tns section)
Isn’t that correct? How does your changes look like?my $connecthash = { RaiseError => 0, AutoCommit => 0, PrintError => 0 }; if ($self->{username} eq "sys" || $self->{username} eq "sysdba") { $connecthash = { RaiseError => 0, AutoCommit => 0, PrintError => 0, #ora_session_mode => DBD::Oracle::ORA_SYSDBA ora_session_mode => 0x0002 }; $dsn = sprintf "DBI:Oracle:"; }
[Reply]
-
Thomas Says:
May 6th, 2010 at 9:09I am running nagios 1.2 and the service i have created with check_oracle_health won’t start. The service still remains on pending. When i run the check on the servers shell it works perfectly. Might that be a problem with nagios 1.x? Should I update to nagios 2 or 3?
[Reply]
lausser Reply:
May 10th, 2010 at 0:13I don’t think this has to do with the Nagios version. Can’t you force scheduling of the service through the service detail page? Upgrading to 3.x is a good idea anyway.
[Reply]
-
Björn Says:
May 17th, 2010 at 11:54Hello,
for some databases we are using your health check. One of the installations is using a Dataguard Environment. When we configure checks for the standby, we get a critical error, as no connection is allowed (“ORA-01033: ORACLE initialization or shutdown in progress”).
For our other oracle monitors we excluded the ORA-01033 and give an OK-State with a comment (“OK – Login Denied, the DB is in Standby Mode – this Check only works for Primary DB’s “).
Could you implement an exeption handling for the ORA-01033 to allow the same Nagios config for Primary and Standby Database?
[Reply]
lausser Reply:
May 18th, 2010 at 0:29ORA-01033 can be a sign of serious problems, for example when a corrupted database was restartet (ORA-10567 et al can be found in the alertlog), hence ignoring this error message is not an option.
[Reply]
Michael Reply:
June 28th, 2010 at 11:00@lausser, Oracle Dataguard tnsping is not working anymore. getting the same error initialization or shutdown in progress. In Version 1.6.2 teh mode tnsping was working for closed Databases.
[Reply]
lausser Reply:
June 28th, 2010 at 11:09How do you call the plugin and what’s the error message (in 1.6.2 and 1.6.4)?
[Reply]
-
Antonio Romero Says:
May 25th, 2010 at 17:07Hi,
I have installed the check_oracle_health on my nagios system in order to monitor several Oracle DB’s. All works fine, except one thing. When I ask to the DB for the space used by the tablespace the info that the plugin returns is diferent from the info that I can get by a Oracle query in the Oracle Manager. Can you give me some help about this issue?
Thank you in advance for your help!
Toni.
[Reply]
lausser Reply:
June 1st, 2010 at 21:42Oracle tools usually set two values into relation: used space and allocated space. Now if you use autoallocation, the latter value may grow. When used:allocated is near 100%, autoallocation happens, allocated space suddenly grows and used:alloc percentage drops. This means, you could get an alert from nagios because the crit.threshold has been reached. Then, after the autoallocation, the usage drops below the threshold again. False alert. That’s why check_oracle_health calculates the usage percentage from used:max_allocatable
[Reply]
-
Thomas Says:
May 26th, 2010 at 11:32Hallo,
zunächst mal danke für das hilfreiche Plugin. Ich habe allerdings noch Probleme es zur Zusammenarbeit mit Nagios (3.2.0) zu überreden. Ich betreibe den Nagiosserver auf Ubuntu und habe den Oracle Instantclient installiert. Das Plugin funktioniert von der Konsole, als User nagios gestartet, ohne Probleme. Als service in Nagios mit folgendem command:
command_line $USER1$/check_oracle_health –connect rebmasc.world –user dbo –password xxx –mode tnsping
bekomme ich immer die Fehlermeldung:
cannot connect to rebmasc.world. ORA-12154: TNS:could not resolve the connect identifier specified (DBD ERROR: OCIServerAttach)
Die Variablen ORACLE_HOME, TNS_ADMIN usw. habe ich in der bash.bashrc für alle korrekt gesetzt und die DB kommt auch in der dort vorhandenen tnsnames.ora vor. Wie gesagt in der Konsole der Maschine ohne Probleme.
Ich habe schon diverse alternative command probiert (–environment; –method), allerdings ohne Erfolg. Ich kann keinen Fehler finden. Was mache ich falsch?
[Reply]
lausser Reply:
May 26th, 2010 at 20:45Die Environmentvariablen müssen im init-Script von Nagios gesetzt werden. Dateien wie .bashrc werden beim Systemstart nicht gelesen.
[Reply]
-
Antonio Romero Says:
May 28th, 2010 at 21:56Please Lausser, Can you answer my post above?, number 40.
Thanks!
[Reply]
-
Dennis Says:
June 8th, 2010 at 16:07Hallo, gibt es eine Möglichkeit die Flash Recovery Area zu überwachen? Bzgl. Füllstand.
Gruß, Dennis
[Reply]
lausser Reply:
June 9th, 2010 at 11:06Nein, das ist nicht eingebaut. Aber vielleicht wäre sowas hilfreich:
--mode sql --name 'select max(percent_space_used) from v$flash_recovery_area_usage' --warning 80 --critical 90
[Reply]
-
Hamza Says:
June 17th, 2010 at 12:15Hi there
I love your check oracle plugin, it does everything I want.
Is there any way to specify multiple DB names using some sort of delimiter.
I currently have a setup as such.
- In .profile I have
export NAGIOS__SERVICEORACLE_SID=
/usr/lib/oracle/11.2/client/network/admin/tnsnames.sh
which basically does a cat of the tnsnames.ora and pulls out all the sids for me.
What I would like to do is be able to run the check_oracle_health in this way
check_oracle_health –connect $NAGIOS__SERVICEORACLE_SID:$NAGIOS__SERVICEORACLE_SID –username nagios –password nagios –mode tnsping
where the : is any delimiter to which can specify multiple DB names.
Please help.
Thank you Hamza Maal
[Reply]
lausser Reply:
June 17th, 2010 at 12:59That’s not possible. You can only check databases one at a time. If you want multiple checks inside one single service, you might want to give check_multi a try. http://www.my-plugin.de/wiki/projects/check_multi/discussion
[Reply]
- In .profile I have
export NAGIOS__SERVICEORACLE_SID=
-
Tim Says:
June 17th, 2010 at 20:27I have an odd problem with this plugin. It works fine, but Nagios reports any response as a warning. From the command line, I’ll get:
OK – 0.22 seconds to connect as MONITOR | connection_time=0.2193;3;8
But Nagios shows the service in yellow and the log has:
SERVICE ALERT: myhost;Oracle mySID Connect;WARNING;HARD;3;OK – 0.14 seconds to connect as MONITOR
Why is it showing as an alert when the connect time is within the correct range?
[Reply]
lausser Reply:
June 21st, 2010 at 10:51Strange… What about the thresholds? From your command line example i see you set –warning 3 –critical 8 (without these extra parameters it would be 1 and 5 by default) Did you set thresholds also in the service/command definition? When you get such a WARNING, please click on “Service Details” and look at the performance data. Which thresholds do you see there?
[Reply]
Tim Reply:
June 21st, 2010 at 19:10I think I added the warning/critical params just in case that might affect the display. The performance data looks like this:
Current Status: WARNING (for 3d 22h 57m 46s) Status Information: OK – 0.23 seconds to connect as MONITOR Performance Data: connection_time=0.2339;3;8 Current Attempt: 3/3 (HARD state) Last Check Time: 06-21-2010 13:06:33 Check Type: ACTIVE
Interestingly, I also set this up in Icinga and it does the same thing.
[Reply]
lausser Reply:
June 21st, 2010 at 19:49Very strange…the last lines of the plugin are:
so if $ERRORCODES{$nagios_level} is “OK” (which is in the output), then the exit code $nagios_level must be 0. Can you reset the service to OK with “submit passive checkresult”? Did you see a warning from the first moment when you configured this service? Or has it been OK before?printf "%s - %s", $ERRORCODES{$nagios_level}, $nagios_message; printf " | %s", $perfdata if $perfdata; printf "\n"; exit $nagios_level;
[Reply]
Tim Reply:
June 22nd, 2010 at 21:05I can send it an OK passive result and it will switch to “OK”, but usually changes right back to a yellow warning.
I’ve tried enabling and disabling passive checks, event handling, but to no effect.
One thing I did notice is that it almost always shows:
Current Attempt: 3/3 (HARD state)
As if maybe it didn’t pass the first 2 checks. Running from the command line I can submit it repeatedly and I get OK results each time. It’s an odd thing. After all of this testing, I think the script works fine, it appears to be more of a Nagios problem.
[Reply]
lausser Reply:
June 22nd, 2010 at 21:13Just to be absolute sure, you can add an extra line at the end of the plugin:
printf "%s - %s", $ERRORCODES{$nagios_level}, $nagios_message; printf " | %s", $perfdata if $perfdata; printf "\n"; printf "i will definitively exit with %d\n", $nagios_level; exit $nagios_level;
The level surely won’t change between the printf and the exit.
-
IT-COW | Icinga: Oracle-Datenbanken abfragen Says:
June 19th, 2010 at 8:53[...] Es gibt ein PlugIn für Icinga/Nagios, das es erlaubt den Status von Oracle-Datenbanken übers Netzwerk abzufragen. Das Tool nennt sich oracle_check_health und ist wie check_logfiles von Herrn Lausser von der Firma ConSol entwickelt worden – dies ist die Homepage des Projekts: Link. [...]
-
Hamza Says:
June 24th, 2010 at 18:00Hi
I seem to be having some trouble setting the warning and critical thresholds for checking tablespace free.
Could you please advise on the correct syntax for
check_oracle_health -t 480 –connect db1 –username nagios –password nagios –mode tablespace-free –warning 85 –critical 90
Please help.
[Reply]
lausser Reply:
June 24th, 2010 at 18:06I assume you want a warning if less than 15% are free and a critical if less than 10% are free. Please use ‘:’ which is the correct syntax for ‘less than’-thresholds.
--mode tablespace-free --warning 15: --critical 10:
[Reply]
-
Rija Says:
July 2nd, 2010 at 15:55Hello, I have problem when I execute line command using tablespace-io-balance to check datafiles under all tablespaces. The output is CRITICAL – unable to aquire tablespace info. Can You help me please?
[Reply]
-
Rija Says:
July 2nd, 2010 at 15:56Hello, I have problem when I execute line command using tablespace-io-balance to check datafiles under all tablespaces. The output is CRITICAL – unable to aquire tablespace info. Could You help me please?
[Reply]
lausser Reply:
July 2nd, 2010 at 16:09Do you see this message only with mode tablespace-io-balance? What about –mode list-tablespaces ?
Maybe you forgot to set the right privileges?
CREATE user nagios IDENTIFIED BY oradbmon; GRANT CREATE session TO nagios; GRANT SELECT any dictionary TO nagios; GRANT SELECT ON V_$SYSSTAT TO nagios; GRANT SELECT ON V_$INSTANCE TO nagios; GRANT SELECT ON V_$LOG TO nagios; GRANT SELECT ON SYS.DBA_DATA_FILES TO nagios; GRANT SELECT ON SYS.DBA_FREE_SPACE TO nagios;
[Reply]
-
Rija Says:
July 2nd, 2010 at 16:38I see this message with tablespace-io-balance only. I’ve executed: check_oracle_health –connect SID –user nagios –password oradbmon –mode tablespace-io-balance. list-tablespaces works, the output gives list and message “OK – have fun” in the end. All privileges are OK for user nagios. Thank You for your help!
[Reply]
lausser Reply:
July 2nd, 2010 at 17:02Edit the plugin and search for “sub init_datafiles”, then search for “iobalance” and finally search for “datafileresults”. Now you found the line
Please change the $params{selectname} to $params{tablespace} (2 times) and try again.my @datafileresults = $params{handle}->fetchall_array($sql, $params{selectname}, $params{selectname});[Reply]
Rija Reply:
July 2nd, 2010 at 17:32I followed your tips and now everything works. Thank you very much for your help.
[Reply]
Rija Reply:
July 2nd, 2010 at 19:12@lausser, Oups! Sorry, it doesn’t work for oracle installed on windows machine, the same error message appear . Have you got another solution for that? Thank you.
[Reply]
lausser Reply:
July 2nd, 2010 at 21:18Strange…unfortunately i don’t have a windows db-server. Please execute the following statement with sqlplus:
SELECT file_name, SUM(phyrds), SUM(phywrts) FROM dba_data_files, v$filestat WHERE tablespace_name = UPPER('USERS') AND file_id=file# GROUP BY tablespace_name, file_name
[Reply]
Rija Reply:
July 10th, 2010 at 2:00@lausser, Hello! I ran “GRANT SELECT ON V_$filestat TO nagios;” and it works. Thank you… I have another problem, I’d like to modify default values of critical and warning level when execute sga-data-buffer-hit-ratio or sga-library-cache-hit-ratio or sga-dictionary-cache-hit-ratio but I still have the error message that appears critical even value is 100%. I’he executed the following command: check_oracle_health –connect SID –mode sga-data-buffer-hit-ratio –warning 80 –critical 90 CRITICAL – SGA data buffer hit ratio 100.00% | sga_data_buffer_hit_ratio=100.00%;80;90
[Reply]
Rija Reply:
July 10th, 2010 at 2:11@Rija, Sorry! The command is: check_oracle_health –connect SID –mode sga-data-buffer-hit-ratio –warning 90 –critical 80 The same error message appears…
lausser Reply:
July 10th, 2010 at 2:38These are “less than”-thresholds. According to the plugin developer guidelines, you must add a “:”. So –warning <less than 90> is written as –warning 90:
-
roger Says:
July 2nd, 2010 at 18:38is normal what seg_top_10 metrics is including to PERFSTAT user, also:
this my top 10 ……….
PERFSTAT 2154 row lock waits 1 …….. PERFSTAT 1450 row lock waits 2 ……….. PERFSTAT 572 row lock waits 3 ……… PERFSTAT 466 row lock waits 4 ……… PERFSTAT 446 row lock waits 5 ……….. PERFSTAT 382 row lock waits 6 ………… PERFSTAT 350 row lock waits 7 ………… PERFSTAT 288 row lock waits 8 ………….
PERFSTAT 246 row lock waits 9 ………….. PERFSTAT 191 row lock waits 10
[Reply]
-
Hamza Maal Says:
July 14th, 2010 at 10:06Hi
I am trying to run a sql statement using –mode sql but it does not seem to work. I have tried using the encode but it still comes up with errors
Original statement /usr/lib/nagios/plugins/check_oracle_health –connect mlc247 –username dbuser –password dbpass –mode sql SELECT TO_CHAR(NEXT_TIME, ‘DD-MON-YYYY HH24:MI:SS’) FROM V$ARCHIVED_LOG where sequence# = (select max(sequence#) from v$archived_log where applied = ‘YES’)
After encoding
/usr/lib/nagios/plugins/check_oracle_health –connect mlc247 –username dbuser –password dbpass –mode sql SELECT%20TO%5FCHAR%28NEXT%5FTIME%2C%20%27DD%2DMON%2DYYYY%20HH24%3AMI%3ASS%27%29%20FROM%20V%24ARCHIVED%5FLOG%20where%20sequence%23%20%3D%20%28select%20max%28sequence%23%29%20from%20v%24archived%5Flog%20where%20applied%20%3D%20%27YES%27%29
This is the error I get using the encode
Use of uninitialized value $sql in sprintf at /usr/lib/nagios/plugins/check_oracle_health line 4194. Use of uninitialized value in subroutine entry at /usr/local/lib/perl/5.10.0/DBD/Oracle.pm line 284. Use of uninitialized value $value in numeric gt (>) at /usr/lib/nagios/plugins/check_oracle_health line 3615. Use of uninitialized value $value in numeric gt (>) at /usr/lib/nagios/plugins/check_oracle_health line 3616. Use of uninitialized value $params{“name2″} in split at /usr/lib/nagios/plugins/check_oracle_health line 3553. OK – :
Any help would be much appreciated
[Reply]
lausser Reply:
July 14th, 2010 at 10:31check_oracle_health .... --mode sql --name SELECT%20TO%5...
[Reply]
-
jhon Says:
July 15th, 2010 at 22:11check_oracle_health –connect SID –mode sga-data-buffer-hit-ratio
OK – SGA data buffer hit ratio 105.55%
105.55 !!! why ?
[Reply]
lausser Reply:
July 15th, 2010 at 23:44I don’t know. I need more information. Look into the code. Find the statement which is used to fetch the data used for the calculation of the hit ratio, execute the statement manually, get the values involved in the calculation manually, post the result here.
[Reply]
jhon Reply:
July 16th, 2010 at 23:04SUM(DECODE(NAME,’PHYSICALREADS’,VALUE,0))
SUM(DECODE(NAME,’PHYSICALREADSDIRECT’,VALUE,0))
SUM(DECODE(NAME,’PHYSICALREADSDIRECT(LOB)’,VALUE,0))
SUM(DECODE(NAME,’SESSIONLOGICALREADS’,VALUE,0))
33942 155 319842 5623223===== using query of @Marco this result:
SELECT ROUND((1-(phy.value / (cur.value + con.value)))*100,2) “Cache Hit Ratio” FROM v$sysstat cur, v$sysstat con, v$sysstat phy WHERE cur.name = ‘db block gets’ AND con.name = ‘consistent gets’ AND phy.name = ‘physical reads’ SQL> /
Cache Hit Ratio
99.4[Reply]
-
JamesC Says:
July 16th, 2010 at 22:23I’m having an odd issue, related to running the script as a non-root user. The output is correct, except there’s a printf() error included with the output.
[nagios@server0224 ~]$ /usr/local/nagios/libexec/check_oracle_health –connect krusta_srv –username USER –password PASS –mode sga-data-buffer-hit-ratio –warning 95: –critical 90: printf() on closed filehandle STATE at /usr/local/nagios/libexec/check_oracle_health line 3828. OK – SGA data buffer hit ratio 99.99% | sga_data_buffer_hit_ratio=99.99%;95:;90:
[Reply]
lausser Reply:
July 16th, 2010 at 22:48You ran the plugin as root. This lead to the creation of /var/tmp/check_oracle_health and probably some files below this directory. (owner: root) These files are necessary to carry state information from one run to the next. Then you ran the plugin as non-root. Overwriting the state file(s) does not work, because they’re owned by root. That’s why you see the error message. Homework:
- chown -R nagios:nagios /var/tmp/check_oracle_health
- write 100 times “i must not run plugins as root”
[Reply]
-
sdouce Says:
July 20th, 2010 at 12:15Hi i receive this kind of message and i dont understand, i have many nagios server using same distrib and working fine , here i have this probleme can oy help ? :
CRITICAL – cannot connect to ORACLE_TOTO. install_driver(Oracle) failed: Can’t load ‘/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/DBD/Oracle/Oracle.so’ for module DBD::Oracle: /usr/lib/oracle/10.2.0.4/client/lib/libocci.so.10.1: ne peut restaurer le segment prot après reloc:
Permission non accordée at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/DynaLoader.pm line 230. at (eval 14) line 3 Compilation failed in require at (eval 14) line 3. Perhaps a required shared library or dll isn’t installed where expected at /usr/local/nagios/libexec/check_oracle_health line 4193
[Reply]


lausser Reply:
October 9th, 2009 at 9:39
Das kann ich nicht nachvollziehen.
$ check_oracle_health --user nagios --password $ORAPW --connect NAPRAX --mode sga-shared-pool-reloads OK - SGA shared pool reload ratio 0.85% | sga_shared_pool_reload_ratio=0.85%;1;10 $ check_oracle_health -V check_oracle_health (1.6.3)Verwendest du die neueste Version? Gerhard[Reply]