check_oracle_health
Posted on July 3rd, 2009 by admin
Description
check_oracle_health is a plugin to check various parameters of an Oracle database.
Documentation
Command line parameters
- –connect=
The database name - –user=
The database user - –password=
Password of the database user. - –connect=
Alternativ to the parameters above. - –connect=sysdba@
Login with / as sysdba (if the user that executes the plugin is privileged to do this) - –connect=/@token Login with help of the Password Store (assumes –method=sqlplus)
- –mode=
With the mode-parameter you tell the plugin what it should do. See the list of possible values further down. - –tablespace=
With this you can limit the check of a single tablespace. If this parameter is omitted all tablespaces are checked. - –datafile=
With this you can limit the check of a single datafile. If this parameter is omitted all datafiles are checked. - –name=
Here the check can be limited to a single object (Latch, Enqueue, Tablespace, Datafile). If this parameter is omitted all objects are checked. (Instead of –tablespace or –datafile this parameter can and should be used. It servers the purpose to standardize the CLI interface.) - –name2=
f you use –mode=sql, then the SQL-Statement appears in the output and performance values. With the parameter name2 you’re able to specify a string for this. - –regexp Through this switch the value of the –name Parameters will be interpreted as regular expression.
- –warning=
Determined values outside of this range trigger a WARNING. - –critical=
Determined values outside of this range trigger a CRITICAL. - –absolute Without –absolute values that increase in the course of time will show the increase per second or with –absolute show the difference between the current and last run.
- –runas=
With this parameter it is possible to run the script under a different user. (Calls sudo internally: sudo -u . - –environment
= With this you can pass environment variables to the script. For example: –environment ORACLE_HOME=/u01/oracle. Multiple declarations are possible. - –method=
With this parameter you tell the plugin how it should connect to the database. (dbi for using DBD::Oracle (default), sqlplus for using the sqlplus-Tool). - –units=<%|KB|MB|GB> The declaration from units servers the “beautification” of the output from mode=sql and simplification from threshold values when using mode=tablespace-free
- –dbthresholds With this parameter thresholds are read from the database table check_oracle_health_thresholds
- –statefilesdir This parameter tells the plugin not do use the default directory for temporary files, but a user-specified one. It can be important in a clustered environment with shared filesystems.
Use the option –mode with various keywords to tell the Plugin which values it should determine and check.
| Keyword | Description | Range |
| tnsping | Listener | |
| connection-time | Determines how long connection establishment and login take | 0..n Seconds (1, 5) |
| connected-users | The sum of logged in users at the database | 0..n (50, 100) |
| session-usage | Percentage of max possible sessions | 0%..100% (80, 90) |
| process-usage | Percentage of max possible processes | 0%..100% (80, 90) |
| rman-backup-problems | Number of RMAN-errors during the last three days | 0..n (1, 2) |
| sga-data-buffer-hit-ratio | Hitrate in the Data Buffer Cache | 0%..100% (98:, 95:) |
| sga-library-cache-gethit-ratio | Hitrate in the Library Cache (Gets) | 0%..100% (98:, 95:) |
| sga-library-cache-pinhit-ratio | Hitrate in the Library Cache (Pins) | 0%..100% (98:, 95:) |
| sga-library-cache-reloads | Reload-Rate in the Library Cache | n/sec (10,10) |
| sga-dictionary-cache-hit-ratio | Hitrate in the Dictionary Cache | 0%..100% (95:, 90:) |
| sga-latches-hit-ratio | Hitrate of the Latches | 0%..100% (98:, 95:) |
| sga-shared-pool-reloads | Reload-Rate in the Shared Pool | 0%..100% (1, 10) |
| sga-shared-pool-free | Free Memory in the Shared Pool | 0%..100% (10:, 5:) |
| pga-in-memory-sort-ratio | Percentage of sorts in the memory. | 0%..100% (99:, 90:) |
| invalid-objects | Sum of faulty Objects, Indices, Partitions | |
| stale-statistics | Sum of objects with obsolete optimizer statistics | n (10, 100) |
| tablespace-usage | Used diskspace in the tablespace | 0%..100% (90, 98) |
| tablespace-free | Free diskspace in the tablespace | 0%..100% (5:, 2:) |
| tablespace-fragmentation | Free Space Fragmentation Index | 100..1 (30:, 20:) |
| tablespace-io-balanc | IO-Distribution under the datafiles of a tablespace | n (1.0, 2.0) |
| tablespace-remaining-time | Sum of remaining days until a tablespace is used by 100%. The rate of increase will be calculated with the values from the last 30 days. (With the parameter –lookback different periods can be specified) | Days (90:, 30:) |
| tablespace-can-allocate-next | Checks if there is enough free tablespace for the next Extent. | |
| flash-recovery-area-usage | Used diskspace in the flash recovery area | 0%..100% (90, 98) |
| flash-recovery-area-free | Free diskspace in the flash recovery area | 0%..100% (5:, 2:) |
| datafile-io-traffic | Sum of IO-Operationes from Datafiles per second | n/sec (1000, 5000) |
| datafiles-existing | Percentage of max possible datafiles | 0%..100% (80, 90) |
| soft-parse-ratio | Percentage of soft-parse-ratio | 0%..100% |
| switch-interval | Interval between RedoLog File Switches | 0..n Seconds (600:, 60:) |
| retry-ratio | Retry-Rate in the RedoLog Buffer | 0%..100% (1, 10) |
| redo-io-traffic | Redolog IO in MB/sec | n/sec (199,200) |
| roll-header-contention | Rollback Segment Header Contention | 0%..100% (1, 2) |
| roll-block-contention | Rollback Segment Block Contention | 0%..100% (1, 2) |
| roll-hit-ratio | Rollback Segment gets/waits Ratio | 0%..100% (99:, 98:) |
| roll-extends | Rollback Segment Extends | n, n/sec (1, 100) |
| roll-wraps | Rollback Segment Wraps | n, n/sec (1, 100) |
| seg-top10-logical-reads | Sum of the userprocesses under the top 10 logical reads | n (1, 9) |
| seg-top10-physical-reads | Sum of the userprocesses under the top 10 physical reads | n (1, 9) |
| seg-top10-buffer-busy-waits | Sum of the userprocesses under the top 10 buffer busy waits | n (1, 9) |
| seg-top10-row-lock-waits | Sum of the userprocesses under the top 10 row lock waits | n (1, 9) |
| event-waits | Waits/sec from system events | n/sec (10,100) |
| event-waiting | How many percent of the elapsed time has an event spend with waiting | 0%..100% (0.1,0.5) |
| enqueue-contention | Enqueue wait/request-Ratio | 0%..100% (1, 10) |
| enqueue-waiting | How many percent of the elapsed time since the last run has an Enqueue spend with waiting | 0%..100% (0.00033,0.0033) |
| latch-contention | Latch misses/gets-ratio. With –name a Latchname or Latchnumber can be passed over. (See list-latches) | 0%..100% (1,2) |
| latch-waiting | How many percent of the elapsed time since the last run has a Latch spend with waiting | 0%..100% (0.1,1) |
| sysstat | Changes/sec for any value from v$sysstat | n/sec (10,10) |
| sql | Result of any SQL-Statement that returns a number. The statement itself is passed over with the parameter –name. A Label for the performance data output can be passed over with the parameter –name2. | n (1,5) |
| list-tablespaces | Prints a list of tablespaces | |
| list-datafiles | Prints a list of datafiles | |
| list-latches | Prints a list with latchnames and latchnumbers | |
| list-enqueues | Prints a list with the Enqueue-Names | |
| list-events | Prints a list with the events from (v$system_event). Besides event_number/event_id a shortened form of the eventname is printed out. This could be use as Nagios service descriptions. Example: lo_fi_sw_co = log file switch completion | |
| list-background-events | Prints a list with the Background-Events | |
| list-sysstats | Prints a list with system-wide statistics |
Measurements that are dependent on a time interval can be execute differently. To calculate the end result the following is needed: start value, end value and the passed time between this two values. Without further options the inital value will be the value from the last plugin run. The passed time is normally the time of normal_check_interval of the according service.
If the increase per second shouldn’t be decisive for the check result, but the difference between two measured values, than use the option –absolute. This is useful for Rollback Segment Wraps which happen very rare so that their rate is nearly 0/sec. Nevertheless you want to be alarmed if the number od this events grows.
The threshold values should be choosen in a way that they can be reached during a retry_check_interval. If not the service will change into the OK-State after each SOFT;1.
Pleae note, that the thresholds must be specified according to the Nagios plug-in development Guidelines.
“10″ means “Alarm, if > 10″ and
“90:” means “Alarm, if < 90″
Preparation of the database
In order to be able to collect the needed information from the database a database user with specific privileges is required:
CREATE USER nagios IDENTIFIED BY oradbmon; GRANT CREATE SESSION TO nagios; GRANT SELECT any dictionary TO nagios; GRANT SELECT ON V_$SYSSTAT TO nagios; GRANT SELECT ON V_$INSTANCE TO nagios; GRANT SELECT ON V_$LOG TO nagios; GRANT SELECT ON SYS.DBA_DATA_FILES TO nagios; GRANT SELECT ON SYS.DBA_FREE_SPACE TO nagios; -- -- if somebody still uses Oracle 8.1.7... GRANT SELECT ON sys.dba_tablespaces TO nagios; GRANT SELECT ON dba_temp_files TO nagios; GRANT SELECT ON sys.v_$Temp_extent_pool TO nagios; GRANT SELECT ON sys.v_$TEMP_SPACE_HEADER TO nagios; GRANT SELECT ON sys.v_$session TO nagios;
Examples
nagios$ check_oracle_health --connect bba --mode tnsping OK - connection established to bba. nagios$ check_oracle_health --mode connection-time OK - 0.17 seconds to connect | connection_time=0.1740;1;5 nagios$ check_oracle_health --mode sga-data-buffer-hit-ratio CRITICAL - SGA data buffer hit ratio 0.99% | sga_data_buffer_hit_ratio=0.99%;98:;95: nagios$ check_oracle_health --mode sga-library-cache-hit-ratio OK - SGA library cache hit ratio 98.75% | sga_library_cache_hit_ratio=98.75%;98:;95: nagios$ check_oracle_health --mode sga-latches-hit-ratio OK - SGA latches hit ratio 100.00% | sga_latches_hit_ratio=100.00%;98:;95: nagios$ check_oracle_health --mode sga-shared-pool-reloads OK - SGA shared pool reloads 0.28% | sga_shared_pool_reloads=0.28%;1;10 nagios$ check_oracle_health --mode sga-shared-pool-free WARNING - SGA shared pool free 8.91% | sga_shared_pool_free=8.91%;10:;5: nagios$ check_oracle_health --mode pga-in-memory-sort-ratio OK - PGA in-memory sort ratio 100.00% | pga_in_memory_sort_ratio=100.00;99:;90: nagios$ check_oracle_health --mode invalid-objects OK - no invalid objects found | invalid_ind_partitions=0 invalid_indexes=0 invalid_objects=0 unrecoverable_datafiles=0 nagios$ check_oracle_health --mode switch-interval OK - Last redo log file switch interval was 18 minutes | redo_log_file_switch_interval=1090s;600:;60: nagios$ check_oracle_health --mode switch-interval --connect rac1 OK - Last redo log file switch interval was 32 minutes (thread 1)| redo_log_file_switch_interval=1938s;600:;60: nagios$ check_oracle_health --mode tablespace-usage CRITICAL - tbs SYSTEM usage is 99.33% tbs SYSAUX usage is 93.73% tbs USERS usage is 8.75% tbs UNDOTBS1 usage is 6.65% | 'tbs_users_usage_pct'=8%;90;98 'tbs_users_usage'=0MB;4;4;0;5 'tbs_undotbs1_usage_pct'=6%;90;98 'tbs_undotbs1_usage'=11MB;153;166;0;170 'tbs_system_usage_pct'=99%;90;98 'tbs_system_usage'=695MB;630;686;0;700 'tbs_sysaux_usage_pct'=93%;90;98 'tbs_sysaux_usage'=802MB;770;839;0;856 nagios$ check_oracle_health --mode tablespace-usage --tablespace USERS OK - tbs USERS usage is 8.75% | 'tbs_users_usage_pct'=8%;90;98 'tbs_users_usage'=0MB;4;4;0;5 nagios$ check_oracle_health --mode tablespace-usage --name USERS OK - tbs USERS usage is 8.75% | 'tbs_users_usage_pct'=8%;90;98 'tbs_users_usage'=0MB;4;4;0;5 nagios$ check_oracle_health --mode tablespace-free --name TEST OK - tbs TEST has 97.91% free space left | 'tbs_test_free_pct'=97.91%;5:;2: 'tbs_test_free'=32083MB;1638.40:;655.36:;0.00;32767.98 nagios$ check_oracle_health --mode tablespace-free --name TEST --units MB --warning 100: --critical 50: OK - tbs TEST has 32083.61MB free space left | 'tbs_test_free_pct'=97.91%;0.31:;0.15: 'tbs_test_free'=32083.61MB;100.00:;50.00:;0;32767.98 nagios$ check_oracle_health --mode tablespace-free --name TEST --warning 10: --critical 5: OK - tbs TEST has 97.91% free space left | 'tbs_test_free_pct'=97.91%;10:;5: 'tbs_test_free'=32083MB;3276.80:;1638.40:;0.00;32767.98 nagios$ check_oracle_health --mode tablespace-remaining-time --tablespace ARUSERS --lookback 7 WARNING - tablespace ARUSERS will be full in 78 days | 'tbs_arusers_days_until_full'=78;90:;30: nagios$ check_oracle_health --mode flash-recovery-area-free OK - flra /u00/app/oracle/flash_recovery_area has 100.00% free space left | 'flra_free_pct'=100.00%;5:;2: 'flra_free'=2048MB;102.40:;40.96:;0;2048.00 nagios$ check_oracle_health --mode flash-recovery-area-free --units KB --warning 1000: --critical 500: OK - flra /u00/app/oracle/flash_recovery_area has 2097152.00KB free space left | 'flra_free_pct'=100.00%;0.05:;0.02: 'flra_free'=2097152.00KB;1000.00:;500.00:;0;2097152.00 nagios$ check_oracle_health --mode datafile-io-traffic --datafile users01.dbf WARNING - users01.dbf: 1049.83 IO Operations per Second | 'dbf_users01.dbf_io_total_per_sec'=1049.83;1000;5000 nagios$ check_oracle_health --mode latch-contention --name 214 OK - SGA latch library cache (214) contention 0.08% | 'latch_214_contention'=0.08%;1;2 'latch_214_sleep_share'=0.00% 'latch_214_gets'=49995 nagios$ check_oracle_health --mode latch-contention --name 'library cache' OK - SGA latch library cache (214) contention 0.08% | 'latch_214_contention'=0.08%;1;2 'latch_214_sleep_share'=0.00% 'latch_214_gets'=49937 nagios$ check_oracle_health --mode enqueue-contention --name TC CRITICAL - enqueue TC: 19.90% of the requests must wait | 'TC_contention'=19.90%;1;10 'TC_requests'=2015 'TC_waits'=401 nagios$ check_oracle_health --mode latch-contention --name 'messages' OK - SGA latch messages (17) contention 0.02% | 'latch_17_contention'=0.02%;1;2 'latch_17_gets'=4867 nagios$ check_oracle_health --mode latch-waiting --name 'user lock' OK - SGA latch user lock (205) sleeping 0.000841% of the time | 'latch_205_sleep_share'=0.000841% nagios$ check_oracle_health --mode event-waits --name 'log file sync' OK - log file sync : 1.839511 waits/sec | 'log file sync_waits_per_sec'=1.839511;10;100 nagios$ check_oracle_health --mode event-waiting --name 'Log file parallel write' OK - log file parallel write waits 0.045843% of the time | rarr 'log file parallel write_percent_waited'=0.045843%;0.1;0.5 nagios$ check_oracle_health --mode sysstat --name 'transaction rollbacks' OK - 0.000003 transaction rollbacks/sec | 'transaction rollbacks_per_sec'=0.000003;10;100 'transaction rollbacks'=4 nagios$ check_oracle_health --mode sql --name 'select count(*) from v$session' --name2 sessions CRITICAL - sessions: 21 | 'sessions'=21;1;5 nagios$ check_oracle_health --mode sql --name 'select 12 from dual' --name2 twelve --units MB CRITICAL - twelfe: 12MB | 'twelfe'=12MB;1;5 nagios$ check_oracle_health --mode sql --name 'select 200,300,1000 from dual' --name2 'kaspar melchior balthasar' --warning 180 --critical 500 WARNING - kaspar melchior balthasar: 200 300 1000 | 'kaspar'=200;180;500 'melchior'=300;; 'balthasar'=1000;; nagios$ check_oracle_health --mode sql --name "select 'abc123' from dual" --name2 \\d --regexp OK - output abc123 matches pattern \d
Authentication
Example with –runas and an “external user”
There are to users in the database:
- OPS$DBNAGIO IDENTIFIED EXTERNALLY
- NAGIOS IDENTIFIED BY ‘DBMONI’
There are two unix users:
- qqnagio with normal access.
- dbnagio with /bin/false as login shell.
qqnagio$ check_oracle_health --mode=connection-time --connect=nagios/dbmoni@BBA OK - 0.21 seconds to connect as NAGIOS dbnagio$ check_oracle_health --mode=connection-time --connect=BBA --runas=dbnagio --environment ORACLE_HOME=$ORACLE_HOME OK - 0.17 seconds to connect as OPS$DBNAGIO
The background for this example is the following scenario with a SAP-Server:
Only local connections to the database are allowed. The database isn’t reachable over the network. Logging in with username and password is not possible.
Only database-users that are authenticated through the operating system (OPS$-User) are allowed to connect.
These users are not allowed to connect via SSH. (Therefore /bin/false).
Because the Nagios user qqnagio is allowed to connect via SSH, he can’t be used as database user. But the NRPE which executes the plugin will run under the qqnagios-account.
Use of environment variables
It is possible to omit –connect (and if not needed –user and –password) completely, if you provide the corresponding values in environment variables. Since Version 3.x it is possible to extend service definitions in Nagios through own attributes (custom object variables). These will appear during the exectution of the check command in the environment.
The environment variables are:
- NAGIOS__SERVICEORACLE_SID (_oracle_sid in the service definition)
- NAGIOS__SERVICEORACLE_USER (_oracle_user in the service definition)
- NAGIOS__SERVICEORACLE_PASS (_oracle_pass in the service definition)
Installation
The installation of the perl-modules DBI and DBD::Oracle is required.
After unpacking the archive ./configure is called. With ./configure –help some options can be printed which show some default values for compiling the plugin.
- –prefix=BASEDIRECTORY Specify a directory in which check_oracle_health should be stored. (default: /usr/local/nagios)
- –with-nagios-user=SOMEUSER This User will be the owner of the check_oracle_health file. (default: nagios)
- –with-nagios-group=SOMEGROUP The group of the check_oracle_health plugin. (default: nagios)
- –with-perl=PATHTOPERL Specify the path to the perl interpreter you wish to use. (default: perl in PATH)
Download
check_oracle_health-1.7.3.tar.gz
check_oracle_health-1.7.3.shar.gz
Some versions of tar are having problems with the long filesnames. In this case please unpack the shar-Paket with cat check_oracle_health-xxx.shar.gz | gzip -d | sh
Changelog
-
2011-09-29 1.7.3
mode sql now correctly handles dml sql errors like missing tables etc.
single ticks around the –name argument under Windows CMD will be removed automaticallyadd mode sga-library-cache-pinhit-ratio
sga-library-cache-hit-ratio becomes sga-library-cache-gethit-ratio
add mode sga-library-cache-reloads -
2011-09-23 1.7.2
better error handling for mode sql -
2011-08-17 1.7.1
add option –commit (Thanks Ovidiu) - 2011-08-16 1.7.0
add error handling for unwritable status filesfix a bug with statefilesdir and capital letters
enhance stale statistics
enhance invalid objects (Thanks Yannick Charton)
fix a bug in open cmdcmd (only affects method sqlplus)
- 2011-06-16 1.6.9
add modes session-usage, process-usage, rman-backup-problems, corrupted-blocks, datafiles-existing (Thanks Ovidiu Marcu)sites in an OMD (http://omdistro.org) environment have now private statefile directories
- 2011-01-08 1.6.8.1
fixed a bug which lead to leftover temporary files under Windows. (Thanks Heiko) - 2011-01-05 1.6.8
massive speedup in modes seg-top10-* (Thanks Michael Nieberg http://kenntwas.de)bugfix in –mode sql (numeric vs. regexp result) (Thanks Michel Meelker)
- 2010-12-20 1.6.7
mode sql can now have a non-numerical output which is compared to a string/regexpnew mode report can be used to output only the bad news (short,long,html)
better error message with method sqlplus when db is down
- 2010-10-01 1.6.6.1
–dbthresholds can have an argumentadded a workaround for an oracle-bug in shared-pool-free. Thanks Yannick
- 2010-08-12 1.6.6
new parameter –dbthresholds. thresholds can now also be deposited in the table check_oracle_health_thresholdsbugfix in connection-time. dbuser was uninitialized in rare cases
- 2010-08-09 1.6.5
added modes flash-recovery-area-usage and flash-recovery-area-freeplugin can now run on Windows
new commandline parameter –with-mymodules-dyn-dir (precedes configure-option of the same name)
- 2010-06-10 1.6.4
added checking of dba_registry to mode invalid-objects. Thanks Ovidiu Marcuspeedup of tablespace-remaining-time. Thanks Steffen Poulsen
switch-interval detects redo log timestamps in the future and reports critical
method sqlplus now works with “(DESCRIPTION =(ADDRESS = (PROTOCOL = TCP”-like connectstrings
new parameter –ident to show instance and database names in the output
bugfix in tablespace-usage (temp tbs with multiple datafiles). Thanks Philipp Lemke
- 2009-09-09 1.6.3
tablespace-can-allocate-next was optimized.
Illegal statefile-Names were fixed. Thanks Franky van Liedekerke.
Bugfix in tablespace-usage under Oracle 8.1.x
switch-interval now works more precise. Thanks Naquada.
Paswords don’t show up in error messages any more. Thanks Jens Seiffert.
Bugfix in mode sql. (Decimalvalues with .5 lead to errors). Thanks Shane Jordan.
Bugfix in sga-latches-hitratio, Thresholds were ignored. Thanks Yannik Charton.
The parameter –user is now –username (user still works)
- 2009-04-05 1.6.2 Bugfix in tablespace-usage/free due to non-autoextensible TEMP-Tablespaces. (Thanks Daniel Graef)
- 2009-03-27 1.6.1 –mode=tablespace-usage|free now recognizes offline tablespaces. (Thanks Daniel Graef)
- 2009-03-11 1.6 Support for DBD::SQLRelay. Mode sql can print out multiple values (Thanks Juergen Lesny). Login as “sys” possible (Thanks Joerg Horchler). Bugfix when using warning/critical=0 (Thanks Danijel Tasov)
- 2008-10-28 1.5.0.1 Bugfix due to , instead of . in decimal values. mode=sql output will be rounded to 2 places after the decimal point. Bugfix in mode=sga-shared-pool-free. (Thanks Birk Bohne)
- 2008-10-15 1.5 New authentication methods password store and as sysdba. New mode tablespace-free. New parameter –units when using mode=sql and mode=tablespace-free. Mode switch-interval considers RAC (Thanks Harald Zahn).
- 2008-09-19 1.4.2.1 New parameter –regexp supplemented –name. Bugfix in tablespace-usage (>100% when using resize datafile)
- 2008-09-10 1.4.1 New mode tablespace-can-allocate-next, Handling from locked accounts, Timeout-Bugfix, Encode, expired Extents in UNDO-Tablespace are considered, Bugfix wg. mode=sql and Null-Values (Thanks Viktor Käfer), mode=top10* optimized.
- 2008-07-09 1.4.0.1 Bugfixes#(–name=0, –method=sqlplus), –invalid-objects and –stale-statistics now consider thresholds (Thanks Konrad Barck)
- 2008-07-03 1.4 Statesdir is now /var/tmp/check_oracle_health, Bugfixes in latch-contention, systats and roll-extends. Performance improvements.
- 2008-07-01 1.3.1.1 Bugfix in method=sqlplus and os$user, Bugfix in tablespace-usage when using Temp-Tablespaces, better performancevalues for pga-in-memory-sort-ratio
- 2008-06-26 1.3.1 Code cleanup, Bugfix in connected-users Thresholds
- 2008-06-24 1.3 data-buffer/library/dictionary-cache-hitratio are now more precise, tablespace-usage considers autoextents, sqlplus, code cleaned up
- 2008-06-20 1.2.7 bugfixes in top10-x and pga-in-memory-sort. New Mode sql. Unrecoverable datafiles removed from invalid-objects (will get his own mode later)
- 2008-06-16 1.2.6.1 New modes sysstat list-sysstats
- 2008-06-14 1.2.6 New modes event-waited event-waits list-events
- 2008-06-11 1.2.5.1 internal changes
- 2008-06-03 1.2.5 New modes latch-contention enqueue-contention enqueue-waiting connected-users list-latches list-enqueues
- 2008-05-27 1.2.4.1 New modes list-tablespaces and list-datafiles (no Monitoringfunction)
- 2008-05-27 1.2.4 New modes datafile-io-traffic and redo-io-traffic
- 2008-05-25 1.2.3.1 stale-statistics now run under Oracle 9.x
- 2008-05-25 1.2.3 New modes –roll-block-contention, –roll-hit-ratio, Bugfix in –switch-interval
- 2008-05-23 1.2.2.1 Modes, that require Oracle 10.x are disabled with Oracle 9.x/8.x
- 2008-05-21 1.2.2 Bugfix in –environment
- 2008-05-19 1.2.1 sga-buffer-cache-hit-ratio now shows percent (thx Maik Ihde), new parameters –runas –environment, support for externally authenticated users, Bugfix in tablespace-remaining-time
- 2008-05-06 1.2 connection timeout handling, stale-statistics
- 2008-05-02 1.1 tablespace-remaining-time, tablespace-io-balance
- 2008-04-16 1.0 first public version
Copyright
2008 Gerhard Laußer
Check_oracle_health is published under the GNU General Public License. GPL
Author
Gerhard Laußer (gerhard.lausser@consol.de) gladly answers questions to this plugin.
Translation
Thanks to Christian Lauf there is finally an english translation of this page :-)
275 Responses to “check_oracle_health”
-
Marco Says:
October 7th, 2009 at 12:58Hallo,
ich teste gerade Ihr Tool. Ich bin sehr begeistert davon, denn es nimmt mir viel Abreit ab. Leider habe ich ein kleines Problem mit dem mode sql ./check_oracle_health –connect ‘(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.102.103)(PORT=1521))(CONNECT_DATA=(SID=XYZ)))’ –user nagios –password oradbmon09 –mode sql –name ‘select count(*) from v$session’ –name2 sessions –warning 100 –critical 150 ERgebnis: WARNING – sessions: 21 | ‘sessions’=21;20;30
In Nagios eingebunden bekomme ich als Status Information nur OK – sessions: Hier müßte eigentlich ja auch die Warning kommen.
Skript:
define command{ command_name check_oracle_per_sql command_line $USER1$/check_oracle_health –connect ‘(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=$HOSTADDRESS$)(PORT=1521))(CONNECT_DATA=(SID=$ARG3$)))’ –user $ARG1$ –password $ARG2$ –mode sql –name ‘select count(*) from v\$session’ –name2 sessions –warning 5 –critical 10 }
define service{ use local-service host_name test2_ch service_description Count Open Sessions2 check_command check_oracle_per_sql!nagios!abc!xyz }
Performance Daten werden auch nicht geschrieben.
Der Mode Tablespace funktioniert bei mir.
Gruß Marco
-
lausser Says:
October 8th, 2009 at 10:24Hallo, das liegt daran, dass Nagios empfindlich auf das Dollarzeichen in seinen Konfigdateien reagiert. Mit –name ‘select count(*) from v$$session’ in der Command-Definition sollte es funktionieren. Eine Alternative wäre, SQL-Statements, die Sonderzeichen beinhalten, vorher zu encodieren. Das geht zwar auf Kosten der Lesbarkeit, dafür muss man sich aber keine Gedanken mehr bzgl. einfache/doppelte Hochkommata, Escapen etc. machen.
echo 'select count(*) from v$session' | check_oracle_health --mode encode select%20count%28%2A%29%20from%20v%24sessioncommand_line $USER1$/check_oracle_health ..... --name select%20count%28%2A%29%20from%20v%24session ....Gerhard
-
Marco Says:
October 9th, 2009 at 8:50Wunderbar es funktionert.
Eine Frage habe ich noch. Kann es sein, dass für einige Abfragen keine Performancedaten geschrieben werden? z.B. für sga-shared-pool-reloads ?
Marco
Chris Reply:
March 11th, 2011 at 15:57I have a few questions:
Does ORACLE_CONNECTED-USERS measure number of connected sessions, or processes? Is there any warning or critical threshold, or no?
Does ORACLE_DATAFILE-IO-TRAFFIC have a warning or critical threshold associated with it? Or does it just report the traffic?
Does ORACLE_STALE_STATISTICS have a warning or critical threshold associated with it? If so, what is it?
Thanks in advance for your answers.
-
Marco Says:
October 9th, 2009 at 9:29Wenn ich keine Schwellwerte mitgebe, dann schreibt er nichts. Gebe ich welche an, dann werden auch Performancedaten geschrieben.
-
Marco Says:
October 9th, 2009 at 12:25Ich habe die gleiche Version. Bei mir war es so, dass er keine Performance-Daten nach “/usr/local/nagios/share/perfdata” geschrieben hat. Nachdem ich die Schwellwerte angegeben habe, ging es. Lag vielleich an mir.
Eine Frage habe ich nun doch schon wieder: Ich möchte mir “sga-data-buffer-hit-ratio” ausgeben lassen. Ergebnis: CRITICAL – SGA data buffer hit ratio 43.73% | sga_data_buffer_hit_ratio=43.73%;98:;95: Unter SQLPLUS kommt folgendes raus: SELECT ROUND((1-(phy.value / (cur.value + con.value)))*100,2) “Cache Hit Ratio” FROM v$sysstat cur, v$sysstat con, v$sysstat phy WHERE cur.name = ‘db block gets’ AND con.name = ‘consistent gets’ AND phy.name = ‘physical reads’;
Cache Hit Ratio
86.36Interpretier ich etwas falsch?
lausser Reply:
October 9th, 2009 at 16:07“db block gets” und die anderen Werte werden stur hochgezählt. Mit deinem SQL-Statement errechnest du die Hitratio über die gesamte Laufzeit der Instanz. Der Wert wird irgendwann sehr ungenau bzw. ändert sich nur sehr langsam. Bei check_oracle_health werden die Deltas dieser Zähler (zwischen dem aktuellen und dem letzten Lauf des Plugins) zur Berechnung verwendet. So bekommst du immer einen aktuellen Wert. (Den Mittelwert im check_interval).
-
Marco Says:
October 15th, 2009 at 15:04Ich habe schon wieder eine Frage: Kann ich mit “tablespace-usage” auch einige Tablespace excluden? Ich möchte alle TBS’s außer z.B. sysaux, system überwachen. Geht das?
lausser Reply:
October 15th, 2009 at 15:17Der Parameter –regexp sorgt dafür, dass der Parameter –name (mit dem man normalerweise einen Tablespace gezielt abfragt) als regulärer Ausdruck interpretiert wird. Wenn du also –name so formulierst, dass der Pattern alle Namen matcht ausser SYSTEM und SYSAUX, dann werden die beiden nicht angezeigt.
… –name=’^(?!(SYSTEM|SYSAUX))’ –regexp
Gerhard
-
Manuel Says:
October 18th, 2009 at 3:22HAllo aus Spanien , Wo kann ich check_oracle_health download ?’
DAnke
lausser Reply:
October 18th, 2009 at 16:11Ungefähr 80cm nach oben scrollen bis zur Überschrift “Download”.
-
guzik Says:
October 28th, 2009 at 11:35Hi, I’ve got a problem with check_oracle_health plugin. Status information in my Nagios: **ePN /usr/lib64/nagios/plugins/check_oracle_health: “Can’t exec “/usr/sbin/p1.pl”: Permission denied at (eval 1) line 5254,”. Few of check working fine, rest has got a problem. From console there is no problem to execute script. What can I do to correct check services?
lausser Reply:
October 29th, 2009 at 20:34Hi, please add one line in the first 10 lines of the plugin:
# nagios: -epn
This prevents Nagios from executing check_oracle_health with the embedded Perl interpreter. I will add this to the next release of the plugin as the default. GerhardPeter Reply:
August 4th, 2010 at 15:03@lausser, Hallo, ich hab den ersten Test mit diesem Skript gemacht. Der Aufruf aus der Shell funktioniert, aber wenn ich es in der Nagios-Config eintrage und Nagios den check ausführt, dann bekommen ich folgende Fehlermeldung: CRITICAL – cannot connect to Instanz.WORLD. install_driver(Oracle) failed: Can’t load ‘/usr/lib/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/auto/DBD/Oracle/Oracle.so’ for module DBD::Oracle: libnnz10.so: cannot open shared object file: No such file or directory at /usr/lib/perl5/5.8.8/x86_64-linux-thread-multi/DynaLoader.pm line 230. at (eval 16) line 3 Compilation failed in require at (eval 16) line 3. Perhaps a required shared library or dll isn’t installed where expected at /usr/local/nagios/libexec/check_oracle_health line 3977
Ich interpretiere die Fehlermeldung so, das das Problem beim embedded Perl Interpreter liegt, da der nichts von den Perlmodulen weiß.
Habe dann den Eintrage # nagios: -epn in der 3. Zeile des Skripts eingefügt, aber der Fehler kommt immer noch.
Muss ich da Nagios neu starten oder wo könnten das Problem liegen?
Schöne Grüße Peter
lausser Reply:
August 4th, 2010 at 15:51In der Shell hast du das Oracle-Environment (ORACLE_HOME, LD_LIBRARY_PATH etc.), aber wenn Nagios per Initscript gestartet wird, weiss der nagios-Prozess davon nichts. Am Besten ist es, wenn man
in /etc/init.d/nagios einträgt. Gruß nach Zandt! (ich stamme aus Cham)export ORACLE_HOME=... export LD_LIBRARY_PATH=${ORACLE_HOME}/lib:${LD_LIBRARY_PATH} ... -
Steffen Poulsen Says:
November 9th, 2009 at 17:29I tried to run check_oracle_health at an install using perl, version 5.005_03 – and it barked at me:
Use of reserved word “our” is deprecated at check_oracle_health.pl line 9. Bareword “our” not allowed while “strict subs” in use at check_oracle_health.pl line 9. Unquoted string “our” may clash with future reserved word at check_oracle_health.pl line 9. Array found where operator expected at check_oracle_health.pl line 9, at end of line (Do you need to predeclare our?) syntax error at check_oracle_health.pl line 9, near “our @ISA ” Global symbol “@ISA” requires explicit package name at check_oracle_health.pl line 9. BEGIN not safe after errors–compilation aborted at check_oracle_health.pl line 80.
We are very fond of your plugin and would like to use it at this install also – is there per incidence a drag-and-drop replacement for the “our @ISA”-construct that would allow the check to run at this old install also?
Best regards, Steffen Poulsen
lausser Reply:
November 11th, 2009 at 22:35I think, this would require a major rewrite of the plugin. Can’t you run it on the Nagios server and check the database with a remote connection? Gerhard
-
SweetBiene91 Says:
November 11th, 2009 at 0:57hey ho bin so einsam jemand lust zu chattn oder so
lausser Reply:
November 11th, 2009 at 13:50Versuch’s mal hiermit: irc server : irc.irclink.net port : 6667 channel : #nagios
-
Manfred Says:
November 12th, 2009 at 18:17Gibt es eine Option (z.B. quiet) welche nur die Werte ausgeben läßt, welche ein warning oder critical ausgeben? Bei über 30 Tablespaces (bei –mode=tablespace-free) ist es fast unmöglich den zu finden, welcher das Warning ausgelöst hat. Ausserdem wird die Ausgabe in Nagios sehr unübersichtlich und viel zu lange. Ich habe schon im Source versucht, ein “if” einzubauen um die Ausgabe zu unterdrücken, bin damit aber gescheitert – da dann die Warnings selbst ausbleiben. Z.B. der orignal Nagios check der Filesysteme gibt auch nur die aus, welche warning oder critical sind.
lausser Reply:
November 12th, 2009 at 19:01Ich würde in dem Fall empfehlen, check_multi zu verwenden. Das hat ausserdem den Vorteil, dass Schwellwerte im check_multi-Konfigfile geändert werden können, ohne dass man Nagios neu starten muss. Wenn man die Tablespacenamen als Label angibt, so erhält man eine knappe Ausgabe 30 plugins checked, 1 critical (TBS_1), 1 warning (TBS_25), 0 unknown, 28 ok
-
fsom Says:
November 20th, 2009 at 15:54Tolles Script! Funktioniert soweit alles, nur bei mode=sql komme ich nicht weiter (v1.6.3): ./check_oracle_health –connect=DB –user=xxxxxx –password=yyyyyy –mode=sql –name=”select count(*) from v$session where status = ‘ACTIVE’”
Use of uninitialized value in numeric gt (>) at /usr/lib/nagios/plugins/check_oracle_health line 3615. Use of uninitialized value in numeric gt (>) at /usr/lib/nagios/plugins/check_oracle_health line 3616. OK – select count(*) from v where status = ‘active’:
ich bekomme nichts von dem SQL Befehl zurück. Mache ich etwas falsch ? danke, fsom
lausser Reply:
November 20th, 2009 at 16:42Du musst das Dollarzeichen entwerten. Dein Statement: from v$session where… Ausgabe: from v where… Für die Shell sieht $session wie eine Variable aus und da diese nicht existiert, macht sie einen Leerstring draus. Schreib stattdessen …from v\$session…. Wenn das SQL-Statement komplizierter ist und viele solcher Sonderzeichen enthält, kann man es auch encodieren. Dazu rufst du check_oracle_health mit dem Parameter “–mode encode” auf. Es liest dann von der Standardeingabe. Du tippst dein Statement (ohne auf Entwertung von Sonderzeichen achten zu müssen und schliesst es mit RETURN ab.
$ check_oracle_health --mode encode select count(*) from v$session where status = 'ACTIVE' select%20count%28%2A%29%20from%20v%24session%20where%20status%20%3D%20%27ACTIVE%27
Als Ausgabe erhältst du das Statement in encodierter Form, das du nun bei –name angeben kannst, ohne auf Dollar- oder irgendwelche Anführungszeichen achten zu müssen. -
Bas de Klerk Says:
November 28th, 2009 at 17:55Hi,
thx for your greate plugin. Saves me a lot of time!
One small problem I’m having in version 1.6.3 is that the sga-data-bufer-hit-ratio sometimes drops to 0%… no clue why but sometimes it does. If I calculate it by hand using statement below the values are fine. If you need any add. info please let me know. For now I’ve made a workaround using mod=sql
Regards Bas
SELECT ((P1.value + P2.value – P3.value) / (P1.value + P2.value))*100 ratio FROM v$sysstat P1, v$sysstat P2, v$sysstat P3 WHERE P1.name = ‘db block gets’ AND P2.name = ‘consistent gets’ AND P3.name = ‘physical reads’;
-
lausser Says:
November 28th, 2009 at 20:48I use the deltas (the difference to the counter value when check_oracle_health was run last time) for the calculation. E.g. the “physical reads” i use for the calculation is “value of physical reads now – value of physical reads approx. 5 minutes ago.” This way the hitrate reflects the current state of the buffer cache. In your formula you use the counters which increased since the database was started, so it’s an average hitrate over the whole lifetime. But isn’t it more interesting to get the current hitrate? When you get 0% sometimes, it actually means a hitrate of 0% (at least during the last check_interval).This is some kind of a “negative spike”. But i understand the problem. I will introduce a parameter “–lookback” which takes a number of minutes as argument. This way, you can for example measure the hitrate during the last 30 minutes, which is pretty up to date, but gives you much smoother results.
-
Andreas Says:
December 12th, 2009 at 8:44Hallo, bei mir funktionieren nur run 50% der Abfragen: z.B. TNSPING, CON.-TIME, CON.-USERS, invalid-objects . Aber bei einigen Abfragen z.B. sga-data-buffer-hit-ratio erhalte ich in Nagios folgende Fehlermeldung: **ePN /usr/lib/nagios/plugins/check_oracle_health: printf() on closed filehandle STATE at (eval 1) line 3841,. “-epn” habe ich schon eingebaut. auf der Kommandozeile funktioniert die Abfrage aber. Danke!
lausser Reply:
December 12th, 2009 at 14:45Kann es sein, dass du check_oracle_health auf der Kommandozeile als root ausgeführt hast? Das Plugin merkt sich nämlich Zwischenergebnisse im Verzeichnis /var/tmp/check_oracle_health, welches automatisch angelegt wird. Falls das Verzeichnis root gehört, kann ein check_oracle_health-Prozess, der unter der Nagios-Kennung läuft, da nicht mehr hineinschreiben. Die Fehlermeldung weist darauf hin. Ein “chown -R nagios:nagios /var/tmp/check_oracle_health” sollte das Problem lösen.
-
Erlon Says:
December 14th, 2009 at 14:41Where I find the download link?
lausser Reply:
December 14th, 2009 at 15:30Scroll up until you see the topic “Download”
-
Erlon Says:
December 14th, 2009 at 16:54But does not exist the Topic Download!
-
Erlon Says:
December 14th, 2009 at 20:41Ok, I can see now. I didn’t see before because, i was seeing the page in english, and in english this link dont exists.
-
Don Seiler Says:
January 8th, 2010 at 1:14Are there plans for ASM checks, such as disk group free space (v$asm_diskgroup.usable_file_mb)?
lausser Reply:
January 8th, 2010 at 11:40No, i actually have no plans (mostly because i’m too occupied with other things). But if you look in the contribs subdirectory, you’ll find a description how you can extend check_oracle_health with your own custom modes. You simply put the code (mostly the sql stements) in a separate file which is sourced at runtime. Perhaps you want to play around with this and post the result. If it works, i will gladly add it to the core plugin.
Don Seiler Reply:
January 14th, 2010 at 0:16@lausser, I’d love to do this if I have some time later. Thanks.
-
Millet JC Says:
January 13th, 2010 at 11:29Hello All
I’ve a small compilation error on a Solaris system. I’m not expert but think that it’s linked to my environment :
./configure work with success.
make give me this error :
Making all in plugins-scripts make: Fatal error: Don’t know how to make target
Nagios/DBD/Oracle/Server/Instance/SGA/SharedPool/DictionaryCache.pm' Current working directory /tmp/check_oracle_health-1.6.3/plugins-scripts *** Error code 1 The following command caused the error: failcom='exit 1'; \ for f in x $MAKEFLAGS; do \ case $f in \ *=* | --[!k]*);; \ *k*) failcom='fail=yes';; \ esac; \ done; \ dot_seen=no; \ target=echo all-recursive | sed s/-recursive//; \ list='plugins-scripts t'; for subdir in $list; do \ echo "Making $target in $subdir"; \ if test "$subdir" = "."; then \ dot_seen=yes; \ local_target="$target-am"; \ else \ local_target="$target"; \ fi; \ (cd $subdir && make $local_target) \ || eval $failcom; \ done; \ if test "$dot_seen" = "no"; then \ make "$target-am" || exit 1; \ fi; test -z "$fail" make: Fatal error: Command failed for targetall-recursive’Millet JC Reply:
January 13th, 2010 at 11:29@Millet JC, Making all in plugins-scripts make: Fatal error: Don’t know how to make target
Nagios/DBD/Oracle/Server/Instance/SGA/SharedPool/DictionaryCache.pm' Current working directory /tmp/check_oracle_health-1.6.3/plugins-scripts *** Error code 1 The following command caused the error: failcom='exit 1'; \ for f in x $MAKEFLAGS; do \ case $f in \ *=* | --[!k]*);; \ *k*) failcom='fail=yes';; \ esac; \ done; \ dot_seen=no; \ target=echo all-recursive | sed s/-recursive//; \ list='plugins-scripts t'; for subdir in $list; do \ echo "Making $target in $subdir"; \ if test "$subdir" = "."; then \ dot_seen=yes; \ local_target="$target-am"; \ else \ local_target="$target"; \ fi; \ (cd $subdir && make $local_target) \ || eval $failcom; \ done; \ if test "$dot_seen" = "no"; then \ make "$target-am" || exit 1; \ fi; test -z "$fail" make: Fatal error: Command failed for targetall-recursive’lausser Reply:
January 13th, 2010 at 16:01Looks like your tar-command does not support filenames which exceed 100 characters (i think SuSE has such a tar). Instead of the tar.gz please download the shar.gz and unpack it with [sourcecode]cat check_oracle_health-xxx.shar.gz | gzip -d | sh[/sourcecode]
-
Rascal Says:
January 25th, 2010 at 20:44Hallo, ich bin kein Datenbänker, sondern nur “Überwacher”, daher meine Frage: Gibt es eine Möglichkeit den Datenbank-Connect durch das Plugin zu erhalten? Durch den ständigen Auf- und Abbau der Verbindung, schwellen die Logdateien auf der DB an? Oder muss da was an der Datenbank-Config gemacht werden?
-
lausser Says:
January 26th, 2010 at 13:13Mit http://sqlrelay.sourceforge.net/ kann man einen Proxy laufen lassen, der die Verbindung aufrecht hält. Dadurch entfallen dann die Login-Meldungen in der Logdatei.
check_oracle_health --method sqlrelay --connect <proxy-ip>:<proxy-port> --username <proxy-user> --password <proxy-password> ...
-
Frank Says:
February 15th, 2010 at 18:15Hallo, auf Kommandozeile funktioniert die Abfrage als User nagios. Im Nagios selber kommt die Fehlermeldung: ePN failed to compile /usr/lib/nagios/plugins/check_oracle_health “Missing right curly or square bracket at (eval 18) line 4193, at end of line syntax error at (eval 18) line 4200, at EOF at /usr/lib/nagios/p1.pl line 155″
Die Zeile “# nagios: -epn” steht im Skript bereits drin.
Kann es daran liegen dass Nagios noch v.2.9 ist? Gibt es einen Weg das unter dieser Version zum laufen zu bringen?
lausser Reply:
February 16th, 2010 at 2:20Die selektive Abschaltung mit -epn gibt es erst ab der Version 3. Leider, bleibt also nur ein Upgrade auf 3.x oder der komplette Verzicht auf ePN.
-
John Tomawski Says:
February 16th, 2010 at 23:30Be sure to set the –environment flag when required. The flag can be used to set things such as TNS_ADMIN, etc.
Hopefully this comment saves someone 2 hours… sigh
ex. –environment TNS_ADMIN=’/usr/lib/oracle/bleh’
-
Aldo Says:
February 19th, 2010 at 12:52when running the following command:
./check_oracle_health –connect REMOTE –username $ORAUSER –password $ORAPWD –mode tablespace-usage –tablespace USERS
I get the following error message.
Use of uninitialized value in split at /usr/lib/nagios/plugins/check_oracle_health line 3924. bumm Can’t call method “execute” on an undefined value at /usr/lib/nagios/plugins/check_oracle_health line 4230.
Can’t use an undefined value as an ARRAY reference at /usr/lib/nagios/plugins/check_oracle_health line 4242.
and now I’m clue less what todo? can you assist me on this one
thanks in advance
lausser Reply:
February 20th, 2010 at 0:12Did you give the necessary privileges to your ORAUSER?
You also can create an empty file /tmp/check_oracle_health.trace with the touch-command. As long as this file exists, check_oracle_health will write debugging messages into it. You should see the sql statements sent to the database server and the responses. Maybe this gives you an idea what’s wrong.CREATE USER nagios IDENTIFIED BY oradbmon; GRANT CREATE SESSION TO nagios; GRANT SELECT any dictionary TO nagios; GRANT SELECT ON V_$SYSSTAT TO nagios; GRANT SELECT ON V_$INSTANCE TO nagios; GRANT SELECT ON V_$LOG TO nagios; GRANT SELECT ON SYS.DBA_DATA_FILES TO nagios; GRANT SELECT ON SYS.DBA_FREE_SPACE TO nagios;
-
Hans-Jürgen Says:
February 22nd, 2010 at 11:03Hallo,
wir benutzen check_oracle_health seit längerem und sind sehr zufrieden damit. Vielen Dank dafür. Für die Tablespaces, bei denen auto-extent eingeschaltet ist, möchten wir die Überwachung von tablespace-usage auf tablespace-can-allocate-next ändern. Wird dabei sowohl überprüft, ob noch genügend Platz ist als auch ob MAX_EXTENT bereits erreicht ist?
lausser Reply:
February 22nd, 2010 at 12:51Hallo, max_extent wird meines Wissens nach nicht angeschaut. Wenn man mit touch /tmp/check_oracle_health.trace eine leere Datei anlegt (beschreibbar vom Nagios-User), dann werden dort die angesetzten SQL-Statements und deren Resultate reinprotokolliert.
Günter Reply:
April 13th, 2010 at 14:14@lausser, wird es in Zukunft eine Möglichkeit geben bei tablespace-usage autoextent Tablespaces auszuschließen?
Günter Reply:
April 13th, 2010 at 15:17@Günter, hat sich erledigt. Hab gerade im Trace File gesehen, dass Autoextent Tablespaces berücksichtigt werden, d.h. es wir die max. Größe verwendet.
lausser Reply:
April 13th, 2010 at 19:05Du kannst auch bestimmte Tablespaces per regulärem Ausdruck ausschliessen:
bedeutet: alles, ausser TABLESPACE1,TABLESPACE2,TABLESPACE3--name='^(?!(TABLESPACE1$)|(TABLESPACE2$)|(TABLESPACE3$))' --regexp
-
angry_admin Says:
February 24th, 2010 at 13:49 -
Rik Says:
February 26th, 2010 at 16:21Thanks you Gerhard for an excellent plugin. Here is a tiny correction on the documentation on this page. –method accepts two arguments: dbi (not tns) or sqlplus. Or am I misinterpreting things?
lausser Reply:
February 26th, 2010 at 19:19Thanks! “tns” was how i named it in a very early phase. Later it was replaced by the less misleading “dbi”.
-
Thomas Says:
March 4th, 2010 at 11:57Hallo,
hört sich ja alles sehr schön an. Ich würde es ja auch gerne mal ausprobieren, aber ich finde leider nirgends einen Download Link (auch nicht mittlerweile 1,20 m weiter oben). Habe ich etwas übersehen?
Danke, Thomas
lausser Reply:
March 4th, 2010 at 15:16du bist vermutlich auf der englischen Seite gelandet, die es nicht gibt (bei der allerdings die Kommentare angezeigt werden) Der Download-Link ist auf dieser Seite: http://labs.consol.de/lang/de/nagios/check_oracle_health/
-
Thomas Says:
March 4th, 2010 at 17:27Hallo,
habe leider Schwierigkeiten, den Oracle-Instant-Client zu installieren. Weiß vielleicht jemand eine Seite, die sich mit dem Thema beschäftigt?
Vielen Dank, Thomas
Max Reply:
March 12th, 2010 at 17:05@Thomas, Hallo Thomas, schaue mal hier, http://samushka.blogspot.com/2009/04/installing-oracle-sqlplus-in-ubuntu.html
-
Steffen Poulsen Says:
March 25th, 2010 at 18:09When using –mode=tablespace-remaining-time we have the experience, that on some machines it is somewhat slow. I.e. on the machine below it takes more than 60 seconds to process 34 tablespaces.
Apparantly the processing of each status-file takes two seconds to process at this particular machine (some trace output pasted below) – and as this machine has a new tablespace automaticaly added each week, this is not going to get any better by itself any time soon :-)
We are aware that we could split the tablespace checking into separate checks and do each tablespace individually – but if you would happen to have an idea for making this mode run a bit faster, so that all tablespaces could be checked inside a timeframe of say 60 seconds, that would be a clear number 1? :-)
Best regards, Steffen Poulsen
$ uname -a SunOS 5.10 Generic_141414-07 sun4v sparc SUNW,SPARC-Enterprise-T5220
./check_oracle_health –mode=tablespace-remaining-time –lookback=15 –warning=10: –critical=2: …
Thu Mar 25 15:10:52 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:52 2010 Thu Mar 25 15:10:52 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:52 2010 Thu Mar 25 15:10:54 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:54 2010 Thu Mar 25 15:10:54 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:54 2010 Thu Mar 25 15:10:56 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:56 2010 Thu Mar 25 15:10:56 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:56 2010 Thu Mar 25 15:10:58 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:58 2010 Thu Mar 25 15:10:58 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:10:58 2010 Thu Mar 25 15:11:00 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:00 2010 Thu Mar 25 15:11:00 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:00 2010 Thu Mar 25 15:11:02 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:02 2010 Thu Mar 25 15:11:02 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:02 2010 Thu Mar 25 15:11:04 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:04 2010 Thu Mar 25 15:11:04 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:04 2010 Thu Mar 25 15:11:06 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:06 2010 Thu Mar 25 15:11:06 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:06 2010 Thu Mar 25 15:11:08 2010: loaded 5822 data sets from Thu Feb 25 11:00:46 2010 – Thu Mar 25 15:11:08 2010 Thu Mar 25 15:11:08 2010: trimmed to 5822 data sets from Thu Feb 25 11:00:46 2010 – Thu Mar 25 15:11:08 2010 Thu Mar 25 15:11:09 2010: loaded 3806 data sets from Thu Mar 4 11:00:53 2010 – Thu Mar 25 15:11:09 2010 Thu Mar 25 15:11:09 2010: trimmed to 3806 data sets from Thu Mar 4 11:00:53 2010 – Thu Mar 25 15:11:09 2010 Thu Mar 25 15:11:10 2010: loaded 1790 data sets from Thu Mar 11 11:00:59 2010 – Thu Mar 25 15:11:10 2010 Thu Mar 25 15:11:10 2010: trimmed to 1790 data sets from Thu Mar 11 11:00:59 2010 – Thu Mar 25 15:11:10 2010 Thu Mar 25 15:11:10 2010: loaded 5 data sets from Mon Mar 22 14:41:08 2010 – Thu Mar 25 15:11:10 2010 Thu Mar 25 15:11:10 2010: trimmed to 5 data sets from Mon Mar 22 14:41:08 2010 – Thu Mar 25 15:11:10 2010 Thu Mar 25 15:11:10 2010: no historical data found Thu Mar 25 15:11:11 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:11 2010 Thu Mar 25 15:11:11 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:11 2010 Thu Mar 25 15:11:13 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:13 2010 Thu Mar 25 15:11:13 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:13 2010 Thu Mar 25 15:11:15 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:15 2010 Thu Mar 25 15:11:15 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:15 2010 Thu Mar 25 15:11:17 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:17 2010 Thu Mar 25 15:11:17 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:17 2010 Thu Mar 25 15:11:19 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:19 2010 Thu Mar 25 15:11:19 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:19 2010 Thu Mar 25 15:11:21 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:21 2010 Thu Mar 25 15:11:21 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:21 2010 Thu Mar 25 15:11:23 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:23 2010 Thu Mar 25 15:11:23 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:23 2010 Thu Mar 25 15:11:25 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:25 2010 Thu Mar 25 15:11:25 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:25 2010 Thu Mar 25 15:11:27 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:27 2010 Thu Mar 25 15:11:27 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:27 2010 Thu Mar 25 15:11:29 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:29 2010 Thu Mar 25 15:11:29 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:29 2010 Thu Mar 25 15:11:31 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:31 2010 Thu Mar 25 15:11:31 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:31 2010 Thu Mar 25 15:11:33 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:33 2010 Thu Mar 25 15:11:33 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:33 2010 Thu Mar 25 15:11:35 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:35 2010 Thu Mar 25 15:11:35 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:35 2010 Thu Mar 25 15:11:37 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:37 2010 Thu Mar 25 15:11:37 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:37 2010 Thu Mar 25 15:11:39 2010: loaded 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:39 2010 Thu Mar 25 15:11:39 2010: trimmed to 6419 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:39 2010 Thu Mar 25 15:11:41 2010: loaded 6418 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:41 2010 Thu Mar 25 15:11:41 2010: trimmed to 6418 data sets from Tue Feb 23 09:15:44 2010 – Thu Mar 25 15:11:41 2010 Thu Mar 25 15:11:42 2010: found 2027 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:42 2010: found 2028 usable data sets since Wed Mar 10 15:11:42 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 1 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 6 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 1791 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:43 2010: found 2028 usable data sets since Wed Mar 10 15:11:43 2010 Thu Mar 25 15:11:44 2010: DESTROY DBD::Oracle::Server::Database::Tablespace with handle null null
lausser Reply:
March 26th, 2010 at 1:32You’re right. 2 seconds is quite long. In /var/tmp/check_oracle_health you should find several files named tablespace-remaining-time_*
Please mail me one of these files. I’ll have a look at it.
Steffen Poulsen Reply:
March 29th, 2010 at 12:39Thank you very much for the patch you sent us, run time is down from 65 to 11 seconds at this particular host now :-)
lausser Reply:
March 29th, 2010 at 17:25You’re welcome. If anybody stumbled upon the same problem….i’ll release a version with this patch soon.
Khadija Reply:
April 13th, 2010 at 1:18@Steffen Poulsen, Can you plz let me know the patch that you suggest Steffen?
Regards, Khadija
lausser Reply:
April 13th, 2010 at 19:01I forgot to release this update. New version of check_oracle_health is coming asap.
-
Frank Says:
April 14th, 2010 at 16:03Hello,
We used this plugin (1.5) for some months now and everything worked fine, but since yesterday we receive the message “CRITICAL – connection could not be established within 60 seconds”. Nothing has changed on the plugin, nothing has changed on the network, nothing has changed on the machines nor on nagios/centreon?
I don’t have a clue where to look to resolve this problem. Does it sound familiar to somebody?
Regards, Frank
lausser Reply:
April 14th, 2010 at 18:37can you connect with the sqlplus command? (executed on the Nagios server)
-
Steffen B Says:
April 15th, 2010 at 9:41Hallo,
erstmal großes Lob an euch, ein super Plugin was ihr dort kreiert habt. Wir nutzen es komplett zur Oracle Überwachung unserer Kundensysteme.
Seit heute hab ich aber ein Problem wo ich nicht mehr weiter weiß. Situation ist folgende:
1DB Server – darauf zwei Datenbanken mit jeweils einem Schema – beide mit dem gleichen DB Stand 10.2.0.4 und dem gleichen Schemanamen.
Ich möchte mit dem Plugin die “Usage” des Tablespaces ermitteln. –mode=tablespace-usage Die Syntax auf der Kommandozeile ist die gleiche, es ändert sich nur der TNSAlias für die Datenbank. Und bei der einen DB funkioniert es ohne Probleme und bei der anderen DB zeigt er mir folgenden Fehler:
Use of uninitialized value in split at /usr/local/nagios/libexec/check_oracle_health line 3924. bumm Can’t call method “execute” on an undefined value at /usr/local/nagios/libexec/check_oracle_health line 4230.
Can’t use an undefined value as an ARRAY reference at /usr/local/nagios/libexec/check_oracle_health line 4242.
Wie schon vorher hier empfohlen, hab ich das check_oracle_health.trace file angelegt und mir ist als einziges aufgefallen, dass ein anderes SQL Statement abgesetzt wird. Aber warum? Die Datenbanken sind gleich isntalliert und auf dem Selben Server, also kann es nicht mit der DB zu tun ahben oder mit dem Betriebssystem, oder?
Wäre für jede Hilfe Dankbar.
lausser Reply:
April 17th, 2010 at 12:19“..dass ein anderes SQL Statement abgesetzt wird.” Es wäre natürlich hilfreich, diese beiden unterschiedlichen Statements sehen zu können.
-
Frank Says:
April 15th, 2010 at 10:10I think you’re right, it seems to be a problem with sqlplus.
As root user I can connect with sqlplus, as Nagios user I cannot connect. I think we have to find out what is changed there…
-
Frank Says:
April 15th, 2010 at 13:20We had to relocate the nagios server today, so we had to restart the server. Problem is now solved.
-
Hans Wolters Says:
April 23rd, 2010 at 14:40Dear all,
Great Plugin. Started to configure it this week and currently for some databases I already have nearly all of the checks possible with the default options.
One question remains for me. If I have overlooked this in the documentation (yes, I can read German) the please let me know.
Situation:
We have several machines with more then one database per service id. Would it be possible to return the SID and database name with the return string of nagios (given by the plugin written in perl). This will enable me to use short service descriptions on those machines and setup service entries with multiple databases/sids on one machine. Maybe even with a parameter so people using only one database can skip the options.
I could hack it into the source my self but I can imagine I am not the only one who would like that feature.
Freundliche Grusse,
Hans Wolters
lausser Reply:
April 27th, 2010 at 1:46Hi, i’ll have a look at it.
-
Geoff Sears Says:
April 30th, 2010 at 23:53Hi. I’m having trouble making a connection as sysdba, though I understand this should be possible.
Would you post an example of how to make it work?
Thanks,
-geoff
lausser Reply:
May 10th, 2010 at 0:07check_oracle_health –connect sysdba@ …
Geoff Sears Reply:
May 14th, 2010 at 2:44That’s what I can not get to work. Works fine with connect=host:port//service or connect=//host:port/service
But, If I use:
connect=sysdba@host:port/service or connect=sysdba@//host:port/service
results in ORA-12154: TNS:could not resolve the connect identifier specified (DBD ERROR: OCIServerAttach)
I believe that getting a sysdba connection with DBI/DBD::Oracle requires setting a connection attribute ora_session_mode => ORA_SYSDBA ; just passing that string “sysdba@host:port/service” as the data source won’t do it.
Geoff Sears Reply:
May 15th, 2010 at 2:31ok, I finally sat down and read through the code: sysdba@… is only supported for sqlplus connections. I hacked it so tns connections are possible.
lausser Reply:
May 15th, 2010 at 15:21Which version did you use? I looked into the source (in my git repository) and found (in the tns section)
Isn’t that correct? How does your changes look like?my $connecthash = { RaiseError => 0, AutoCommit => 0, PrintError => 0 }; if ($self->{username} eq "sys" || $self->{username} eq "sysdba") { $connecthash = { RaiseError => 0, AutoCommit => 0, PrintError => 0, #ora_session_mode => DBD::Oracle::ORA_SYSDBA ora_session_mode => 0x0002 }; $dsn = sprintf "DBI:Oracle:"; }
-
Thomas Says:
May 6th, 2010 at 9:09I am running nagios 1.2 and the service i have created with check_oracle_health won’t start. The service still remains on pending. When i run the check on the servers shell it works perfectly. Might that be a problem with nagios 1.x? Should I update to nagios 2 or 3?
lausser Reply:
May 10th, 2010 at 0:13I don’t think this has to do with the Nagios version. Can’t you force scheduling of the service through the service detail page? Upgrading to 3.x is a good idea anyway.
-
Björn Says:
May 17th, 2010 at 11:54Hello,
for some databases we are using your health check. One of the installations is using a Dataguard Environment. When we configure checks for the standby, we get a critical error, as no connection is allowed (“ORA-01033: ORACLE initialization or shutdown in progress”).
For our other oracle monitors we excluded the ORA-01033 and give an OK-State with a comment (“OK – Login Denied, the DB is in Standby Mode – this Check only works for Primary DB’s “).
Could you implement an exeption handling for the ORA-01033 to allow the same Nagios config for Primary and Standby Database?
lausser Reply:
May 18th, 2010 at 0:29ORA-01033 can be a sign of serious problems, for example when a corrupted database was restartet (ORA-10567 et al can be found in the alertlog), hence ignoring this error message is not an option.
Michael Reply:
June 28th, 2010 at 11:00@lausser, Oracle Dataguard tnsping is not working anymore. getting the same error initialization or shutdown in progress. In Version 1.6.2 teh mode tnsping was working for closed Databases.
lausser Reply:
June 28th, 2010 at 11:09How do you call the plugin and what’s the error message (in 1.6.2 and 1.6.4)?
-
Antonio Romero Says:
May 25th, 2010 at 17:07Hi,
I have installed the check_oracle_health on my nagios system in order to monitor several Oracle DB’s. All works fine, except one thing. When I ask to the DB for the space used by the tablespace the info that the plugin returns is diferent from the info that I can get by a Oracle query in the Oracle Manager. Can you give me some help about this issue?
Thank you in advance for your help!
Toni.
lausser Reply:
June 1st, 2010 at 21:42Oracle tools usually set two values into relation: used space and allocated space. Now if you use autoallocation, the latter value may grow. When used:allocated is near 100%, autoallocation happens, allocated space suddenly grows and used:alloc percentage drops. This means, you could get an alert from nagios because the crit.threshold has been reached. Then, after the autoallocation, the usage drops below the threshold again. False alert. That’s why check_oracle_health calculates the usage percentage from used:max_allocatable
-
Thomas Says:
May 26th, 2010 at 11:32Hallo,
zunächst mal danke für das hilfreiche Plugin. Ich habe allerdings noch Probleme es zur Zusammenarbeit mit Nagios (3.2.0) zu überreden. Ich betreibe den Nagiosserver auf Ubuntu und habe den Oracle Instantclient installiert. Das Plugin funktioniert von der Konsole, als User nagios gestartet, ohne Probleme. Als service in Nagios mit folgendem command:
command_line $USER1$/check_oracle_health –connect rebmasc.world –user dbo –password xxx –mode tnsping
bekomme ich immer die Fehlermeldung:
cannot connect to rebmasc.world. ORA-12154: TNS:could not resolve the connect identifier specified (DBD ERROR: OCIServerAttach)
Die Variablen ORACLE_HOME, TNS_ADMIN usw. habe ich in der bash.bashrc für alle korrekt gesetzt und die DB kommt auch in der dort vorhandenen tnsnames.ora vor. Wie gesagt in der Konsole der Maschine ohne Probleme.
Ich habe schon diverse alternative command probiert (–environment; –method), allerdings ohne Erfolg. Ich kann keinen Fehler finden. Was mache ich falsch?
lausser Reply:
May 26th, 2010 at 20:45Die Environmentvariablen müssen im init-Script von Nagios gesetzt werden. Dateien wie .bashrc werden beim Systemstart nicht gelesen.
Thomas Reply:
May 27th, 2010 at 8:30@lausser, Diesen Hinweis habe ich dann gestern auch im Nagiosforum gefunden. Wenn ich die Variablen in /etc/init.d/nagios setze klappt alles. Danke für den Hinweis.
-
Antonio Romero Says:
May 28th, 2010 at 21:56Please Lausser, Can you answer my post above?, number 40.
Thanks!
-
Dennis Says:
June 8th, 2010 at 16:07Hallo, gibt es eine Möglichkeit die Flash Recovery Area zu überwachen? Bzgl. Füllstand.
Gruß, Dennis
lausser Reply:
June 9th, 2010 at 11:06Nein, das ist nicht eingebaut. Aber vielleicht wäre sowas hilfreich:
--mode sql --name 'select max(percent_space_used) from v$flash_recovery_area_usage' --warning 80 --critical 90
-
Hamza Says:
June 17th, 2010 at 12:15Hi there
I love your check oracle plugin, it does everything I want.
Is there any way to specify multiple DB names using some sort of delimiter.
I currently have a setup as such.
- In .profile I have
export NAGIOS__SERVICEORACLE_SID=
/usr/lib/oracle/11.2/client/network/admin/tnsnames.sh
which basically does a cat of the tnsnames.ora and pulls out all the sids for me.
What I would like to do is be able to run the check_oracle_health in this way
check_oracle_health –connect $NAGIOS__SERVICEORACLE_SID:$NAGIOS__SERVICEORACLE_SID –username nagios –password nagios –mode tnsping
where the : is any delimiter to which can specify multiple DB names.
Please help.
Thank you Hamza Maal
lausser Reply:
June 17th, 2010 at 12:59That’s not possible. You can only check databases one at a time. If you want multiple checks inside one single service, you might want to give check_multi a try. http://www.my-plugin.de/wiki/projects/check_multi/discussion
- In .profile I have
export NAGIOS__SERVICEORACLE_SID=
-
Tim Says:
June 17th, 2010 at 20:27I have an odd problem with this plugin. It works fine, but Nagios reports any response as a warning. From the command line, I’ll get:
OK – 0.22 seconds to connect as MONITOR | connection_time=0.2193;3;8
But Nagios shows the service in yellow and the log has:
SERVICE ALERT: myhost;Oracle mySID Connect;WARNING;HARD;3;OK – 0.14 seconds to connect as MONITOR
Why is it showing as an alert when the connect time is within the correct range?
lausser Reply:
June 21st, 2010 at 10:51Strange… What about the thresholds? From your command line example i see you set –warning 3 –critical 8 (without these extra parameters it would be 1 and 5 by default) Did you set thresholds also in the service/command definition? When you get such a WARNING, please click on “Service Details” and look at the performance data. Which thresholds do you see there?
Tim Reply:
June 21st, 2010 at 19:10I think I added the warning/critical params just in case that might affect the display. The performance data looks like this:
Current Status: WARNING (for 3d 22h 57m 46s) Status Information: OK – 0.23 seconds to connect as MONITOR Performance Data: connection_time=0.2339;3;8 Current Attempt: 3/3 (HARD state) Last Check Time: 06-21-2010 13:06:33 Check Type: ACTIVE
Interestingly, I also set this up in Icinga and it does the same thing.
lausser Reply:
June 21st, 2010 at 19:49Very strange…the last lines of the plugin are:
so if $ERRORCODES{$nagios_level} is “OK” (which is in the output), then the exit code $nagios_level must be 0. Can you reset the service to OK with “submit passive checkresult”? Did you see a warning from the first moment when you configured this service? Or has it been OK before?printf "%s - %s", $ERRORCODES{$nagios_level}, $nagios_message; printf " | %s", $perfdata if $perfdata; printf "\n"; exit $nagios_level;
Tim Reply:
June 22nd, 2010 at 21:05I can send it an OK passive result and it will switch to “OK”, but usually changes right back to a yellow warning.
I’ve tried enabling and disabling passive checks, event handling, but to no effect.
One thing I did notice is that it almost always shows:
Current Attempt: 3/3 (HARD state)
As if maybe it didn’t pass the first 2 checks. Running from the command line I can submit it repeatedly and I get OK results each time. It’s an odd thing. After all of this testing, I think the script works fine, it appears to be more of a Nagios problem.
lausser Reply:
June 22nd, 2010 at 21:13Just to be absolute sure, you can add an extra line at the end of the plugin:
printf "%s - %s", $ERRORCODES{$nagios_level}, $nagios_message; printf " | %s", $perfdata if $perfdata; printf "\n"; printf "i will definitively exit with %d\n", $nagios_level; exit $nagios_level;
The level surely won’t change between the printf and the exit.
-
IT-COW | Icinga: Oracle-Datenbanken abfragen Says:
June 19th, 2010 at 8:53[...] Es gibt ein PlugIn für Icinga/Nagios, das es erlaubt den Status von Oracle-Datenbanken übers Netzwerk abzufragen. Das Tool nennt sich oracle_check_health und ist wie check_logfiles von Herrn Lausser von der Firma ConSol entwickelt worden – dies ist die Homepage des Projekts: Link. [...]
-
Hamza Says:
June 24th, 2010 at 18:00Hi
I seem to be having some trouble setting the warning and critical thresholds for checking tablespace free.
Could you please advise on the correct syntax for
check_oracle_health -t 480 –connect db1 –username nagios –password nagios –mode tablespace-free –warning 85 –critical 90
Please help.
lausser Reply:
June 24th, 2010 at 18:06I assume you want a warning if less than 15% are free and a critical if less than 10% are free. Please use ‘:’ which is the correct syntax for ‘less than’-thresholds.
--mode tablespace-free --warning 15: --critical 10:
-
Rija Says:
July 2nd, 2010 at 15:55Hello, I have problem when I execute line command using tablespace-io-balance to check datafiles under all tablespaces. The output is CRITICAL – unable to aquire tablespace info. Can You help me please?
-
Rija Says:
July 2nd, 2010 at 15:56Hello, I have problem when I execute line command using tablespace-io-balance to check datafiles under all tablespaces. The output is CRITICAL – unable to aquire tablespace info. Could You help me please?
lausser Reply:
July 2nd, 2010 at 16:09Do you see this message only with mode tablespace-io-balance? What about –mode list-tablespaces ?
Maybe you forgot to set the right privileges?
CREATE USER nagios IDENTIFIED BY oradbmon; GRANT CREATE SESSION TO nagios; GRANT SELECT any dictionary TO nagios; GRANT SELECT ON V_$SYSSTAT TO nagios; GRANT SELECT ON V_$INSTANCE TO nagios; GRANT SELECT ON V_$LOG TO nagios; GRANT SELECT ON SYS.DBA_DATA_FILES TO nagios; GRANT SELECT ON SYS.DBA_FREE_SPACE TO nagios;
-
Rija Says:
July 2nd, 2010 at 16:38I see this message with tablespace-io-balance only. I’ve executed: check_oracle_health –connect SID –user nagios –password oradbmon –mode tablespace-io-balance. list-tablespaces works, the output gives list and message “OK – have fun” in the end. All privileges are OK for user nagios. Thank You for your help!
lausser Reply:
July 2nd, 2010 at 17:02Edit the plugin and search for “sub init_datafiles”, then search for “iobalance” and finally search for “datafileresults”. Now you found the line
Please change the $params{selectname} to $params{tablespace} (2 times) and try again.my @datafileresults = $params{handle}->fetchall_array($sql, $params{selectname}, $params{selectname});Rija Reply:
July 2nd, 2010 at 17:32I followed your tips and now everything works. Thank you very much for your help.
Rija Reply:
July 2nd, 2010 at 19:10@Rija, Oups! Sorry, it doesn’t work for oracle installed on windows machine, the same error message appear . Have you got another solution for that? Thank you.
Rija Reply:
July 2nd, 2010 at 19:12@lausser, Oups! Sorry, it doesn’t work for oracle installed on windows machine, the same error message appear . Have you got another solution for that? Thank you.
lausser Reply:
July 2nd, 2010 at 21:18Strange…unfortunately i don’t have a windows db-server. Please execute the following statement with sqlplus:
SELECT file_name, SUM(phyrds), SUM(phywrts) FROM dba_data_files, v$filestat WHERE tablespace_name = UPPER('USERS') AND file_id=file# GROUP BY tablespace_name, file_name
Rija Reply:
July 10th, 2010 at 2:00@lausser, Hello! I ran “GRANT SELECT ON V_$filestat TO nagios;” and it works. Thank you… I have another problem, I’d like to modify default values of critical and warning level when execute sga-data-buffer-hit-ratio or sga-library-cache-hit-ratio or sga-dictionary-cache-hit-ratio but I still have the error message that appears critical even value is 100%. I’he executed the following command: check_oracle_health –connect SID –mode sga-data-buffer-hit-ratio –warning 80 –critical 90 CRITICAL – SGA data buffer hit ratio 100.00% | sga_data_buffer_hit_ratio=100.00%;80;90
Rija Reply:
July 10th, 2010 at 2:11@Rija, Sorry! The command is: check_oracle_health –connect SID –mode sga-data-buffer-hit-ratio –warning 90 –critical 80 The same error message appears…
lausser Reply:
July 10th, 2010 at 2:38These are “less than”-thresholds. According to the plugin developer guidelines, you must add a “:”. So –warning <less than 90> is written as –warning 90:
-
roger Says:
July 2nd, 2010 at 18:38is normal what seg_top_10 metrics is including to PERFSTAT user, also:
this my top 10 ……….
PERFSTAT 2154 row lock waits 1 …….. PERFSTAT 1450 row lock waits 2 ……….. PERFSTAT 572 row lock waits 3 ……… PERFSTAT 466 row lock waits 4 ……… PERFSTAT 446 row lock waits 5 ……….. PERFSTAT 382 row lock waits 6 ………… PERFSTAT 350 row lock waits 7 ………… PERFSTAT 288 row lock waits 8 ………….
PERFSTAT 246 row lock waits 9 ………….. PERFSTAT 191 row lock waits 10
-
Hamza Maal Says:
July 14th, 2010 at 10:06Hi
I am trying to run a sql statement using –mode sql but it does not seem to work. I have tried using the encode but it still comes up with errors
Original statement /usr/lib/nagios/plugins/check_oracle_health –connect mlc247 –username dbuser –password dbpass –mode sql SELECT TO_CHAR(NEXT_TIME, ‘DD-MON-YYYY HH24:MI:SS’) FROM V$ARCHIVED_LOG where sequence# = (select max(sequence#) from v$archived_log where applied = ‘YES’)
After encoding
/usr/lib/nagios/plugins/check_oracle_health –connect mlc247 –username dbuser –password dbpass –mode sql SELECT%20TO%5FCHAR%28NEXT%5FTIME%2C%20%27DD%2DMON%2DYYYY%20HH24%3AMI%3ASS%27%29%20FROM%20V%24ARCHIVED%5FLOG%20where%20sequence%23%20%3D%20%28select%20max%28sequence%23%29%20from%20v%24archived%5Flog%20where%20applied%20%3D%20%27YES%27%29
This is the error I get using the encode
Use of uninitialized value $sql in sprintf at /usr/lib/nagios/plugins/check_oracle_health line 4194. Use of uninitialized value in subroutine entry at /usr/local/lib/perl/5.10.0/DBD/Oracle.pm line 284. Use of uninitialized value $value in numeric gt (>) at /usr/lib/nagios/plugins/check_oracle_health line 3615. Use of uninitialized value $value in numeric gt (>) at /usr/lib/nagios/plugins/check_oracle_health line 3616. Use of uninitialized value $params{“name2″} in split at /usr/lib/nagios/plugins/check_oracle_health line 3553. OK – :
Any help would be much appreciated
lausser Reply:
July 14th, 2010 at 10:31check_oracle_health .... --mode sql --name SELECT%20TO%5...
-
jhon Says:
July 15th, 2010 at 22:11check_oracle_health –connect SID –mode sga-data-buffer-hit-ratio
OK – SGA data buffer hit ratio 105.55%
105.55 !!! why ?
lausser Reply:
July 15th, 2010 at 23:44I don’t know. I need more information. Look into the code. Find the statement which is used to fetch the data used for the calculation of the hit ratio, execute the statement manually, get the values involved in the calculation manually, post the result here.
jhon Reply:
July 16th, 2010 at 23:04SUM(DECODE(NAME,’PHYSICALREADS’,VALUE,0))
SUM(DECODE(NAME,’PHYSICALREADSDIRECT’,VALUE,0))
SUM(DECODE(NAME,’PHYSICALREADSDIRECT(LOB)’,VALUE,0))
SUM(DECODE(NAME,’SESSIONLOGICALREADS’,VALUE,0))
33942 155 319842 5623223===== using query of @Marco this result:
SELECT ROUND((1-(phy.value / (cur.value + con.value)))*100,2) “Cache Hit Ratio” FROM v$sysstat cur, v$sysstat con, v$sysstat phy WHERE cur.name = ‘db block gets’ AND con.name = ‘consistent gets’ AND phy.name = ‘physical reads’ SQL> /
Cache Hit Ratio
99.4 -
JamesC Says:
July 16th, 2010 at 22:23I’m having an odd issue, related to running the script as a non-root user. The output is correct, except there’s a printf() error included with the output.
[nagios@server0224 ~]$ /usr/local/nagios/libexec/check_oracle_health –connect krusta_srv –username USER –password PASS –mode sga-data-buffer-hit-ratio –warning 95: –critical 90: printf() on closed filehandle STATE at /usr/local/nagios/libexec/check_oracle_health line 3828. OK – SGA data buffer hit ratio 99.99% | sga_data_buffer_hit_ratio=99.99%;95:;90:
lausser Reply:
July 16th, 2010 at 22:48You ran the plugin as root. This lead to the creation of /var/tmp/check_oracle_health and probably some files below this directory. (owner: root) These files are necessary to carry state information from one run to the next. Then you ran the plugin as non-root. Overwriting the state file(s) does not work, because they’re owned by root. That’s why you see the error message. Homework:
- chown -R nagios:nagios /var/tmp/check_oracle_health
- write 100 times “i must not run plugins as root”
-
sdouce Says:
July 20th, 2010 at 12:15Hi i receive this kind of message and i dont understand, i have many nagios server using same distrib and working fine , here i have this probleme can oy help ? :
CRITICAL – cannot connect to ORACLE_TOTO. install_driver(Oracle) failed: Can’t load ‘/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/DBD/Oracle/Oracle.so’ for module DBD::Oracle: /usr/lib/oracle/10.2.0.4/client/lib/libocci.so.10.1: ne peut restaurer le segment prot après reloc:
Permission non accordée at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/DynaLoader.pm line 230. at (eval 14) line 3 Compilation failed in require at (eval 14) line 3. Perhaps a required shared library or dll isn’t installed where expected at /usr/local/nagios/libexec/check_oracle_health line 4193
lausser Reply:
July 20th, 2010 at 12:28Hi, looks like a broken installation of DBD::Oracle.
-
hha Says:
August 3rd, 2010 at 11:54Hi,
ich bin gerade am rätseln ob man bei den perfomance-daten, die z.B. auch irgendwie nur die %-Zahlen ausgeben kann, bzw. nur die absoluten Werte. Hintergrund ist: Ich möchte die sachen gerne mit dem nagios-grapher visualisieren. Alternativ: Wenn wer eine passende config dafür hat.. :)
Danke!
lausser Reply:
August 3rd, 2010 at 14:08Die Performancedaten wurden so gestaltet, daß sie den Vorgaben der “Plugin Developer Guidelines” entsprechen und somit von PNP ohne jeglichen Aufwand verarbeitet werden können. Auf Nagios-Grapher habe ich keine Rücksicht genommen, daher gibt es keine Möglichkeit, an der Ausgabe etwas zu drehen. Aber wie du schreibst…irgendwer wird sicher schon eine NG-Config gebaut haben.
-
Hix Says:
August 27th, 2010 at 13:42Hi lausser,
Firstly many thanks for your great plugin! I have written MyBind package (based on your documentation thanks again!) for check_oracle_health where I can test PL/SQL execution with bind variables. I have two successful execution: - without bind variables (only simple SQL or PL/SQL codes are executed in my check_oracle_health package) - with bind variables in a single perl script file independently check_oracle_health (using bind_param_inout method) So this two versions of my code is running fine. But when I try to put it into MyBind package I receive the following error message: Can’t locate object method “prepare” via package “DBD::Oracle::Server::Connection::Dbi” at /usr/lib/nagios/plugins/pm/CheckOracleHealthExtMyBind.pm line 36.
OK, I know there are reasons: In my “without bind” versions I can use your great method: “$self->{handle}->fetchrow_array” and it’s working fine. But in my “with bind” versions I cannot use this because I need call the bind_param_inout method many times between prepare and execute phase. There is a small example:
$sql = " begin xy.check_status(:P_PROGRAM_NAME, :P_PROGRAM_ID, :P_RUN_ID, :P_IS_ABORTED, :P_EXCEPTION_CODE, :P_EXCEPTION_NAME); end;";$sth = $self->{handle}->prepare($sql); $sth->bind_param_inout(":P_PROGRAM_NAME", \$p_program_name, 100); $sth->bind_param_inout(":P_PROGRAM_ID", \$p_program_id, 100); ... $sth->bind_param_inout(":P_EXCEPTION_NAME", \$p_exception_name, 100); $sth->execute(); I cannot use simple SQL because I have many complex PL/SQL code to monitor applications. So my question is, could you suggest to me how I can resolve this issue? Danke sehr!
Hix Reply:
August 27th, 2010 at 13:49@Hix, Oh… sorry for bad code formating! I have just try the code between “code” tag of WordPress, but it seems that isn’t working properly for me.
-
Hix Says:
August 27th, 2010 at 17:26Hi Gerhard, I’m sure you are going to find much better solution but I wrote a workaround temporarily: the following prepare method is inserted in the DBD::Oracle::Server::Connection::Dbi package, sub fetchrow_array above:
sub prepare { my $self = shift; my $sql = shift; my $sth = undef; $sth = $self->{handle}->prepare($sql); if ($@) { $self->debug(sprintf "bumm %s", $@); } return $sth; }I hope I was not too difficult to describe the situation and the code formatting will work. :) Best regards
-
Robson Says:
September 10th, 2010 at 3:48Hi, I am trying to create any check using mode sql, but are dont working, can you helpme? check_oracle_health –connect=orcl2 –user=nagios –password=dbacomp2011 –mode=sql –name=select%20FNC%5FLOCK%20from%20dual –warning=1 –critical=2 return: WARNING – select fnc_lock from dual: 2 | ‘select’=2;1;2 so I create a command: define command { command_name check_check_orah_sql command_line $USER1$/check_oracle_health –connect $ARG1$ –username $ARG2$ –password $ARG3$ –mode $ARG4$ –name=$ARG5$ –warning=1 –critical=3 } and I create a service: define service{ use generic-service ; host_name host_home_orcl2 service_description Oracle Table Lock check_command check_check_orah_sql!orcl2!system!redhat!sql!select%20FNC%5FLOCK%20from%20dual } On command line the warning is detect sucessfull but on nagios the status is only OK. My sql return 2 if there is a lock, and return 0 if there is not a lock. Do you know what is wrong? thanks
Robson Reply:
September 10th, 2010 at 3:54@Robson, Another doubt is, on status information appears: OK – select fnc_lock from dual: how can I change that?, on parameter name2, I can change to a fixed value. Can I implement to show on this columns users thath are locking? the command to extract thath I know, the problems is how to put thath information there. thanks for your help.
lausser Reply:
September 10th, 2010 at 10:57Thats not possible. If you want this kind of customized output you need to implement it yourself. In the contrib-directory of the check_oracle_health-package you find some help´on how to write own extensions.
lausser Reply:
September 10th, 2010 at 10:55I have no idea, why it shows a different result undr nagios. You might create a trace-file with ‘touch /tmp/check_oracle_health.trace’ As long as this file exists, the plugin will write some debugging information to it. Then find the differences between a commandline run and a nagios-triggered run in the logs.
-
Robson Says:
September 10th, 2010 at 16:55I create file /tmp/check_oracle_health.trace, but the file are writen only by command line, when I force run comando by nagios, the file dont receive anything. There are another thing I can do? thanks
lausser Reply:
September 10th, 2010 at 17:02I bet you ran the plugin on the command line as root, didn’t you? The root is the owner of the tracefile. Now when the plugin runs in a nagios context (and under the nagios user account) it has not the permission to write to the file. Please do never test plugins as root on the command line. (You also have to delete /var/tmp/check_oracle_health, because you’ll have the same problem here. This directory is important for correct results)
-
Robson Says:
September 10th, 2010 at 17:36Thanks a lot for your help. The error is thath I was using different user on nagios. Thank you so much
-
rkelly Says:
September 13th, 2010 at 21:47Hi lausser,
I was hoping to use this plugin but Im having some basic install/config issues. I fairly new to BSD type OS and nagios. Im have installed the latest Groundworks VM (CentOS based) for testing. I setup the NSClient++ for monitoring remote Windows servers (which is working) and was hoping to monitor some basic parameters of Oracle. Are there any issues with the plugin monitoring remote windows machines? I have created an account in Oracle using the defaults and created the plugin in Nagios based on the instructions which seem to install fine and the plugin shows up in Nagios but i am getting the “Error executing command (Permission denied)” when I test a new command. Ive checked access to oracle and i can ping it and log in remotely so i dont think it is oracle access but access to the plugin. I checked the permissions on the plugin and they are nagios/nagios. I created the check_oracle_health.trace and did a touch and then tailed the file but I dont see any debug being created – Im sure it is just me being a noob but any insite would be appreciated.
thanks
lausser Reply:
September 13th, 2010 at 22:23How did you unpack/install the plugin? Can you run it from the command line?
rkelly Reply:
September 18th, 2010 at 0:03found out that the groundwork perl is actually in /usr/local/groundwork/perl/bin/perl – i deleted the old script and created a new one using the new path but kept getting the same error – looked at the script and it still says it is pointing to the old location so i just modded the script and now when testing perl is now recognized – still have a few more issues but I should be able to get it now thanks
-
rkelly Says:
September 13th, 2010 at 23:50Hi -
I unpacked to the desktop and ran the installer from terminal as the grounwork user. If i run from command line i get:
/usr/local/groundwork/nagios/libexec/check_oracle_health: /usr/local/groundwork/perl: bad interpreter : permission denied
i wasnt sure during the config whether to point the perl directory to /perl or /perl/bin – should i reinstall perl – the one i am pointing to is built into the groundwork install (v 5.8.8 for i386 linux thread multi).
thanks again
-
Tontonitch Says:
September 28th, 2010 at 10:12Hi Lausser,
I faced a problem using the mode shared-pool-free reporting 0% event if the shared pool free memory was about 80%. I saw that problem only on Oracle 9i at the moment. Looking at that, it seems that a bug with some statistics reported for exemple 3.6E19 for the kzull component in the shared pool, which is completely false/impossible.
I’ve adapted the check related query in check_oracle_health (in init_shared_pool_free) to not take into account the values in v$sgastat which are greater than the free memory.
sub init_shared_pool_free { my $self = shift; my %params = @_; #$self->{free_percent} = $self->{handle}->fetchrow_array(q{ # SELECT ROUND((SUM(DECODE(name, ‘free memory’, bytes, 0)) / # SUM(bytes)) * 100,2) FROM v$sgastat where pool = ‘shared pool’ #}); # Fix for problems with abnormal values. Example: kzull = 3.9E19! $self->{free_percent} = $self->{handle}->fetchrow_array(q{ select ROUND(a.bytes / b.sm * 100,2) from (select bytes from v$sgastat where name=’free memory’ AND pool=’shared pool’) a, (select sum(bytes) sm from v$sgastat where pool = ‘shared pool’ and bytes <= (select bytes from v$sgastat where name=’free memory’ AND pool=’shared pool’)) b }); # scheint nur bis ora9 sinnvoll zu sein. >10.x liefert 0 #$self->{alloc} = $self->{handle}->fetchrow_array(q{ # SELECT value FROM v$parameter WHERE name = ‘shared_pool_size’ #}); if (! defined $self->{free_percent}) { $self->add_nagios_critical(“unable to get sga free”); return undef; } }
May that change help some people still using Oracle 9i to have correct shared pool usage statistics.
Cheers,
Yannick
Tontonitch Reply:
September 28th, 2010 at 10:14@Tontonitch, I will send you that little change via email as it’s a bit crap on that web page.
Cheers,
Yannick
lausser Reply:
September 28th, 2010 at 12:23Thanks Yannik!
-
lozair Says:
November 4th, 2010 at 18:30Hi Lausser,
I faced a problem using check_oracle_health. I run this command on an oracle server and all is good using a local user.
I want to launch the command remotely, using nrpe, from my nagios server.
however i encountered always the same error on the nagios server :
+++++++++++ CRITICAL – cannot connect to TOTO. install_driver(Oracle) failed: Can’t load ‘/usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/auto/DBD/Oracle/Oracle.so’ for module DBD::Oracle: libclntsh.so.10.1: Ne peut ouvrir le fichier d’objet partagé: Aucun fichier ou répertoire de ce type at /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/DynaLoader.pm line 230. at (eval 13) line 3 Compilation failed in require at (eval 13) line 3. Perhaps a required shared library or dll isn’t installed where expected at /usr/local/nagios/libexec/check_oracle_health line 4512 +++++++++++++++
Reading the forum and googling for a while i view a post saying that we must set the ORACLE_HOME and LD_LIBRARY_PATH variable in nrpe init script in order to have a correct environment to launch the check_oracle_health.
I attempt this solution but that not solve the problem
I also attempt to use the –environment check_oracle_health option to set these variables but i obtain always the same error
Any help/idea would be great
regards
lausser Reply:
November 4th, 2010 at 18:34check_oracle_health needs (like every other oracle client software) a valid environment with ORACLE_HOME, LD_LIBRARY_PATH etc. So you need to start your nrpe with an oracle environment.
lozair Reply:
November 4th, 2010 at 22:19@lausser, ok thanks for your response i have found the problem, the env variables were correctly set in nrpe but there was bad file permissions on the ORACLE_HOME directory for the nrpe user
-
OMD Version 0.44 erschienen » klimmbimm Says:
November 15th, 2010 at 17:56[...] dienen der Abfrage von Parametern der bekannten Datenbanksysteme. Mehr Infos auf http://labs.consol.de/lang/de/nagios/check_oracle_health/ und [...]
-
michauko Says:
November 15th, 2010 at 20:15Hi,
Thanks for this amazing nagios plug-in. It rocks !! I’m setting it up, just a configuration problem to fix, if you have any idea, that’d be great: - My NRPE process owner is “nagios” on a Debian system, so the script are started with his environment - I can run all the tests from the command line, as “nagios”. It works just fine. - I already have some other NRPE tests running with my Nagios/NRPE configuration.
For this plugin, from Nagios, I got a ORA-24327 “need explicit attach before authenticating a user” error. I might have something to define in the environment, but what? I saw some remarks about sqlnet.ora, but not sure what I have to do.
I’ve defined my NRPE commands like this one: command[check_oracle_health_connection-time]=/usr/local/nagios/libexec/check_oracle_health –connect user/pass@MYORA –mode connection-time
As I told, running this command directly as “nagios” just works.
Thanks, Regards
Jacques M
lausser Reply:
November 15th, 2010 at 20:19In order to sucessfully execute Oracle client software (like check_oracle_health) your NRPE needs to run with some special environment variables. Have a look at ORACLE_HOME, TNSADMIN and LD_LIBRARY_PATH. You need to set them in the nrpe-init-script.
michauko Reply:
November 15th, 2010 at 20:37@lausser, MMM, I’ll try tomorrow. The thing is that I could run another Oracle-based plug-in using sqlplus from Nagios without any problem. I’ll check this tomorrow, anyway
Thank you
michauko Reply:
November 16th, 2010 at 12:19@lausser, You’re right, thank you.
On Debian, I added in /etc/init.d/nagios-nrpe-server the following: export ORACLE_HOME=/usr/lib/oracle/11.1/client export PATH=$PATH:$ORACLE_HOME/bin export LD_LIBRARY_PATH=$ORACLE_HOME/lib
After the initial PATH, DAEMON… initialisation
Thank you
I wonder if there is a better place to put these. I saw I can give env vars in your check_oracle_health script, but didn’t try it correctly I guess.
regards
-
Nagios : supervision d’une base Oracle - Le blog de Michauko Says:
November 16th, 2010 at 16:52[...] j’ai trouvé ça : http://labs.consol.de/nagios/check_oracle_health/ repéré sur exchange.nagios.org. Y’a aussi le pendant pour MySQL, MSSQL etc. A voir à [...]
-
Sylvain Says:
November 18th, 2010 at 16:56Hello,
I have a problem when i “make’. I ‘m in Solaris 10 and DBI and DBD Oracle are installed:
Making all in plugins-scripts make[1]: Entering directory
/downloads/Nagios/check_oracle_health-1.6.6.1/plugins-scripts' make[1]: *** No rule to make targetNagios/DBD/Oracle/Server/Instance/SGA/SharedPool/DictionaryCache.pm’, needed bycheck_oracle_health'. Stop. make[1]: Leaving directory/downloads/Nagios/check_oracle_health-1.6.6.1/plugins-scripts’ make: *** [all-recursive] Error 1Please help me
Many thanks in advance.
lausser Reply:
November 18th, 2010 at 16:58Looks like your tar cannot unpack files with very long names. Take the shar.gz-package instaed.
-
Thomas Wollner Says:
November 23rd, 2010 at 17:57Hi, thanks for the great plugin. Currently it seems to me that the percent calculation does something strange. I have a tablespace which is 500MB size and ~ 245MB used. the plugin output tells me that there is 0.75 % of the TS used. the performance data seems to have the right values.
Plugin output: OK – tbs TS_MYDB usage is 0.75% | ‘tbs_ts_mydb_usage_pct’=0.75%;90;98 ‘tbs_ts_mydb_usage’=245MB;29491;32112;0;32767 ‘tbs_ts_mydb_alloc’=500MB;;;0;32767
The Database is an oracle 10g XE version.
Any thoughts on this? thanks in advance,
Tom
lausser Reply:
November 24th, 2010 at 12:32The 500MB are allocated size. But this tablespace seems to be auto-allocatable, so it will grow as soon as the used size becomes nearly as much as the allocated size. If the calculation were based on used/allocated, the percentage would ‘jump’, approaching 100%, then (after the database automativally allocates more space) drop again. That’s why calculation is based on used size/maximum allocatable size.
-
Prezes Klubu Says:
November 24th, 2010 at 3:17Hello!
check_oracle_health is awesome :) But… It works fine only for some time… hours, days. After some time service stay in one state forever. And all new added services which using check_oracle_health plugin hangs on “pending” status for all the time. When I run check_command from command line everything works fine, results are correct. Any ideas?
lausser Reply:
November 24th, 2010 at 11:16If Nagios doesn’t schedule a service any more, then it’s not the fault of the underlying plugin. The nagios-users mailing list would be the best place to address this kind of problems.
-
Prezes Klubu Says:
November 24th, 2010 at 18:34Hello,
When something should be created in system_vartmpdir? After the first checks, or when? I don’t have anything in this directory… That might be a reason why all nagios checks hangs on one status…?
lausser Reply:
November 24th, 2010 at 18:38Files are creates in the temp-dir as soon as the plugin is executed. No temp-files = plugin never has been run. Plugin never has been run + pending status = Nagios problem. Again: if Nagios doesn’t run a plugin because the service is pending, this has nothing to do with the plugin.
-
Prezes Klubu Says:
November 24th, 2010 at 18:48If I start check_oracle_plugin from command line (the same command and parameter which are defined in command in nagios), some files should be created in temp-dir ?
lausser Reply:
November 24th, 2010 at 18:53It depends on the mode. For example connection-time does not write a temp-file.
Prezes Klubu Reply:
November 24th, 2010 at 18:55@lausser, could you tell me which mode should created temp-file?
miwu Reply:
December 2nd, 2010 at 20:37Hi,
erstmal vielen Dank für das Top-plugin. Leider habe auch ich Probleme mit dem Plugin in Verbindung mit nrpe. Wenn ich das Plugin lokal auf dem DB-Server aufrufe klappt alles prima. Wenn ich es per nrpe vom Nagios-Host aus versuche leider nicht. ich bekomme folgenden Fehler:
CRITICAL – cannot connect to testdb. install_driver(Oracle) failed: Can’t load ‘/usr/lib/perl5/site_perl/5.10.0/x86_64-linux-thread-multi/auto/DBD/Oracle/Oracle.so’ for module DBD::Oracle: libclntsh.so.11.1: cannot open shared object file: No such file or directory at /usr/lib/perl5/5.10.0/x86_64-linux-thread-multi/DynaLoader.pm line 203. at (eval 14) line 3 Compilation failed in require at (eval 14) line 3. Perhaps a required shared library or dll isn’t installed where expected at /usr/local/nagios/libexec/check_oracle_health line 4512
In der /etc/init.d/nrpe habe ich bereits folgendes gesetzt (und den nrpe auch neu gestartet)
PATH=/opt/oracle/app/oracle/product/11.2.0/dbhome_1/bin:/bin:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/usr/lib/mit/bin:/usr/lib/mit/sbin ORACLE_HOME=/opt/oracle/app/oracle/product/11.2.0/dbhome_1 LD_LIBRARY_PATH=$ORACLE_HOME/lib export PATH export ORACLE_HOME export LD_LIBRARY_PATH
Ich habe hier auch noch etwas davon gelesen, daß man im ORACLE_HOME noch Dateirechte ändern sollte, ich bin aber nicht dahinter gekommen, welche geändert werden sollten.
Kannst Du mir hier weiterhelfen?
Vielen Dank und Grüße
Miwu
lausser Reply:
December 2nd, 2010 at 20:59Oracle-Environment-Variablen im nrpe-Startscript eintragen. Siehe weiter oben…
miwu Reply:
December 3rd, 2010 at 8:36Hallo,
die Oracle-Environment-Variablen hatte ich schon in die /etc/init.d/nrpe eingetragen (ORACLE_HOME,LD_LIBRARY_PATH und den aktualisierten PATH), fehlt da noch was?
Vielen Dank!
Miwu
miwu Reply:
December 3rd, 2010 at 10:39habe die Ursache gefunden:
Die Datei libclntsh.so.11.1 im ORACLE_HOME war für den nrpe-User nicht lesbar.
Danke!
Miwu
-
Prezes Klubu Says:
November 24th, 2010 at 18:54I think that nagios scheduled check service proper… I runs nagios in debug mode and I can found: [1290617402.177408] [016.0] [pid=17995] Checking service ‘IMP Connected Users’ on host ‘hostname’… [1290617402.177431] [001.0] [pid=17995] get_raw_command_line() [1290617402.177438] [2320.2] [pid=17995] Raw Command Input: /usr/lib/nagios/plugins/check_oracle_health –environment ORACLE_HOME=/usr/lib/oracle/10.2.0.4/client –connect IMP –username SYSTEM –password imp –mode connected-users [1290617402.177447] [2320.2] [pid=17995] Expanded Command Output: /usr/lib/nagios/plugins/check_oracle_health –environment ORACLE_HOME=/usr/lib/oracle/10.2.0.4/client –connect IMP –username SYSTEM –password imp –mode connected-users [1290617402.177453] [001.0] [pid=17995] process_macros() [1290617402.177459] [2048.1] [pid=17995] * BEGIN MACRO PROCESSING ******** [1290617402.177465] [2048.1] [pid=17995] Processing: ‘/usr/lib/nagios/plugins/check_oracle_health –environment ORACLE_HOME=/usr/lib/oracle/10.2.0.4/client –connect IMP –username SYSTEM –password imp –mode connected-users’ [1290617402.177482] [2048.2] [pid=17995] Processing part: ‘/usr/lib/nagios/plugins/check_oracle_health –environment ORACLE_HOME=/usr/lib/oracle/10.2.0.4/client –connect IMP –username SYSTEM –password imp –mode connected-users’ [1290617402.177490] [2048.2] [pid=17995] Not currently in macro. Running output (175): ‘/usr/lib/nagios/plugins/check_oracle_health –environment ORACLE_HOME=/usr/lib/oracle/10.2.0.4/client –connect IMP –username SYSTEM –password imp –mode connected-users’ [1290617402.177496] [2048.1] [pid=17995] Done. Final output: ‘/usr/lib/nagios/plugins/check_oracle_health –environment ORACLE_HOME=/usr/lib/oracle/10.2.0.4/client –connect IMP –username SYSTEM –password imp –mode connected-users’ [1290617402.177502] [2048.1] [pid=17995] * END MACRO PROCESSING ********** [1290617402.177510] [064.1] [pid=17995] Making callbacks (type 13)… [1290617402.177555] [016.1] [pid=17995] Check result output will be written to ‘/var/lib/nagios3/spool/checkresults/checkQ53yV6′ (fd=7) [1290617402.177630] [016.1] [pid=17995] ** Using Embedded Perl interpreter to run service check… [1290617402.256972] [016.2] [pid=18501] Moving temp check result file ‘/var/lib/nagios3/spool/checkresults/checkwsH433′ to queue file ‘/var/lib/nagios3/spool/checkresults/cWbnK76′… [1290617402.324166] [016.1] [pid=17995] Embedded Perl successfully compiled /usr/lib/nagios/plugins/check_oracle_health and returned code ref to plugin handler [1290617402.324363] [016.2] [pid=17995] Service check is executing in child process (pid=18503)
-
Prezes Klubu Says:
November 26th, 2010 at 16:22After changing use_embedded_perl_implicitly options in nagios.cfg everything start working… for now :)
Thanks!
-
miwu Says:
December 3rd, 2010 at 13:16Hallo,
ich bin schwer beigeistert von dem Plugin. Bis auf folgendes Problem klappt alles auch super, nur beim Prüfen der Tablespace_usage kommen bei mir falsche Werte.
Mein system-Tbs ist 700M groß, davon sind 273M belegt. Mein Check ergibt jedoch Folgendes:
./check_oracle_health –connect testdb –username=user –password=pw –mode tablespace-usage –tablespace system -units % OK – tbs SYSTEM usage is 0.84% | ‘tbs_system_usage_pct’=0.84%;90;98 ‘tbs_system_usage’=273MB;29491;32112;0;32767 ‘tbs_system_alloc’=700MB;;;0;32767
Die 0,84% Tablespace-Belegung stimmen leider nicht, obwohl die ermittelten Werte richtig sind. Das Problem tritt bei allen Tablespaces auf. Woran könnte das liegen?
Vielen Dank!
Miwu
lausser Reply:
December 3rd, 2010 at 17:05Der Tablespace ist nicht 700MB groß. 700MB sind allocated space. Wenn der used space die 700MB erreicht, wird auch der allocated space wachsen. Das Plugin geht bei der Berechnung vom max. erreichbaren allocated space aus.
-
i611287 Says:
December 10th, 2010 at 17:43hello Lauser I need your help I have a problem with the query ‘tablespace-usage’
First: The tablespace to consult the following features: Used%: 92.42 Size: 134 Mb Used: 124 Mb Free: 10.2 Mb
Second: Consultation with check_oracle_health is:
/Usr/local/nagios/libexec/check_oracle_health –connect ‘(DESCRIPTION = (ADDRESS = (PROTOCOL = TCP) (HOST = X.Y.Z.R) (PORT = 1521)) (CONNECT_DATA = (SID = sidname))) ‘ –user nagios –password nagios –mode tablespace-usage –tablespace tablespacename
OK – tbs tablespacename usage is 0.38% | ‘tbs_tablespacename_usage_pct’ = 0.38%, 90, 98 ‘tbs_tablespacename_usage’ = 124MB, 29491, 32112, 0, 32767 ‘tbs_tablespacename_alloc’ = 134MB;;; 0; 32767
you see, there are data that do not match:
tbs tablespacename tbs usage is 0.38% (is 92.42%) failed tbs_tablespacename_usage_pct = 0.38% (is 92.42%) failed tbs_tablespacename_usage = 124MB – ok tbs_tablespacename_alloc = 134MB – ok
The percentages do not match.
Can you help me with this? If I’m wrong about something, please let me note
Thanks.
Regards.
lausser Reply:
December 10th, 2010 at 18:06Used%: 92.42
92% of the currently allocated space are used. Now write to the database and see what will happen. Oracle will auto-allocate more space and your Used: will drop. Not good. check_oracle_health calculates the usage on the maximum allocatable space.
-
svbr Says:
December 13th, 2010 at 9:20Hallo, bei der Anwendung des Plugins check_oracle_health halten wir aus Sicherheitsgründen das setzten des Rechtes “GRANT SELECT any dictionary TO nagios; ” für bedenklich, wenn der Nagios User in der DB “nur” die größe der tabelspaces abfragen soll. Nach entzug des any dictionary Rechtes kommen aber keine Werte mehr zu Nagios herüber. Welche Rechte sind explizit für die Tablespaceüberwachung erforderlich, die einen weiteren Zugriff auf andere DB-Informationen nicht zulassen?
lausser Reply:
December 13th, 2010 at 11:45Das herauszufinden kostet Zeit. Meine Freizeit, von der ich momentan keine habe. Wenn sie dringend eine Antwort benötigen, können sie gerne unter support@consol.de ein formales Angebot anfordern. Mein Arbeitgeber wird mir dann erlauben, die Informationen während meiner Arbeitszeit zu beschaffen.
-
svbr Says:
December 14th, 2010 at 9:48Danke für das Angebot, inzwischen hab ich –nach nächtelanger Suche im Netz — die Infos bekommen:
GRANT CREATE SESSION TO “NAGIOS” GRANT SELECT ON “SYS”.”DBA_UNDO_EXTENTS” TO “NAGIOS” GRANT SELECT ON “SYS”.”DBA_TABLESPACES” TO “NAGIOS” GRANT SELECT ON “SYS”.”DBA_TEMP_FILES” TO “NAGIOS” GRANT SELECT ON “SYS”.”V_$TEMP_EXTENT_POOL” TO “NAGIOS” GRANT SELECT ON “SYS”.”V_$TEMP_SPACE_HEADER” TO “NAGIOS” GRANT SELECT ON “SYS”.”DBA_DATA_FILES” TO “NAGIOS” GRANT SELECT ON “SYS”.”DBA_FREE_SPACE” TO “NAGIOS” GRANT “CONNECT” TO “NAGIOS”
Damit funktioniert´s Danke
-
John Alberts Says:
December 16th, 2010 at 23:29Hi. We’ve been using this plugin for about a year now and I thought I was working great. It turns out that it wasn’t monitoring the tablespace usage properly. We are using this in a rac environment, so I wonder if that might have something to do with it. Does this plugin check tablespace usage properly for Oracle 10g RAC? The numbers it returns almost seem random. For instance, the plugin says my USERS tablespace is at 0.8%, while the actual usage determined using an sql statement is around 91%. Other tables are just the opposite. Real usage is at almost 99%, while the plugin returns usage as being something around 20 or 30%. I know I didn’t give much detail here, but I figured I would ask here real quick and give more details if needed. Thanks
lausser Reply:
December 16th, 2010 at 23:4399% of what? Of the currently allocated space or of the maximal allocatable space? In contrary to other tools, check_oracle_health compares used space to maximum allocatable space. Other tools compare used space to allocated space. Which means, that you have 99% now and (after the db auto-allocates more space) in the next moment less than 99%.
John Alberts Reply:
December 17th, 2010 at 0:35@lausser, I’m actually a complete idiot when it comes to Oracle, but the dba gave me some sql to run to check space which he says is accurate. Can you tell me what the difference is? Here is the output of the sql query, and then the output of the plugin.
SQL> select a.TABLESPACE_NAME, a.BYTES bytes_used, b.BYTES bytes_free, b.largest, round(((a.BYTES-b.BYTES)/a.BYTES)*100,2) percent_used from (select TABLESPACE_NAME, sum(BYTES) BYTES from dba_data_files group by TABLESPACE_NAME) a, (select TABLESPACE_NAME, sum(BYTES) BYTES, max(BYTES) largest from dba_free_space group by TABLESPACE_NAME) b where a.TABLESPACE_NAME=b.TABLESPACE_NAME order by ((a.BYTES-b.BYTES)/a.BYTES) desc; 2 3 4 5 6 TABLESPACE_NAME BYTES_USED BYTES_FREE LARGEST PERCENT_USED
USERS 292225024 13959168 13303808 95.22 REGISTRATIONDB 674430976 33161216 31653888 95.08
/usr/local/nagios/libexec/check_oracle_health –connect=’(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=secret)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=secret)))’ –user=secret –password=secret –mode=tablespace-usage –warning=90 –critical=95 –regexp –name “REGISTRATIONDB|USERS” OK – tbs USERS usage is 0.81%, tbs REGISTRATIONDB usage is 1.87% | ‘tbs_users_usage_pct’=0.81%;90;95 ‘tbs_users_usage’=265MB;29491;31129;0;32767 ‘tbs_users_alloc’=278MB;;;0;32767 ‘tbs_registrationdb_usage_pct’=1.87%;90;95 ‘tbs_registrationdb_usage’=611MB;29491;31129;0;32767 ‘tbs_registrationdb_alloc’=643MB;;;0;32767
Thanks
John Alberts Reply:
December 17th, 2010 at 0:38@John Alberts, Sorry, that didn’t format very well at all. Here’s the output from the sql query formatted a bit better. Hopefully…
TABLESPACE_NAME BYTES_USED BYTES_FREE LARGEST PERCENT_USED
USERS 292225024 13959168 13303808 95.22 REGISTRATIONDB 674430976 33161216 31653888 95.08
lausser Reply:
December 17th, 2010 at 1:36Please ask your dba if he knows the difference of allocated size and max size of a tablespace (with auto-allocation configured) As i already wrote, 95.22% usage means: 95.22% of the allocated space are used. If you continue writing to the tablespace, oracle will automatically allocate more space. This means, that your percentage will drop. You’ll get an alarm at 3:00 in the morning because your sql-script reports 99%, but at 3:01 oracle auto-allocates more space. So when you haved turned on your pc at 3:05 you’ll see for example 80%.
John Alberts Reply:
December 17th, 2010 at 3:11@lausser, That’s what I suspected was the problem; however, our disk actually ran out of space and we were never alerted because the plugin still showed we were below our critical threshold of 95%. I wonder… is it possible for the dba to allocate a larger max size for the tablespace than is actually available on the disk? Maybe that’s what happened? The plugin showed tablespace x is at 80% because it still has 20% to go before it hits the max size, but unfortunately, the disk space was 100% used.
In any case, I’ll need to do some more playing around and talking to the dba some more.
-
John Alberts Says:
December 17th, 2010 at 4:43Can you explain the performance data information please? I couldn’t find any documentation about it. For instance, for the users tablespace, it shows this:
‘tbs_users_usage_pct’=0.81%;90;95 ‘tbs_users_usage’=265MB;29491;31129;0;32767 ‘tbs_users_alloc’=278MB;;;0;32767
tbs_user_usage_pct makes sense. 0.81% used, warning threshold is 90% and critical is 95%.
I’m really not sure aobut the actual usage data.
lausser Reply:
December 18th, 2010 at 17:53used = used data, alloc = allocated data
John Alberts Reply:
December 18th, 2010 at 18:53@lausser, :) Yeah, I figured that much. I actually meant the semi-colon separated values. So, ‘tbs_users_usage’=265MB;29491;31129;0;32767
This = 265MB used and I’m not sure what the other numbers are.
Thanks for spending so much time helping me understand this.
Regards, john
lausser Reply:
December 18th, 2010 at 21:10You can find the meaning of the values here: http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN201
John Alberts Reply:
December 21st, 2010 at 17:13@lausser, Thank you. I didn’t realize it was standard.
-
17 Nagios-Fliegen mit einer Klappe: OMD 0.44 | KenntWas.de - Technische Tips Says:
December 22nd, 2010 at 0:08[...] check_oracle_health aus den Labs der Fa. Consol überprüft Oracle Datenbanken. Auf die Benutzung werde in ein einem anderen Artikel noch weiter eingehen. check_oracle_health ist ein Perl-Script. Die Datenbanken werden (meistens) remote von Nagiosserver aus abgefragt. Dazu benötigt dieser entweder einen Oracle (Voll-) Client oder den Oracle Instantclient. Die Installation des Oracle Instant Clients unter Ubuntu habe ich in einem früheren Artikel beschrieben. Dort findet sich auch Aufrufbeispiel mit dem Oracle Easy Connectstring. Die Connectstrings sollten direkt im Kommando mit angegeben werden. Eine Verwendung der tnsnames.ora ist zu fehleranfällig (wird schnell vergessen). Ich persönlich ziehe die EZCONNECT-Strings vor (wenn sie erlaubt sind), da sie wesentlich einfacher sind als “(DESCRIPTION =(ADDRESS = (PROTOCOL = TCP…..”- connectstrings. [...]
-
Fabien Says:
December 27th, 2010 at 18:04Hi,
I tried to use the check_oracle_health scriptn but I have some problems with the execution :
for example, I tried to run :
“simple” command works : check_oracle_health –connect release –mode tnsping ==> OK
but when I tried to acces to the database :
check_oracle_health –mode tablespace-usage
==> I have an error : CRITICAL – cannot connect to release. ORA-01017: invalid username/password; logon denied (DBD ERROR: OCISessionBegin)
If I add -user – password in my commande line, I have another error ;)
check_oracle_health –mode tablespace-usage -user OR#USER -password toto
==> I have an error :
Use of uninitialized value in split at ./check_oracle_health line 4381. Use of uninitialized value in split at ./check_oracle_health line 4381. bumm Can’t call method “execute” on an undefined value at ./check_oracle_health line 4690.
Can’t use an undefined value as an ARRAY reference at ./check_oracle_health line 4704.
Someone can help me on those errors ?
regards,
lausser Reply:
December 27th, 2010 at 19:31OR#USER is not valid the way you use it. Either chose anothe rname which contains no shell special characters or put it into quotes.
-
check_oracle_health: seg-top10- Abfragen verbessern | KenntWas.de - Technische Tips Says:
December 29th, 2010 at 12:27[...] Links zu v$segstat Das Nagios-/OMD Plugin check_oracle_health von Gerhard Lausser kann unter anderem auch vier top10-Abfragen auf v$segstat. Diese Abfragen [...]
-
M.Nieberg Says:
December 29th, 2010 at 12:51After having performance problems with –mode=seg-top10- Queries in check_oracle_health, I use another statement. You cant find it here: http://wp.me/p16yMU-hN (german page) Perhaps it also works for you.
-
Michel Says:
January 3rd, 2011 at 18:10Hi,
I am trying your new version (1.6.7) But seems to be running into a small issue. When I am using the sqlmode with –name2=”something” I receive “CRITICAL – output 0 not found” When I am deleting the –name2=”something” it will produce a “OK” status (with the complete script in it ofcourse) Any Ideas what is wrong?
Michel
-
Tontonitch Says:
January 5th, 2011 at 19:06Hi Gerhard,
I’m facing some problems with the new version 1.6.7 of check_oracle_health plugin and the modification made on the sql mode. Hereunder are 2 exemple of use taken from the exemples on this page, with perl error returned:
Exemple 1: icinga@monitor:/usr/local/icinga/libexec$ ./check_oracle_health –connect XX –username XXXX –password xxxxxxxxxxx –mode sql –name “select ‘abc123′ from dual” –name2 abc123 Use of uninitialized value $_ in pattern match (m//) at ./check_oracle_health line 3878. CRITICAL – output abc123 not found
Exemple2: icinga@monitor:/usr/local/icinga/libexec$ ./check_oracle_health –connect XX –username XXXX –password xxxxxxxxxxx –mode sql –name ‘select count(*) from v$session’ –name2 sessions Use of uninitialized value $_ in pattern match (m//) at ./check_oracle_health line 3878. CRITICAL – output 26 not found icinga@monitor:/usr/local/icinga/libexec$
Regards,
Yannick
lausser Reply:
January 5th, 2011 at 19:55Release 1.6.8 is out which corrects this error.
-
Jens G. Says:
January 10th, 2011 at 12:44Hallo, Ich habe mal eine Frage: Wir haben heute den Check von 1.5.0.1 auf 1.6.8 aktualisiert. Laut Changelog ist mit der Version 1.6.4 die Überprüfung der Objects in der Table “dba_registry” hinzugekommen. Nun habe ich hier ein Projekt, welches Oracle 10 einsetzt mit der Lizenz Standard Edition One. Bei diesen scheint es standardmäßig so zu sein, dass folgende Komponenten Off sind: Oracle Data Mining = OPTION OFF Oracle OLAP API = OPTION OFF Spatial = OPTION OFF OLAP Analytic Workspace = OPTION OFF
Nun hier die Frage, wie man dieses Problem umgehen kann? Macht es Sinn die Abfrage (status <> ‘VALID’;) zu erweiteren mit status <> ‘OPTION OFF’; ?
Oder gibt es andere Lösungen?
Vielen Dank und Viele Grüße Jens G.
-
Hector Roman Says:
January 14th, 2011 at 17:52your guidance teacher again to install the plugin from Oracle, I am using the following statement from the command line.
. / check_oracle_health – connect 172.20.10.180 – user monitoring – password bbbb – tablespace-usage mode
The result is:
CRITICAL – Can not connect to 172.20.10.180. install_driver (Oracle) failed: Can not locate DBD / Oracle.pm in @ INC (@ INC contains:. / etc / perl / usr/local/lib/perl/5.10.1 / usr/local/share/perl/5.10 .1 / usr/lib/perl5 / usr/share/perl5 / usr/lib/perl/5.10 / usr/share/perl/5.10 / usr / local / lib / site_perl) at (eval 18) line 3. Perhaps the DBD:: Oracle perl module hasn’t Been Fully installed, or Perhaps the capitalization of ‘Oracle’ isn’t right. Available drivers: AnyDATA, CSV, DBM, ExampleP, File, Gofer, ODBC, Proxy, Sponge, Sybase, mysql. at. / check_oracle_health line 4596
I do not know if the statement that I’m using is correct because I’ve seen some tsnames specify in the connection. I installed the Oracle Client without any problems and I was able to connect using SQLPLUS.
Appreciating your guidance, I say goodbye.
Hector Roman Reply:
January 14th, 2011 at 17:55I am using Ubuntu Server 10.4
perl -MCPAN -e shell cpan> install DBI
lausser Reply:
January 14th, 2011 at 18:02DBI is not enough. From the plugin’s error output: ……Perhaps the DBD:: Oracle perl module hasn’t Been Fully installed…..
Either install DBD::Oracle or use the parameter –method sqlplus
-
Christoph Greiner Says:
January 17th, 2011 at 12:43Wenn ich nagios im Daemon-Modus starte bekomme ich folgende Meldungen…
Illegal division by zero at /usr/local/nagios/libexec/check_oracle_health line 3350, <> line 11. Warning: Return code of 25 for check of service ‘ora-tablespace-io-balance’ on host ‘TSAPBWP02_LEN_LAN’ was out of bounds.
Mein Command sieht so aus:
check_oracle_health -3 –method sqlplus –mode tablespace-io-balance –warning 900 –critical 950 –environment ORACLE_HOME=/usr/lib/oracle/11.2/client –environment LD_LIBRARY_PATH=/usr/lib/oracle/11.2/client/lib/ –connect SID –username=nagios –password=nagiospass
lausser Reply:
January 19th, 2011 at 19:52Bitte mit touch /tmp/check_oracle_health.trace eine Datei anlegen. Solange diese existiert, schreibt check_oracle_health die abgesetzten SQL-Statements und derern Ergebnisse rein. Die Datafile-relevanten Statements bitte mit sqlplus von Hand ausführen und auf leere Ergebnisse oder Fehler prüfen.
Weil ichs grad noch sehe….was bedeutet “als daemon aufrufe”. Heisst das, es funktioniert, wenn mans auf der Kommandozeile aufruft? In dem Fall müsste ich kräftig schimpfen, da check_oracle_health vermutlich als root aufgerufen wurde und das ist pfui. PFUI! Nagios-Plugins ruft man niemals als root auf, sondern als Nagios-User. root = PFUI!
Bei manchen Modi wird nämlich eine Zwischendatei in /var/tmp/check_oracle_health erzeugt. Wenn die zunächst dem root gehört, kann sie später nicht mehr überschrieben werden, wenn check_oracle_health unter der UID von Nagios läuft und dann kracht’s.
-
Hennie Says:
January 25th, 2011 at 1:03Hi Lausser
Thanks for a great extension, I would like to exclude undo tablespace from the fragmentation check, where can i change the script to exclude this?
lausser Reply:
January 25th, 2011 at 4:06you don’t have to modify the code. you simply have to read the manual or search google-
-
Christoph Says:
January 27th, 2011 at 11:04Hallo Hr. Lausser,
kann man “tablespace-usage” ohne autoextend Option betreiben? So das der Füllgrad des Tablespace dem Wert im SAP (Transaktion DB02) entspricht?
Ausgabe check_oracle_health:
‘tbs_psapp02_usage_pct’=40.89%;90;95 ‘tbs_psapp02_usage’=310752MB;684000;722000;0;760000 ‘tbs_psapp02_alloc’=460309MB;;;0;760000
Werte in DB02:
PSAPP02 Size = 460.310,00 Free = 148.345,69
Used = 68 %
Ausserdem würde ich gerne den UNDO-Tablespace excluden – das funktioniert auf der Shell – aber nicht wenn ich es über Centreon konfiguriere.
./check_oracle_health -3 –method sqlplus –mode tablespace-usage –name=’^(?!(PSAPUNDO))’ –regexp –warning 90 –critical 95 –environment ORACLE_HOME= …usw…
da kommt Centreon mit der RegExp nicht zurecht. Könntest du nicht einen seperaten Parameter einführen? ;-)
Merci und Gruß aus dem Schwarzwald Christoph
lausser Reply:
January 28th, 2011 at 19:00Das geht nicht und wurde absichtlich nicht so gemacht. Der Wert *_allocated sollte dem SAP-Füllstand entsprechen.
-
Marco P Says:
February 1st, 2011 at 11:53Hi, when i start /usr/local/nagios/libexec/check_oracle_health –connect ora_db1 –mode tablespace-usage
i get a error: ORA-24327: need explicit attach before authenticating a user (DBD ERROR: OCISessionBegin)
sqlnet.ora exists: cat /usr/lib/oracle/11.2/client64/network/admin/sqlnet.ora names.directory_path = (TNSNAMES,EZCONNECT)
env | grep TNS TNS_ADMIN=/usr/lib/oracle/11.2/client64/network/admin
env | grep ORA ORACLE_HOME=/usr/lib/oracle/11.2/client64
env | grep LD LD_LIBRARY_PATH=/usr/lib/oracle/11.2/client64/lib
But the following script works:
! /usr/bin/perl -w
use DBI; my $dbh = DBI->connect( ‘dbi:Oracle:192.168.1.1:1521/prod’,'scott’,'tiger’,) || die “Connect error: $DBI::errstr”;
($name,$sal) = $dbh->selectrow_array( “SELECT ename,sal FROM scott.emp WHERE empno = 7369″); print “Name Gehalt\n”; print “—- ——\n”; print “$name $sal”; $dbh->disconnect;
Any Ideas? Thanks Marco
Marco P Reply:
February 1st, 2011 at 12:33Found the solution: username and password was missing Thanks Marco
-
Hector Roman Says:
February 3rd, 2011 at 17:08Friends of your help, achieved after much stress perl install oracle driver, create the corresponding tsnames the database that I want to monitor and perform a tnsping this he responds.
tnsping MPROD1
TNS Ping Utility for Linux: Version 10.2.0.1.0 – Production on 03-FEB-2011 11:50:07 Copyright (c) 1997, 2005, Oracle. All rights reserved. Used parameter files: Used TNSNAMES adapter to resolve the alias Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 172.20.10.180)(PORT = 1521))) (CONNECT_DATA = (SID = mprod1) (SERVER = DEDICATED))) OK (10 msec)
By using the plugins this throws me the following error.
./check_oracle_health –method sqlplus –connect 172.20.10.180 –user monitoreo –password monitoreo2011 –mode tablespace-usage
CRITICAL – cannot connect to 172.20.10.180. ORA-12504: TNS:listener was not given the SERVICE_NAME in CONNECT_DATA .
Any suggestions teachers
lausser Reply:
February 3rd, 2011 at 17:17If there are fifty oracle instances running on 172.20.10.180, how should the listener know which one you mean? You already setup a tnsnames.ora, so why do you connect to the db server’s ip address instead of the oracle instance’s sid?
... --connect MPROD1 ...
Hector Roman Reply:
February 3rd, 2011 at 17:44@lausser, Sorry for my stupidity, I had not noticed, it worked perfectly friend, when you want to come to Chile tells me it has won a stay thanks to the constant support
-
M.-M. Suchanek Says:
February 4th, 2011 at 10:50Hallo Herr Lausser,
wenn ich “tablespace-can-allocate-next” abfrage, braucht meine DB länger als 60 sec für die Antwort, Ihr Plugin läuft dann auf “timed out”. Gibt es eine Möglichkeit den “time out” höher zu setzen?
lausser Reply:
February 4th, 2011 at 10:54http://www.nagios-wiki.de/nagios/doku3/configmain#service_check_timeout
M.-M. Suchanek Reply:
February 4th, 2011 at 12:32Hallo Herr Lausser,
der Parameter “service_check_out” hat nix gebracht (:-(: /usr/lib/nagios/plugins/check_oracle_health –connect=db –user=nagios –password=”xxx” –mode=tablespace-can-allocate-next “UNKNOWN – check_oracle_health timed out after 60 seconds”
lausser Reply:
February 4th, 2011 at 12:34Was ist mit
check_oracle_health --timeout 300 .....
M.-M. Suchanek Reply:
February 4th, 2011 at 13:29Danke! Jetzt funktioniert’s.
-
Jens Says:
February 7th, 2011 at 11:55Ich habe auch ein kleines Problem, welches aber nicht immer auftritt, sondern in unregelmässigen Abständen. Weiss leider keine Lösung, bisher hatte Neuinstallation des Plugins immer geholfen, aber diesmal nicht.
Führt Nagios das Plugin aus, erhalte ich die MEldung: CRITICAL – cannot connect to CPLANDB. ERROR OCIEnvNlsCreate. Check ORACLE_HOME (Linux) env var or PATH (Windows) and or NLS settings, permissions, etc.
Führe ich den gleichen Befehl von Kommandozeile aus, erhalte ich die Rückmeldung OK.
lausser Reply:
February 7th, 2011 at 11:57Der nagios-Prozess braucht das Oracle-Environment (export ORACLE_HOME=…). Am Besten ins init-Script eintragen.
Jens Reply:
February 7th, 2011 at 11:57Ich nochmal. Nachdem ich den Nagios Server neu gestartet hatte trat das Problem auf. Habe eben nur Nagios neu gestartet, seitdem funktioniert es wieder…
lausser Reply:
February 7th, 2011 at 12:01Wie ich es mir dachte. In deiner Shell hast du die Oracle-Environmentvariablen gesetzt. Wenn du Nagios aus der Shell heraus startest, erbt der Nagios-Prozess das Environment und alles funktioniert. Startet Nagios beim Booten per init-Script, dann hat er dieses Environment nicht und das Plugin funktioniert nicht. Daher muss man im init-Script ORACLE_HOME,LD_LIBRARY_PATH,TNSNAMES oder was auch immer nötig ist, extra eintragen.
-
Jens Says:
February 7th, 2011 at 12:25Alles klar, dann schau ich mal wie ich das mache :) Vielen Dank
-
Jens Says:
February 7th, 2011 at 12:31Und nochmal… ich habe eben nachgeschaut, im rc2.d Startup Script von Nagios habe ich die entsprechenden Variablen:
export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/client_1 export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
lausser Reply:
February 7th, 2011 at 12:38Gibts da auch ein
export ORACLE_BASE=….?
Ohne das bekommt ja ORACLE_HOME keinen gültigen Wert. -
Jens Says:
February 7th, 2011 at 12:55Danke -.- doch nur blind… scheints wohl gewesen zu sein. Zumindest sieht es nach dem Reboot noch gut aus.
-
Joe Says:
February 7th, 2011 at 16:53Hallo Herr Lausser,
ich habe soeben ein wenig Zeit gehabt mich mit den Invalid Registry Components zu beschäftigen, die ein Check seit dem Update unserer Scriptversion bringt.
In der Registry sind auf dem betroffenen Server einige Module mit ‘OPTION OFF’ vermerkt, sie sind also nicht aktiv. Der Check hat dafür aber leider keine Ausnahme im Code.
lausser Reply:
February 7th, 2011 at 17:49Das heisst, wenn man statt
künftigSELECT COUNT(DISTINCT STATUS) FROM dba_registry WHERE status <> 'VALID'
schreiben würde, dann wär’s perfekt?SELECT COUNT(DISTINCT STATUS) FROM dba_registry WHERE NOT (status = 'VALID' OR status = 'OPTION OFF')
-
Conny Seifert Says:
February 7th, 2011 at 19:49Hallo,
beim mode tablespace-remaining-time (Version 1.6.8)scheints einen Bug zu geben. Nach Anpassung von Zeile 473 in tablespace.pm funktioniert dieser Mode bei mir.
my $lookback = (($params{lookback} || 30) + 1) * 24 * 3600;
Ausserdem würde es aus meiner Sicht Sinn machen die Ausgaben welche in den Zeilen 781, 792 und 795 zum “Ausgabepuffer” hinzugefügt werden um den Namen des Tablespace zu ergänzen.
-
Mendes Says:
February 8th, 2011 at 15:22Hello,
I’m trying to use check_oracle_health. It works with a few modes (tnsping, connection-time, sga-data-buffer-hit-ratio, tablespace-fragmentation).
With some other modes (tablespace-usage, tablespace-free, and others), I get the followin error:
bumm Can’t call method “execute” on an undefined value at /opt/nagios/libexec/check_oracle_health line 4646.
Can’t use an undefined value as an ARRAY reference at /opt/nagios/libexec/check_oracle_health line 4660.
With other modes such as sga-library-cache-hit-ratio, sga-dictionary-cache-hit-ratio and some other modes, I get the following type of error :
CRITICAL – unable to get sga lc
Would you please help me ? Any advice would be appreciated…
Regards, S. Mendes
lausser Reply:
February 8th, 2011 at 15:26Looks like your user doesn’t have the necessary privileges.
-
Hector Roman Says:
February 8th, 2011 at 15:47Lausser
When I run the plugins from the console I works perfect, but when it is integrated with Nagios I get the following error;
CRITICAL – Can not connect to mprod1. ORA-12154: TNS: Could not resolve the connect identifier Specified
If I run tnsping the correct answer me
tnsping mprod1 TNS Ping Utility for Linux: Version 10.2.0.1.0 – Production on 08-FEB-2011 10:43:17 Copyright (c) 1997, 2005, Oracle. All rights reserved. Used parameter files: Used TNSNAMES adapter to resolve the alias Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP) (HOST = 172.20.10.180) (PORT = 1521))) (CONNECT_DATA = (SID = mprod1) (SERVER = DEDICATED) (SERVICE_NAME = mprod1)) ) OK (0 msec)
I think the problem is that the nagios user can not access the file tnsnames.ora, you think that is the problem?
-
Mendes Says:
February 9th, 2011 at 11:41You are right ! I just checked it and figured out that the user priviliges were not set correctly. Thank you very much !
Hector Roman Reply:
February 9th, 2011 at 19:33first edit the file just granting privileges tsnames.ora 777, and did not work, then chown nagios:nagios tsnames.ora and neither worked, is the same error message, any idea where to look?
CRITICAL – cannot connect to mprod1. ORA-12154: TNS:could not resolve the connect identifier specified
Hector Roman Reply:
February 9th, 2011 at 19:35worth noting that when I use the plugins from a console as root is working properly
Hector Roman Reply:
February 9th, 2011 at 21:12@Hector Roman, solution
chown nagios:dba tnsnames.ora
-rwxrwxrwx 1 nagios dba 661 2011-02-03 12:39 tnsnames.ora*
-
Chris Says:
February 16th, 2011 at 14:00FROM Nagios: define command { command_name check_oracle_health command_line /usr/lib/nagios/plugins/check_oracle_health –connect=nagios/passwd@$ARG1$ –mode=$ARG2$
}
define service{ use generic-service host_name some_hostname service_description sga-shared-pool-reloads check_command check_oracle_health!db_name!sga-shared-pool-reloads }
I receive message **ePN /usr/lib/nagios/plugins/check_oracle_health: “printf() on closed filehandle STATE at (eval 1) line 4332,”. same with keywords: sga-library-cache-hit-ratio sga-latches-hit-ratio pga-in-memory-sort-ratio
from shell as root: ./check_oracle_health –connect=nagios/passwd@db_name–mode=sga-shared-pool-reloads OK – SGA shared pool reload ratio 0.25% | sga_shared_pool_reload_ratio=0.25%;1;10
Looks good
with those keywors it works fine from Nagios: sga-shared-pool-free OK OK – SGA shared pool free 25.07%. Tablespace usage also ok OK – tbs SYSTEM has 32220.80MB free space left
I googled but I can’t get find the answer. Thanks for helping
-
Chris Says:
February 16th, 2011 at 14:21The answer is: chown -R nagios:nagios /var/tmp/check_oracle_health as mentioned above
-
@Christina Says:
February 23rd, 2011 at 16:24Hallo, habe check_oracle_health tablespace-free eingebunden. Funktioniert super bis auf folgende Fehlermeldungen die ich ab und zu kriege wie: **ePN /usr/local/nagios/libexec/check_oracle_health:’Can’t open /tmp/server::database::tablespace::freeVu1kJ.out:No such file or directory at (eval18)line 4631,<>chunk2.”. Nach einiger Zeit verschwindet diese Meldung wieder und der Tablespace wird richtig angezeigt. Hat jemand eine Idee für die Lösung dieses Problems? BR, Christina
lausser Reply:
February 23rd, 2011 at 16:31Welche Version von check_oracle_health?
-
@Christina Says:
February 23rd, 2011 at 17:26Hi, die Version ist 1.6.4.
lausser Reply:
February 23rd, 2011 at 17:28Dann bitte auf 1.6.8.1 updaten. Gerade an der Stelle mit den out-Files hat sich seitdem was getan.
-
miwu Says:
February 28th, 2011 at 11:10Hallo,
nach dem Restart meines Servers bekomme ich bei meinem Check mit –mode=tablespace-remaining-time überall den Fehler “Can’t use an undefined value as an ARRAY reference at ./check_oracle_health line 3145″. Vorher bekam oich auch schon immer den Fehler “No data avaiable”. Unter /var/tmp/check_oracle_health liegen verschiedene Files zu den Tablespaces. Woran könnte das liegen? Vielen Dank! Miwu
-
@Christina Says:
February 28th, 2011 at 11:53Hi, danke für die Info. Wir werden auf die neue Version 1.6.8.1 updaten. Eine Frage noch um sicherzugehen, dass wir nicht noch zusätzlich permissions Probleme haben. Wir haben auch noch folgende Fehlermeldung ab und zu:CRITICAL – cannot connect to nagios/***@GIS. Argument x86_64/Linux 2.4.xx isnt numeric in numeric eq (==) at (eval 18) line 4594, line 1. ab und zu? Das PlugIn gehört den nagios user nur ist mir aufgefallen das der Oracle Client dem root gehört (tnsnames.ora). LG, Christina
-
Sven Says:
March 4th, 2011 at 16:52Hallo Gerhard,
Ich habe einen ziemlich mysteriösen Fehler mit deinem ansonsten echt genialen Plugin.
System: Red Hat Enterprise Linux v5.5 DBI v1.52 DBD v1.27 Nagios v3.2.3
Der erste meiner Checks im command.cfg war
define command{ command_name oracle_health_tnsping command_line /usr/local/nagios/libexec/check_oracle_health \ –connect=$ARG1$ \ –mode=$ARG2$ }
Mein zweiter Eintrag war dann mit dem Befehl connection-time. Ich habe schon gelesen, dass connection-time einen funktionierenden TNS Listener voraussetzt und daher der tnsping eigentlich überflüssig wird. Interessant finde ich trotzdem:
Wenn man zum Testen BEIDE Befehle ausführen möchte, funktioniert keiner, beide kommen mit ziemlich kryptischen Fehlermeldungen daher.
cannot connect to svenf. install_driver(Oracle) failed: Can’t load ‘/usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/auto/DBD/Oracle/Oracle.so’ for module DBD::Oracle: libclntsh.so.11.1: cannot open shared object file: No such file or directory at /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/DynaLoader.pm line 230.
Wenn ich dann den tnsping auskommentiere, Nagios neu starte und auf ein funktionieren von connection-time hoffe, werde ich enttäuscht.
connection-time funktioniert erst, wenn ich die service description verändere. Also etwa von “Connection Time” zu “01. Connection Time”
Ich kann hier den Grund nicht ganz nachvollziehen, ist ja aber auch keine grosse Sache. Erwähnen möchte ich dies einfach nur, falls jemand der nach mir mit dem Plugin zu tun hat, diesen Fehler auch bekommt und dann nicht den ganzen Freitag damit verbrät ;)
Danke und Gruss Sven
lausser Reply:
March 6th, 2011 at 22:36‘libclntsh.so.11.1: cannot open shared object file’ besagt, daß LD_LIBRARY_PATH nicht richtig gesetzt ist. Vermutlich gilt das auch für PATH und ORACLE_HOME. Die müssen auch im Init-Script von Nagios stehen.
-
Sven Says:
March 4th, 2011 at 16:54Vergessen habe ich noch: check_oracle_health v.1.6.8.1
-
Sven Says:
March 7th, 2011 at 9:34Danke Gerhard, da war der Hund begraben ;)
-
Thorsten (Panik) Says:
March 10th, 2011 at 15:40Ver 1.6.8.1 und ich bin kein Oracle-Datenbänker Ist alls noch in der Testphase, deswegen auch als Ruth.
wenn ich
root@lxthorsten:/usr/local/nagios/libexec# ./check_oracle_health –connect dexicon –mode connected-users –user –password=
absetze, erhalte ich folgendes
Invalid conversion in printf: “%S” at /usr/lib/perl/5.10/IO/Handle.pm line 154. Invalid conversion in printf: “%’” at /usr/lib/perl/5.10/IO/Handle.pm line 154. Use of uninitialized value $value in numeric gt (>) at ./check_oracle_health line 4002. Use of uninitialized value $value in numeric gt (>) at ./check_oracle_health line 4003. Use of uninitialized value in sprintf at ./check_oracle_health line 1987. Use of uninitialized value in sprintf at ./check_oracle_health line 1991. OK – 0 connected users | connected_users=0;50;100
Ein cat /tmp/check_oracle_health.trace zeigt: Thu Mar 10 14:33:40 2011: RESULT: $VAR1 = [];
Thu Mar 10 14:33:40 2011: fetchrow_array: SELECT version FROM product_component_version WHERE product LIKE ‘%Server%’ Thu Mar 10 14:33:40 2011: args: $VAR1 = [];
Thu Mar 10 14:33:40 2011: RESULT: $VAR1 = [];
Thu Mar 10 14:33:40 2011: fetchrow_array: SELECT COUNT(*) FROM v$session WHERE type = ‘USER’
Thu Mar 10 14:33:40 2011: args: $VAR1 = [];
Thu Mar 10 14:33:40 2011: RESULT: $VAR1 = [];
Thu Mar 10 14:33:40 2011: disconnecting DBD with handle
Bin ich das Problem in meiner Unwissenheit, weil ich den Befehl falsch absetze oder ist das ein ‘echtes’ Problem?
lausser Reply:
March 10th, 2011 at 15:47Eigentlich sollte ich hier nicht antworten, solange ich ‘root’ sehe. Das führt nämlich spätestens dann zu Problemen, wenn das Plugin unter Regie des Nagios-Prozesses läuft. Allerdings ist das dann nicht mein Problem. Vermutlich hat der hier verwendete DB-User nicht die erforderlichen Berechtigungen. Näheres siehe Dokumentation.
Thorsten (Panik) Reply:
March 16th, 2011 at 16:19@lausser, [...]Ist alls noch in der Testphase, deswegen auch als Ruth.[...] habe ich bewusst erwähnt um etwaigen Kommentaren zu diesem Thema entgegen zu treten. Das es Rechteprobleme geben kann, wenn es von der Rutzh zum nagios User wechselt ist mir klar und ich habe das in den späteren Schritten im Hinterkopf.
Der verwendete Oracle User sollte (theoretisch) lt. der Dokumentation angelegt worden sein – so die Aussage meines Oracle-DBAs.
-
Theofanis Katsiaounis Says:
March 14th, 2011 at 11:06Hello. I would like to ask you a question. When i run the check_oracle_health –connect AAA –user someuser –password somepassword –mode latch-waiting –name ‘user lock’ command from shell it works fine but when i try to use it through centreon it gives me the following errror. CRITICAL – cannot connect to asp. ORA-24327: need explicit attach before authenticating a user (DBD ERROR: OCISessionBegin)
I have oracle instant client installed and DBI & DBD::Oracle. Do you have any possible idea this might be happening?? Thanks a lot in advance.
lausser Reply:
March 14th, 2011 at 11:16The centreon process probably needs a correct oracle environment (ORACLE_HOME, LD_LIBRARY_PATH etc..) too. You have this environment in your shell, thats why it works there. Put the export-statements in the centreon initscript.
-
konrad Says:
March 14th, 2011 at 16:06Hallo,
benutze check_oracle_health (1.6.8.1) und kriege folgende Fehlermeldung:
check_oracle_health –connect orcl11 –user nagios –password ****** –mode list-sysstats bumm Can’t call method “execute” on an undefined value at /usr/local/nagios3/libexec/check_oracle_health line 4730.
Can’t use an undefined value as an ARRAY reference at /usr/local/nagios3/libexec/check_oracle_health line 4744.
Woran liegt das? Vielen dank
lausser Reply:
March 14th, 2011 at 16:13Hat der Benutzer nagios die erforderlichen Privilegien?
-
konrad Says:
March 16th, 2011 at 11:52War ein Problem mit der Oracle DB. Jetzt klappt alles.
vielen dank
-
Niggo Says:
March 16th, 2011 at 13:39Hi,
hab ein Problem mit der tablespace-remaining-time Funktion. Beim Aufruf kommt folgender Fehler: Can’t use an undefined value as an ARRAY reference at /usr/local/nagios/libexec/dist/check_oracle_health line 3145.
Alle anderen Funktionen gehen einwandfrei, die Berechtigungen in der DB stimmen und die Perl Module sind, soweit ich das sehe, auch alle korrekt installiert.
-
Ashish Kumar Says:
March 17th, 2011 at 9:47Thank you for donating this plugin to the community, it saves a lot of time for system administrators with no background of Oracle.
I was wondering if there is a way to check multiple (more than one but not all) tablespaces.
One more quick question, is there a way to display only those tablespaces which have some problem and not all. It seems to annoy DBAs few times when they have to read whole lot of message and figure out which tablespace has a problem.
I am using version 1.6.3.
Thank you
lausser Reply:
March 17th, 2011 at 10:42Tablespace monitoring for a selected subset of tablespaces: with the parameter –name you select a specific tablespace, like –name USERS. If you also add the parameter –regexp, the string in name is interpreted as a regular expression.
Ot the other way round, every tyblespace except tablespace1/2/3:--name='^((TABLESPACE1$)|(TABLESPACE2$)|(TABLESPACE3$))' --regexp
How to shorten the output can be found in a blog article: http://labs.consol.de/lang/de/blog/nagios/verkrzen-der-ausgabe-von-check_oracle_health-und-konsorten It’s german, but you should understand it, there are the command line parameters and the corresponding screenshots.--name='^(?!(TABLESPACE1$)|(TABLESPACE2$)|(TABLESPACE3$))' --regexp
Ashish Kumar Reply:
March 17th, 2011 at 11:24@lausser, cool, the regexp part really worked!
Perhaps, I need to upgrade my plugin to latest one to get the short output part working.
I could not thank you enough for the comprehensive documentation you write and for the active support you are providing.
Cheers
-
Gerard Schurink Says:
March 24th, 2011 at 11:08I have to install the check_oracle_health check on a Centos 5.5 box but i don’t know what i have to install for extra tools from oracle to get this check working.
Can anybody point me to the right direction please.
Greetings,
Gerard
-
Alex Says:
March 31st, 2011 at 9:10Hello
Thanks for this plugin, I am using it in my project, but I have a big question to ask that I cannot solve.
Can this plugin be used, to execute an external oracle sql file, and show the result of the query?
I need to execute some sql files, which check database and return a result with the status of the oracle database.
lausser Reply:
March 31st, 2011 at 21:53have a look in the contrib subdirectory of the plugin’s tar-package. there you’ll find instructions how to extend the plugin with your self-written sql code.
-
Alex Says:
April 8th, 2011 at 8:23Hi!
Thanks for the answer, but I am not able to find extend documentacios at that folder (nor in any place), to can execute an sql file in the oracle database.
Should not be as easy as in check_mssql_health?
-
Peter Says:
April 15th, 2011 at 7:38Hallo Gerhard, bin wieder mal mit deinem check_oracle_health beschäftigt. Wir werden in kürze Oracle-Streams umsetzen und da hätt ich ein paar SQL-Abfragen mit check_oracle_health umgesetzt. Jetzt hab ich nur wieder das Problem das es in der Shell funktioniert, aber ich in der Web-Umgebung eine Fehler bekomme.
Hier mal die Daten: Nagios 3.2.0 check_oracle_health 1.6.8.1
der Select den ich ausführen möchte: select count(*) from dba_capture where status <> ‘ENABLED’;
Ergebnis in SQLPLUS:
COUNT(*)
0Den Select hab ich dann mit –encode umgewandelt.
Ausführung mit check_oracle_health: check_oracle_health –connect zztr01.strmtest –username nagios –password passwort –mode sql –name select%20count%28%2A%29%20from%20dba%5Fcapture%20where%20status%20%3C%3E%20%27ENABLED%27 –warning 1 –critical 1 OK – select count(*) from dba_capture where status <> ‘enabled’: 0 | ‘select’=0;1;1
Fehler in Weboberfläche: **ePN /usr/local/nagios/libexec/check_oracle_health: “Use of uninitialized value in numeric gt (>) at (eval 28) line 4016,”.
Wenn ich “# nagios: -epn” im Script eintrage, dann hab ich zwar keinen Fehler mehr, aber ich bekomme auch keine Performancedaten zurück.
Was meinst du da dazu?
Viele Grüße Peter
-
Markus Says:
April 15th, 2011 at 13:05Hallo,
erstmal vielen Dank für das tolle Plugin, habe bisher noch keinerlei Probleme gehabt, allerdings hänge ich momentan bei dem -mode sql. Ich erhalte immer die gleiche Fehlermeldung, egal ob ich die Abfrage encode oder nicht. ./check_oracle_health -connect dba.test -user nagios -password nagios -mode sql name=select%20count%28%2A%29%20from%20TSDWMASTER%2EJOB%5FEMAIL -warning 1 -critical 2 Use of uninitialized value in numeric gt (>) at ./check_oracle_health line 4002. Use of uninitialized value in numeric gt (>) at ./check_oracle_health line 4003. OK – select count(*) from tsdwmaster.job_email:
ich habe jetzt in dieser Abfrage den entsprechenden tablespace mit eingebaut, aber auch wen ich diesen weg lasse und seperat mit -tablespace übergebe bekomme ich diese Fehlermeldung. Was mache ich falsch?
Vielen Dank vorab
Grüße Markus
-
OMD: State Retention der nagios-Plugins in /var/tmp !??? | KenntWas.de - Technische Tipps Says:
May 28th, 2011 at 12:34[...] absoluten Pfad (/var/tmp/) zum Cachen von Dateien oder als statesdir. Ein Beispiel dafür sind check_oracle_health und check_logfiles, die sich gewisse Informationen zwischen zwei Aufrufen persistent [...]
-
check_oracle_health unter Windows: zusätzliche Tipps | KenntWas.de - Technische Tipps Says:
August 16th, 2011 at 8:32[...] PERL5LIBpl2batperl2exeSIGALRM unter Windows32Abhilfe: –method=sqlplusFazitDas Nagios-Plugin check_oracle_health wurde ursprünglich für Linux geschrieben. Da es sich um ein Perlscript handelt, sollte es doch [...]
-
Niepoprawna (niepełna) obsługa zakresów przez check_oracle_health « guzik Says:
September 22nd, 2011 at 10:35[...] szerszy artykuł o wtyczce check_oracle_health od ConSol Labs i w czasie testów zauważyłem, że obsługa zakresów jest niepoprawna, a w [...]
-
Monitoring sinusoidy czyli Nagios i dynamicznie zmieniające się progi « guzik Says:
September 24th, 2011 at 23:00[...] poradzić sobie z tym problemem jeśli monitorujemy bazę (dane w bazie) Oracle za pomocą wtyczki check_oracle_health od ConSol Labs (lub SQL Server wtyczką check_mssql_health; pozostałe dwie – [...]
-
Happy new year ! – ConSol* Labs Says:
December 30th, 2011 at 13:49[...] Nagios plugins got a major boost, with the rock-stars check_logfiles and check_oracle_health, which you probably can find on any larger Nagios installation [...]



lausser Reply:
October 9th, 2009 at 9:39
Das kann ich nicht nachvollziehen.
$ check_oracle_health --user nagios --password $ORAPW --connect NAPRAX --mode sga-shared-pool-reloads OK - SGA shared pool reload ratio 0.85% | sga_shared_pool_reload_ratio=0.85%;1;10 $ check_oracle_health -V check_oracle_health (1.6.3)Verwendest du die neueste Version? Gerhard