Description

check_hpasm is a plugin for Nagios which checks the hardware health of Hewlett-Packard Proliant Servers. To accomplish this, you must have installed the hpasm package. The plugin checks the health of

  • Processors
  • Power supplies
  • Memory modules
  • Fans
  • CPU- and board-temperatures
  • Raids (ide and sas only when using SNMP)

and alerts you if one of these components is faulty or operates outside its normal parameters.

Documentation

The plugin can operate in two modes:

  • Local. The plugin runs on the server which is to be checked. The command hpasmcli (from the hpasm.rpm package) must be installed.
  • Remote. The plugin runs on the Nagios server. It finds out the status of the remote hardware by contacting remote server with SNMP. The hpasm package must be installed on the remote server.
nagios$ check_hpasm
OK - hardware working fine
nagios$ check_hpasm -H 10.0.73.30 -C public
OK - hardware working fine
nagios$ check_hpasm -H 10.0.73.30 -C public -P 1
OK - hardware working fine
nagios$ check_hpasm -H 10.0.73.30 -C public --snmpwalk /usr/bin/snmpwalk
OK - hardware working fine

Comparison of the two modes: lokal und remote.

nagios/check_hpasm/check_hpasm_modes.jpg

Verbosity

For debugging purposes it can be called with the –verbose (or -v) option. It will then output the detailed status of each checked component:

nagios$ check_hpasm -v
CRITICAL - dimm module 0:5 (module 5 @ cartridge 0) needs attention (degraded), System: 'proliant dl360 g5', S/N: '3UH841N09K', ROM: 'P58 08/03/2008'
checking cpus
cpu 0 is ok
cpu 1 is ok
checking power supplies
powersupply 1 is ok
powersupply 2 is ok
checking fans
fan 1 is present, speed is normal, pctmax is 50%, location is powerSupply, redundance is redundant, partner is 2
fan 2 is present, speed is normal, pctmax is 50%, location is cpu, redundance is redundant, partner is 3
fan 3 is present, speed is normal, pctmax is 50%, location is cpu, redundance is redundant, partner is 1
checking temperatures
1 ioBoard temperature is 42C (65 max)
2 ambient temperature is 18C (40 max)
3 cpu temperature is 30C (95 max)
4 cpu temperature is 30C (95 max)
5 powerSupply temperature is 29C (60 max)
checking memory
dimm module 0:1 (module 1 @ cartridge 0) is ok
dimm module 0:2 (module 2 @ cartridge 0) is ok
dimm module 0:3 (module 3 @ cartridge 0) is ok
dimm module 0:4 (module 4 @ cartridge 0) is ok
dimm module 0:5 (module 5 @ cartridge 0) needs attention (degraded)
dimm module 0:6 (module 6 @ cartridge 0) is ok
dimm module 0:7 (module 7 @ cartridge 0) is ok
dimm module 0:8 (module 8 @ cartridge 0) is ok
checking disk subsystem
da controller 0 in slot 0 is ok
controller accelerator is ok
controller accelerator battery is ok
logical drive 0:1 is ok (distribDataGuard)
physical drive 0:0 is ok
physical drive 0:1 is ok
physical drive 0:2 is ok
physical drive 0:3 is ok
physical drive 0:4 is ok
physical drive 0:5 is ok | fan_1=50% fan_2=50% fan_3=50% temp_1_ioBoard=42;65;65 temp_2_ambient=18;40;40 temp_3_cpu=30;95;95 temp_4_cpu=30;95;95 temp_5_powerSupply=29;60;60

–verbose (or -v) can be repeated several times or given a numerical argument. The maximum level is -vvv. Using this level you will see a complete dump of all detected hardware components with all details.

nagios$ check_hpasm -vvv
...
[CPU_0]
cpqSeCpuSlot: 0
cpqSeCpuUnitIndex: 0
cpqSeCpuName: Intel Xeon
cpqSeCpuStatus: ok
info: cpu 0 is ok

[PS_1]
cpqHeFltTolPowerSupplyBay: 1
cpqHeFltTolPowerSupplyChassis: 0
cpqHeFltTolPowerSupplyPresent: present
cpqHeFltTolPowerSupplyCondition: ok
cpqHeFltTolPowerSupplyRedundant: redundant
info: powersupply 1 is ok
...
[FAN_1]
cpqHeFltTolFanChassis: 1
cpqHeFltTolFanIndex: 1
cpqHeFltTolFanLocale: powerSupply
cpqHeFltTolFanPresent: present
cpqHeFltTolFanType: spinDetect
cpqHeFltTolFanSpeed: normal
cpqHeFltTolFanRedundant: redundant
cpqHeFltTolFanRedundantPartner: 2
cpqHeFltTolFanCondition: ok
cpqHeFltTolFanHotPlug: nonHotPluggable
info: fan 1 is present, speed is normal, pctmax is 50%, location is powerSupply, redundance is redundant, partner is 2
...
[PHYSICAL_DRIVE]
cpqDaPhyDrvCntlrIndex: 0
cpqDaPhyDrvIndex: 4
cpqDaPhyDrvBay: 5
cpqDaPhyDrvBusNumber: 1
cpqDaPhyDrvSize: 1864
cpqDaPhyDrvStatus: ok
cpqDaPhyDrvCondition: ok
...

Blacklisting

If you want checks of failed/missing components to be skipped, so alerts caused by these are suppressed, then use the option –blacklist to blacklist them. With this option you give the plugin a list of items separated by / having the following format:

<typ>:<nr>[,<nr>…][/<typ>:<nr>[,<nr>…]]…

where <type> can take one of the following values:

cpu c
powersupply p
fan f
overall fan status ofs
temperature t
dimm d
da controller daco
da controller accelerator daac
da controller accelerator battery daacb
da logical drive dald
da physical drive dapd
scsi controller scco
scsi logical drive scld
scsi physical drive scpd
fcal controller fcaco
fcal accelerator fcaac
fcal host controller fcahc
fcal host controller overall condition fcahco
fcal logical drive fcald
fcal physical drive fcapd
fuse fu
enclosure manager em
iml-event evt

The <nr> of a component can be found in the output of check_hpasm -v.

checking cpus  
cpu 0 is ok c:0
cpu 1 is ok c:1
checking power supplies  
powersupply 1 is ok p:1
powersupply 2 is ok p:2
checking fans  
fan 1 is present, speed is normal, …. f:1
fan 2 is present, speed is normal, …. f:2
fan 3 is present, speed is normal, …. f:3
overall fan status: fan=ok, cpu=ok  
checking temperatures  
1 ioBoard temperature is 42C (65 max) t:1
2 ambient temperature is 18C (40 max) t:2
3 cpu temperature is 30C (95 max) t:3
4 cpu temperature is 30C (95 max) t:4
5 powerSupply temperature is 29C (60 max) t:5
checking memory  
dimm module 0:1 (module 1 @ cartridge 0) is ok d:0:1
dimm module 0:2 (module 2 @ cartridge 0) is ok d:0:2
dimm module 0:3 (module 3 @ cartridge 0) is ok d:0:3
dimm module 0:4 (module 4 @ cartridge 0) is ok d:0:4
dimm module 0:5 (module 5 @ cartridge 0) needs attention (degraded) d:0:5
dimm module 0:6 (module 6 @ cartridge 0) is ok d:0:6
dimm module 0:7 (module 7 @ cartridge 0) is ok d:0:7
dimm module 0:8 (module 8 @ cartridge 0) is ok d:0:8
checking disk subsystem  
da controller 3 in slot 0 is ok daco:3
controller accelerator is ok daac:3
controller accelerator battery is ok daacb:3
logical drive 3:1 is ok (mirroring) dald:3:1
logical drive 3:2 is ok (mirroring) dald:3:2
physical drive 3:0 is ok dapd:3:0
physical drive 3:1 is ok dapd:3:1
physical drive 3:2 is ok dapd:3:2
physical drive 3:3 is ok dapd:3:3
ide controller 0 in slot -1 is ok and unused ideco:0
fcal controller 1:0 in box 1/slot 0 needs attention (degraded) fcaco:1:0
fcal accelerator in box 1/slot 0 is temp disabled fcac:1:0
logical drive 1:1 is failed (advancedDataGuard) fcald:1:1
physical drive 1:128 is failed fcapd:1:128
physical drive 1:129 is ok fcapd:1:129
physical drive 1:130 is failed fcapd:1:130
physical drive 1:131 is ok fcapd:1:131
physical drive 1:132 is failed fcapd:1:132
physical drive 1:133 is ok fcapd:1:133
physical drive 1:134 is ok fcapd:1:134
physical drive 1:135 is ok fcapd:1:135
physical drive 1:144 is ok fcapd:1:144
physical drive 1:145 is ok fcapd:1:145
physical drive 1:147 is unconfigured fcapd:1:147
fcal host controller 0 in slot 1 is ok fcahc:0
fcal host controller 1 in slot 1 is ok fcahc:1

Assumed that you want to blacklist the failed memory module and the three failed hard disks (including the logical drive they belong to), you would write

d:0:5/fcapd:1:128,1:130,1:132/fcald:1:1

As an alternative you can write this string into the first line of a file and give the filename as an argument to –blacklist.

Custom temperature thresholds

If the system-default temperature thresholds should be overridden, use the –customthresholds option.

nagios$ check_hpasm
...
1 cpu temperature is 45C (62 max)
2 cpu temperature is 56C (80 max)
3 ioBoard temperature is 38C (60 max)
4 cpu temperature is 59C (80 max)
5 powerSupply temperature is 31C (53 max)
...

nagios$ check_hpasm --customthresholds 1:70/5:65
...
1 cpu temperature is 45C (70 max)
2 cpu temperature is 56C (80 max)
3 ioBoard temperature is 38C (60 max)
4 cpu temperature is 59C (80 max)
5 powerSupply temperature is 31C (65 max)
...

Performance data

With the option –perfdata you can switch on the output of performance data, if not already set as the default during installation. Should the perfdata string become too long, then use –perfdata=short which outputs a short form of the temperature tags (the location part will not be shown)

nagios$ check_hpasm
OK - hardware working fine| fan_1=8%;0;0 fan_2=8%;0;0  fan_3=15%;0;0 fan_4=15%;0;0 fan_5=8%;0;0 fan_6=8%;0;0 fan_7=20%;0;0 fan_8=20%;0;0 'temp_1_processor_zone'=38;62;62 'temp_2_cpu#1'=37;73;73 'temp_3_i/o_zone'=49;68;68 'temp_4_cpu#2'=40;73;73 'temp_5_power_supply_bay'=36;44;44

nagios$ check_hpasm --perfdata short
OK - hardware working fine| fan_1=8%;0;0 fan_2=8%;0;0  fan_3=15%;0;0 fan_4=15%;0;0 fan_5=8%;0;0 fan_6=8%;0;0 fan_7=20%;0;0 fan_8=20%;0;0 'temp_1'=38;62;62 'temp_2'=37;73;73 'temp_3'=49;68;68 'temp_4'=40;73;73 'temp_5'=36;44;44

Unknown memory status

With some Bios releases hpasmcli doesn’t display the memory modules correctly. The command SHOW DIMM shows only a list of modules with status n/a which is counted as a Warning. Using the –ignore-dimms you can skip memory checking without using a blacklist to avoid this warning.

Non-redundant fans

If you see a warning because all of the fans are not redundant, then this might be because ther are only single fans instead of pairs of fans on purpose. With –ignore-fan-redundancy you can suppress this warning. (See README).

Unfortunately it is not possible to show fan speed (or percent of max. speed) in SNMP mode. Therefore it is shown substituded by 50%.

Installation

  • After unpacking the Archive, call the ./configure command. Attention should be paid to the –with-noinst-level option which defines the exit code of the plugin if no hpasm rpm was installed. With the option –with-degrees you tell the plugin whether you want temperature values displayed in celsius or fahrenheit. With the option –enable-perfdata you tell check_hpasm to add performance data to it’s output by default. If you don’t want to see type, serial number and biosrelease in the output, you can switch this off by using –disable-hwinfo. With –enable-hpacucli you activate checking of raid controllers.
  • Grab the hpasm package suitable for your Linux distribution and install it. See the list of links below where to find it.
  • If you run check_hpasm (in local mode) as a non-root user you will need sudo-privileges which allow you to call /sbin/hpasmcli as root without providing a password.
  • Note: if you want to run check_hpasm under Debian with SNMP v3, you must install some additional packages: aptitude install libtie-encryptedhash-perl libdigest-hmac-perl (Thanks Tony Wolf)

Examples

More examples for different error conditions:

memory module failed:

nagios$ check_hpasm
CRITICAL - dimm module 2 @ cartridge 2 needs attention (dimm is degraded)

nagios$ check_hpasm -v
checking hpasmd process
System        :proliant dl580 g3
Serial No.    :GB8632FB7V
ROM version   :P38 04/28/2006
checking cpus
 cpu 0 is ok
 cpu 1 is ok
 cpu 2 is ok
 cpu 3 is ok
checking power supplies
 powersupply 1 is ok
 powersupply 2 is ok
checking fans
checking temperatures
 1 cpu#1 temparature is 36 (80 max)
 2 cpu#2 temparature is 34 (80 max)
 3 cpu#3 temparature is 33 (80 max)
 4 cpu#4 temparature is 37 (80 max)
 5 i/o_zone temparature is 32 (60 max)
 6 ambient temparature is 23 (40 max)
 7 system_bd temparature is 34 (60 max)
checking memory modules
 dimm 1@1 is ok
 dimm 2@1 is ok
 dimm 3@1 is ok
 dimm 4@1 is ok
 dimm 1@2 is ok
 dimm 2@2 is dimm is degraded
 dimm 3@2 is ok
 dimm 4@2 is ok
CRITICAL - dimm module 2 @ cartridge 2 needs attention (dimm is degraded)

power supply module failed:

nagios$ ./check_hpasm
CRITICAL - powersuply #2 needs attention (failed), powersuply #1 is not redundant
nagios$ ./check_hpasm -v
checking hpasmd process
System        :proliant dl580 g4
Serial No.    :GB8637M8TH
ROM version   :P59 09/08/2006
checking cpus
 cpu 0 is ok
 cpu 1 is ok
 cpu 2 is ok
 cpu 3 is ok
checking power supplies
 powersupply 1 is ok
 powersupply 2 is failed
checking fans
checking temperatures
 1 cpu#1 temparature is 42 (85 max)
 2 cpu#2 temparature is 46 (85 max)
 3 cpu#3 temparature is 44 (85 max)
 4 cpu#4 temparature is 44 (85 max)
 5 i/o_zone temparature is 39 (60 max)
 6 ambient temparature is 27 (40 max)
 7 system_bd temparature is 41 (60 max)
checking memory modules
 dimm 1@1 is ok
 dimm 2@1 is ok
 dimm 3@1 is ok
 dimm 4@1 is ok
 dimm 1@2 is ok
 dimm 2@2 is ok
 dimm 3@2 is ok
 dimm 4@2 is ok
 dimm 1@3 is ok
 dimm 2@3 is ok
 dimm 3@3 is ok
 dimm 4@3 is ok
 dimm 1@4 is ok
 dimm 2@4 is ok
CRITICAL - powersuply #2 needs attention (failed),  powersuply #1 is not redundant

power supply module pulled:

nagios$ ./check_hpasm
CRITICAL - powersuply #2 is missing, powersuply #1 is not redundant
nagios$ ./check_hpasm -v
checking hpasmd process
System        :proliant dl580 g4
Serial No.    :GB8637M8TH
ROM version   :P59 09/08/2006
checking cpus
 cpu 0 is ok
 cpu 1 is ok
 cpu 2 is ok
 cpu 3 is ok
checking power supplies
 powersupply 1 is ok
 powersupply 2 is n/a
checking fans
checking temperatures
 1 cpu#1 temparature is 42 (85 max)
 2 cpu#2 temparature is 46 (85 max)
 3 cpu#3 temparature is 44 (85 max)
 4 cpu#4 temparature is 44 (85 max)
 5 i/o_zone temparature is 39 (60 max)
 6 ambient temparature is 27 (40 max)
 7 system_bd temparature is 41 (60 max)
checking memory modules
 dimm 1@1 is ok
 dimm 2@1 is ok
 dimm 3@1 is ok
 dimm 4@1 is ok
 dimm 1@2 is ok
 dimm 2@2 is ok
 dimm 3@2 is ok
 dimm 4@2 is ok
 dimm 1@3 is ok
 dimm 2@3 is ok
 dimm 3@3 is ok
 dimm 4@3 is ok
 dimm 1@4 is ok
 dimm 2@4 is ok
CRITICAL - powersuply #2 is missing, powersuply #1 is not redundant

Hpasm daemon is not running:

nagios$ check_hpasm
CRITICAL - hpasmd needs to be started

Hpasm software is not installed:

nagios$ check_hpasm
OK - hardware working fine, at least i hope so because hpasm is not installed

Call to participate

Please run check_hpasm -v on as many as possible different platforms. Chances are you have a rare Proliant model whose components are not detected completely. You will then see instructions on how to report this to the author.

The following line appears frequently but can be considered harmless:

#0 SYSTEM_BD - -

I am always interested in test data. If you want to do me a favour, send me the output of

snmpwalk ... <ip-adress> 1.3.6.1.4.1.232

or if you are using the local variant, i’d like to see the output of the following script:

hpasmcli=$(which hpasmcli)
hpacucli=$(which hpacucli)
for i in server powersupply fans temp dimm iml
do
  $hpasmcli -s &quot;show $i&quot; | while read line
  do
    printf '%s %s\n' $i &quot;$line&quot;
  done
done
if [ -x &quot;$hpacucli&quot; ]; then
  for i in config status
  do
    $hpacucli ctrl all show $i | while read line
    do
      printf '%s %s' $i &quot;$line&quot;
    done
  done
fi

Download

check_hpasm-4.9.tar.gz

Changelog

  • 4.9 2024-01-03
    Merge pull request #29 from fragfutter/storage_gen11
    Merge pull request #21 from matsimon/additional-PSU-msg
  • 4.8.0.2 2021-01-31
    Merge pull request #25 from peternewman/patch-1. Add more fan locations
  • 4.8.0.1 2020-09-18
    fix cpqSeSysRomVer pattern for Gen10
  • 4.8 2018-09-24
    check proliant cpqHeSysBatteryTable
  • 4.7.5.5 2018-04-19
    add HPE Synergy
  • 4.7.5.4 2017-02-01
    allow snmpv3 in a flat community-string
  • 4.7.5.3 2016-12-05
    reorder cpqHoMibStatusArray
  • 4.7.5.2 2016-12-02
    Detect more MSA devices (P2000)
  • 4.7.5.1 2016-11-18
    Merge pull request # 15 Use HP::StorageWorks for HP MSA systems
  • 4.7.5 2016-09-19
    better error message for hpasmcli on dl160 (Thanks Matthias Bethke)
    better error detection for da accelerators
    Merge pull request #6 from fredricj/master (Add support for proliant DA subsystem Disk enclosures)
  • 4.7.4 2016-06-16
    allow tcp connects. state other for fans is like ok. (Thanks fredericve)
  • 4.7.3.1 2016-05-01
    add blacklisting for ide. (Thanks Tommi)
  • 4.7.3 2016-02-15
    add hp superdome 2
  • 4.7.2 2016-02-01
    search for hpssacli if hpacucli was not found
  • 4.7.1.1 2015-06-08
    bugfix for gen9 with broken SysRomVer string
  • 4.7.1 2015-03-23
    interpret other status for fcal as ok
  • 4.7.0.2 2014-03-18
    add another storageworks detection
    add StoreEasy detection (thanks Alexander Laimer)
  • 4.7.0.1 2014-03-04
    bugfix in blacklisting (Thanks Ingvar Hagelund)
  • 4.7 2014-02-21
    add StorageWorks
  • 4.6.3.4 2013-05-15
    fix a bug in fan perfdata (absent fans were shown with 0%)
  • 4.6.3.3 2013-04-10
    fix a bug in snmp overall nic condition
    sort events by id numerically
  • 4.6.3.2 2013-03-19
    fix a bug in proliant/gen8/ilo temperature thresholds (Thanks Kai Benninghoff and Stephane Loeuillet)
  • 4.6.3.1 2013-01-10
    fix a bug in da disk in local mode
    fix a bux in overall_init proliant nics (Thanks Fanming Jen)
  • 4.6.3 2012-11-25
    gen8 should work now
    fix the problem with -99 degrees
    fix the problem with binary zero EventUpdateTime
  • 4.6.2.1 2012-11-09
    some bugfixes in bladecenter temperatures (Thanks Thomas Reichel)
  • 4.6.2 2012-08-20
    fix some bugs in snmpget where the system responded with undef values
  • 4.6.1 2012-08-14
    fix a small bug in boottime
    skip pagination in long “show iml” lists
    make bulk requests if possible
  • 4.6 2012-06-07
    output power consumption as performance data (only newer proliant models)
    support older <=7 versions of hpacucli
    add another error log: Uncorrectable Memory Error
    raise the default timeout from 15 to 60 seconds
  • 4.5.3.1 2012-04-19
    change the way –snmpwalk reads oids from a file
  • 4.5.3 2012-03-26
    fix a bug in snmp-eventlogs
  • 4.5.2 2012-03-06
    add another error log: Main Memory - Corrected Memory Error threshold exceeded
  • 4.5.1 2012-02
    add another error log: 210 - Quick Path Interconnect (QPI) Link Degradation
    remove watt percent for blade center power supply
    make the snmp oid collection phase shorter for blade center
  • 4.5 2012-01-26
    output power consumption perfdata for BladeCenters
    correctly identify dl388g7 (Thanks lilei8)
  • 4.4 2011-12-16
    add checks for power converters
    add checks for nic teaming (experimental!!, must be enabled with –eval-nics)
    fix a bug with invalid date/time from iml
    fix a bug in blade enclosure manager verbose output
    add msa2xxx storage sensors
  • 4.3 2011-10-14
    add monitoring of IML events (Thanks Klaus)
    esp. Memory initialization error… The OS may not have access to all of the memory installed in the system
  • 4.2.5
    G2 series of X1660 storage systems are now correctly detected. (Thanks Andre Zaborowski)
    blacklisting for SAS controller & disks was added (Thanks Jewi)
  • 4.2.4.1 2011-08-09
    dimm output of G7 hpasmcli (under Solaris) is now handled (Thanks Ron Waffle)
  • 4.2.4 2011-07-21
    add a check for asr (Thanks Ingmar Verheij http://www.ingmarverheij.com/)
  • 4.2.3 2011-07-21
    add a global temperature check when no temperature sensors are found
    check power converters if no fault tolerant power supplies are found
  • 4.2.2.1 2011-04-17
    fix a bug when a wrong –hostname was used (Thanks Wim Savenberg)
  • 4.2.2 2011-01-21
    add support for msa500 and hpasmcli (Thanks Kalle Andersson)
  • 4.2.1.1
    added support for x1** nas storage, which was detected as storage but in fact is like a proliant (Thanks Maik Schulz)
  • 4.2.1
    added timeout handling
    better hpacucli da controller handling
    fix a bug in memory detection (0 dimms were shown) (Thanks Anthony Cano)
    better handling for failed and disabled controller batteries. warning only.
  • 4.2 2010-03-20
    added temperatures for bladesystems (although not implemented by HP)
    added fuses for bladesystems
    added enclosure managers for bladesystems
    added blacklisting for scsi devices (scco,scld,scpd) (Thanks Marco Hill)
    added blacklisting for overall fan status (ofs) (Thanks Thomas Jampen)
  • 4.1.2.1 2010-03-03
    fixed a harmless bug in BladeCenter::Powersupply output
  • 4.1.2 2010-02-09
    fixed a severe bug in detecting multiple logical drives with hpacucli (Thanks Trond Hasle)
  • 4.1.1 2010-01-07
    detect more smart array types when run in local mode (Thanks Trond Hasle)
  • 4.1 2009-12-07
    added more details for bladecenters (power suppl., server blades)
    fixed a bug in powersupply checks with hpasmcli (Thanks Guillaume)
  • 4.0.1 2009-12-02
    added the missing output for –help
    non-redundant fans are now tolerated if the global fan status says “ok”
    added detection for servers with a hidden model description
    fixed a bug in celsius-fahrenheit-conversion
  • 4.0 2009-11-30
    added support for the new g6-models
    complete rewrite of the code
    autodetection for proliant, bladecenter and storage
    detailed dump of the hardware with -vvv
    new format for blacklist
  • 3.5.1 2009-04-22
    fixed a bug where the server didn’t reveal serial no. and rom rev. (thanks Daniel Rich)
    fixed a bug in the snmpv3 code.
  • 3.5 2009-03-20
    added support for SNMPv3
    added new parameter –port
  • 3.2.1 2009-02-26
    fixed a bug which showed degraded dimms as missing. (thanks matt at adicio.com)
  • 3.2 2009-02-20
    added support for external disk arrays. (M. M. has a MSA20)
  • 3.1.1.1 2009-02-13
    added an error message when sudo was configured with requiretty=yes. (thanks Jeff The Riffer)
  • 3.1.1 2009-02-06
    fixed a bug which caused ugly perl warnings. (thanks Martin Hofmann and Bill Katz)
  • 3.1 2009-01-21
    added support for sas and ide controllers/disks (only with snmp)
  • 3.0.7.2 2009-01-16
    minor bugfix for dl320g5+hpasmcli+fan+n/a. (thanks Bruce Jackson)
  • 3.0.7.1 2008-12-05
    minor bugfix. snmpwalk now uses -On
  • 3.0.7 2008-11-29
    bugfix in controller blacklists (thanks Maurice Moric)
    no need for Net::SNMP with –snmpwalk /usr/bin/snmpwalk
  • 3.0.6 2008-10-30
    buxfix in ignore-dimms (thanks tumtliw)
  • 3.0.5 2008-10-23
    higher speed through decreased amount of transferred oids (thanks Yannick Gravel)
    new switch –ignore-fan-redundancy for old boxes without double fans
  • 3.0.4 2008-09-18
    rewrote snmp memory checking for better handling of missing health info
    new configure option –enable-extendedinfo (outputs lots of crap)
  • 3.0.3.2 2008-09-11
    –protocol ist now optional (this was a bug)
  • 3.0.3.1 2008-09-10
    Only accept 1, 2 or 2c as SNMP protocol
    Try both bulk walk and get-next
  • 3.0.3 2008-08-11
    cpqSiMem instead of cpqHeResMem
    new parameter –protocol (default: 2c)
    cpqHeComponents are fetched with get-next instead of get-bulk (Net::SNMP grr)
  • 3.0.2 2008-08-01
    skip memory checking if snmp returns garbage
    bugfix in numbering of snmp table indexes
  • 3.0.1 2008-07-31
    bugfix in customthresholds&snmp (thanks TheCry)
    broke up the snmpwalk into smaller pieces.
  • 3.0 2008-07-20
    first release with snmp support for remote checks (thanks Matthias Flacke)
    simulation is possible with –snmpwalk or –hpasmcli
  • 2.0.3.3 - 2008-05-22 Brangerdog
    support fan partner# 0 with proliant support pack 8.0 (thanks Mark Wagner)
  • 2.0.3.2 - 2008-05-03
    fixed a typo in README
  • 2.0.3.1 - 2008-04-16
    fixed a bug in path to perl binary
    fixed a bug in –enable-perfdata (thanks Birk Bohne)
  • 2.0.3 - 2008-04-09
    fixed a bug in dimm code
    added blacklisting for raid controllers (thanks Andreas Schrogl)
    added blacklisting for cache&battery (thanks Harrold Nabben)
  • 2.0.2 - 2008-02-11
    empty cpu&fan sockets are now properly handled
  • 2.0.1 - 2008-02-08
    multiline output for nagios 3.x
  • 2.0 - 2008-02-08
    complete code redesign
    integrated raid checking with hpacucli
    (thanks Kelly Kristiaan van Vliet who was the first to propose this feature)
    (thanks Mess for calling me “FAULE SAU!!!”)
  • 1.6.2.2 - 2008-01-18
    added debian 3.1 to the osses where multiple hpasmd are considered normal.
  • 1.6.2.1 - 2007-12-12
    fixed a bug which caused overlooked fans. Thanks Michael Krebs.
    such unknown patterns which might be important will be reported now.
  • 1.6.2 - 2007-11-16
    Marcus Fleige contributed the -i and a more meaningful ok output
  • 1.6.1 - 2007-11-07
    fixed a bug which caused overlooked failed fans
  • 1.6 - 2007-07-27
    added performance data for fan speed and temperatures
  • 1.5.1 - 2007-07-11
    hpasmcli can also be a link
    fixed a bug, so more fan locations can be found
  • 1.5 - 2007-06-14
    added support for userdefined temperature thresholds (Kelly Kristiaan van Vliet)
  • 1.4 - 2007-05-22
    added support for hpasmxld und hpasmlited
  • 1.3 - 2007-04-17
    added –with-degree to configure (celsius or fahrenheit output)
    added -b/–blacklist
    added trustix 2.2 to the osses where multipel hpasmd are considered normal.
  • 1.2 - 2007-04-16
    added –with-noinst-level
  • 1.1 - 2007-04-14
    First public release

Gerhard Lausser

Check_hpasm is released under the GNU General Public License.

Author

Gerhard Lausser (gerhard.lausser@consol.de) will gladly answer your questions.