AIX / Linux agent

OS agent is a solution for these of you who wish to get further metrics which can be obtained only from the Operating System level.
CPU
OS agent CPU
CPU Queue
OS agent CPU Queue
Memory
OS agent Memory
LAN
OS agent LAN
SAN
OS agent SAN
SAN IOPS
OS agent SAN IOPS
SAN Latency
OS agent SAN response time

OS agent metrics and features

  • OS CPU utilization of user/sys/IO wait/idle in %
  • CPU queue: load average, blocked processes / raw / direct IO
  • Memory utilization of used/FS cache/free memory in MB
  • Paging rate in MB/sec
  • Paging space utilization in %
  • SAN (FC & vSCSI) throughput per adapter
    • data in MB/sec
    • IO/sec
    • response time (latency)
    • error
  • LAN (ethernet) throughput per adapter
    • data in MB/sec
    • packet count
    • error
  • Total IO throughput (Linux)
    • IOPS
    • Data in MB/sec
    • response time (latency)
  • Filesystem capacity utilization
  • AIX SEA (Shared Ethernet Adapter) throughput per adapter in MB/sec (IBM Power only)
  • SAN multipath monitoring
  • JOB TOP, CPU and Memory tracking of running processes graphically in the time

Operating systems

  • AIX 5.1+
  • Linux on Power
  • Linux x86

Implementation

it is implemented as simple client/server application.
There is XorMon NG daemon listening on the host where XorMon NG server is running on port 8162.
Each LPAR has installed simple Perl based agent which is started every minute from the crontab and saves memory and paging statistics into a temporary file.
The agent contacts the server every 15-25 minutes and sends all locally stored data for that period.

Agent prerequisites

  • Perl interpreter. All Unix/Linux systems contain Perl in basic installation.
  • It might run under whatever user account, it does not need any special privileges in the OS.
  • Opened TCP communication between each LPAR and XorMon NG server on port 8162.
  • Connections are initiated from monitored AIX / Linux only.

Usage

perl lpar2rrd-agent.pl [-s ] [-d] [-c] [-n  ] [-b  ] [-i  ] <XorMon NG server hostname/IP>[:<PORT>]

 -d  forces sending out data immediately to check communication channel (DEBUG purposes)
 -c  agent collects & sends only internal HMC data
 -n  agent sends only NMON data from NMON directory <NMON_DIR>
 -b  path to Hitachi HvmSh API
 -i  IP address of HVM (Hitachi Virtualization Manager)
 -t  <max send time in seconds>
 -s  <step in seconds>, do not set < 60, do not forget to update crontab line accordingly e.g. -s 300 means in crontab */5 for minutes
 -m  using sudo for multipath (only root can run it): sudo multipath -l", put this into sudoers: lpar2rrd  ALL = (root) NOPASSWD: /usr/sbin/multipath -ll

 options -c and -n are mutual exclusive
 options -b and -i are both required for Hitachi agent
 no option - agent collects & sends standard OS agent data
Crontab entry for scheduling, use non admin account preferably
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon NG server hostname/IP> > /var/tmp/lpar2rrd-agent.out 2>&1
The agent collects data and sends them every 5 - 20 minutes to XorMon NG server
If you use other than standard XorMon NG port then place if after SERVER by ':' delimiter
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon NG server hostname/IP>:<PORT> > /var/tmp/lpar2rrd-agent.out 2>&1
If you want to send data to more XorMon NG server instances (number is not restricted)
* * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon NG server 1 hostname/IP> <XorMon NG server 2 hostname/IP> > /var/tmp/lpar2rrd-agent.out 2>&1

Enhanced setting

  • default behaviour is, that the agent tries randomly send data to the XorMon NG server between 5 - 20 mins
    you can specify max time when data is send, minimum is 5 minutes
    * * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -t <max send time in seconds> <XorMon NG server hostname/IP>
    
  • How to avoid SAN checks via fcstat (they might cause some problems, it should not happen in v4.50+)
    * * * * * FCSTAT=/bin/true /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon NG server hostname/IP> > /var/tmp/lpar2rrd-agent.out 2>&1
    
  • By default only interfaces which have IP address assiggned are reported, by env variable can this be skiped and selection is done base on XorMon NG_LAN_INT env var, it allows regex only for Linux, be carefull here to do not stack in 1 graph interfaces from different virtualization level what might lead to creasing of presented traffic by counting some traffic more times
    * * * * * XorMon NG_LAN_INT="eth.*0$,bond.*,rhevm,9.*" /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <XorMon NG server hostname/IP>  > /var/tmp/lpar2rrd-agent.out 2>&1
    

Debug

  • option -d forces sending out data immediately to check communication channel
      /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -d <XorMon NG server hostname/IP>
    
  • error log: /var/tmp/lpar2rrd-agent-*.err
  • output log, last run: /var/tmp/lpar2rrd-agent-*.out
  • collected data waiting for sending: /var/tmp/lpar2rrd-agent--.txt