Friday, January 10, 2014

UCDavis MIB for Monitoring Linux Memory

I discovered quite some time ago that the Net-SNMP agent on my RaspberryPi doesn't report memory utilization in the hrstorage MIB like I would expect it to.  It's not wrong, however, the value doesn't match up with what's actually being used on the device.  The reason for that is that Linux can use the system RAM for several things: processes, shared memory for processes, buffers, and disk cache.  When most people ask how much memory is in use, they are asking for how much memory is in use by the processes.  That's the value you get on the second row under the 'used' column of the output of the free command:

However, the OIDs in the hrstorage MIB actually return the value from the first row of the 'used' column.  The problem is that both of these numbers represent memory utilization.  The first row shows the total of the processes, shared, buffers, and cache.  That's the total amount of memory that's in use on the system.  However, this isn't the value most people associate with memory utilization.

In order to get the correct value, there are two options.  The first doesn't work with NetVoyant, but it doesn't use additional MIBs or OIDs to get the data.

Since the shared, buffer, and cache memory is reported in the hrstorage table, you can simply take hrStorageSize of the Physical Memory row (hrStorageType==1.3.6.1.2.1.25.2.2), and subtract the hrStorageUsed from the shared, buffer, and cache rows (hrStorageType==1.3.6.1.2.1.25.2.1.1).  Since NetVoyant can't use values from other poll instances in an expression, it won't work in NV.
Side Note: This may be possible by creating a single expression that results in the positive value of hrStorageSize when the hrStorageType is .2 and a negative value of hrStorageUsed when hrStorageType is .1.  The sum of that expression for all the .1 and .2 poll instances should give you the total used memory.  However, since the sum could only be done in a view in the web GUI, it would only work for reporting and not thresholding/alarming.
The second option is to use the UCDavis MIB.  The NetSNMP agent does populate the UCDavis tables, so any of the values there can be polled.  The problem is that there's no real clear documentation on which OIDs give you which values when compared to the output of the free command.  Here's the mappping:

Given the output above, here are the OIDs or combinations you need to calculate the values:
  1. memTotalReal
  2. memTotalReal - memAvailReal
  3. memAvailReal
  4. memShared
  5. memBuffer
  6. memCached
  7. memTotalReal - memAvailReal - memShared - memBuffer - memCached
  8. memAvailReal + memShared + memBuffer + memCached
  9. memTotalSwap
  10. memSwapError
  11. memAvailSwap
Given this, it should be pretty easy to create a dataset to poll memory.  Just remember, these OIDs are in units of KB, so if you want it in Bytes so that NV automatically scales (to KB, MB, GB, TB, etc.) you'll need to multiply each one by 1024.  Obviously, if you're calculating % utilization, you don't need to multiply both the numerator and the denominator by 1024.  You will need to multiply by 100 to get the ratio to a scale of 0-100%.