Tuesday, December 10, 2019

Sending sysLog via Python

I had someone ask me the other day if we could ingest alarms into LogicMonitor from their app. For example, if the app had an exception raised, they'd like to send a message to LogicMonitor to open an alarm. The easiest way I could think of doing this was to have a standalone Python script that sends data via Syslog to LM. This is one way this could be done. I haven't tested this, but it's simple enough to see that it should work:

import logging
import logging.handlers
import sys
my_logger = logging.getLogger('MyLogger')
my_logger.setLevel(logging.INFO)
destIpAddress = sys.argv[1]
destPort = sys.argv[2]
handler = logging.handlers.SysLogHandler(address = (ipAddress,514))
my_logger.addHandler(handler)
my_logger.info(sys.argv[3:].join(" "))

You'd call it like this:
> send_syslog.py 192.168.25.64 514 My application had an error in the widget creator service.

Tuesday, September 3, 2019

Groovy SNMPwalk Helper Functions

tl;dr - Go here.

I've been doing a lot of SNMP polling through Groovy lately. One of the methods I use most often is Snmp.walkAsMap(), which returns a map that looks like this:
[8.6:2, 4.10:1500, 8.7:1, 8.8:1, 4.12:1500, 4.14:1500, 4.16:1500, 3.44:6, 22.5:0.0, 22.4:0.0, 
22.3:0.0, 22.2:0.0, 22.1:0.0, 22.8:0.0, 22.7:0.0, 22.6:0.0, 17.16:164136, 16.44:6639007, 17.12:8634086, 
18.8:0, 17.14:8745523, 18.7:0, 17.10:21076, 10.6:0, 10.5:0, 10.4:0, 10.3:0, 10.2:3968863511, 10.1:8980402, 
22.44:0.0, 18.6:0, 18.5:0, 7.1:1, 18.4:0, 7.2:1, 18.3:0, 7.3:1, 18.2:0, 7.4:1, 18.1:0, 7.5:1, 10.8:1684285, 
7.6:2, 10.7:4181064938, 7.7:1, 7.8:1, 15.44:0, 16.12:2114735865, 16.10:4077101, 16.16:16467729, 
16.14:2361323502, 22.14:0.0, 22.16:0.0, 9.44:2 days, 14:54:42.04, 21.6:0, 21.5:0, 21.4:0, 21.3:0, 
21.2:0, 21.1:0, 22.10:0.0, 22.12:0.0, 21.44:0, 21.8:0, 21.7:0, 5.10:4294967295, 4.44:1500, 5.12:4294967295, 
5.14:4294967295, 11.10:0, 5.16:4294967295, 10.44:1620014, 11.12:12487957, 17.8:30812, 11.14:12673362, 
17.7:17502386, 6.1:, 17.6:0, 6.2:90:e6:ba:59:1b:18, 11.16:136053, 17.5:0, 6.3:00:50:56:c0:00:01, 
17.4:20226, 6.4:00:50:56:c0:00:08, 17.3:20229, 6.5:52:54:00:9b:4b:e0, 17.2:37200831, 6.6:52:54:00:9b:4b:e0, 
17.1:44466, 6.7:02:42:39:1b:cc:e3, 6.8:02:42:55:44:3f:06, 10.10:0, 20.7:0, 20.6:0, 20.5:0, 20.4:0, 20.3:0, 
20.2:0, 20.1:0, 10.12:30849041, 10.14:131456967, 10.16:77954842, 20.8:0, 5.1:10000000, 16.8:17045376, 
5.2:100000000, 16.7:189459674, 5.3:0, 16.6:0, 5.4:0, 16.5:0, 5.5:0, 16.4:0, 5.6:10000000, 16.3:0, 5.7:0, 
16.2:1009509757, 5.8:0, 16.1:8980402, 6.12:4e:c1:4e:e6:07:90, 5.44:4294967295, 6.14:0a:2c:4f:3a:eb:5c, 
6.16:12:db:e0:82:ca:a2, 19.44:0, 6.10:06:31:50:31:a5:fb, 15.12:0, 14.44:0, 15.10:0, 21.16:0, 15.16:0, 
21.14:0, 15.14:0, 20.44:0, 21.12:0, 1.12:12, 1.10:10, 15.1:0, 4.1:65536, 4.2:1500, 4.3:1500, 15.8:0, 
21.10:0, 4.4:1500, 15.7:0, 4.5:1500, 15.6:0, 4.6:1500, 15.5:0, 4.7:1500, 15.4:0, 4.8:1500, 15.3:0, 
15.2:0, 14.10:0, 20.16:0, 14.14:0, 20.14:0, 13.44:0, 14.12:0, 20.12:0, 1.16:16, 1.14:14, 20.10:0, 14.16:0, 
6.44:62:d0:3f:14:ef:85, 7.12:1, 7.14:1, 7.16:1, 7.10:1, 14.2:0, 14.1:0, 3.1:24, 3.2:6, 3.3:6, 3.4:6, 3.5:6, 
14.8:0, 3.6:6, 14.7:0, 3.7:6, 14.6:0, 3.8:6, 14.5:0, 14.4:0, 14.3:0, 2.14:veth9a64fa1, 2.12:veth2bc8fbd, 
1.44:44, 2.10:veth92ca0b0, 19.14:0, 19.16:0, 19.10:0, 19.12:0, 18.44:0, 13.3:0, 13.2:0, 13.1:0, 2.1:lo, 
2.2:NVIDIA Corporation MCP77 Ethernet, 2.16:veth497df7f, 2.3:vmnet1, 2.4:vmnet8, 2.5:virbr0, 2.6:virbr0-nic, 
2.7:br-6a2604a91ac1, 13.8:0, 2.8:docker0, 13.7:0, 13.6:0, 13.5:0, 13.4:0, 18.16:0, 8.16:1, 8.14:1, 
17.44:18857, 18.12:0, 18.14:0, 18.10:0, 8.12:1, 7.44:1, 8.10:1, 13.10:0, 12.44:0, 13.14:0, 13.12:0, 
3.14:6, 2.44:vethd608288, 3.12:6, 3.10:6, 12.4:0, 12.3:0, 12.2:2662, 12.1:0, 1.1:1, 1.2:2, 1.3:3, 1.4:4, 
1.5:5, 1.6:6, 1.7:7, 1.8:8, 13.16:0, 9.1:0:00:00.00, 12.8:0, 9.2:0:00:00.00, 12.7:0, 9.3:0:00:06.31, 12.6:0, 
9.4:0:00:06.31, 12.5:0, 9.5:0:00:09.32, 9.6:0:00:12.32, 9.7:3:03:43.67, 9.8:0:00:18.32, 12.10:0, 11.44:4506, 
12.12:0, 3.16:6, 12.14:0, 12.16:0, 9.16:3:03:43.67, 9.14:3:03:43.67, 19.8:0, 19.7:0, 19.6:0, 8.44:1, 
9.12:3:03:43.67, 9.10:0:00:18.32, 11.5:0, 11.4:0, 11.3:0, 11.2:28692938, 11.1:44466, 19.5:0, 19.4:0, 
19.3:0, 8.1:1, 19.2:0, 8.2:1, 19.1:0, 8.3:1, 11.8:6882, 8.4:1, 11.7:25297372, 8.5:2, 11.6:0]



This great and usable, but I found myself constantly wanting to display the data easier. I also wanted the data to be structured a little more hierarchically, grouping by row in the SNMP table.  I also wanted to make it easier to address individual pieces of the data. I built a couple helper functions that transform the data and make it easier to address. They can be found here.




The output of the snmpMapToTable() function looks like this:
[6:[8:2, 22:0.0, 10:0, 18:0, 7:2, 21:0, 17:0, 6:52:54:00:9b:4b:e0, 20:0, 16:0, 5:10000000, 15:0, 4:1500, 
3:6, 14:0, 2:virbr0-nic, 13:0, 1:6, 12:0, 9:0:00:12.32, 19:0, 11:0], 10:[4:1500, 17:21076, 16:4077101, 22:0.0, 
5:4294967295, 11:0, 10:0, 6:06:31:50:31:a5:fb, 15:0, 1:10, 21:0, 14:0, 20:0, 7:1, 2:veth92ca0b0, 19:0, 18:0, 
8:1, 13:0, 3:6, 12:0, 9:0:00:18.32], 7:[8:1, 22:0.0, 18:0, 10:4181064938, 7:1, 21:0, 17:17502386, 
6:02:42:39:1b:cc:e3, 20:0, 16:189459674, 5:0, 15:0, 4:1500, 14:0, 3:6, 2:br-6a2604a91ac1, 13:0, 1:7, 
12:0, 9:3:03:43.67, 19:0, 11:25297372], 8:[8:1, 22:0.0, 18:0, 10:1684285, 7:1, 21:0, 17:30812, 
6:02:42:55:44:3f:06, 20:0, 16:17045376, 5:0, 15:0, 4:1500, 14:0, 3:6, 13:0, 2:docker0, 1:8, 12:0, 
9:0:00:18.32, 19:0, 11:6882], 12:[4:1500, 17:8634086, 16:2114735865, 22:0.0, 5:4294967295, 
11:12487957, 10:30849041, 6:4e:c1:4e:e6:07:90, 15:0, 21:0, 1:12, 14:0, 20:0, 7:1, 2:veth2bc8fbd, 
19:0, 18:0, 8:1, 13:0, 3:6, 12:0, 9:3:03:43.67], 14:[4:1500, 17:8745523, 16:2361323502, 22:0.0, 
5:4294967295, 11:12673362, 10:131456967, 6:0a:2c:4f:3a:eb:5c, 21:0, 15:0, 14:0, 20:0, 1:14, 7:1, 
2:veth9a64fa1, 19:0, 8:1, 18:0, 13:0, 3:6, 12:0, 9:3:03:43.67], 16:[4:1500, 17:164136, 16:16467729, 
22:0.0, 5:4294967295, 11:136053, 10:77954842, 6:12:db:e0:82:ca:a2, 21:0, 15:0, 20:0, 1:16, 14:0, 
7:1, 19:0, 2:veth497df7f, 18:0, 8:1, 13:0, 3:6, 12:0, 9:3:03:43.67], 44:[3:6, 16:6639007, 22:0.0, 
15:0, 9:2 days, 14:54:42.04, 21:0, 4:1500, 10:1620014, 5:4294967295, 19:0, 14:0, 20:0, 
13:0, 6:62:d0:3f:14:ef:85, 1:44, 18:0, 17:18857, 7:1, 12:0, 2:vethd608288, 11:4506, 8:1], 5:[22:0.0, 
10:0, 18:0, 7:1, 21:0, 17:0, 6:52:54:00:9b:4b:e0, 20:0, 16:0, 5:0, 4:1500, 15:0, 3:6, 14:0, 2:virbr0, 
13:0, 1:5, 12:0, 9:0:00:09.32, 11:0, 19:0, 8:2], 4:[22:0.0, 10:0, 18:0, 7:1, 21:0, 17:20226, 
6:00:50:56:c0:00:08, 20:0, 5:0, 16:0, 4:1500, 15:0, 3:6, 14:0, 2:vmnet8, 13:0, 12:0, 1:4, 9:0:00:06.31, 
11:0, 19:0, 8:1], 3:[22:0.0, 10:0, 18:0, 7:1, 21:0, 6:00:50:56:c0:00:01, 17:20229, 20:0, 5:0, 16:0, 
4:1500, 15:0, 3:6, 14:0, 13:0, 2:vmnet1, 12:0, 1:3, 9:0:00:06.31, 11:0, 19:0, 8:1], 2:[22:0.0, 
10:3968863511, 7:1, 18:0, 21:0, 6:90:e6:ba:59:1b:18, 17:37200831, 20:0, 5:100000000, 16:1009509757, 
4:1500, 15:0, 14:0, 3:6, 13:0, 2:NVIDIA Corporation MCP77 Ethernet, 12:2662, 1:2, 9:0:00:00.00, 
11:28692938, 19:0, 8:1], 1:[22:0.0, 10:8980402, 7:1, 18:0, 21:0, 6:, 17:44466, 20:0, 5:10000000, 
16:8980402, 15:0, 4:65536, 14:0, 3:24, 13:0, 2:lo, 12:0, 1:1, 9:0:00:00.00, 11:44466, 8:1, 19:0]]





It may not look much better, but if you add some carriage returns and tabs you get this:
[
6:[
  8:2, 22:0.0, 10:0, 18:0, 7:2, 21:0, 17:0, 6:52:54:00:9b:4b:e0, 
  20:0, 16:0, 5:10000000, 15:0, 4:1500, 3:6, 14:0, 2:virbr0-nic, 
  13:0, 1:6, 12:0, 9:0:00:12.32, 19:0, 11:0], 
10:[
  4:1500, 17:21076, 16:4077101, 22:0.0, 5:4294967295, 11:0, 10:0, 
  6:06:31:50:31:a5:fb, 15:0, 1:10, 21:0, 14:0, 20:0, 7:1, 2:veth92ca0b0, 
  19:0, 18:0, 8:1, 13:0, 3:6, 12:0, 9:0:00:18.32], 
7:[
  8:1, 22:0.0, 18:0, 10:4181064938, 7:1, 21:0, 17:17502386, 6:02:42:39:1b:cc:e3, 
  20:0, 16:189459674, 5:0, 15:0, 4:1500, 14:0, 3:6, 2:br-6a2604a91ac1, 13:0, 1:7, 
  12:0, 9:3:03:43.67, 19:0, 11:25297372], 
8:[
  8:1, 22:0.0, 18:0, 10:1684285, 7:1, 21:0, 17:30812, 6:02:42:55:44:3f:06, 20:0, 
  16:17045376, 5:0, 15:0, 4:1500, 14:0, 3:6, 13:0, 2:docker0, 1:8, 12:0, 9:0:00:18.32, 
  19:0, 11:6882], 
12:[
  4:1500, 17:8634086, 16:2114735865, 22:0.0, 5:4294967295, 11:12487957, 10:30849041, 
  6:4e:c1:4e:e6:07:90, 15:0, 21:0, 1:12, 14:0, 20:0, 7:1, 2:veth2bc8fbd, 19:0, 18:0, 8:1, 
  13:0, 3:6, 12:0, 9:3:03:43.67], 
14:[
  4:1500, 17:8745523, 16:2361323502, 22:0.0, 5:4294967295, 11:12673362, 10:131456967, 
  6:0a:2c:4f:3a:eb:5c, 21:0, 15:0, 14:0, 20:0, 1:14, 7:1, 2:veth9a64fa1, 19:0, 8:1, 18:0, 
  13:0, 3:6, 12:0, 9:3:03:43.67], 
16:[
  4:1500, 17:164136, 16:16467729, 22:0.0, 5:4294967295, 11:136053, 10:77954842, 
  6:12:db:e0:82:ca:a2, 21:0, 15:0, 20:0, 1:16, 14:0, 7:1, 19:0, 2:veth497df7f, 18:0, 
  8:1, 13:0, 3:6, 12:0, 9:3:03:43.67], 
44:[
  3:6, 16:6639007, 22:0.0, 15:0, 9:2 days, 14:54:42.04, 21:0, 4:1500, 10:1620014, 
  5:4294967295, 19:0, 14:0, 20:0, 13:0, 6:62:d0:3f:14:ef:85, 1:44, 18:0, 17:18857, 
  7:1, 12:0, 2:vethd608288, 11:4506, 8:1], 
5:[
  22:0.0, 10:0, 18:0, 7:1, 21:0, 17:0, 6:52:54:00:9b:4b:e0, 20:0, 16:0, 5:0, 4:1500, 
  15:0, 3:6, 14:0, 2:virbr0, 13:0, 1:5, 12:0, 9:0:00:09.32, 11:0, 19:0, 8:2], 
4:[
  22:0.0, 10:0, 18:0, 7:1, 21:0, 17:20226, 6:00:50:56:c0:00:08, 20:0, 5:0, 16:0, 
  4:1500, 15:0, 3:6, 14:0, 2:vmnet8, 13:0, 12:0, 1:4, 9:0:00:06.31, 11:0, 19:0, 8:1], 
3:[
  22:0.0, 10:0, 18:0, 7:1, 21:0, 6:00:50:56:c0:00:01, 17:20229, 20:0, 5:0, 16:0, 4:1500, 
  15:0, 3:6, 14:0, 13:0, 2:vmnet1, 12:0, 1:3, 9:0:00:06.31, 11:0, 19:0, 8:1], 
2:[
  22:0.0, 10:3968863511, 7:1, 18:0, 21:0, 6:90:e6:ba:59:1b:18, 17:37200831, 20:0, 
  5:100000000, 16:1009509757, 4:1500, 15:0, 14:0, 3:6, 13:0, 
  2:NVIDIA Corporation MCP77 Ethernet, 12:2662, 1:2, 9:0:00:00.00, 11:28692938, 19:0, 8:1], 
1:[
  22:0.0, 10:8980402, 7:1, 18:0, 21:0, 6:, 17:44466, 20:0, 5:10000000, 16:8980402, 15:0, 
  4:65536, 14:0, 3:24, 13:0, 2:lo, 12:0, 1:1, 9:0:00:00.00, 11:44466, 8:1, 19:0]
]




As you can see, it's getting easier to see what's going on. At this point, it'd be nice to order by instance ID and also order by metric ID.  One of the helper functions I built does just that, pprintSnmpWalkTable:
Wildvalue: 1:
Data sorted by column ID
  1.##WILDVALUE##: 1
  2.##WILDVALUE##: lo
  3.##WILDVALUE##: 24
  4.##WILDVALUE##: 65536
  5.##WILDVALUE##: 10000000
  6.##WILDVALUE##: 
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 1
  9.##WILDVALUE##: 0:00:00.00
  10.##WILDVALUE##: 8980402
  11.##WILDVALUE##: 44466
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 8980402
  17.##WILDVALUE##: 44466
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 2:
Data sorted by column ID
  1.##WILDVALUE##: 2
  2.##WILDVALUE##: NVIDIA Corporation MCP77 Ethernet
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 100000000
  6.##WILDVALUE##: 90:e6:ba:59:1b:18
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 1
  9.##WILDVALUE##: 0:00:00.00
  10.##WILDVALUE##: 3968863511
  11.##WILDVALUE##: 28692938
  12.##WILDVALUE##: 2662
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 1009509757
  17.##WILDVALUE##: 37200831
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 3:
Data sorted by column ID
  1.##WILDVALUE##: 3
  2.##WILDVALUE##: vmnet1
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 0
  6.##WILDVALUE##: 00:50:56:c0:00:01
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 1
  9.##WILDVALUE##: 0:00:06.31
  10.##WILDVALUE##: 0
  11.##WILDVALUE##: 0
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 0
  17.##WILDVALUE##: 20229
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 4:
Data sorted by column ID
  1.##WILDVALUE##: 4
  2.##WILDVALUE##: vmnet8
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 0
  6.##WILDVALUE##: 00:50:56:c0:00:08
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 1
  9.##WILDVALUE##: 0:00:06.31
  10.##WILDVALUE##: 0
  11.##WILDVALUE##: 0
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 0
  17.##WILDVALUE##: 20226
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 5:
Data sorted by column ID
  1.##WILDVALUE##: 5
  2.##WILDVALUE##: virbr0
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 0
  6.##WILDVALUE##: 52:54:00:9b:4b:e0
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 2
  9.##WILDVALUE##: 0:00:09.32
  10.##WILDVALUE##: 0
  11.##WILDVALUE##: 0
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 0
  17.##WILDVALUE##: 0
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 6:
Data sorted by column ID
  1.##WILDVALUE##: 6
  2.##WILDVALUE##: virbr0-nic
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 10000000
  6.##WILDVALUE##: 52:54:00:9b:4b:e0
  7.##WILDVALUE##: 2
  8.##WILDVALUE##: 2
  9.##WILDVALUE##: 0:00:12.32
  10.##WILDVALUE##: 0
  11.##WILDVALUE##: 0
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 0
  17.##WILDVALUE##: 0
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 7:
Data sorted by column ID
  1.##WILDVALUE##: 7
  2.##WILDVALUE##: br-6a2604a91ac1
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 0
  6.##WILDVALUE##: 02:42:39:1b:cc:e3
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 1
  9.##WILDVALUE##: 3:03:43.67
  10.##WILDVALUE##: 4181064938
  11.##WILDVALUE##: 25297372
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 189459674
  17.##WILDVALUE##: 17502386
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 8:
Data sorted by column ID
  1.##WILDVALUE##: 8
  2.##WILDVALUE##: docker0
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 0
  6.##WILDVALUE##: 02:42:55:44:3f:06
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 1
  9.##WILDVALUE##: 0:00:18.32
  10.##WILDVALUE##: 1684285
  11.##WILDVALUE##: 6882
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 17045376
  17.##WILDVALUE##: 30812
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 10:
Data sorted by column ID
  1.##WILDVALUE##: 10
  2.##WILDVALUE##: veth92ca0b0
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 4294967295
  6.##WILDVALUE##: 06:31:50:31:a5:fb
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 1
  9.##WILDVALUE##: 0:00:18.32
  10.##WILDVALUE##: 0
  11.##WILDVALUE##: 0
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 4077101
  17.##WILDVALUE##: 21076
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 12:
Data sorted by column ID
  1.##WILDVALUE##: 12
  2.##WILDVALUE##: veth2bc8fbd
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 4294967295
  6.##WILDVALUE##: 4e:c1:4e:e6:07:90
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 1
  9.##WILDVALUE##: 3:03:43.67
  10.##WILDVALUE##: 30849041
  11.##WILDVALUE##: 12487957
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 2114735865
  17.##WILDVALUE##: 8634086
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 14:
Data sorted by column ID
  1.##WILDVALUE##: 14
  2.##WILDVALUE##: veth9a64fa1
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 4294967295
  6.##WILDVALUE##: 0a:2c:4f:3a:eb:5c
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 1
  9.##WILDVALUE##: 3:03:43.67
  10.##WILDVALUE##: 131456967
  11.##WILDVALUE##: 12673362
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 2361323502
  17.##WILDVALUE##: 8745523
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 16:
Data sorted by column ID
  1.##WILDVALUE##: 16
  2.##WILDVALUE##: veth497df7f
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 4294967295
  6.##WILDVALUE##: 12:db:e0:82:ca:a2
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 1
  9.##WILDVALUE##: 3:03:43.67
  10.##WILDVALUE##: 77954842
  11.##WILDVALUE##: 136053
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 16467729
  17.##WILDVALUE##: 164136
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0
Wildvalue: 44:
Data sorted by column ID
  1.##WILDVALUE##: 44
  2.##WILDVALUE##: vethd608288
  3.##WILDVALUE##: 6
  4.##WILDVALUE##: 1500
  5.##WILDVALUE##: 4294967295
  6.##WILDVALUE##: 62:d0:3f:14:ef:85
  7.##WILDVALUE##: 1
  8.##WILDVALUE##: 1
  9.##WILDVALUE##: 2 days, 14:54:42.04
  10.##WILDVALUE##: 1620014
  11.##WILDVALUE##: 4506
  12.##WILDVALUE##: 0
  13.##WILDVALUE##: 0
  14.##WILDVALUE##: 0
  15.##WILDVALUE##: 0
  16.##WILDVALUE##: 6639007
  17.##WILDVALUE##: 18857
  18.##WILDVALUE##: 0
  19.##WILDVALUE##: 0
  20.##WILDVALUE##: 0
  21.##WILDVALUE##: 0
  22.##WILDVALUE##: 0.0



For me, this is pretty easy to read and find individual values that I'm looking for. The three lines that resulted in the above output were these:
walkResult = Snmp.walkAsMap(host, Oid, props, timeout)
entryRaw = snmpMapToTable(walkResult)
pprintSnmpWalkTable(entryRaw)



Accessing the data is pretty easy now:
ifEntryRaw.each {wildvalue, data ->
 println("""Interface ${wildvalue}:
 ifAlias: ${ifXEntryRaw[wildvalue]["1"]}
 ifDescr: ${data["2"]}
 ifType: ${data["3"]}
    """)
}



This gives us the following output:
Interface 6:
 ifAlias: virbr0-nic
 ifDescr: virbr0-nic
 ifType: 6

Interface 10:
 ifAlias: veth92ca0b0
 ifDescr: veth92ca0b0
 ifType: 6

Interface 7:
 ifAlias: br-6a2604a91ac1
 ifDescr: br-6a2604a91ac1
 ifType: 6

Interface 8:
 ifAlias: docker0
 ifDescr: docker0
 ifType: 6

Interface 12:
 ifAlias: veth2bc8fbd
 ifDescr: veth2bc8fbd
 ifType: 6

Interface 14:
 ifAlias: veth9a64fa1
 ifDescr: veth9a64fa1
 ifType: 6

Interface 16:
 ifAlias: veth497df7f
 ifDescr: veth497df7f
 ifType: 6

Interface 44:
 ifAlias: vethd608288
 ifDescr: vethd608288
 ifType: 6

Interface 5:
 ifAlias: virbr0
 ifDescr: virbr0
 ifType: 6

Interface 4:
 ifAlias: vmnet8
 ifDescr: vmnet8
 ifType: 6

Interface 3:
 ifAlias: vmnet1
 ifDescr: vmnet1
 ifType: 6

Interface 2:
 ifAlias: enp0s10
 ifDescr: NVIDIA Corporation MCP77 Ethernet
 ifType: 6

Interface 1:
 ifAlias: lo
 ifDescr: lo
 ifType: 24

Friday, July 5, 2019

Discovering Enumerated Properties

tl;dr - the scripts are here and here.

In the same vein as the previous post, this post will talk about SNMP enumeration. While the previous post talked about polling enumerated values and how to display them as intuitively as possible, this post will talk about polling more static values and using them as properties.

There are two levels of properties that can be obtained via SNMP: device level properties (that pertain to the whole device) and instance level properties (that pertain to a particular thing on the device, of which there may be more than one).  Polling those properties is easy; this post will go over how to improve the quality of the data stored so that the data can be intuitively used.

Polling properties vs. data points

Polling properties is easy. Knowing which properties to poll as properties and which to poll as data points is a separate discussion entirely. Suffice it to say that things that represent a characteristic about an object should be properties and things that represent a behavior should be data points. Depending on the tool, only data points can be alerted upon, so that may influence a decision to make something that would normally be a property also a data point.  In other cases, sometimes data points can't be used to influence enablement of monitoring; so sometimes data points need to be properties too.
Assuming you're past the point where you've looked at the MIB and figured out which OIDs should be properties and which should be data points, the next thing to look at is whether or not the OIDs you will be storing as properties have enumerations.

Enumerations

An enumeration is used in SNMP to keep the simple in Simple Network Management Protocol. Instead of passing back a string comprised of ASCII characters, a map is created that connects a meaningful string to a single integer, which is passed back to the NMS. Passing back a single integer is much simpler than passing back a string of characters of varying length.
For example, the admin status of an interface is found at 1.3.6.1.2.1.2.2.1.7. It's not a particularly good example of a property since it lends itself more to a data point, but having it as a property can allow for advanced filtering which may only be available using properties.
ifAdminStatus OBJECT-TYPE
    SYNTAX  INTEGER {
                up(1),       -- ready to pass packets
                down(2),
                testing(3)   -- in some test mode
            }
    MAX-ACCESS  read-write
    STATUS      current
    DESCRIPTION
            "The desired state of the interface.  The testing(3) state
            indicates that no operational packets can be passed.  When a
            managed system initializes, all interfaces start with
            ifAdminStatus in the down(2) state.  As a result of either
            explicit management action or per configuration information
            retained by the managed system, ifAdminStatus is then
            changed to either the up(1) or testing(3) states (or remains
            in the down(2) state)."
    ::= { ifEntry 7 }
You can see in the definition of the OID that it has a custom syntax that allows for one of three values. When this OID is polled, either a 1, a 2, or a 3 is returned. It's up to the NMS to interpret the different values to understand the meaning. If the NMS tool has MIB compilation, this may have a certain level of automation. If it doesn't, interpretation is up to you. Obviously, storing a 1, 2, or 3 as a property could be sufficient, but it would be better to store the interpreted meaning making use by humans easier and more intuitive.

Interpretation of an instance property enumeration

So how do we interpret this? It's actually pretty easy, but it involves some scripting to process the returned data. It also involves some research into the MIB to find out all the meanings. Let's start with a simple case of interfaces. In most cases a MIB will contain a "Table" OID with an "Entry" child OID containing all the instances with whatever OIDs go along with the instances. In the case of interfaces (ignoring the ifXTable), the table is called "ifTable" at 1.3.6.1.2.1.2.2. You'll see that there is a child OID called "ifEntry" at 1.3.6.1.2.1.2.2.1.  This kind of table typically has an index along with perhaps some properties and data points as columns. Each row is an instance. We're going to ignore the data points for now and focus on the OIDs that would do well stored as properties. MTU(4), speed(5), and MAC address(6) are good items to store as properties. They require no interpretation. However, ifType(3) and ifAdminStatus(7) require some interpretation to be useful.
LogicMonitor's multi-instance datasource can be configured to use a groovy script (yes, it's a real thing) to do auto-discovery, which is the mechanism that discovers poll instances and sets properties per poll instance. The concept is pretty simple, use the groovy SNMP libraries to retrieve the data, use groovy to interpret the data, then just print the data to standard output, one line per poll instance.
I recently wrote a script to do this. It's all self documented with explanations and sample output and everything. Some points to consider at the following lines:
  1. This line defines the address of the Entry table. Everything we will be polling happens to live under this branch of the OID tree.
  2. This is where we define which column of the ifEntry table contains the name that we should be using for each of our instances.
  3. This line shows the information from the MIB added to the script so that the script can interpret the meaning from the returned value for ifAdminStatus
  4. This line shows the information from the MIB added to the script so that the script can interpret the meaning from the returned value for ifOperStatus
  5. This line shows that we're still polling ifMtu as a property, but there is no interpretation available for the value. We could alternatively put ["1500":"Default (1500)"] instead of [:] to tell the script to add some meaning to the most common value of MTU
  6. ifPhysAddress is the MAC address and needs no interpretation
  7. This line shows an interpretation that isn't defined in the MIB. Instead of just passing the raw value through, we can provide our own interpretation of the speed to give some more intuitive values for common speeds. If the speed of the interface isn't in our map, the speed itself will be stored as the value. If it does happen to match on eof
  8. This is where we start to define the enumeration for ifType. Turns out ifType has over 200 different enumerated values. Each of these is defined in the script so that the proper type name can be stored as a property.
  9. Here's where you can see some sample output against a device here in my house. Notice that there's one line for each port (both physical and logical).
  10. This line is a good example showing the interpreted values of ifAdminStatus, ifOperStatus, ifSpeed, and ifType.

Interpretation of device level properties

Polling and storing device level properties is a bit simpler mainly because looping through instances is not required. We do have to provide each OID, the name we want the property stored under, and the interpretation, if any. All of this comes from the MIB. The script to do this is here. This example comes from the mGuard MIB, which is from some work I did recently. However, the OIDs can be replaced with any OID from any MIB (notice there's not a baseOID), as long as the OID returns a single value because the script does an SNMP get on that OID.

Conclusion

That's about it. These two scripts can be used to add real meaning to device and instance level properties. The only things that have to change are the data that come from the MIB itself. Perhaps one of these days I'll get around to writing a MIB parser that will output this information for all OIDs in the MIB. Yeah, when I have time.

Tuesday, July 2, 2019

Visualizing Status Codes

If you follow me on LinkedIn, you will have noticed that I changed jobs and moved from Houston to Austin. I'm now working for LogicMonitor as a Sales/Monitoring Engineer. It's a great job and I'm enjoying it a lot more than previous jobs. One of the advantages of this job is that I should have much more opportunity, desire, and content for blog posts.

If this is your first time here, know that this blog is not written for you. It's written for me. I increasingly need more and more reminders of how to do things. That goes especially for things that I devise since no one else knows it unless I tell them. This blog is primarily a place for me to keep those things written down.

Anyway, on to this blog post. LogicMonitor monitors IT infrastructure. After collecting data through various mechanisms, it stores the data in a big database in the cloud and then provides a cloud hosted front end website to display the data. Part of the display is graphs. Many times, the metrics being graphed lend themselves to being plotted on a Cartesian coordinated graph. However, sometimes the metric being polled is a status code. A good example of this is license status on Sophos' XG Firewall. This metric is found at .1.3.6.1.4.1.21067.2.1.3.4.1.
asSubStatus OBJECT-TYPE
    SYNTAX          SubscriptionStatusType
    MAX-ACCESS      read-only
    STATUS          current
    DESCRIPTION     " "
    ::= { liAntispam 1 }
The syntax is "SubscriptionStatusType", which is an enumerated type meaning that only a number is returned, but that number has a meaning depending on the different values returned. Looking at the syntax definition in the MIB will help illustrate:
SubscriptionStatusType ::= TEXTUAL-CONVENTION
        STATUS            current
        DESCRIPTION       "enumerated type for subscription status"
        SYNTAX INTEGER {
                 trial          ( 1 ),
                 unsubscribed   ( 2 ),
                 subscribed     ( 3 ),
                 expired        ( 4 )
        }
So each different value returned indicates a particular state of the license subscription. It's not like a percentage where 100% is good and 0% is bad and there might be values in between. It's not like a rate, where a high number is fast and a low number is slow. It only has discreet values and values in between don't actually have any meaning.
Normally, without putting in much effort, someone might easily create a graph that just plots this number, putting time on the x-axis and the value retuned on the y-axis. This results in what you see here:
As you can see, it's not very helpful. It's a flat line because the status has been the same for the entire time range. That's ok. However, there's no real indicator of meaning. Some effort was made to add the meaning to the data point description (which appears in the tooltip). However, the enumeration is so long that it doesn't really fit in the tooltip. What does a 3 mean? Also, it's not illustrated here, but what happens if the value changes from 3 to 4? There would be two flat lines, one at 3 before the change and one at 4 after the change. But what would be shown at the change? Would it be a vertical line from 3 to 4? Would it be slightly slanted? Also not illustrated here, but what happens when larger timeframes are chosen and values are aggregated together (most often using an average)? Imagine that line at 3 that transitions to 4. What if that happened in middle of the quarter and you viewed it at the end of the quarter? If this status was polled every hour, that would mean 2190 data points to display! That's too many. Almost every graphing solution would attempt to decrease the data points by grouping points and averaging every group. In the case of a quarterly timeframe, it might simplify by averaging all data points for a single day together. This could be fine for most days, except for the one where there was a change. That would show an average of 3's and 4's, yielding a value of 3.5. WTH does 3.5 mean? It gets worse if you have a 2 that transitions to a 3 which then later transitions to a 4. You could end up with an average of 3, indicating no problem at all!?!?  It's not intuitive; and graphs need to be intuitive.

So, what do we do?

Well, we might be tempted to normalize the data. This is actually a very good idea. Let me explain: normalizing the data transforms it into a scale that is more intuitive. For example, we might say that we will normalize the data using the following rules:
  1. A value of 3 is good, so we'll call that 1
  2. Any other value is bad, so we'll call that 0
Pretty cool. That's a pretty good one. Any time everything is ok, we would plot a 1. Any other status is undesirable and we would plot a 0. Transitions are still ugly and potentially troublesome. If we make one tweak, it could allow us to put some context around the resulting values. What if we changed it to 100% instead of 1 and 0% instead of 0? If we did that, we could actually put some additional meaning behind the resulting values. Thing about it, if everything is good, you're plotting 100%. What does that mean? It means that for 100% of the timeframe displayed, the status was good. If the status is 4, we'd see a line down at 0% meaning that for 100% of the timeframe displayed, the status was not good, or conversely: the status was good 0% of the timeframe. We could also create another data point which is the inverse of our normalized data:
  1. If value is 3, plot 100, else plot 0
  2. Plot (100 - the value from above)
Doing this would also let us use a more intuitive graph called a stacked area graph. We could plot our normalized data using a pleasant color like blue or green and plot the inverse data using a warning color like red or orange. This would give us two series of data that compliment each other and when plotted as a stacked graph would look like the second graph here (the first graph is a status code plot for reference):
See how much more intuitive that is? The first graph in the above picture is a plot of the raw status code. How easy is it to know when things aren't good (without reading the axis labels)? Doable but not instantly intuitive. Imagine you had 12 of these kinds of graphs on a single dashboard on a wall. Would you want to take the time and effort to read the axis labels of each one to know how things are going? No. Now look at the second graph. It's pretty easy to tell that there were two different problems between 8pm and 11am and 1pm and 5pm.  We could even remove the y-axis labels and you'd still probably be able to tell with a glance how things are doing. Imagine 12 different graphs like this. How easy is it to see if there's a problem? Just look for the red!

There are only two drawbacks. Have you noticed? We see that there are two problems, but are they the same problem? Actually, they're not. Also, are they the same severity of problem? The morning problem is that status goes from "subscribed" to "expired". The afternoon problem is that the status goes from "subscribed" (did you notice that it returned to a good status at noon?) to "trial". The morning problem is worse than the afternoon problem.  There's a way to visualize this so that it all looks good. Let me explain:

Essentially we want to normalize the data but still keep as much detail as possible. We'll need to have four series, each with its own color. For any one data point, we'll only have a value of 100% in one of the series. All the others will be 0%. Meaning that for the timeframe that data point represents, whichever series has a value of 100% indicates the status for that moment. Let's look at the normalization rules:
  1. If the status code is 1, return 100, else return nothing (100 here means Trial status)
  2. If the status code is 2, return 100, else return nothing (100 here means Unsubscribed status)
  3. If the status code is 3, return 100, else return nothing (100 here means Subscribed status)
  4. If the status code is 4, return 100, else return nothing (100 here means Expired status)
This is what it would look like (graph on the right, first two shown for reference):
Notice how the problem in the morning is highlighted with a red and the problem in the afternoon is highlighted with a yellow? Easy to tell that there are two problems, that they are different, and that the morning problem is the more severe. 

This is what the final version would look like in the LogicMonitor web gui (not interesting I know since the status code didn't change the whole time I was building this):



Here's how it's built in the GUI:
Notes on the screenshot:

  • it shows line types of "Area" but they should be "Stacked" to display properly
  • the formulas should be `if(in(StatusCode,3),100,unkn())`

Monday, June 3, 2019

How to share a ton of stuff with someone else


If you have the stuff to share:

  1. Download and install Resilio Sync
  2. After opening the app, click the plus sign in the top left corner and select "Standard Folder"
  3. Browse to the folder you want to share with someone else and click "Open"
  4. A new entry will appear in your list of folders. At the right end of this entry will appear three dots (when you mouseover the row). Click the three dots and select "Copy Read Only key" or "Copy Read & Write key" depending on whether or not you want the sharer to be able to change what's in the shared folder. They key is now in your clipboard.
  5. Send the key to the person you want to share with.

If you have received a key:


  1. Download and install Resilio Sync
  2. After opening the app, click the plus sign in the top left corner and select "Enter key or link"
  3. Paste in the key that was sent to you
  4. Browse to the folder you want to synchronize and select Open.
  5. Go grab a coffee and chips and wait until the synchronization finishes.


Disclaimer: using any protocol to transmit/receive data that you are not legally allowed to transmit/receive is obviously illegal. I'm not responsible if you use this to do something illegal.

Tuesday, April 16, 2019

One Trailer for Each Piece of the Saga

Saw an article bringing together all the trailers for all the Star Wars productions. They did the episodes 1-9 first, then the ancillary productions. I decided to put them into a YouTube playlist. You're welcome.

Don't forget the one that you can't find on YouTube, it's after Episode 6, before 7.

Friday, March 22, 2019

Web GL Globe

I have spent some time now working in the Oil Industry. I've been working in technology, so I haven't had much to do with actual oil. However, I did come across this really cool visualization of oil imports and exports when I was searching for a way to visualize some network performance data. It's based on a bit of code by Google called the WebGL Globe. WebGL Globe is built on a technology called WebGL, which is like OpenGL except that it runs natively in the browser. It allows for interaction like what you would expect from a 3D graphics game but right in the browser with web code. Some of the examples are pretty cool and load extremely quickly because they don't really use the normal document structure. (Some other examples of really neat uses of WebGL and/or three.js, a library to facilitate using WebGL: Galactic Neighbors Internet Safety game for Kids by Google Lubricious)

The concept is pretty expansive when you think about the kinds of data that can be shown. WebGL Globe combines data that has several dimensions of information encoded and visualized:

  1. Node Location - latitude and longitude
  2. Node connections - showing that two nodes are connected in some way (shown as an arc when you click on a country in the World of Oil visualization).
  3. Connection intensity - this one can have multiple dimensions depending on how creative you get. For example:
    1. the line color itself can indicate some sort of status (red/green/yellow/orange). It's even possible that you could show percentages of the arc length as certain colors to indicate distribution of the status (i.e. 10% is bad while 90% is good might mean a line that is mostly green with a segment of length 10% that is red).
    2. the thickness of the arc can be another dimension, indicating something like volume
    3. the maximum altitude of the arc can be another, indicating sample size
All this is neat, but what good does it do anyone. Well, if you've watched my Analyzing TCP Application Performance video, you know that monitoring of response time is the most important part of any infrastructure monitoring. If you're doing application response time monitoring, you should have data for each transaction showing how each transaction performed. That's a huge amount of data (big data anyone?). 

Visualizing that data requires a few steps of summarization and grouping. First of all, every transaction's metrics should be retained individually so that you can dive into the details if needed. Second, you can start grouping by user then by summarizing by time buckets (1 minute or 5 minute). Another level of grouping that can be done is to group by network location. This can legitimately be done because most users at a particular location are going to share a very high percentage of the network path that gets them to the services they are consuming. 

Let me rephrase: You should have data that describes sources, destinations, and the performance between them. Sound familiar? What I'm envisioning is taking this data and plotting it out using WebGL Globe. Each network location and each service hosting location is a node (hm, could cloud services actually be represented as clouds?!?). They'd be connected with arcs. Thickness of the arcs could represent the number of transactions between those two nodes. Height of the arc could represent the recent negative variability in performance or a way to highlight the selected arc. Color of the arc could show a measure of the deviance in performance. 

Click on a node and you'd see the arcs from all the user locations consuming services from that site (if any) and the arcs to all the service hosting locations consumed by that site (if any) pop up in altitude (normally they'd be displayed at sea level or hidden based on a GUI setting). This is similar to the action you see in the Global Oil visualization when you click on a country (you see the countries connected to it). 

From there, if you saw that all the arcs were showing problems, you would know (from problem domain isolation) that there is a problem common to all users at that site. You could click on the node again and get into performance stats for the infrastructure at that location (from a tool like LogicMonitor). Following the colors should get you to the root of the problem pretty quickly.

Alternatively, if you saw a problem in only one of the arcs (or a small selection) follow problem domain isolation tactics and click on the arc having the problem. That should dive you into the infrastructure connecting those two nodes so you can find the problem.

Nodes themselves can have problems that don't evidence themselves outside the node. If that's the case, you should be going into the node to look at why there are performance issues. You'd know to go to there because the node icon itself would be showing some color indicator that there's a problem.

Friday, March 8, 2019

Using Docker

Just a couple articles that have been camping out in my browser since I found them. They were very useful in helping me get docker doing useful stuff, like Ansible.

https://blog.docker.com/2016/09/build-your-first-docker-windows-server-container/

https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-18-04

alpine/git          latest    a1d22e4b51ad      10 days ago     27.5MB
ansible-docker      latest    25b39c3ffd15      2 weeks ago     153MB
ubuntu              latest    47b19964fb50      4 weeks ago     88.1MB

Ansible-docker is a container I modified to suit my needs. You can use it by either cloning the git repository and building from source or just pull it from the hub:
docker pull sweenig/ansible-docker

I used this to know how to push images to docker: https://ropenscilabs.github.io/r-docker-tutorial/04-Dockerhub.html

alpine/git is the easiest way I've found to run git on Windows (hint, it's not actually running in Windows but in Linux inside the container).

Thursday, March 7, 2019

Object Oriented Programming

Today's blog post may have been posted before, but it's a really good one. If you are looking to get into object oriented programming, you should give this a look: https://inventwithpython.com/blog/2014/12/02/why-is-object-oriented-programming-useful-with-a-role-playing-game-example/.

Tuesday, May 8, 2018

HTML Maps, Continued

Continuing a previous post, I decided to add some CSS to make it obvious that you can click on the mapped links.  Here's the CSS:

<style>

  a:hover {border:1px dotted gray;}

  a {position:absolute;}

</style>


Obviously, you may want to use CSS selectors to make sure that only your mapped links on images get styled this way.

Wednesday, December 13, 2017

Online circuit schematic design and simulation

This is pretty cool, although I haven't actually had a chance to use it since most of my circuits don't involve much signal processing and are quite simple DC circuits. However, LushProjects has this circuit simulator that is completely online and free to use (unlike SPICE). The neat thing is that you should be able to embed your own circuit onto any web page using this tool and iframe.

Tuesday, December 12, 2017

Hard puzzle with an easy solution

I've been keeping this tab open on my browser because I wanted to work out the solution myself. I finally figured it out (2 years later). Don't give in and look at the solution without giving it a good try. Hint, you can do it without any trigonometric functions and without Pythagoras' help.

This is known as Langley's Adventitious Angles. And a good visual solution can be seen here (warning Flash required).

Monday, December 11, 2017

Boolean Arithmetic

I had to explain Boolean arithmetic the other day to non-makers. These guys didn't have any real experience with electric logic circuits, but the pictures here seemed to help.

Friday, December 8, 2017

Prusa's ColorPrint tool

When 3D printing, there's usually a jump in cost and complexity for printers that can print multiple colors. As a workaround, you can pause the printer, switch the filament to a different color, print a few layers with that different color, then pause, switch filament, then resume printing. This can be a very tricky thing to do manually, so obviously, there is a tool to do it.  Here are some prints by a buddy of mine that utilized this tool for these prints. He prints a black surface with a cutout for the image, prints a few layers of a different color (glow in the dark in this case), then resumes printing in black. He didn't have to design the model with any major modifications, just the first few layers cut out so that the glow in the dark layer can shine through.

Thursday, December 7, 2017

PiVPN

Rolling your on VPN can have various benefits. The biggest of which is that when you're on an unsecured network (i.e. any wifi network that you don't own yourself) your traffic is encrypted back to your home and then goes out to the internet. This means that you don't have to trust that the WiFi owner (think Starbucks or McDonald's) isn't snooping on your packets. It doesn't matter if they do snoop it because you're packets are encrypted and nobody can understand it unless they are you or your RaspberryPi at home.

Before you contest, yes, I know that any form of encryption can eventually be beaten. If you're that paranoid about someone decrypting your packets (which would take years by the way) you should be off the grid.

That said, I looked into setting up a VPN option for myself and eventually found PiVPN. This little one line installer sets everything up on your RPi so that it becomes a VPN endpoint. Use it to generate a certificate which you can load on your device (I've tested on iOS and Windows 10) into the freely available OpenVPN client.

I have since found, but not installed/tried a web GUI that should let me manage PiVPN through a browser. I hope to try this eventually after I have some free time. So, probably next year!

Tuesday, December 5, 2017

PressPi

I wanted to figure out the best way to put WordPress on a RaspberryPi. Turns out the best way is an image called PressPi. Load this up, lock down all the security, load CertBot so it's all running over https and you're good to go.

In case you're interested, here's a good set of instructions for installing Wordpress on Ubuntu 16.04. So many instructions! You might need phpMyAdmin as well.

Monday, December 4, 2017

HTML Maps

I recently needed to overlay a bunch of links on top of an image. This is done one of two ways, CSS being the more modern way. Essentially, you create a div with a bunch of elements inside it, which elements are all positioned absolutely.

<div style="position:relative; height:786px; width:537px; background:url(myimage.png) 0 0 no-repeat;">
     <a style="position:absolute; top:393px; left:147px; width:87px; height:69px;" title="asdf" alt="asdf" href="asdf" target="_self"></a>
</div>

Instead of mapping out all the positions manually, there's a really neat tool that will let you do it right on top of your own image and then generate both the HTML map code as well as the more simple CSS code to render it on a webpage.


Friday, December 1, 2017

W3Schools and their incredible CSS Library

Anyone who has built a website in recent history knows the importance of good CSS. I myself have been compiling a master CSS sheet that I use on most of the web development projects I'm involved with. I realized that I was trying to accomplish the same set of outcomes over and over, so a standard library of CSS styles was a natural shortcut to a good end.

I know some have been critical of W3Schools.com and their no nonsense way of explaining web development concepts, citing technical inaccuracies and nuances. I've found that those perceived inadequacies either can't be discerned by "normal" people or don't have a discernible impact on the end product. As such, I've been a fan for a couple years now. I've built their site into my Google searches so that I know I'll end up going to the answer that I'm sure they've provided straightaway.

I've used a few of their tools from the CSS section over the last few years, particularly pleased with their tooltip implementation. That's when I discovered that the CSS sheet that they use for their own site, which has all of the CSS needed to implement all of the cool, modern utilities is free to use. They even encourage it!

There are a couple things I like about it:

  1. All their examples use this single sheet. I don't have to understand a concept, then look up a different place to find out how to use the W3.CSS framework to implement it. 
  2. It uses pure CSS. I only include one CSS reference and I'm good to go. There's no need to import a jQuery/javascript library as well to make it all work.
  3. It treats responsiveness and mobile first as the highest priorities. This is what makes simple websites look like websites developed my multi-billion dollar corporations.
  4. Templates!
I used one of the templates here. It's one page. I don't host any javascript nor CSS files. Even the icons are used from frameworks referenced and explained by W3Schools.

Thursday, November 30, 2017

Installing LAMP in one step

Go educate yourself on LAMP.

Installing LAMP has been getting easier over the years. Now you can install it with a single command line:

sudo apt-get install lamp-server^

More information here.

Encryption Everywhere

Anybody who has stood up a web server knows the importance of securing that connection. Watch this video:


While I don't yet use the HTTPS Everywhere add-on, I do make use of Certbot. You can see an example here. This website runs on a LAMP server on AWS (the free tier). From beginning to end, except the coding of the site itself, I had the secured site running in about 15 minutes. Several cool things happen when using Certbot:

  1. It's aware of the multiple hosts you may have configured in your web server and lets you run for specific hosts.
  2. It automatically configures http redirect. This means that even if a user accidentally left of the https:// from the address to your site, they'll get redirected to the https version automatically. When I first did this manually, it took me several days to get it working right. 
  3. The certificates are free because they have a short life span. So, Certbot has to be run regularly to get a new certificate. You don't have to pay attention to that cycle though because you can run the checker daily or weekly and it won't do anything unless the existing certificate is close to expiration.

Mini blog posts

Since I don't really have the time anymore to do long, in-depth blog posts, I've decided that I'll start doing mini posts with tidbits of information. When I started this blog, it started out as a place for me to post stuff I needed to remember and/or have a place to write stuff down that I could recall easily. This is a continuation of those efforts. I'll be picking suspended Chrome tabs and detailing why I've kept that particular tab around.

Tuesday, September 12, 2017

Hurricane Harvey - Our Story

Hurricane Harvey started affecting us Friday, 25 AUG 2017.  It was my Friday off, and we were preparing for the next Monday when the twins would enter kindergarten. As such, Friday was "Meet the Teacher" at their school. The sky was dark and there was light rain. It felt like a good day to cuddle up and watch a movie. Friday evening, I was in touch with our ward leadership as we coordinated our response teams.
Our band is to perform on 9 SEP, so the next morning (Sat 26 AUG), Christy and I got together with the rest of the band and had practice over near Black Horse Ranch. We were finishing up with the first set as the bottom fell out so we decided to break. We were across Cypress creek from our kids and we wanted to make sure we got back to them before any flooding started. We had identified some friends of ours who were in a neighborhood near us that was likely to flood. They were preparing to move out the following Thursday (31 AUG) so they had most of their stuff in boxes already. She (Kelia) is almost 9 months pregnant so it was agreed that they would preemptively evacuate to our home. On the way home, we went over and helped him (Kris) lift some of their furniture onto blocks and 2x4's. They had a few other things to finish and he has a large pickup truck, so we left them to finish expecting them to come over later in the day. They could get out even if the flooding started. It's important to note that their neighborhood usually floods before anything else in the area and when the flooding is really bad, their neighborhood dumps out into our neighborhood. As long as they are not flooding into us, we don't have a problem draining our neighborhood.
They came over later that afternoon and the kids started playing. We got them setup in our spare bedroom with bunk bed cots for the kids (4 total people in that family). Kris and I went back over to his neighborhood. He went back over to the house to get a few things they had forgotten and I went to help another friend attempt to waterproof his garage door. We jammed some tarps into the hinges of the garage door and weighted it down with landscaping bricks. This turned out to be a pretty good barrier against the water that eventually rose a couple feet above the bottom of the garage door. When we returned, there was only water in the street gutters. A few hours later, water had risen to cover the street. We started keeping an eye on things. The reservoir we drain into was well below us, so I wasn't worried that flooding would get to dangerous levels for us. The last two major floods had not produced enough water quickly enough to have it come up more than halfway up the driveway.

Sunday morning (27 AUG) dawned with some water covering the streets, but less than the highest point overnight. The other friend in the same neighborhood that always floods first had not yet evacuated. The overnight rise of water had not receded so they were looking at evacuation options. Kris and I rode in his big truck and started to prepare their house for flooding and to convince them to evacuate. It became evident that the water was going to keep rising. Last year, this family had waited until it was too late to evacuate during the 2016 tax day flood and had to be evacuated by canoe. We emphasized how important it was to avoid getting to that point again. The father wanted to wait it out, so we took the mother and kids to another ward member's two story home which was serving as a dispatch location for the emergency crews. The father was left to his own devices to get out (which he eventually did on his own).
A large number of ward members had congregated at that two story home for a previously scheduled baptism. Since extended family had flown in for the event, it was decided that the baptism would be performed not at the presently closed Eldridge building, but in the pool in the rain (pretty memorable!). Since there was a large group and the Bishop was present, it was decided that the sacrament would be administered since all other church meetings were cancelled. Shortly afterward, the rain lightened up and most of our street drained.
Sunday afternoon, a rescue request came in for a family in Enchanted Valley. It was outside our area of responsibility, so our dispatch tried to find resources in the area. Unfruitful, he dispatched Scott with his big Yukon. They made it to the family and loaded everyone up. There wasn’t room for two of the four rescuers, so they stayed behind to be retrieved after dropping off the family at a safe location. Upon returning, Scott decided to splash around a little and stalled his Yukon. They pushed it up onto dry land and notified our dispatch. I saw that call come in and reached out to a few Jeeper groups who had been offering help and making rescues since the rain started. Allan and Z responded and we headed out toward Telge and 290. I made it past the Sheriff station before the water started getting deeper. I gave Allan and Z my tow strap and they continued on (they have a few inches more clearance than I do). Their two door jeep would only hold two of the four rescuers, so Scott and his nephew stayed with his truck while the two who had originally stayed behind were brought back to me. I had backtracked and waited under the 290 bridge at Telge. Upon returning, the water had risen. Allan commented that he could probably make it back in, but he was worried about getting out while the water was still rising. We decided to attempt to rescue via Huffmeister. It was dark by this time. In my lower Jeep, I led the charge. The streets were clear of water until about a half mile north of Cypress North Houston. I was cruising at about 40mph when we hit the water. Needless to say, it was a pucker moment. We were all fine, but it was one of those moments where everything went into slow motion. The water started getting deeper, so again Allan in his swamp thing went ahead to see what they could see. The water ended up being too deep (reportedly about 6’) so, we weren’t getting to Scott and his nephew any time soon. They would have to ride it out. The rain started coming down harder and we had just received news that the Addicks and Barker reservoirs would be opened up at 2am. Not yet knowing how this would affect the current water levels, we decided to break for the night.  We went to bed late Sunday night as we watched the waters begin to slowly creep over the street in front of our house.

Monday morning (28 AUG) when Christy got up, Kelia told her the water outside, which was up to the sidewalk, was no longer draining away.  Christy insisted that we start making plans to raise our important possessions and make an evacuation plan. I was hesitant because of past experience with extreme flooding had never given us any problems. I reluctantly conceded though and we figured out what we would do if we decided to evacuate.
We found out that the neighborhood that always floods first had breached the main road and was spilling into our neighborhood. It wasn't going to get any worse for them, but it was coming in quickly enough that our drainage system wouldn't be able to keep up for long. I went for a hike in my chest waders and saw our main drainage creek rising. This meant that what we were draining into was full and it was only going to get worse from here. This had never happened before. It turns out that the Addicks reservoir had filled up. <a href="https://www.usatoday.com/story/news/nation-now/2017/08/28/controlled-release-water-houston-reservoirs/607594001/">The Army corps of engineers had already opened it up to drain it</a> (which would send the water south toward the ocean) but the water leaving was less than what was coming in. I broadcasted my hike live over Facebook (https://youtu.be/-XQDWJn-1Tw). Upon returning home, I had decided it was time to get out while the water was low enough for my Jeep and Kris' truck. Our neighbors (4 adults and one infant) also needed to evacuate. We rallied everyone into motion and started implementing our plan. We got everything that we could think of up a as high as we could. Kris and I loaded our vehicles with the essentials we would be taking with us. I got my family loaded up in my Jeep and Kris got his and the neighbors loaded up in his truck. My neighbor got 3 videos of our escape (part 1, part 2, part 3) from the back of Kris’ truck.
My Jeep dove into the water and got us to a high point right before the exit of the neighborhood onto Barker Cypress (which was the spillover point for the first neighborhood that flooded). I parked there and we made sure the boys had their life jackets on and seatbelts off (in case we had to ditch the Jeep). About 8 minutes later (which seemed like an eternity) Kris' truck caught up with us and we pushed forward into the deeper water right before the exit onto Barker Cypress. It's at this point that I think I got water in my differential, more on that later. With no option but to push forward, we got water up to top of the Jeep tires before making it out onto the shallows of Barker Cypress. We turned south away from Cypress Creek and away from 290 where the water was coming from. We made it down to Tuckerton without any real issues except for some water up to the middle of the Jeep tires. It was dry from there on out. My plan was to head Southwest until we found a place to land. While en route, James, a fellow Cub Master, texted me offering to let us come to his place indefinitely, which we did. His neighborhood was wet but didn't have any water on the streets. I realized later that we were living out the story of the three little pigs, Kris' family fleeing the straw house from the big bad Harvey to our house of sticks, which we eventually fled to James' house of brick.
I spent most of Monday afternoon coordinating rescues, surveying potentially flooded/closed streets, and making various runs to the Longenbaugh Mormon church, which had been turned into a shelter. There was a ton of food and other donations to be received and sorted as well as families to take care of. I got a call later in the day asking to help with evacuation of a family of 13 near the intersection of Queenston and Tuckerton. I made it to the Shell station there, which had turned into a staging point for various high clearance vehicles that were going in to make rescues. I arranged for an ATV with a flatbed trailer to make the run into the house where the family was. They would bring them out to me and I would take them to a shelter. After the ATV was dispatched, they family called and cancelled. I still feel bad for the driver of the ATV. I got signed up to do a shift on Tuesday from 4-8pm at the Longenbaugh building. The shelter required two Elders or High Priests present at all times.

Tuesday morning (29 AUG), Christy and I ventured out in the Jeep to try to make it back to the house. Several reports on our neighborhood Facebook page indicated that the waters were receding. We found high water on Red Rugosa, but not too much covering the street in front of our house. We discovered that the water seems to have entered the garage and gotten to the front porch, but didn't come into the house. We also found a telephone/electrical pole which had been parked at the end of the street waiting to be installed that had floated into our front yard. This was surprising, but not unreasonable. It was wood and the water was high and apparently thrashing towards the three storm drains in front of our house. Some neighbors saw it churning around and lashed it to one of my trees. Christy and I elevated a few more things in case the water came up some more. I built a couple of impromptu sandbags out of wet towels, garbage bags, and landscaping bricks to put at the front and back doors. I spent the afternoon at the Longenbaugh building. The clouds moved off, the sun shone. It felt like a good sign that the storm was over. There was a rumor about a kicked in door across Barker Cypress, so I decided that I would spend the night at the house with my shotgun. It also gave me a chance to watch Guardians of the Galaxy 2.

Wednesday morning (30 AUG), our roads were dry and we had decided that we could probably come back to our house from James'. The sun had started to shine and only the lowest intersections still had water. Harvey had moved on to east Houston, so while we were technically still in the storm, we were now on the dry side. We came back home and started to put things back down on the ground. I spent the afternoon loading up small items from Kris’ house and being a shuttle driver for the ward team that was gutting a home. I eventually got some food brought in for the teams in both locations.

Thursday (1 SEP) we spent most of the day moving Kris and his family into their new home. The roads were dry, so it was a simple matter of loading up the U-Haul twice. Their new home is less than a mile away, but on our side of Barker Cypress (less chance of flooding).

Friday (2 SEP) I spent the morning getting some things back in place around the house until my brother, John, came in from Dallas. When he got here, we got together with the ward team to work on removing some wood flooring from a flooded house. That took the rest of the night and we only got the main living room (150/1200 sq. ft.).

Saturday (3 SEP) I dropped my brother off with the team that would continue for the next six hours working on that wood floor. I had arranged to attend a differential fluid changing party hosted by a shop owner on Clay road just inside the beltway. They were changing fluids for free, so it was a good opportunity to make sure everything was in working order and also make sure I got the water out of my gears. They also gave me some pointers which made installing the wiring harness for my trailer hitch dead simple. I got done with that around 1pm and went to act as a coordinator for the team that was finishing the wood floor removal and the other team that had begun gutting another house.

Sunday (4 SEP) began with gutting a few houses in our neighborhood. We had abbreviated church meetings at 1pm during which a new Bishopric was called and we were notified that we would be meeting back in the West road building for the foreseeable future.

Monday (5 SEP) was Labor day and our crew chief had advised that those of us who had been working for several days straight take some time off to recover. I heeded that advice and played with the kids. I brought out the slot car track and we raced. In the evening, grandpa invited us over to go fishing. He had just bought three new kids fishing poles (Star Wars themed, of course). Luke caught a baby brim and a baby bass. I caught a turtle and Grandpa caught a brim and another turtle. We let the first one go, but decided to relocate the second one since the turtles have a tendency to kill the ducklings. Cole became an expert caster, sometimes throwing his practice weight 25 feet from the shore.

Tuesday (6 SEP) meant a return to work; the Chevron offices had been closed since the storm. It appears the tunnels were flooded since the demo work had already been done and there were dozens of fans and dehumidifiers.

Wednesday, March 2, 2016

Rate, Volume, Utilization, and Parsecs

But wait, the parsec is not a unit of time, but a unit of distance! Wait, what? All arguments aside about how the Millennium Falcon could make the Kessel run in a shorter distance through enormous gravitational shears, knowing your unit is extremely important.
I work in network monitoring and one of the main reports my tools provide measures how much an interface is used. Because the tool is better than poo, it presents the utilization in several different units. First, let's review the units. Each of these units are SI units, so standard SI prefixes apply when talking about larger bases of the base unit:
  • Bytes - measures the total number of octets that were transmitted (or received depending on p.o.v.)
  • Bits per second - measures the number of 0's and 1's that were transmitted (or received depending on p.o.v.) in a single second.
  • Percent utilization (%) - measures the percentage of a period of time that the interface was transmitting (or receiving).
Let's break it down.

Bytes

This one is pretty simple and is referred to as VOLUME. It's simply the total number of Bytes transmitted (or received) during the measurement window. An SNMP polling station would poll the octet counter at a regular interval. Every time the octet counter is polled, the delta between the previous poll results and the current poll results represents the total number of Bytes during the measurement interval.
V = B1 - B0
Polling too frequently will result in small values. Whenever rollups happen, the individual data points should be summed (integrate over the rollup interval). As long as rollups are done that way, the poll rate is less consequential.
Rollover is accounted for by assuming that a lower number than the previous measurement is caused by rollover and the new measurement (measured from 0) is added to whatever remained between the previous measurement and the max limit of the counter.

Layman's example

This is similar to tracking how many miles a car travels. You simply take a reading of the odometer before beginning a trip and another at the end of the trip. The difference is the total miles the trip entailed. You could take readings more often. You'd just need to add up all your measurements at the end of the trip to get the total for the trip.

Bits per second

Bits per second is a simple count measured over a unit of time, making it a RATE. It counts the number of bits that went through the interface, then normalizes the count over a standard unit of time, the second. It is calculated like this:
R = (Δ bits) / (Δ time)
That is, you take the total number of bits and divide it over the total time of the measurement. This is usually done through SNMP by looking at the octet counters. The NMS will poll the sysUpTime and the octet counters at a certain time (T0 and B0). It will then poll the sysUpTime and octet counters at some other time in the future (T1 and B1). The RATE is calculated by dividing the difference between these two measurements (and adjusting the octet counters to get it into bits instead of bytes 8 bits = 1 Byte):
R = 8 (B1 - B0) / T1 - T0
The resulting unit is bits/second and represents an average of the number of bits transmitted per second over the measurement interval (T1-T0). When doing the rollup, average is the most common descriptor. In addition, min, max, standard deviation, variance, and 50th, 75th, and 90th percentiles would be useful.
If you're already gathering VOLUME, you'll notice that B1 - B0 used in the RATE calculation comes from the volume calculation. That's on purpose and is why it is said that RATE is derived from the VOLUME measurement. In fact, if the polling interval is fairly regular, the rate can be said to be approximately linearly proportional to the volume.

Layman's example

This is not really any different than measuring the speed of your car while on a trip. You take a reading of the odometer and the clock at the beginning of the trip and again at the end of the trip. The difference in miles, divided by the total time of the trip (in hours in this case) will give you an average speed in mph. You could increase the resolution of your measurements by taking a reading and performing the calculation every 5 minutes. This would give you a data point describing the average speed for every 5 minutes of your trip.

Percent Utilization

Percent UTILIZATION measures how much capacity is used and is reported in percentage of the total capacity available. This is calculated by dividing the current RATE by the total rate capable by the interface. Alternatively, it could be calculated by dividing the VOLUME by the total volume capability of the interface. The latter requires a bit more derivation, so most use the former.
This metric requires knowledge about the interface's capabilities. This is usually obtained by polling the bandwidth statement (ifspeed) of the interface (1.3.6.1.2.1.2.2.1.5), which is in bits per second (bps). Once obtained, the percent UTILIZATION can be calculated like this:
U = 8 (B1 - B0) / T1 - T0 / ifSpeed
You may notice that a part of this formula looks the same as the RATE calculation. It is. Simplifying the formulas:
U = 8 (B1 - B0) / T1 - T0 / ifSpeed * 100
R = 8 (B1 - B0) / T1 - T0
U = R / ifSpeed * 100
Since the UTILIZATION formula involves dividing a rate (in bps) by a speed (in bps), the result is unitless. This means that the unit can be thought of as % (percentage). Rollups of UTILIZATION should be treated the same way as rollups for RATE. You should also notice that the percent utilization should be linearly proportional to the rate, given a constant bandwidth capacity of the interface.

Layman's example

This calculation is similar to calculating how close a driver is to the speed limit. By dividing the current speed (derived using the formulas above for speed) by the total allowable speed, you can calculate what percentage of the limit the car is currently traveling. When driving, moving at 100% of the speed limit is actually good. You are actually making the most of the available resource. The only time 100% utilization is a problem is when you need to do something else with that speed (i.e. other cars on the road not travelling at the same speed). The same actually holds true for networking. Utilization of 100% is not bad until you need some percentage of those resources for another task.

Wednesday, February 3, 2016

SharePoint Kanban

I was recently asked to help reproduce a Kanban board I had built in SharePoint for one of my projects. Having built it only once previously, I learned a few things and the resulting reproduction had a few improvements over the original.
First, I start with a custom list. I never use the templates in SharePoint because they are constantly trying to make things more complicated in an effort to make things more simple.

The Easy Stuff


  • Rename the [Title] field to 'Task' or something more representative of the items you'll have on your Kanban board.
  • Create a [Person or Group] field to contain the person responsible for the item. It's important that this not be a simple text field. I'll explain why when we build the views.
  • Create any other meta data columns that you want for your items (notes, description, priority, estimated effort, etc.)
  • Create a [Date and Time] field to contain the due date. Call it [Due Date] if you want to use the formulas here without modification.
  • Create a [Date and Time] field for every phase. For example, if my phases were:
    Deploy Launchpad, Igniter Primed, Mount Rocket, Connect Detonator, Remove Safety Cap, Detonate
    Then I would create the following fields as [Date and Time] fields:
    [Launchpad], [Igniter], [Mount], [Connect], [Safety], [Detonate].
    Essentially, each field will contain the date and time that that phase was completed. If the date is blank, that stage hasn't been completed. If there is a value in the field, then that phase has been completed (and was completed at that date/time). 
  • Go to List Settings >> Advanced Settings and disable attachments for the list (this was a dumb feature for what we're using the list for). 

The Next Phase Calculation

Create a [Calcuated] column called "Next Phase". This column should evaluate the phase fields to determine which phase is the current phase being worked on. Continuing with my example, if I had already deployed the launchpad, primed the igniter, and mounted the rocket, the "Next Phase" would be to connect the detonator.

This is done by evaluating the last phase to see if it is complete. If the last phase has a date/time value, it is completed and the next stage is "Done". If the last phase does not have a value, we need to figure out if the "next phase" is this phase or the previous one. Here's the formula (using my example field names, you should be using yours):

=IF(NOT(ISBLANK([Detonate])), "Done",
IF(NOT(ISBLANK([Safety])), "Detonate",
IF(NOT(ISBLANK([Connect])), "Safety",
IF(NOT(ISBLANK([Mount])), "Connect",
IF(NOT(ISBLANK([Igniter])), "Mount",
IF(NOT(ISBLANK([Launchpad])),"Igniter",
"Launchpad"))))))

Technically, the end of each line above can say anything you want. Since it would be nice to be able to sort the [Next Phase] column to show tasks in order, it would be nice if they were in some sort of sortable order. Unfortunately, alphabetical order won't work. We can easily fix this by prefixing each resulting string with a number to indicate the order, like this:

=IF(NOT(ISBLANK([Detonate])), "6 Done",

IF(NOT(ISBLANK([Safety])), "5 Detonate",
IF(NOT(ISBLANK([Connect])), "4 Safety",
IF(NOT(ISBLANK([Mount])), "3 Connect",
IF(NOT(ISBLANK([Igniter])), "2 Mount",
IF(NOT(ISBLANK([Launchpad])),"1 Igniter",
"0 Launchpad"))))))

The IF(NOT(ISBLANK( logically means, if the previous phase has a value but the current phase didn't, the current phase is the next phase.

The Status Column

This column is designed to figure out the status of each item as compared to the due date. Four possible states exist:
  1. If there's no [Due Date] (e.g. [Due Date] is blank), the the status is "No Due Date".
  2. If the item has not been completed (e.g. the last phase field is blank) and the item is not yet due (e.g. the [Due Date] is greater than today), its status would be "On Time for Completion".
  3. If the item has not been completed (e.g. the last phase field is blank) and the item is due (e.g. the [Due Date] is less than today), its status would be "Overdue".
  4. If the item has been completed (e.g. the last phase field is not blank) and the item was completed before the due date (e.g. the last phase field is less than the [Due Date]), its status would be "Completed On Time".
  5. If the item has been completed (e.g. the last phase field is not blank) and the item was completed after the due date (e.g. the last phase field is greater than the [Due Date]), its status would be "Completed Late".
Here's the formula:

=IF(ISBLANK([Due Date]),
      "No Due Date",
      IF(ISBLANK([Detonate]),
             IF([Due Date]>=Now(),"On Time for Completion","Overdue"),
             IF([Detonate]<=[Due Date],"Completed on Time","Completed Late")
      )
)

The Views

I recommend 4 types of views:
  • Datasheet View - This should be a datasheet view of all items. Usually sorted by [Due Date].
  • All Items - This is a standard version view of the Datasheet View. You can alternatively add groupings based on Next Stage.
  • My Items - This is either a standard or datasheet view (your preference), also sorted by [Due Date], however, filtered by the person the item is assigned to (remember above where I said we'd use this later?). SharePoint has a session variable called [Me], which contains the username of the current user. By putting a filter where the [Assignee] field is equal to [Me], we create a view that only shows the items assigned to the current logged in user. This means that anyone on the team can log in and look at this view and see only their items. This won't work if you made the assignee field a simple text string; it needs to be a [Person or Group] field.
  • Phase specific views - these views aren't required but are often requested. You basically build a copy of the Datasheet View or the All Items view but filter it where [Next Phase] field equals a particular phase. You would repeat this for every phase. I find this tedious when those who want this type of breakdown could just look at the Datasheet or All Items views and just filter for a particular value in the [Next Phase] field. However, some people can't handle that level of sophistication, so statically defining views is the only way to please them.