Monitoring and Other Mayhem

April 18, 2016

Data Graphs

The graphing support is, within reason, flexible. Most of the graphs that you’ll use at first are already provided by the templates you apply, and those are reasonable. Adding new ones is also fairly trivial, though you may have to create a calculated item here or there as a graph source to get what you want. There are also Screens, which appear to aggregate graphs for a given host or other object. I’ve not played with these yet, but I’m thining they’re probably a good answer to a number of different issues. I don’t know if there’s any ability to create additional global dashboards, however; the Screens appear at first glance to be tied to individual hosts.

The graphing engine supports pie charts, stacked graphs, line graphs, and so on. You can perform simple min/max/avg calculations. You can set the colors. Graphing is pretty much instant so long as the data is available. There’s not really much more I can ask for out of this — yet.

Future Directions

Overall, I’m happy with it so far. We’ll see over the coming months whether or not it truly handles the job in the long term; my needs tend to change with my mood, and I’m sure I’ll find something it doesn’t do. There are also a lot of things to explore in there.

One of my big questions is whether or not there’s some kind of API that I can use to get at the trend data. It’s MySQL based, so I know I’ll be able to get it somehow, but a REST API or something similar would be very useful. It would allow for much more complex and meaningful use of the data the thing is collecting.

And maybe the creation of interesting dashboards outside of Zabbix.

There are also mapping (?) and Inventory features that I haven’t even touched yet. Those might be interesting. And the agent offers countless possibilities for the future. The question will be whether or not it’s easy to integrate at an application level, as that could be very cool.

I like the idea of my Apache server spewing response code counts at the monitoring system without having to parse the logs to get them… Those kinds of metrics are exceedingly useful for troubleshooting in production environments. Not that this one will ever be big enough to matter, but I can dream, right?

My next trick will be to set up the requisite items so that I can trend the overall cache hit rate for varnish. I haven’t done it yet, but I don’t expect it to be very hard at all to do. It’s going to be fun.

If it turns out to be interesting, maybe I’ll write it up.

Oh, and I need to compile the agent for the Raspberry Pi… That’s definitely on my list.