What's in my podcast roll

I’ve always been interested in learning and developing my skills, and over the past couple of years i’ve been increasingly targeting my professional development.  On a long commute, a great way to explore different areas of interest is listening to podcasts.  Here are a few of the things i’m listening to at the moment:

  • SANS Internet Storm Centre daily podcast [feed] – If there is only one podcast that should be on every IT professional’s podcast roll, this is it.  Johannes Ullrich‘s daily recap of security & Internet news is my “must listen” podcast.  Pros: Fast-paced, no fluff, edits out blank spots; very balanced coverage – i value Johannes’ opinion on almost every security topic; his enthusiasm is infectious – you can even hear his voice pitch getting higher and more excited as the podcast goes on.  Cons: none unless you like being uninformed or don’t like German accents.  Released every weekday (Johannes never seems to get sick; my son thinks he might be a cyborg 🙂 and usually around 5 minutes long.  I would pay for this podcast if i had to.
  • SANS ISC Monthly Threat Update [feed] – This is a long-format security update from Johannes Ullrich and sometimes a sponsor representative.  Pros: more in-depth coverage of certain topics, e.g. IPv6, DNSSEC.  Cons: covers many of the same news topics that have already been covered in the daily podcast; based on a webcast, so sometimes you have to go back and look at the slides to make sense of it.  Often a fair amount of time is spent on the monthly Microsoft Patch Tuesday update, which may be good or bad depending on your interests.
  • Risky Business [feeds] – This is an Australian podcast focused on security.  Pros: in-depth interviews with world-class guests who offer varied and fascinating insights; considerable time devoted to opinion and analysis of news; Patrick Gray is quite accessible on Twitter, and actually replies to messages meaningfully; quirky and sometimes outstanding full music tracks as outro.  Cons: Patrick and Adam/Mark laugh at their own jokes a bit too much and are a little obsessed with lulz; the electronic fart track for theme music makes me want to stab my eardrums out (tip: skip the first 30-40 seconds).  But the biggest con is that it’s not suitable for work – the podcast contains frequent coarse language and not all of it is beeped out.  I would love to share some of their material with my wife and kids, but references to anal rape in prisons goes beyond edginess to just plain poor judgement.  (Have you ever actually been into a prison where this was a risk, Patrick?  It’s really not a laughing matter.)  This isn’t enough to make me stop listening, but it’s a definite audience-limiter.  Make sure you subscribe to the RB2 feed as well – it covers AusCERT conferences and other similar events.
  • Cisco TAC Security Podcast [feed] – Four to six Cisco security TAC engineers talk news, tips, and products.  Pros: efficiently edited; engaging hosts, not boring; useful practical information (i’ve listened to their IPsec troubleshooting podcast more than once whilst working on a VPN problem).  Cons: feels a little too scripted sometimes; very Cisco-centric (understandably), but this means a lot of focus on ASA, which to my mind is not a compelling firewall platform.
  • Packet Pushers [feeds] – This podcast revived my interest in networking, and is a major contributor to my renewed interest in certification and professional development.  If you’re new to the podcast and like it, it’s well worth subscribing the to the all audio feed, starting from the beginning and listening to all the episodes.  Until very recently (with the Cisco Nexus series and the various SDN-related episodes) there was very little repetition of topics.  Pros: experienced presenters; world-class guests; never boring.  Cons: Greg Ferro dominates a bit too much, often talking over the top of his co-hosts; a little Cisco-centric; sometimes too many presenters, which means not everyone gets a chance to share their expertise.  Greg, Ethan, and most of the guys with a Cisco background are somewhat uninformed about Linux/Free Software/Open Source licensing, values, and culture.  They can also be uncritically fanboyish when it comes to Fruit Company laptops and tablets.
  • The Linux Action Show [feed] – A desktop/mobile/gaming-centric podcast in both video and audio formats.  Pros: lots of different news from the Linux world; CC-BY-SA licensed (o/); assumes Linux on the desktop is normal (!); metal-styled theme music is easy on the ears.  Cons: Bryan Lunduke seems to think he’s a cross between James Hetfield and Patrick Warburton, and only occasionally pulls it off; could be edited more compactly without losing its feel; a little too much material released.  I find it a little hard to keep up; a weekly podcast should be targeting around 30 minutes, in my opinion.
  • Tuxradar [feed] – a UK-based Linux news & opinion show from the creators of Linux Format magazine.  Pros: good format which incorporates listener feedback; strongly committed to Free Software from a socio-political perspective as well as a technical perspective; as expected, doesn’t assume you’re a freak if you run Linux on the desktop.  Cons: audio often muffled, slurred, and crowded with random chit-chat & laughter – this is as much a result of poor diction and too many presenters as it is of low budget or technical issues.  Released fortnightly, which is about right for a long-format show.
  • DevOps Cafe [feed] – DevOps news, interviews, and opinion.  Pros: good interviews with people from various perspectives, creative commons licensed.  Cons: a little too concerned with what’s cool & trendy in the industry (which one might argue is inevitable with DevOps).
  • The Cloudcast [feed] – Cloud industry news and opinion.  Pros: helps to de-cloud (bad pun fully intended) some of the vagueness around many *aaS (… as a Service) terms.  Cons: assumes cloud is good, doesn’t get into much technical or philosophical discussion about why we should use cloud.
  • Geeks and God [feed] – God meets nerds.  Pros: provides a unique perspective on technology that is more aware of “soft” issues (and not just spiritual ones).  Cons: content is a bit Drupal-centric (not a con for me, but if you don’t use it you might want to skip the drupal spotlight section); podcast has gone a bit quiet lately (the old team of Rob Feature and Matt Farina was prolific and particularly well-practiced at keeping discussion flowing and making a podcast interesting).
  • Andy Stanley – In my opinion, the best Bible communicator of our generation.  Even if you don’t consider yourself a Christian or “religious”, he’s worth a listen to moderate the fringe elements of Christianity who seem to dominate popular media.  His content comes in several different podcasts:
    • Your Move – a “best of” series; tends to be shorter than the weekly podcast.
    • North Point Weekly Podcast – I’ve often found the topics in this one more interesting than the “best of”.  Not all of the messages are from Andy Stanley, and i find them rather bland and traditional by comparison.
    • Leadership Podcast – good insights and the some interesting interviews on various leadership topics, not just church leadership.  Released monthly, usually around 30 minutes in length.

Some shows i’ve previously had in my podcast roll:

  • No Strings Attached – Wireless networking podcast.  I found this too slow and too scripted and got bored quickly.  I probably could have given them more of a chance to impress me, but i didn’t even make it through one episode.  No doubt as it went on it got better, but i’m less involved with wireless networking than i was previously, so it hasn’t made it back onto my radar.
  • FLOSS Weekly – Randall Schwartz’ Free Software podcast; felt too slow and scripted to me.
  • Linux Weekly News – verbatim reading of the LWN.net news articles; a bit dry.

What’s missing?  One thing that i haven’t found is a podcast that deals with system engineering in the enterprise from a Linux/networking perspective.

Source: libertysys.com.au

Initial thoughts on the HP A5500-EI switch


I first came across HP’s A5500 switches when i started looking at configuring VRRP and distributed trunking to provide routing redundancy for the core of a client’s campus network using two ProCurve 5400 switches. I found that i could get a new pair of A5500s for around the same price as the software license upgrade on the 5400s (multicast routing, distributed trunking, and OSPF cost extra), and they came with a much broader range of features. HP’s IRF is far preferable to VRRP and distributed trunking from the perspective of administrative complexity, since the IRF stack appears as a single switch to hosts and other switches.

The 5400 is a chassis-based switch, and replacing it with the stackable 5500 would not normally be ideal for a core switch setup, but we had very low port density and bandwidth requirements in our core, so two 24-port 5500s were sufficient. The 5400s were moved to the distribution layer along with a pair of 2824s used primarily as copper-to-fibre converters. Each distribution switch has four 1 Gbps links to the core (two to each switch) as an LACP trunk, as does each of our four VMware ESX servers.

I’ve been using ProCurve networking for about 9 years, working with it directly as a network admin for the past 5. Most of my networking experience has been with E series switches, so i will write mostly from the perspective of a ProCurve user. I’ve learned about these swiches mostly from reading the manuals and testing features as we migrated our core from standalone 5400s to an IRF stack of 5500s, but Wayne Jewell at Layer127 also provided some helpful assistance with ACLs and OSPF.

This review will probably degenerate into note form before too long, since i’ve been sitting on a draft of this post for far too long already, and i want to get it out the door. 🙂

Packaging and hardware

The information in this review is based on A5500-EI switches running Comware 5.2, revision 2208. The switches were shipped on firmware 2202; finding, downloading, and upgrading the firmware to 2208 (via Xmodem over the console cable) was an easy and straightforward process. (Cisco, take note!) After i set up the IRF stack, the second switch automatically downloaded the firmware from the first switch, installed it, and rebooted as an IRF stack member.

The switches are branded H3C rather than HP. I assume this will change as they do new manufacturing runs. Hardware quality of the units themselves seems excellent – they continue HP’s excellent track record with the ProCurve platform, and have inherited its lifetime warranty.

Perplexingly, port numbering on the switches starts from the bottom left rather than the top left, which is rather disconcerting to those of us expecting a traditional layout. I assume this is due to the switch’s Chinese heritage. I’ve intentionally designed the port configuration on our stack to minimise the risk of errors related to this, but i’m pretty sure it will cause operational problems sooner or later. Let’s hope that HP produces region-specific versions of these switches which allow customers to choose top-to-bottom port numbering.

Main features

Features i’ve tried (in alphabetical order):

  • ACLs
  • DHCP relay
  • IRF
  • LACP (bridge aggregation)
  • NTP
  • OSPF
  • PIM (dense mode)
  • RIP
  • UDP helpers

All of the above work as expected and without hiccup. When i first connected the IRF stack as the network core, it was a disappointingly quiet maintenance window; there were no major problems to deal with.

One perplexing issue which hasn’t been resolved yet on our 5500s is that 20% of ping packets to any IP interface on the switches are lost. There have been no issues with performance of the switching and routing, so it seems to be just a management plane issue. (See further comments on CLI responsiveness below.)

User interface

All of my access to the 5500 has been at the CLI via console cable or ssh session. I haven’t bothered configuring the web interface, nor looking at HP’s IMC management software (which i hear is excellent). Most of the rest of this review will relate to CLI features.

Command syntax

Much of the Comware command syntax seems to use alternative vocabulary from the command sets on most other switches and routers as a deliberate technique to make it seem different from Cisco. However, this difference seems to be only skin-deep, and most of the commands follow an almost identical structure to Cisco. (This is particularly evident in interface context.) I don’t have enough experience with 3Com equipment to comment on whether this was the case with earlier Comware releases or is a feature of only the post-Huawei switches.

‘display this’ is an extremely useful command when you’re configuring anything which has its own context (e.g. OSPF). It gives you immediate feedback on what you’ve just configured, without introducing noise or requiring pipes.

‘hotkey’ is a way to create shortcuts for common commands. I’ve defined:

  • Ctrl-O as ‘display ip rout | exclude InLoop0’ (mnemonic: ‘rOutes’)
  • Ctrl-T as ‘display this’ (mnemonic: ‘This’)
  • Ctrl-U as ‘return’ (mnemonic: ‘Up one level’)

‘command-alias’ is great for making things a little more familiar. Examples of common commands which might be aliased are:

  • show – display
  • no – undo
  • exit – quit
  • end – return

The default output of many diagnostic commands (e.g. ‘display vlan’) is unneccessarily verbose by default. I much prefer the ProVision approach, which gives brief tabular data by default and requires additional parameters to extract details.

Interacting with the CLI

Having access to basic pipe facilities (‘| begin/include/exclude’) seems like a minor issue, but after using it for a few days, going back to switches without this feature becomes increasingly frustrating. (I understand this has also been added on recent 5400 firmware versions.) I long for a real ‘less’ command which can page forwards and backwards in command output. (I hear Cisco has a real ‘grep’ command on their latest IOS; hopefully this is a sign of better things to come.)

The 5500 doesn’t seem to detect terminal length like the 5400 does. The 5400 detects changes in terminal size even during the same session, whereas the 5500 seems fixed at 25 lines.

Ctrl-C doesn’t cancel command input, only running commands (well, some of them – it doesn’t work with ‘display ntp-service trace’). One has to substitute Ctrl-E followed by Ctrl-X or Ctrl-A followed by Ctrl-Y instead. Ctrl-A is a pain for me to use because i’m usually connected either through minicom or GNU screen. I long for a real bash shell (which supports vi editing mode) on a switch.

Note to self: investigate Vyatta and Arista on this point.

The 5500 is somewhat more sluggish to respond to commands at CLI than the 5400, both when connected via the console cable, and when using ssh over gigabit Ethernet. This may be due to a relatively low-powered CPU being used for management, but that would explain poor ssh keystroke latency without explaining the same experience over a 115,200 baud console cable. Most of HP’s low-end ProCurve switches (e.g. 2824, 2610) are more responsive than the 5500.

The 5500 CLI also requires that the backspace character be set to Ctrl-H instead of the usual delete, and i could find no way to change it in the CLI (i.e. there’s no equivalent to stty on Linux/Unix). This is more troublesome for those of us who routinely switch between Comware and ProVision switches. A facility in the CLIs to allow a consistent backspace setting to be used would be highly desirable. (Alternatively, they could just allow both Ctrl-H and DEL on both platforms and then everyone could forget about it. I suspect this is what ProVision already does.)

VLAN setup

Configuration of ports and VLANs is the most Cisco-styled part of the CLI and is much harder with Comware than ProVision. I personally think that dealing with port modes is the most cumbersome part of configuration on all of the non-ProCurve switches i’ve worked with, and i very much hope HP retrofits the VLAN parts of the ProVision CLI to these switches.

Changing a hybrid port to a trunk port (or vice versa) requires setting it to an access port first. I can’t conceive of any technical reason for this being exposed in the CLI; if it is a technical requirement, it should be done automatically by the underlying OS.

On that note, is anyone out there using hybrid ports in a production environment? I can think of only one reason for using them (protocol-based VLANs), and i can’t see why anyone would want to use that for anything more than a fun experiment.

ProVision has ‘show run vlan’ on recent firmwares, which shows only the VLAN definition parts of the config. There seems to be no equivalent on Comware, even though there are similar parameters for other subsystems (e.g. ‘display current-configuration configuration acl-adv’).

VLAN descriptions can only be 32 characters long – this is not enough to give a useful description. Note that this is not the (short) display name, which is set separately.


ACLs use negated rather than conventional netmasks – this seems rather counter-intuitive to those of us who have worked mostly with host-based firewalls, but once you are familiar with it, it isn’t too hard. Anyone who has worked a lot with OSPF on Cisco devices should find it no problem.

Non-contiguous netmasks may be used with ACLs, which is a useful feature if you use a standard numbering system for gateway addresses, server IP ranges, and the like.

The ‘hardware-count’ feature is disabled by default on all ACL entries. Since the switch automatically reverts to software counting for any ACLs which do not support hardware counting (e.g. SNMP access to the management plane), i cannot conceive of a reason why it should not be enabled in all cases.


The H3C documentation will be unfamiliar to people who are expecting ProCurve’s rigidly-defined format, but it is mostly of a high standard. It wastes a lot of space repeating how to get to the required view for most commands and shows some signs of backtranslation into English (also present in the CLI), but i have only encountered only one error so far (the documentation of ‘display interface brief’ as ‘display brief interface’), which was corrected in a subsequent version of the document. HP networking’s documentation site seems to lag behind the current firmware version; i found the documents updated for 2208 on H3C.com months before they appeared on HP.com.

HP networking have produced a resource which has made my conversion from ProVision to Comware much more straightforward: the HP Networking and Cisco CLI Reference Guide, which can be found at HP Networking’s training resources page. This document compares the Comware, ProVision, and IOS versions of many common commands, focusing on practical day-to-day differences and providing lots of examples. It is an indispensable reference for network engineers switching between these platforms, and has even been a useful resource in helping me become more familiar with Cisco’s CLI.

The Future

After working with Comware for a few months, i have come to appreciate its feature-richness compared with ProVision. I have really enjoyed learning the Comware platform, and it is understandable that HP have been strongly focusing on it since the 3Com acquisition. I’ve only really begun to scratch the surface of their security features, which would be really useful at the access layer. I expect that i’ll be recommending these switches to clients a lot more than ProVision-based ones in the future.

IRF supports a ring topology with up to 9 switches on the 5500s, so before long we’ll probably aim to upgrade our campus backbone fibre ring to 10 Gbps with 5500 switches at each point on the ring, providing complete switching and routing redundancy.

Source: libertysys.com.au

NAT is evil, but not bad

2011-09-20: Edited to add section about IPv6 options; minor cleanup; references added.

This is kind of a follow-on from my post about the subnet addressing design differences between IPv4 and IPv6. Recently, Tom Hollingsworth started a little Twitter conversation about NAT where i mentioned that i liked NAT for the purpose of decoupling my internal and external address spaces; 140-character limits got in my way there, and i realised i needed to clarify my logic more, so this is my attempt to do that.  I’m very interested in feedback – have i missed something important?

A bit of context

I’ve never worked for a service provider and i don’t work in large data centres at the moment.  So i don’t have in mind huge, publicly-addressed networks.  I have in mind “corporate” or “enterprise” networks, which might include campus networks on one site with a few thousand ports, or organisations spread across 40 or 50 sites. In such organisations, the “data centre” might comprise something like 4 or 5 racks, usually on one or two sites, with maybe 100-200 gigabit ports or so.

Exposing only what is necessary 

If i have a network of, say, 2000 devices, including desktops, servers, printers, tablets, mobiles, etc. there are a variety of different access requirements.  The servers which largely serve clients on the LAN or internal WAN have limited web access requirements.  Some clients might talk to local servers for most of their applications.  For other clients (especially mobile devices), accessing the web (and perhaps email) is the only thing they need to do.  Another whole range of devices (printers, security cameras, etc.) have no need for inbound or outbound Internet traffic at all – if they need updates or configuration changes, that usually happens through a local management server.

For performance, bandwidth control, security, and auditing purposes, web browsing on most of these devices is forced through a local proxy server. Doing this eliminates most reasons for client devices to directly contact any system in the outside world. This significantly changes the security posture of the devices in question (cf. Greg Ferro’s comments in Packet Pushers #47 about inline load balancers allowing the web servers they balance to have no default route).  Of course, that’s not perfect security, and we still have to be careful that we’re doing the right checks in the proxy server, but it cuts out a whole range of possible attack vectors, with the result that only a tiny portion of a corporate network actually needs to be addressable globally.  This is not in itself justification for NAT, but rather justification for exposure of only a small external address range.

Internal addressing plans

I haven’t yet seen a corporate IP addressing plan that didn’t use the organisational unit, or the geographical location, or both.  In many cases, they are the only real world entity represented by the 2nd or 3rd IPv4 octet, even if there are not 256 organisational units or locations.  This is a little inefficient, and I’m sure that if everyone thought in binary, we could pack things in there and save 3 or 4 bits in many cases, but for the most part it’s a good practice because it saves support costs by allowing everyone to use 8-bit boundaries.  (I suspect when we go to IPv6 people will work on 16-bit boundaries, and burn even more bits on internal subnet addressing.)

The relevance of this to the NAT question is that most corporate networks would prefer that the internal structure of the network is not disclosed when client PCs contact outside addresses during day-to-day tasks, and NAT achieves this rather nicely. Of course, any determined attacker can learn lots about clients by passively watching their traffic, but funneling client traffic through a NAT gateway is one component of the solution.

NAT not a security mechanism?

It’s almost a truism in the networking industry that “NAT is not a security mechanism”.  This is at least somewhat true: a great deal can still be discovered about a host behind a NAT gateway using passive packet sniffing, and if a vulnerable service is exposed through a port forward, then all bets are off.  But in one sense, saying that NAT is not a security mechanism is a misrepresentation, because NAT provides a significant level of protection against active attacks.

For example, if a Windows PC’s file sharing service is open on the internal network but it’s behind a NAT gateway, it cannot be compromised by external hosts through a buffer overrun vulnerability in its SMB protocol handler.  Similarly, if a server has an ssh daemon which allows password-based access, it cannot be compromised by the (very common) ssh password brute-forcing worms that infest the Internet if it’s behind a NAT gateway which does not port-forward to that ssh daemon.  So whilst NAT is not a tool designed to provide security, the address space conservation that it’s designed for also provides some security against common types of attack as a useful by-product.

Most of the discussion about hating on NAT in Packet Pushers episode #61 (starting at about the 40 minute mark) was set in the context of a web hosting or large data centre environment (to which the issue of public vs. private address space does not apply), and assumed that those who deploy NAT do so along with thoughtless port forwarding and without suitable DMZ design. [1]  But NAT and poor network security design need not go hand-in-hand.

NAT fails closed, not open

One aspect of NAT makes it desirable from a security perspective, and this is why the majority of SOHO routers in the world are deployed with NAT enabled by default: NAT is closed to outside access by default.  That is, unless you take active steps to open up outside access to ports and/or hosts behind a NAT gateway, their normal TCP and UDP ports cannot be accessed.  I don’t dispute the possibility of attacks which could exploit weaknesses in the packet forwarding algorithms used by NAT gateways in order to attack the hosts behind them, nor suggest that spear phishing or drive-by downloads are not a significant risk to those hosts, nor suggest that the security of the gateway itself is not essential.  But these are risks apply equally to hosts behind routed firewalls.

Designing for things to fail is part of good network design, and in many (most?) coprorate networks, it’s preferable to fail closed rather than open.  On a NAT gateway, if there is a failure in the routing or firewalling engine, only one host remains open to external attack: the gateway itself.  On the other hand, if a routed firewall’s ACLs fail to be applied for any reason – say, during a system restart after a software update – the default scenario for many operating systems is that their routing functions remain functional even if their firewall does not.  So in a failure scenario, NAT’s security posture is more desirable than that of a similarly-configured non-NATed network.

Similarly, if i make a mistake in specifying a netmask on an ACL in a routed network (as a colleague recently did on a client’s network), i might accidentally allow outside access to double the number of systems i intended to.  Using NAT means that i’m less likely to do this, because such ACLs usually only apply in an outbound direction.

NAT simplifies problems where scale overwhelms the administrator

This is the part where the networking high-flyers are going to start laughing at me.  But please, read and understand first.  There are factors in many organisations (usually at layer 8 or 9 of the OSI network model) that mean that we don’t always have access to the best people.  Finding someone with deep understanding of how all the components of a network hang together is actually hard to come by in many places.

For those of us who are left, NAT is a helpful tool in cutting down the size of a network design or management problem from immense to manageable.  If we can provide Internet access to a large number of systems using a much smaller number of external addresses, we will have a much greater chance of understanding the configuration and producing a good result for our employers and/or clients.

But the naysayers are still right…

In many cases, NAT is only an obscurity mechanism which is fundamentally a waste of time in terms of security.  It adds complexity to the troubleshooting process, often for no additional value. But NAT can and in many cases should be part of a network administrator’s toolkit, when applied rightly.

Thinking IPv6

How this applies to IPv6 is where i start to get uneasy.  The internal-external decoupling that NAT provides seems not to be on the radar for IPv6.  The suggestions i’ve seen so far are either to use unique local addressing internally and do one-to-one translation between these and provider independent addresses at the border router (which seems to me to provide no benefit at all over straight routed firewalling), or to use only unique local addresses and not bother with providing external addresses for corporate end-user PCs at all [2] (which will cease being practical as soon as the sales manager decides he or she needs Skype).

[1] When listening to that episode, one could be forgiven for thinking that connection tracking of FTP had never been invented…

[2] At about the 9:00 mark in the video.

Source: libertysys.com.au

A strange rrdtool error; Linux conntrack documentation

Last week i made some fairly significant changes on a client’s production firewall/routing cluster during our maintenance window.  The next morning there were reports of file server drives not connecting correctly and inaccessible web sites.  Because all wireless-to-wired and Internet traffic goes through this cluster, the firewall changes were the obvious culprit.  Looking at the logs it turned out we had run out of space in the connection tracking table:

May 26 08:55:05 corella1 kernel: ip_conntrack: table full, dropping packet.
May 26 08:55:13 corella1 kernel: ip_conntrack: table full, dropping packet.
May 26 08:55:15 corella1 kernel: ip_conntrack: table full, dropping packet.

I checked the counters in /proc/sys/net/ipv4/netfilter/, upped the limit for net.ipv4.netfilter.ip_conntrack_max in /etc/sysctl.conf to 4 times its previous value, and loaded the new value into /proc.

Then i started to hack up a few little scripts to monitor and graph ip_conntrack_count against ip_conntrack_max using rrdtool. I’ve used rrdtool a little before, so i thought it would be pretty straightforward.  I created my RRD file and started updating it every minute with the latest counters from netfilter.  However, as soon as i tried to graph it i got the error

ERROR: parameter ‘cnt’ does not represent a number in line AREA:cnt#00FF00:countn

A search of Google brought up a lot of hits which contained the same text but were not relevant – most of them were errors in not specifying the variable correctly.  However, i came across one very similar problem: https://lists.oetiker.ch/pipermail/rrd-users/2007-November/013277.html

Unfortunately, this post on the rrdtool users mailing list had no responses, so i was down to solving it myself.  It took me some time before i realised that both the original poster of that message and myself had made exactly the same elementary mistake: forgetting to include a filename for the graph output.  This rudimentary error is not picked up by rrdtool’s command line parser (at least not as at version 1.2.12 on SUSE Linux Enterprise Server), resulting in a very confusing error message.

So then i had a working rrd graph on my firewall, which seems to have settled down nicely.  You can find the current (very rough) state of the scripts at https://github.com/paulgear/puppet/tree/2b5363a3fbc1e73d5d88158e93ab5d879910173b/modules/netfilter/files.

At the moment i’m only graphing the connection tracking count vs. its maximum (see the graph below).  Note the interesting minor variation on the graph from the max value that isn’t actually changing.  This seems to be due to rrdtool’s consolidation of data points – the change to a solid line was effected by truncating the date to an exact multiple of the step interval that the rrd was set up with (in this case, 60 seconds).

Sample conntrack rrd graph

After getting this working, i wondered whether there were other conntrack values i should be checking (the ip_conntrack_tcp_be_liberal and ip_conntrack_tcp_loose sounded particularly interesting) so i started going looking for documentation on the files in /proc/sys/net/ipv4/netfilter/. Initial searches came up with very little. The best description i could find of them was at http://netfilter.linux-kernel.at/documentation/pomlist/pom-extra.html#tcp-window-tracking, but i must admit that i crave more detail.  If anyone can point me to a better reference, or suggest which conntrack items really need monitoring, please drop me a line.

(Incidentally, i’ve discovered that collectd has a netfilter conntrack plugin, so i will probably not develop the scripts i created any further, but will try to adapt that plugin to my needs.)

Attachment Size
ipconntrack-day.png 35.95 KB

Source: libertysys.com.au

Another HP product added to my "do not buy" list: LaserJet P2035n


I tweeted about the HP LaserJet P2035n a while back, and things have only gotten worse for me since.  To summarise: it has no SSL support for administration, its SNMP response is patchy (see graph below), and it isn’t supported by JetAdmin.  This last point was underscored to me yesterday: i realised that the particular printer i’ve been monitoring is running an older firmware version (from 2008), so i went looking for an updated one.  I found it on HP’s web site (eventually – that remains a rant for another day), downloaded it to my JetAdmin VM, and promptly found that it insists on being locally connected via USB.

Fails like this are unfortunately becoming increasingly common with HP’s product range as they try to compete on price with everyone and get products to market quickly.  (See my rant about the ProCurve 1810 switch.)  My open letter to HP follows:

Dear HP,

Please stop trying to compete with Dell on price.  Purchase price is not absolutely everything, nor is time to market.  Concentrate on making yourselves useful and manageable in the medium-large enterprise, and we’ll keep buying your products.

Yours sincerely,
A (mostly) happy, long-term ProCurve and LaserJet customer.

Graph from my SNMP monitoring package showing this printer going up & down like a yo-yo (click for full size version):