The other day I got a bug report about check_ntpmon, which was reporting UNKNOWN status back to Nagios even though everything seemed to be working fine. A bit of debugging revealed that it was receiving the message on standard error:
ntpq: write to ::1 failed: Operation not permitted
This was a bit strange, because various links I found indicated that this message is usually due to firewalls:
- http://www.k336.org/2015/03/ntpq-and-ipv6.html
- https://bugs.launchpad.net/ubuntu/+source/ntp/+bug/596492
- http://h30499.www3.hp.com/t5/Networking/Help-ntpq-write-to-localhost-failed-Operation-not-permitted/td-p/2949822
But this host was not blocking anything, not to mention that check_ntpmon's use of ntpq only ever uses the loopback interface, which is rarely ever touched by firewalls. A bit of further digging showed that indeed it was not the firewall, but a full conntrack table, with dmesg showing:
Aug 4 03:04:19 hostname kernel: [5226949.016837] nf_conntrack: table full, dropping packet
Increasing the conntrack limit fixed the problem.
(Just thought I'd document this here for posterity, since none of the links I found suggested this issue.)