Filtering tshark and tcpdump with packet size as a capture filter

I recently wanted to look at some packet captures on my NTP pool servers and find out if any NTP clients hitting my servers use extension fields or legacy MACs.  Because the overall number of NTP packets is quite large, I didn’t want to spool all NTP packets to disk then later filter with a Wireshark display filter – I wanted to filter at the capture stage.

I started searching and found that not many quick guides exist to do this in the capture filter.  However, the capability is there in both tcpdump and tshark, using either indexing into the UDP header, or using the overall captured frame length.  Here’s an example of tcpdump doing the former (displaying it to the terminal), and tshark doing the latter (writing it to a file):

tcpdump -i eth0 -n -s 0 -vv 'udp port 123 and udp[4:2] > 56'
tshark -i eth0 -n -f 'udp port 123 and greater 91' -w file.pcap

Both of the above filters are designed to capture NTP packets greater than the most common 48-byte UDP payload.  In the case of udp[4:2], we’re using the UDP header’s 16-bit length field, which includes the header itself.  In the case of greater, it uses the overall captured frame length, and actually means greater-than-or-equal-to (i.e. the same as the >= operator); see the pcap-filter(7) man page for more details.

Setting up Zimbra for strong ciphers only

Tonight i was working on getting a client’s Zimbra SSL configuration up to scratch, and found it somewhat difficult to get our server to make Qualys’ SSL Labs scanner happy.  I was working from the following Zimbra wiki pages:

It seems that as of Zimbra 8 (possibly before that?) there is no longer any need to configure jetty – everything seems to go through nginx as an SSL reverse proxy.I tried several different combinations and still kept getting insecure ciphers in the Qualys scan results until i stumbled across this nginx forum post and these certificate installation instructions.  Between them i managed to glean that:

So the commands i ended up with for Zimbra were:zmprov modifyConfig zimbraReverseProxySSLCiphers '!ADH:!eNULL:!aNULL:!DHE-RSA-AES256-SHA:!SSLv2:!MD5:RC4:HIGH'zmmailboxdctl restartThis was enough to get us an “A” rating in Qualys’ eyes.


Quick tip on using Vyatta vbash

I’ve been doing a lot of work with Vyatta of late, and one of the great things about having a router based on Debian is that you can combine Cisco-/Juniper-style CLI with Linux goodies.  One of my favourite Linux utilities is watch, which just runs a command repeatedly and shows you its output.  So to watch a mail queue empty after you’ve fixed a downstream mail server, one might run:

watch "mailq | tail"

On Vyatta, this comes in handy if one wants to watch a route table or something similar.  However, by default, the CLI will not allow you to run “show” commands directly, because they’re implemented internally by vbash, the Vyatta version of the well-known Linux/Unix shell, bash.  So, for example, the following will not work:

watch "show ip ospf database"

nor will

watch "vbash -c 'show ip ospf database'"

The trick is to use the -i flag to vbash, which tells it to assume that it’s an interactive shell, like so:

watch "vbash -ic 'show ip ospf database'"

I’m not sure why Vyatta felt it necessary to require this, since the only conceivable reason one would run vbash instead of bash is to get access to the Vyatta extensions, but this is an easy and painless workaround.  (I’ve also documented this at the Vyatta forum thread that talks about it, since Google still points there for a number of searches – hopefully they’ll update the links soon.)

Note also the double quotes are necessary to tell watch to run send the entire command to its own internal shell.  If you have lots of $ variables and the like, this will quickly turn into quote and backslash hell, so keep it simple, or put your commands in a script file.


Fun with Linux server migrations, part 2


Seeing progress of a long-running Linux process

During the server migration mentioned in part 1, i wanted to see what a long-running rsync process was doing.  Because we had done several presyncs of the data before the outage window, there was not a lot of progress for rsync to report; it was simply churning through files checking to see which ones had changed.

The usual tool for this is strace, which shows all system calls made by a process.  You can attach it to a running process with strace -p PID, where PID is the numeric process id.  I ran strace briefly to find out what system calls rsync was making, and found that it calls lstat64 for each file.  But because it had to look through so many files, i couldn’t very well run strace -p PID 2>&1 | grep lstat641 because even that was too much data.  (I was connected to the system via my home ADSL connection, and with hundreds of thousands of files to copy, it would never have kept up with the trace output.)

So i started looking around for the right tool to sample the data without overwhelming my slow connection.  I considered writing a quick awk script, but it turns out that it’s even easier than that: sed has a built-in function for operating on the Nth line of any input file.  The general form is sed -n 'M~Np' file, which prints every Nth line starting with the Mth.  (In my case, i was reading from a pipe from strace, so there was no file.)  I tried a few different combinations and settled on strace -p PID 2>&1 | grep lstat64 | sed -n '1~10000p', which samples one in every ten thousand files that rsync processes.

I need to do this quite often on running processes, so sed -n 'M~Np' is going straight to my pool room for helpful little Linux recipes.

1. The 2>&1 is necessary because strace sends trace output to standard error rather than standard output.


Fun with Linux server migrations, part 1

Server migrations with file system structure changes

Last night i completed a P2V migration of a 2 TB Linux file server.  It was running on an old IBM x306 server with cheap SATA disks, and we were migrating it to a VMware environment with a SAS-connected disk array.  This server is going to be rebuilt in the near future, so we didn’t want to use the same amount of disk space (it was only about 60% full).  Also, it was running Linux software RAID, which is not necessary under the new environment – the disk array handles RAID.

So i needed to rebuild the file systems and copy at the file level in order to migrate the server.  Preserving the old personality but allowing for a new disk layout and a VM environment requires some care.  I wanted to maximise my options in the case of something going wrong, so i made sure the system was plugged into a managed switch which i control.  Here’s the process i followed:

  1. Create a new VM with the appropriate settings, including CPU, RAM, disk, and network.  On ESXi 5, i prefer to use LSI Logic SAS emulation for disk controllers, and Intel E1000 emulation for NICs, because:
    • both of these drivers are in the mainline Linux kernel, therefore
      • you don’t end up with unmountable root file systems or unreachable networks when you first start up the VM, and
      • you don’t have to run proprietary VMware drivers at all if you don’t want
    • they seem (anecdotally) to offer improved performance over the other emulated driver choices
  2. Do a minimal install of the OS in the new VM; use a different IP address from the source server.
  3. Set up file systems as desired.  In this case, all non-system data is in /home, so i made that a separate virtual disk and created a file system on it.
  4. From the target server, Pre-sync the data in /home.  I used the command
    rsync -avx sourceserver:/home/ /home/ --delete

    The initial sync was the largest, but i ran it again several times over a week to ensure that the final sync was as short as possible.

  5. Create an out-of-band network connection to the source server.  You might already have this.  In this case, the source server had a spare NIC which i put on our network management VLAN.  Start an ssh session on the new network connection to ensure that the old system is still reachable while you’re testing the new VM.
  6. If the system runs a Red Hat-based distribution (this system uses CentOS 5), ensure that any MAC addresses are commented out in /etc/sysconfig/network-scripts/ifcfg-eth*.  This ensures that when services are cut over, the new virtual NIC is not considered a new device, but takes on the settings of the old NIC.
  7. Create an exclude file for the system data.  I used these resources from OpenVZ and Slicehost to help me come up with an appropriate list of files to exclude.  Here’s what i ended up with:

    Some of the entries in the list above are not necessary due to the -x flag on rsync, which prevents it from crossing file system boundaries, but i wanted a fairly generic list that could be reused.  This list should be a good start for CentOS 5 systems, but may need tweaking for other distros.  The exclude file lists itself because i ran the rsync from the target and did not want to lose it when copying the root file system.

  8. Ensure that an independent backup of the source server exists.  Run it just before the outage window.
  9. When the outage window arrives, shut down all services on the source and target which are not essential for the purposes of the copy.  Here’s a list of the ones i used for my system – your list will likely be different:
    service acpid stop
    service anacron stop
    service apmd stop
    service atd stop
    service autofs stop
    service bluetooth stop
    service crond stop
    service gpm stop
    service hidd stop
    service iscsid stop
    service iscsi stop
    service isdn stop
    service netfs stop
    service nfslock stop
    service nfs stop
    service pcscd stop
    service portmap stop
    service radiusd stop
    service rawdevices stop
    service rpcgssd stop
    service rpcidmapd stop
    service sendmail stop
    service smartd stop
    service smb stop
    service syslog stop
    service xfs stop
    service ypbind stop
    service yum-updatesd stop

    Some of these might seem essential (e.g. syslog), but they’re necessary for normal running of the system, not copying its personality to a new server.  The basic idea is to minimise the amount of churn (especially logging) in the file systems being copied, while leaving networking and sshd running.

  10. From the target server, run rsync with the delete flag for any non-root system partitions/LVs on the system drive.  In my case, there was a separate /var partition.  Note that the exclude file entries need to be relative to the partition being copied, so to copy /var, you might use an exclude file like this:

    and a command like this:

    rsync -avx sourceserver:/var/ /var/ --exclude-from=/root/exclude.var --delete

    Be sure to run it with --dry-run first to make sure you’re not trashing something you don’t expect.

  11. Copy the root partition/LV in a similar fashion:
    rsync -avx sourceserver:/ / --exclude-from=/root/exclude.root --delete

    The exclude file has the contents as shown in the main exclude list above.  Again, don’t forget --dry-run to test first.

  12. Now the target VM has all the settings of the original server and is ready for the changeover.  From the managed switch, disable the frontend port(s) leading to the source server, leaving the out-of-band port active.  This prevents client traffic from going to the server.
  13. After the rsyncs are finished, reboot the target VM, watching its startup with the VMware console.  There will probably be a few services that will not be applicable under VMware (e.g. lm_sensors) – you can disable and/or remove these when convenient.  The new VM should now have all the personality of the old server, including services, IP address, and data.
  14. Once you’ve tested the target server and ensured that it is performing the source server’s job appropriately, shut down the source server from ssh session you started on the out-of-band port earlier, then shut down the out-of-band port.  This ensures that even if you’re remote from the server and it is powered up again (either by mistake, or due to mains power loss and recovery), it won’t be able to interfere with the operation of the new system.

This process went very smoothly for me last night.  So smoothly, in fact, that i was a bit worried and ran a lot of extra tests afterwards to ensure that it really was successful.  Fortunately, my fears were unfounded.  😉