Contents
Background
In my last post I described the setup I use to provide time synchronisation to the hosts I maintain in the NTP pool. I only recently learned about the PTP Hardware Clock (PHC) device driver available in KVM, and started testing it in earnest earlier this year.
The main reason to use the PHC is to more closely track the hypervisor host. This is possible because reading from the local PHC is designed to be very efficient and it incurs much less overhead than performing an NTP request and response over the network.
Obviously this requires the host to have a clock worth tracking. My previous post explained the setup I use, but the general guidelines for a VM host would be the same as for any quality NTP setup:
- Use at least four clock sources, with a diversity of reference clocks.
- Peers should be selected on the basis of reliable network connectivity and reliable timekeeping performance. If you're using a host from the public pool, checking its pool score page is highly recommended. Here's an example: 150.101.186.48.
- All other things being equal, closer hosts (in terms of network delay) are better than further away hosts, because they allow NTP to constrain its error estimations to a narrower range.
Performance improvements - chrony
So how much difference does it make using the PHC? The first system I enabled (after validation and playbook development on a test VM) was my chronyd pool server. It is using 2 vCPUs and 1 GB RAM on a host with a 6-core AMD Ryzen 5 Pro 3600 CPU, with a maximum clock speed of 3.6 GHz.
I was already quite happy with the time sync performance of this VM - it was reporting a system offset within ± 50 µs:
The system frequency error ranged between about -19.44 and -19.64 ppm:
And the root dispersion ranged between 80 and 280 µs:
After enabling the PHC device, the same VM actually reported a larger range of system frequency error, now ranging between -19.1 and -19.8 ppm:
But system offset reduced to ± 4 µs:
And maximum root dispersion less than 2 µs:
Here are the graphs showing a few days on either side of this configuration change.
System offset:
Frequency error:
Root dispersion:
One point to note about this is that by default chrony sees the PHC device as stratum 0, rather than stratum 2 equivalent. More importantly, its figures for root dispersion and root delay are misleadingly low because they only account for the time taken to read the PHC device.
To address this, chrony allows the stratum and root delay of the PHC reference clock to be set manually in the config file. I set the root delay by looking at the host's root delay over a few days prior to the change, and picking a slightly higher value than the minimum (in my case I used 400 µs). Unfortunately, there's no way to obtain the host's root dispersion from the PHC device (hence why the AWS Nitro PHC driver reports clock error bound via a separate sysfs interface), and no mechanism that I'm aware of in chrony to adjust it (although it is influenced by the configured root delay).
Setting the stratum to an accurate value is also potentially problematic, because chrony uses stratum as part of its sync peer selection algorithm. I eventually settled on setting the PHC to stratum 1, so that it doesn't appear to be lower stratum than the stratum 1 servers on my local network, but is still likely to be selected as the sync peer under most circumstances.
Common prerequisite
Before setting up an NTP service to use the PHC device, there's one prerequisite: loading the ptp_kvm driver. It takes no options and should be available in all mainstream Linux kernel builds. Activating it is as simple as:
# modprobe ptp_kvm
On my systems this does not produce any kernel log message, because the core PTP driver is compiled into the default kernel. So the only indication that the module is loaded is the presence of a /dev/ptp0 device file, along with a symlink to it indicating that belongs to KVM:
# ls -la /dev/ptp* crw------- 1 root root 248, 0 Apr 21 12:24 /dev/ptp0 lrwxrwxrwx 1 root root 4 Apr 21 12:24 /dev/ptp_kvm -> ptp0
To ensure that this is always loaded on boot, I add the driver name to /etc/modules on my Debian and Ubuntu systems:
ptp_kvm
Other Linux distributions may use a slightly different mechanism for this, e.g. a file in /etc/modules-load.d/ (which also works on Debian-based distros).
Chronyd configuration
Chrony has built-in support for the PHC device as a reference clock. To enable it, I use the following configuration line:
refclock PHC /dev/ptp0 poll 0 delay 0.0004 stratum 1
As mentioned above, the delay and stratum options are to tune these variables so that they're not reported as artificially low. The only other options are the name of the device file, and the poll interval (in powers of 2, so poll 0 means the reference clock is polled once every second). Chrony supports various other options for this and other reference clocks which can be found in man 5 chrony.conf.
After restarting chronyd, the PHC device shows up as a reference clock source:
# chronyc -n sources MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== #* PHC0 1 0 377 1 -8135ns[-8131ns] +/- 200us ^+ 2001:44b8:2100:3f11::7b:6 1 8 377 185 +30us[ +34us] +/- 264us ^- 2001:44b8:2100:3f00::7b:102 2 8 377 36 -201us[ -200us] +/- 2192us ...
Because they're not NTP sources, chrony does not report the statistics for reference clocks in its measurements log. Instead, they are reported in the statistics and tracking logs. This means that at the moment I don't have measurements for them available via NTPmon. I may add support for the tracking log in future, but for the time being the changes in system offset and root dispersion are the main metrics I use to evaluate the effects of enabling the PHC device.
Ntpd configuration
The configuration of traditional ntpd for PHC support is slightly more complicated, because it doesn't have a reference clock driver for PTP devices. Instead, the phc2sys utility from the linuxptp package is used to provide a bridge between PTP devices and ntpd's shared memory driver. So a prerequisite for ntpd to use the PHC is to install this package:
# apt install linuxptp
By default phc2sys assumes that its time is coming via PTP from a supported NIC or similar, so the default systemd configuration for it is inappropriate. Instead, I created a local systemd service file:
# cat /etc/systemd/system/phc2sys.service [Unit] Description=Synchronize PTP hardware clock (PHC) to NTP SHM driver Documentation=man:phc2sys [Service] CapabilityBoundingSet=cap_sys_time EnvironmentFile=-/etc/default/phc2sys ExecStart=/usr/sbin/phc2sys $PHC2SYS_OPTIONS Restart=always RestartSec=12s Type=simple [Install] WantedBy=ntp.service
And a matching defaults file:
# cat /etc/default/phc2sys PHC2SYS_OPTIONS=-E ntpshm -s /dev/ptp0 -O 0 -l 5
These options instruct phc2sys to use /dev/ptp0 as its time source and the NTP SHM driver as its destination, to use an offset of 0 between slave and master clocks (which is only relevant if you're using a PTP source which uses TAI rather than UTC), and to log at level 5 (LOG_NOTICE) rather than the default of level 6 (LOG_INFO). The log level adjustment is optional, but if it's not used, phc2sys will log a message like this every second:
Apr 20 22:05:06 ntp102 phc2sys[1132]: [213.454] CLOCK_REALTIME phc offset -311328 s0 freq +0 delay 0
Then to configure ntpd to use the SHM driver, I added these lines to /etc/ntp.conf:
server 127.127.28.0 fudge 127.127.28.0 stratum 1
As mentioned above, fudging the stratum is optional, but it means that the source appears is treated more like other sources on the network when the sync peer is selected.
After restarting ntpd the SHM source shows up in our list of sources:
# ntpq -np remote refid st t when poll reach delay offset jitter ============================================================================== ... *127.127.28.0 .SHM. 1 l 49 64 377 0.000 +0.005 0.001 -2001:44b8:2100: 80.72.67.48 2 u 32 64 377 1.296 -0.148 0.283 +2001:44b8:2100: .PPS. 1 u 1 64 377 1.350 -0.018 0.168 +2001:44b8:2100: 80.72.67.48 2 u 22 64 377 1.180 +0.006 0.139 -2001:44b8:2100: 80.72.67.48 2 u 12 64 377 0.921 -0.018 0.136 ...
Performance improvements - ntpd
Unlike chronyd, ntpd treats the SHM driver as a virtual NTP source, recording the same statistics in /var/log/ntpstats/peerstats as it does for NTP sources, and enabling the measurements to be directly compared.
Here's a Grafana dashboard snapshot for a one week period on either side of the point where I enabled the PHC device on my first pool server running ntpd, 150.101.186.48: https://snapshots.raintank.io/dashboard/snapshot/jQUiLFkHKNKJJSJZLz6ZiYpaKA3X2EH4
This VM uses 2 vCPUs and 512 MB RAM on a host with an older dual-core Intel Celeron 1037U CPU with a maximum clock speed of 1.8 GHz.
The individual peer values are a little hard to see in that dashboard and apparently can't be singled out, so here's a snapshot of the chronyd pool server mentioned above (which was already using the PHC on its own host), graphed alongside the PHC device:
Even though they aren't on the same VM hosts and therefore aren't tracking the same source clocks with their PHCs, their offset from each other still dropped by using the PHC.
Here are a few of the other highlights: system offset went from ± 225 µs to ± 50 µs:
Frequency error, like chrony, didn't change much:
Root dispersion went from a maximum of 40 ms to 2 ms:
Jitter (an ntpd-specific metric) went from a maximum of 0.00034 to 0.00015:
Next steps
In my next post I'll explore how using the ptp_kvm driver on AWS compares with using their Nitro-based microsecond-accurate time service.
Thanks
Special thanks to Dan Drown and Miroslav Lichvar for their advice and pointers as I learned about using the KVM PHC (although any inaccuracies in this post are mine!).