I decided that the long dyndns URLs were a bit daggy, so it’s back to the old site name. Please let me know if you notice any issues with the changeover.
Depending on how you install Linux, you might already have a pretty workable NTP client, but for the tools we’ll be working with, we’ll need the NTP reference implementation server. On Ubuntu (these examples use 16.04 “xenial xerus”, but the instructions should work on all current Debian-derived distributions), you can install this with the usual packaging tools:
ubuntu@machine-5:~$ sudo apt-get install ntp Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libopts25 Suggested packages: ntp-doc The following NEW packages will be installed: libopts25 ntp 0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded. Need to get 577 kB of archives. After this operation, 1,791 kB of additional disk space will be used. Do you want to continue? [Y/n] y Get:1 http://nova.clouds.archive.ubuntu.com/ubuntu xenial/main amd64 libopts25 amd64 1:5.18.7-3 [57.8 kB] Get:2 http://nova.clouds.archive.ubuntu.com/ubuntu xenial-updates/main amd64 ntp amd64 1:4.2.8p4+dfsg-3ubuntu5.3 [520 kB] Fetched 577 kB in 0s (9,198 kB/s) Selecting previously unselected package libopts25:amd64. (Reading database ... 87077 files and directories currently installed.) Preparing to unpack .../libopts25_1%3a5.18.7-3_amd64.deb ... Unpacking libopts25:amd64 (1:5.18.7-3) ... Selecting previously unselected package ntp. Preparing to unpack .../ntp_1%3a4.2.8p4+dfsg-3ubuntu5.3_amd64.deb ... Unpacking ntp (1:4.2.8p4+dfsg-3ubuntu5.3) ... Processing triggers for libc-bin (2.23-0ubuntu3) ... Processing triggers for systemd (229-4ubuntu10) ... Processing triggers for ureadahead (0.100.0-19) ... Processing triggers for man-db (2.7.5-1) ... Setting up libopts25:amd64 (1:5.18.7-3) ... Setting up ntp (1:4.2.8p4+dfsg-3ubuntu5.3) ... Processing triggers for libc-bin (2.23-0ubuntu3) ... Processing triggers for systemd (229-4ubuntu10) ... Processing triggers for ureadahead (0.100.0-19) ...
Here are some of the components of the ntp package with which you might want to be familiar (use dpkg -L ntp on Ubuntu & other Debian-based distributions to see all the contents of the package; on Red Hat-based distributions, use rpm -qf ntp):
- the main configuration file; if you make changes to this file, you’ll need to restart ntpd with sudo service ntp restart to activate them
- statistics logging, if enabled, writes its logs here
- the main NTP server process; you won’t usually need to use this directly
- prints the exact time according to NTP
- NTP query – this is the most important tool for troubleshooting
- traces from your local server to stratum 1; sometimes helpful if you run a local reference clock, but if you’re using publicly available NTP servers, you’ll probably end up with a rather dull result like this:
ubuntu@machine-5:~$ ntptrace localhost: stratum 3, offset -0.000445, synch distance 0.011506 188.8.131.52: timed out, nothing received ***Request timed out
This is because of the default security restrictions on querying NTP – more on this later.
- On Ubuntu, NTP ships with a default NTP AppArmor profile which limits its capabilities; distributions which use SELinux by default probably ship with similar restrictions.
- a hook which allows clients to receive their NTP server list via DHCP; we won’t be using this, but be aware that in the default Ubuntu configuration, if your network supplies NTP servers via DHCP, its settings will override the settings in /etc/ntp.conf.
Straight after installation you should have a pretty usable NTP server, assuming you have Internet access which allows outbound traffic to UDP port 123 (some ISPs block this). Here’s the default configuration on Ubuntu 16.04 LTS (with comments removed):
ubuntu@machine-5:~$ grep '^[^#]' /etc/ntp.conf driftfile /var/lib/ntp/ntp.drift statistics loopstats peerstats clockstats filegen loopstats file loopstats type day enable filegen peerstats file peerstats type day enable filegen clockstats file clockstats type day enable pool 0.ubuntu.pool.ntp.org iburst pool 1.ubuntu.pool.ntp.org iburst pool 2.ubuntu.pool.ntp.org iburst pool 3.ubuntu.pool.ntp.org iburst pool ntp.ubuntu.com restrict -4 default kod notrap nomodify nopeer noquery limited restrict -6 default kod notrap nomodify nopeer noquery limited restrict 127.0.0.1 restrict ::1 restrict source notrap nomodify noquery
Let’s break down the configuration:
- driftfile – The drift (a.k.a. frequency) is the estimated error rate (in parts-per-million) of the local clock. The NTP daemon saves its estimate every hour to the file named here so that it doesn’t have to recalculate this error rate on startup. If the file doesn’t exist, ntpd will assume that the error rate is zero and recalculate it from scratch, so it’s harmless to delete the file. But leaving it there helps get your clock headed the right way faster on startup.
- statistics & filegen – these direct ntpd to create various types of statistics, if the statsdir directive is enabled (which it’s not here). You probably don’t need the stats files unless you are running a stratum 1 server, or are especially interested in the quality of your individual time sources.
- pool – this defines the sources of time for your NTP server; traditionally, “server” was used here, but the pool directive introduced in recent versions offers much improved behaviour, and should nearly always be used in preference to server. The source may be a hostname or IP address, but hostnames are preferred, since recent versions of ntpd will retry DNS lookups and switch to a new pool server if connectivity is lost. The “iburst” option at the end of the line causes ntpd to use 6 polls (as opposed to just 1) when initially contacting the source. Using this is nearly always recommended, as it helps get your clock synced sooner. The time sources used here are the public servers available in the NTP pool, plus the public Ubuntu NTP servers run by Canonical (my employer).
- restrict – the default set of restrictions allows use of pool servers, and permits querying of associations (the records about time sources) from the local host, but does not allow querying from elsewhere. Clients can still use this server as a time source (firewall rules permitting). Querying of the association list is something which was historically used in reflective DDoS attacks, so the default restrictions should be kept in place unless you have a good reason to do otherwise. [citation required]
If you have average time synchronisation needs, the default configuration will probably work just fine and you won’t have to change anything, with the possible exception of the time sources. A common configuration change you might make is changing [0-3].ubuntu.pool.ntp.org to the pool servers for your country or region. For example, my home network’s NTP servers use [0-3].au.pool.ntp.org. For a full list of the available pools, check the global list, then drill down by continent and country.
Don’t necessarily just assume that your region/country servers are close, though. Recently I was working on a customer’s system located in South America and found that the US pool servers were much closer in terms of network latency than the South American ones! You should watch carefully which sources are supplied to you from the NTP pool and change them if the pool returns servers which are not physically nearby. (Where “nearby” generally means within a few thousand km; a rough rule of thumb for “average” use is: time sources should be less than 100 ms away.)
In part 4 of this series we’ll look at monitoring and troubleshooting your NTP service.
What is NTP?
NTP (Network Time Protocol) is an Internet standard for time synchronisation covered by multiple RFCs. “NTP is [arguably] the longest running, continuously operating, ubiquitously available protocol in the Internet” [Mills]. It has been operating since 1985, which is several years before Tim Berners-Lee invented the WWW. The current version is NTPv4, described in RFC5905, which also covers SNTP (Simple NTP), a more limited version designed mostly for clients.
Whilst there are multiple different implementations of NTP, I’ll be focusing on the reference implementation, from the Network Time Foundation, because that’s what I’m most familiar with, and because it has the most online reference material available.
How Linux keeps time
Linux and other Unix-like kernels maintain a system clock which is set at system boot time from a hardware real time clock (RTC), and is maintained by regular interrupts from a timing circuit, usually a crystal oscillator.
The kernel clock is maintained in UTC; the base unit of time is the number of seconds since midnight 1 January 1970 UTC. Applications can read the system clock via time(2), gettimeofday(2), and clock_gettime(2), the last two of which offer micro- and nano-second resolution.
System calls are available to set the time if it needs to change (called “stepping” the clock), but the more commonly-used technique is to ask the kernel to adjust the system clock gradually via the adjtime(3) library function or adjtimex(2) system call (called “slewing” the clock). Slewing ensures that the clock counter continues to increase rather than jumping suddenly (even if the clock needs to be adjusted backwards), by making slight changes in the length of seconds on the system clock. If the clock needs to go forwards, the seconds are shortened (sped up) slightly until true time is reached; if the clock needs to go backwards, the seconds are lengthened (slowed down) slightly until true time catches up. (There are other interesting timing functions supported by the Linux kernel; see the documentation for more.)
Because oscillators are imperfect, system time is always out from UTC by some amount. Better quality hardware is accurate to within very small variance from the true time (unnoticeable by humans), while cheap hardware can be out by quite significant amounts. Clock accuracy is also affected by other factors such as temperature, humidity, and even system load. NTP is designed to receive timing information from external sources and use clock slewing (or stepping, where necessary) to keep the system clock as close as possible to true UTC time.
How NTP works
The notion of one true time is central to how NTP operates, and it has numerous checks and balances in it which are designed to keep your system zeroing in on the one true time. (For a more detailed and authoritative explanation of this, see Mills’ “Notes on setting up a NTP subnet“.)
The primary means which NTP uses for determining the correct time is just to ask for it! An NTP server simply polls other NTP servers (on UDP port 123) or other time sources (more on this below) for their current time, measures how long it takes the request to get there and back, and analyses the results to determine which sources represent the true time. The polling process is very efficient and can support huge numbers of clients with a minimum of bandwidth.
An NTP poll happens at intervals ranging from 8 seconds to 36 hours (going up in powers of two), with 64 seconds to 1024 seconds being the default range. The NTP daemon will automatically adjust its polling interval for each source based on the previous responses it has received. On most systems with a reliable clock and reliable time sources, poll times will settle on the maximum within a few hours of the NTP daemon being started. Here’s an example from one of my systems:
$ ntpq -pn remote refid st t when poll reach delay offset jitter ============================================================================== +172.22.254.1 172.22.254.53 2 u 255 1024 177 0.527 0.082 2.488 *172.22.254.53 .NMEA. 1 u 37 64 376 0.598 0.150 2.196 -184.108.40.206 220.127.116.11 2 u 1067 1024 377 44.964 -1.948 0.764 +18.104.22.168 22.214.171.124 2 u 101 1024 377 32.703 -1.666 8.223 +126.96.36.199 188.8.131.52 2 u 953 1024 377 55.609 -0.120 6.276 -2001:4478:fe00: 184.108.40.206 2 u 76 1024 377 35.971 4.814 1.848 -2001:67c:1560:8 220.127.116.11 2 u 1017 1024 377 376.041 -3.303 4.412 +18.104.22.168 22.214.171.124 2 u 1004 1024 377 325.680 1.469 38.157
The 6th column is the poll time, which is 1024 seconds for all but one of its peers. (More on how to interpret the output of ntpq will come in a later post.)
So if your system gets time from another system on the network, from where does that system get its time? NTP time is ultimately sourced from accurate external sources like atomic clocks, some of which use the ultimate source of the standard second, the Caesium atom, as their reference. Such time sources are expensive, so other sources are used as well, such as radio clocks, stable oscillators, or (perhaps most commonly) the GPS satellite system (which itself uses atomic clocks). These sources are collectively referred to as reference clocks.
In the NTP network, a reference clock is stratum 0 – that is, an authoritative source of time. An NTP server which uses a stratum 0 clock as its time source is stratum 1. Stratum 2 servers get their time from stratum 1 servers; stratum 3 servers get their time from stratum 2 servers, and so on. In practice it’s rare to see servers higher than stratum 4 or 5 on the Internet [Mills] [Minar].
Stratum 1 servers are connected to their stratum 0 sources via local hardware such as a serial port or expansion card slot. The reason we have additional strata after stratum 1 is to ensure that there are enough servers to cope with the load from all the clients. As much as it is possible, network delay (latency) between strata should be kept to a minimum.
NTP uses a number of different algorithms to ensure that the time it receives is accurate. [Mills] Knowing how these algorithms work at a basic level can help us avoid configuration mistakes later, so we’ll look at them here briefly:
- filtering – The poll results from each time source are filtered in order to produce the most accurate results. [Mills]
- selection (a.k.a. intersection) – The results from all sources are compared to determine which ones can potentially represent the true time, and those which cannot (called falsetickers or falsechimers) are discarded from further calculations. [Mills]
- clustering – The surviving time sources from the selection algorithm are combined using statistical techniques. [Mills]
In the next part of this series we’ll explore how to install and configure NTP on an Ubuntu Linux 16.04 system.
(With apologies to Derek Zoolander and Justin Steven. And to whoever had to touch the HP-UX NTP setup at Queensland Police after I left. And to anyone who prefers the American spelling “synchronization”.)
(This is the first of a series on NTP. Part 2 is an overview of how NTP works.)
The problem with NTP
In my experience, Network Time Protocol (NTP) is one of the least well-understood of the fundamental Internet application-layer protocols, and very few IT professionals operate it effectively. Part of the reason for this is that the documentation for NTP is highly technical and assumes a certain level of background knowledge.
I first encountered NTP more than 20 years ago, and my first efforts with it were an unmitigated disaster due to my ignorance of how the protocol was designed to function. Since then virtually every IT environment I’ve encountered has had a less-than-optimal NTP setup.
I am still far from an expert on NTP, but I’ve learned quite a lot about operating it since my early days. I hope this series of posts will help you develop a working knowledge of NTP faster and get the basics of NTP configuration right in your environment.
Why learn NTP?
Why bother learning this rather obscure corner of Internet lore? I mean, the Internet mostly works, despite this alleged widespread lack of expertise in time sync, right?
Here are some of the reasons you might want to learn more about NTP:
- You run Ceph, Mongodb, Kerberos, or a similar distributed system, and you want it to actually work.
- You want your logs to match up across multiple systems, potentially on multiple continents.
- You like learning about new things and tinkering with embedded systems.
- You think bandwidth-efficient, high-precision time synchronisation is just a fun, nerdy problem.
- You think this is cool:
A scenario where the latter behavior [the PPS driver disciplining the local clock in the absence of external sources] can be most useful is a planetary orbiter fleet, for instance in the vicinity of Mars, where contact between orbiters and Earth only one or two times per Sol (Mars day). These orbiters have a precise timing reference based on an Ultra Stable Oscillator (USO) with accuracy in the order of a Cesium oscillator. A PPS signal is derived from the USO and can be disciplined from Earth on rare occasion or from another orbiter via NTP. In the above scenario the PPS signal disciplines the spacecraft clock between NTP updates.
(Personally, they had me at “planetary orbiter fleet”. 🙂 )
In this series, I’ll describe a few best practices for setting up NTP in a standard 64-bit Ubuntu Linux 16.04 LTS environment. Bear in mind this quite limited scope; this advice will not apply in all circumstances and intentionally ignores the less common use cases. Further caveats:
- I have no looks.
- I am not an expert. My descriptions of the algorithms are based on the documentation and operational experience. I’m not a member of the NTP project; I’ve never submitted a patch; I’ve never compiled ntpd from source (I hate reading & writing C/C++).
- I’ve only worked with the reference implementation of NTP, and only on Linux, with only one reference clock driver (NMEA), and a limited range of configuration options.
- I will be glossing over a lot of detail. Sometimes it’s because I don’t think it’s necessary in order to work with NTP successfully; sometimes it’s because I haven’t looked into that particular corner and so I don’t understand it; sometimes it’s because I have looked into that particular corner and I still don’t understand it. 🙂 But mostly it’s because I’m attempting to keep this series accessible for those who are newcomers. If you’re an experienced NTP operator, you probably won’t find much of interest (if anything) until later in the series.
- We won’t cover much history or theory of time sync in this series. If you’d like to know a little more about that, check out Julien Goodwin‘s previous LCA & SLUG talks:
(You can read part 1 first, if you want.)
When I first started running, I naturally fell into what I considered to be a proper running pace, but I found that I was not able to run as far as I wanted, and it wasn’t until I made a conscious effort to slow down and concentrate on distance that I was able to achieve 5 km without stopping (the goal of Couch to 5K). They advise that to start you should run as slowly as possible; the slowest thing you can actually call running. After a while I realised this was really good advice for me.
I haven’t really concentrated on my technique very much. The main thing I tried was swinging my arms a little more than I thought was natural. This seems to give the lungs more room and make diaphragm cramping less likely. A lot of people I see running keep their elbows bent sharply with their forearms held high, but that hasn’t worked for me so far.
I sometimes find that my lungs start to protest as I get to about the 1-2 km mark in a run. My brain immediately starts to make plans to stop and thinks up appropriate excuses – “You’ll probably feel sick if you keep going”, “you may not make it to 5 km if you don’t stop for a walk now”, and similar things. I found that if I just ignore those excuses and keep running (and maybe ease off the pace a bit), after a little while the tight feeling in the lungs passes and I finish my 5 km very successfully.
I run in a place I love: at the beach, around low tide, in the early morning or late afternoon. Our beaches here aren’t usually crowded, and early or late in the day is the time when many native birds are active, especially raptors. (My favourite to watch is the white-bellied sea eagle – sometimes they’re amazing enough to make me stop running and just watch as they glide along.).
I run on the moist part of the sand – between the thick, dry sand above the high tide mark, and the hard, wet part that the water has recently been on. This seems to give the right combination of stability of surface and underfoot cushioning. I often have a swim or surf before (in the hotter part of the year) or after (in the cooler part of the year).
I run with as few accessories as possible: sunscreen (when necessary), a swimming rash vest, running skins (compression shorts), my Garmin fitness tracker and GPS (I currently use older models that don’t have the functions combined), and my car key on a shoelace around my neck. In winter I have worn a light fitness jacket, but in this climate it’s not really necessary in all but a few weeks of the year, and causes me to overheat at other times.
I don’t wear shoes. This means I need to keep a watch out for any sharp shells on the beach, but I haven’t found that particularly tricky, and I find I heat up less. Until I started training on the road, I hadn’t really experienced any pain in my feet or legs due to running in bare feet (more on this later).
I don’t carry a phone or MP3 player. I love music, but I don’t like things being in my ears when I’m sweating. I prefer to listen to the sound of the waves and the birds. Sometimes I sing (or some rough facsimile thereof), or recite music in my head, especially some of Neal Morse‘s longer pieces.