Validation…

Following on from my post about the new key being added to the zone, the required 30 days have passed and if your resolver is RFC5011 compliant, it should now trust the key.

You can check this as follows:

BIND

$ cat /var/named/managed-keys.bind
$ORIGIN .
$TTL 0  ; 0 seconds
@                       IN SOA  . . (
                                1904       ; serial
                                0          ; refresh (0 seconds)
                                0          ; retry (0 seconds)
                                0          ; expire (0 seconds)
                                0          ; minimum (0 seconds)
                                )
                        KEYDATA 20170425142612 20170210095625 19700101000000 257 3 8 (
                                AwEAAagAIKlVZrpC6Ia7gEzahOR+9W29euxhJhVVLOyQ
                                bSEW0O8gcCjFFVQUTf6v58fLjwBd0YI0EzrAcQqBGCzh
                                /RStIoO8g0NfnfL2MTJRkxoXbfDaUeVPQuYEhg37NZWA
                                JQ9VnMVDxP/VHL496M/QZxkjf5/Efucp2gaDX6RS6CXp
                                oY68LsvPVjR0ZSwzz1apAzvN9dlzEheX7ICJBBtuA6G3
                                LQpzW5hOA2hzCTMjJPJ8LbqF6dsV6DoBQzgul0sGIcGO
                                Yl7OyQdXfZ57relSQageu+ipAdTTJ25AsRTAoub8ONGc
                                LmqrAmRLKBP1dfwhYB4N7knNnulqQxA+Uk1ihz0=
                                ) ; KSK; alg = RSASHA256; key id = 19036
                                ; next refresh: Tue, 25 Apr 2017 14:26:12 GMT
                                ; trusted since: Fri, 10 Feb 2017 09:56:25 GMT
2017-03-12.automated-ksk-test.research.icann.org KEYDATA 20170424152612 20170317172529 19700101000000 257 3 8 (
                                AwEAAa9qsSLDI+H0keqE3Yzdr6XuhqhBQVWw5xdgNoWL
                                hE4VxSEIBz9IuCA4w4ssSrClZ59seNc76ltDFcKJv3X9
                                jDjzRtBLjenIgV4n/3GpKrAAnRlYbUtpBEdlk4mxoL3B
                                lX8pfLg7RQfTlWaxOUga1+CChcVieFF/si/eePc9HpZb
                                WxHZRLCAE8dlDa0aa0tfVAZWOnaifpmbTvhDK3tdvMU0
                                tfG2YfsOYcFB9z2KWmCDYwCONNKtls3p6wMwolun1h8I
                                Yo0PF98vqjAp3NVRZvKKdgyF/bZ/iJtAZFytXvXU6Gwa
                                5tOm1wgP6wuKupscP8KHBluZyOSKw4RMTk6YBdE=
                                ) ; KSK; alg = RSASHA256; key id = 3934
                                ; next refresh: Mon, 24 Apr 2017 15:26:12 GMT
                                ; trusted since: Fri, 17 Mar 2017 17:25:29 GMT
                        KEYDATA 20170424152612 20170418002534 19700101000000 257 3 8 (
                                AwEAAfUtjasCuLysD4MbjG3v4Kyu0vvVJ/0cIreP6flt
                                MeZmwQ5SRta/mB+eFVjau+6YKra2UeTKxojBovHH2lZr
                                w7NNejL44/Xps4gR3LSVMnCdwras+yvj4en64ghRGWYO
                                uB+Icb0AqrCUhLFWR8yx41UkfaA2vzFnM2xTx0N0+o6R
                                6UciWuwJResomQupOjNUy2ZAi81Y3pb0x3Lw4POjpcSJ
                                zrK4aZ/5UPymplqhLEU2DsoQmyFlM5RNTt0YXR8XM4Yw
                                su/scxg0u00IF1GC8xcyZUTMc1Rz98AY1VUo5QqUp9Vb
                                Aed5Aw1nNYfjLTj+zOykedgmjms1iNgh9EY111c=
                                ) ; KSK; alg = RSASHA256; key id = 19741
                                ; next refresh: Mon, 24 Apr 2017 15:26:12 GMT
                                ; trusted since: Tue, 18 Apr 2017 00:25:34 GMT

We can see in the output above that the new key, keytag 19741, is now trusted.

Unbound

$ cat /var/lib/unbound/2017-03-12.automated-ksk-test.research.icann.org.ds
; autotrust trust anchor file
;;id: 2017-03-12.automated-ksk-test.research.icann.org. 1
;;last_queried: 1493044058 ;;Mon Apr 24 14:27:38 2017
;;last_success: 1493044058 ;;Mon Apr 24 14:27:38 2017
;;next_probe_time: 1493047519 ;;Mon Apr 24 15:25:19 2017
;;query_failed: 0
;;query_interval: 3600
;;retry_time: 3600
2017-03-12.automated-ksk-test.research.icann.org.       60      IN      DNSKEY  257 3 8 AwEAAa9qsSLDI+H0keqE3Yzdr6XuhqhBQVWw5xdgNoWLhE4VxSEIBz9IuCA4w4ssSrClZ59seNc76ltDFcKJv3X9jDjzRtBLjenIgV4n/3GpKrAAnRlYbUtpBEdlk4mxoL3BlX8pfLg7RQfTlWaxOUga1+CChcVieFF/si/eePc9HpZbWxHZRLCAE8dlDa0aa0tfVAZWOnaifpmbTvhDK3tdvMU0tfG2YfsOYcFB9z2KWmCDYwCONNKtls3p6wMwolun1h8IYo0PF98vqjAp3NVRZvKKdgyF/bZ/iJtAZFytXvXU6Gwa5tOm1wgP6wuKupscP8KHBluZyOSKw4RMTk6YBdE= ;{id = 3934 (ksk), size = 2048b} ;;state=2 [  VALID  ] ;;count=0 ;;lastchange=1489997718 ;;Mon Mar 20 08:15:18 2017
2017-03-12.automated-ksk-test.research.icann.org.       60      IN      DNSKEY  257 3 8 AwEAAfUtjasCuLysD4MbjG3v4Kyu0vvVJ/0cIreP6fltMeZmwQ5SRta/mB+eFVjau+6YKra2UeTKxojBovHH2lZrw7NNejL44/Xps4gR3LSVMnCdwras+yvj4en64ghRGWYOuB+Icb0AqrCUhLFWR8yx41UkfaA2vzFnM2xTx0N0+o6R6UciWuwJResomQupOjNUy2ZAi81Y3pb0x3Lw4POjpcSJzrK4aZ/5UPymplqhLEU2DsoQmyFlM5RNTt0YXR8XM4Ywsu/scxg0u00IF1GC8xcyZUTMc1Rz98AY1VUo5QqUp9VbAed5Aw1nNYfjLTj+zOykedgmjms1iNgh9EY111c= ;{id = 19741 (ksk), size = 2048b} ;;state=2 [  VALID  ] ;;count=0 ;;lastchange=1492590342 ;;Wed Apr 19 08:25:42 2017

Similarly, for unbound, we can see above that the status is now VALID.

A New Key…

Further to my post on ICANN’s automated KSK testlab, ICANN generated a new key on the 19th, and added it to the test zone that we’re using, and we can see it below:

$ dig +multiline @::1 2017-03-12.automated-ksk-test.research.icann.org dnskey

; <<>> DiG 9.9.5-9+deb8u6-Debian <<>> +multiline @::1 2017-03-12.automated-ksk-test.research.icann.org dnskey
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36605
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;2017-03-12.automated-ksk-test.research.icann.org. IN DNSKEY

;; ANSWER SECTION:
2017-03-12.automated-ksk-test.research.icann.org. 60 IN	DNSKEY 257 3 8 (
				AwEAAa9qsSLDI+H0keqE3Yzdr6XuhqhBQVWw5xdgNoWL
				hE4VxSEIBz9IuCA4w4ssSrClZ59seNc76ltDFcKJv3X9
				jDjzRtBLjenIgV4n/3GpKrAAnRlYbUtpBEdlk4mxoL3B
				lX8pfLg7RQfTlWaxOUga1+CChcVieFF/si/eePc9HpZb
				WxHZRLCAE8dlDa0aa0tfVAZWOnaifpmbTvhDK3tdvMU0
				tfG2YfsOYcFB9z2KWmCDYwCONNKtls3p6wMwolun1h8I
				Yo0PF98vqjAp3NVRZvKKdgyF/bZ/iJtAZFytXvXU6Gwa
				5tOm1wgP6wuKupscP8KHBluZyOSKw4RMTk6YBdE=
				) ; KSK; alg = RSASHA256; key id = 3934
2017-03-12.automated-ksk-test.research.icann.org. 60 IN	DNSKEY 256 3 8 (
				AwEAAbqImB5UsfE5J/sx3L3uQxjSY5HIPjrlTFKA+cxE
				R8SmU1wWGo21nrNBm3pOIYoC3zhiaCq1Jo6XrTcg+In+
				62g7PeXBO+2QBoHzCBxqbFMPoGpHph7D/OebWOvw5Akz
				MFqus2/JxZtvJOgkBws1EbzOw/lKbJUZVStUiCOZ8wFP
				Xd3X7nQMjVTOu6Cb2uGAVrgBRsARo+2CdcXNEtzNTHU1
				c+VxH9G/t/2VCrueDmr/epUP1adkyNUmXoYaG3eMrdGr
				ml8Dr7OMrt40vlWFp6i3TxltDXG/navXdEmL/w6f+pA6
				Dt9KVw/iEUxB08+4VY6jMkxfWJAD6t5XwCVcKH8=
				) ; ZSK; alg = RSASHA256; key id = 19401
2017-03-12.automated-ksk-test.research.icann.org. 60 IN	DNSKEY 257 3 8 (
				AwEAAfUtjasCuLysD4MbjG3v4Kyu0vvVJ/0cIreP6flt
				MeZmwQ5SRta/mB+eFVjau+6YKra2UeTKxojBovHH2lZr
				w7NNejL44/Xps4gR3LSVMnCdwras+yvj4en64ghRGWYO
				uB+Icb0AqrCUhLFWR8yx41UkfaA2vzFnM2xTx0N0+o6R
				6UciWuwJResomQupOjNUy2ZAi81Y3pb0x3Lw4POjpcSJ
				zrK4aZ/5UPymplqhLEU2DsoQmyFlM5RNTt0YXR8XM4Yw
				su/scxg0u00IF1GC8xcyZUTMc1Rz98AY1VUo5QqUp9Vb
				Aed5Aw1nNYfjLTj+zOykedgmjms1iNgh9EY111c=
				) ; KSK; alg = RSASHA256; key id = 19741

;; Query time: 285 msec
;; SERVER: ::1#53(::1)
;; WHEN: Tue Mar 21 20:17:12 GMT 2017
;; MSG SIZE  rcvd: 905

Key 19741 is a new KSK in the zone.

If you look in managed-keys.bind (I’m running Debian, and so that’s in /var/cache/bind/) you’ll now see the new key is visible while BIND is observing the new key. RFC5011 defines the period that the resolver must observe the new key for as either at least two times the TTL of the keyset containing the new key, or 30 days; whichever is the longer.

I’m cheating, slightly, and taking a look at managed-keys.bind from a different server, because my Debian box is running BIND 9.9.5, whereas I have access to a 9.11 box; you’ll see why below:

$ cat /var/named/managed-keys.bind
$ORIGIN .
$TTL 0	; 0 seconds
@			IN SOA	. . (
				284        ; serial
				0          ; refresh (0 seconds)
				0          ; retry (0 seconds)
				0          ; expire (0 seconds)
				0          ; minimum (0 seconds)
				)
			KEYDATA	20170322222551 20170210095625 19700101000000 257 3 8 (
				AwEAAagAIKlVZrpC6Ia7gEzahOR+9W29euxhJhVVLOyQ
				bSEW0O8gcCjFFVQUTf6v58fLjwBd0YI0EzrAcQqBGCzh
				/RStIoO8g0NfnfL2MTJRkxoXbfDaUeVPQuYEhg37NZWA
				JQ9VnMVDxP/VHL496M/QZxkjf5/Efucp2gaDX6RS6CXp
				oY68LsvPVjR0ZSwzz1apAzvN9dlzEheX7ICJBBtuA6G3
				LQpzW5hOA2hzCTMjJPJ8LbqF6dsV6DoBQzgul0sGIcGO
				Yl7OyQdXfZ57relSQageu+ipAdTTJ25AsRTAoub8ONGc
				LmqrAmRLKBP1dfwhYB4N7knNnulqQxA+Uk1ihz0=
				) ; KSK; alg = RSASHA256; key id = 19036
				; next refresh: Wed, 22 Mar 2017 22:25:51 GMT
				; trusted since: Fri, 10 Feb 2017 09:56:25 GMT
2017-03-12.automated-ksk-test.research.icann.org KEYDATA 20170321232551 20170317172529 19700101000000 257 3 8 (
				AwEAAa9qsSLDI+H0keqE3Yzdr6XuhqhBQVWw5xdgNoWL
				hE4VxSEIBz9IuCA4w4ssSrClZ59seNc76ltDFcKJv3X9
				jDjzRtBLjenIgV4n/3GpKrAAnRlYbUtpBEdlk4mxoL3B
				lX8pfLg7RQfTlWaxOUga1+CChcVieFF/si/eePc9HpZb
				WxHZRLCAE8dlDa0aa0tfVAZWOnaifpmbTvhDK3tdvMU0
				tfG2YfsOYcFB9z2KWmCDYwCONNKtls3p6wMwolun1h8I
				Yo0PF98vqjAp3NVRZvKKdgyF/bZ/iJtAZFytXvXU6Gwa
				5tOm1wgP6wuKupscP8KHBluZyOSKw4RMTk6YBdE=
				) ; KSK; alg = RSASHA256; key id = 3934
				; next refresh: Tue, 21 Mar 2017 23:25:51 GMT
				; trusted since: Fri, 17 Mar 2017 17:25:29 GMT
			KEYDATA	20170321232551 20170418002534 19700101000000 257 3 8 (
				AwEAAfUtjasCuLysD4MbjG3v4Kyu0vvVJ/0cIreP6flt
				MeZmwQ5SRta/mB+eFVjau+6YKra2UeTKxojBovHH2lZr
				w7NNejL44/Xps4gR3LSVMnCdwras+yvj4en64ghRGWYO
				uB+Icb0AqrCUhLFWR8yx41UkfaA2vzFnM2xTx0N0+o6R
				6UciWuwJResomQupOjNUy2ZAi81Y3pb0x3Lw4POjpcSJ
				zrK4aZ/5UPymplqhLEU2DsoQmyFlM5RNTt0YXR8XM4Yw
				su/scxg0u00IF1GC8xcyZUTMc1Rz98AY1VUo5QqUp9Vb
				Aed5Aw1nNYfjLTj+zOykedgmjms1iNgh9EY111c=
				) ; KSK; alg = RSASHA256; key id = 19741
				; next refresh: Tue, 21 Mar 2017 23:25:51 GMT
				; trust pending: Tue, 18 Apr 2017 00:25:34 GMT

On my 9.9.5 server, I don’t have the helpful comments. We can see, helpfully, that the root key (19036), and our original testlab key (3934) are trusted. We can also see that the server observing key 19741 because the instead of trusted since we can see trust pending

If you remember from the original post, whereas BIND keeps a track in managed-keys.bind, Unbound tracks the metadata in the external file we specified with auto-trust-anchor-file:. The file has been updated in a similar way to BIND’s:

$ cat /var/lib/unbound/2017-03-12.automated-ksk-test.research.icann.org.ds
; autotrust trust anchor file
;;id: 2017-03-12.automated-ksk-test.research.icann.org. 1
;;last_queried: 1490135144 ;;Tue Mar 21 22:25:44 2017
;;last_success: 1490135144 ;;Tue Mar 21 22:25:44 2017
;;next_probe_time: 1490138421 ;;Tue Mar 21 23:20:21 2017
;;query_failed: 0
;;query_interval: 3600
;;retry_time: 3600
2017-03-12.automated-ksk-test.research.icann.org.	60	IN	DNSKEY	257 3 8
AwEAAa9qsSLDI+H0keqE3Yzdr6XuhqhBQVWw5xdgNoWLhE4VxSEIBz9IuCA4w4ssSrClZ59seNc76ltDFcKJv3X
9jDjzRtBLjenIgV4n/3GpKrAAnRlYbUtpBEdlk4mxoL3BlX8pfLg7RQfTlWaxOUga1+CChcVieFF/si/eePc9Hp
ZbWxHZRLCAE8dlDa0aa0tfVAZWOnaifpmbTvhDK3tdvMU0tfG2YfsOYcFB9z2KWmCDYwCONNKtls3p6wMwolun1
h8IYo0PF98vqjAp3NVRZvKKdgyF/bZ/iJtAZFytXvXU6Gwa5tOm1wgP6wuKupscP8KHBluZyOSKw4RMTk6YBdE=
;{id = 3934 (ksk), size = 2048b} ;;state=2 [  VALID  ] ;;count=0 ;;lastchange=1489997718 ;;Mon Mar 20 08:15:18 2017

2017-03-12.automated-ksk-test.research.icann.org.	60	IN	DNSKEY	257 3 8
AwEAAfUtjasCuLysD4MbjG3v4Kyu0vvVJ/0cIreP6fltMeZmwQ5SRta/mB+eFVjau+6YKra2UeTKxojBovHH2lZ
rw7NNejL44/Xps4gR3LSVMnCdwras+yvj4en64ghRGWYOuB+Icb0AqrCUhLFWR8yx41UkfaA2vzFnM2xTx0N0+o
6R6UciWuwJResomQupOjNUy2ZAi81Y3pb0x3Lw4POjpcSJzrK4aZ/5UPymplqhLEU2DsoQmyFlM5RNTt0YXR8XM
4Ywsu/scxg0u00IF1GC8xcyZUTMc1Rz98AY1VUo5QqUp9VbAed5Aw1nNYfjLTj+zOykedgmjms1iNgh9EY111c=
;{id = 19741 (ksk), size = 2048b} ;;state=1 [ ADDPEND ] ;;count=34 ;;lastchange=1489997718 ;;Mon Mar 20 08:15:18 2017

In line 15, we see the original key (3934) with a status of VALID, whereas in line 22 we see the newly spotted key 19741 is ADDPEND.

What’s next…?

Now we wait; 30 days, and as long as the key is observed throughout, the key should become trusted at the end of this…

Rolling, rolling, rolling…

Introduction

In October 2017, ICANN are going to roll the key signing key in the root of the DNS.

If you’re not technical and don’t know what I just said, this post isn’t for you.

If, however, you run a validating recursive resolver, read on…

In October (the 11th to be exact), the key will roll and you’ll need to have done one of two things…

  1. Update your root trust anchor manually
  2. Check your resolver is RFC5011 compliant.

But first, a little…

Background…

So you know how DNSSEC works…

…you sign a zone. More specifically, you generate two keys, a key to sign the zone (ZSK), and a key to sign the keys (KSK). The zone gets bigger because for each record set, a signature is generated and added (RRSIG records). The public part of the keyset is also added to the zone (DNSKEY records). Some form of proof of non-existance is added (NSEC or NSEC3).

Next, once the keys and signatures have made it to all of the nameservers for the zone, you generate a delegated signer record (DS) from the KSK, and you publish that in the parent. The parent then signs the DS record, and hey presto, your chain of trust is made.

So, where’s the DS record for the root… To make this chain of trust work, resolvers that want to validate the DNSSEC chain of trust need a starting point in the root…

Your resolver has a trust anchor for the root. Depending on what you’re using for a resolver, this will either be the DS of the root KSK, or the public part of the KSK.

Your resolver will have this built in, but then, if configured correctly, will use an automatic mechanism to keep that key up to date and roll it when required.

RFC5011

RFC5011 defines how a resolver can automatically update a trust anchor for a zone.

So that you can check whether your resolver will follow this process, ICANN have an automated testbed for the KSK roll, which I encourage you to look at.

ICANN’s Automated Test

Each week, they create a new zone, and they sign it with a set of newly generated keys. Purposefully broken DS records are published in the parent zone, so that a normal validating resolver will SERVFAIL (because validation fails).

By adding a trust anchor to your resolver, the zone will validate.

If correctly configured, your resolver will now look for new key signing keys, and will observe them, and use them as per RFC5011.

So, lets take a look at this. Before I add a trust anchor, I can check that the zone doesn’t validate:


$ dig @::1 2017-03-12.automated-ksk-test.research.icann.org soa

; <<>> DiG 9.9.5-9+deb8u6-Debian <<>> @::1 2017-03-12.automated-ksk-test.research.icann.org soa
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 39100
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;2017-03-12.automated-ksk-test.research.icann.org. IN SOA

;; Query time: 1908 msec
;; SERVER: ::1#53(::1)
;; WHEN: Mon Mar 20 13:22:57 GMT 2017
;; MSG SIZE  rcvd: 77

We can see in line 7, that we have a SERVFAIL response.

This server is running BIND. So, first we check that the server is configured manage keys using RFC5011:

options {
    ...
    dnssec-validation auto;
    ...
};

If you’re just adding this, don’t forget to rndc reconfig

Trust Anchor

Now, we need to add a trust anchor:

BIND

managed-keys {
  2017-03-12.automated-ksk-test.research.icann.org initial-key 257 3 8
  "AwEAAa9qsSLDI+H0keqE3Yzdr6XuhqhBQVWw5xdgNoWLhE4VxSEIBz9I
  uCA4w4ssSrClZ59seNc76ltDFcKJv3X9jDjzRtBLjenIgV4n/3GpKrAA
  nRlYbUtpBEdlk4mxoL3BlX8pfLg7RQfTlWaxOUga1+CChcVieFF/si/e
  ePc9HpZbWxHZRLCAE8dlDa0aa0tfVAZWOnaifpmbTvhDK3tdvMU0tfG2
  YfsOYcFB9z2KWmCDYwCONNKtls3p6wMwolun1h8IYo0PF98vqjAp3NVR
  ZvKKdgyF/bZ/iJtAZFytXvXU6Gwa5tOm1wgP6wuKupscP8KHBluZyOSK
  w4RMTk6YBdE=";
};

This is added in your named.conf file.

Once again, don’t forget to rndc reconfig

Unbound

If you’re running Unbound, then you can add the DNSKEY or DS records to a file in a location that Unbound can read and write to (so, somewhere like /var/lib/unbound/ and then add a auto-trust-anchor-file line in the server: section of your unbound.conf file.

cat /var/lib/unbound/2017-03-12.automated-ksk-test.research.icann.org.ds
2017-03-12.automated-ksk-test.research.icann.org. IN DS 3934 8 1 47AA8AAF4D75B3D9C58448F241F793EBC4977821
2017-03-12.automated-ksk-test.research.icann.org. IN DS 3934 8 2 0D27F2E6EA9CA548F1896A71FB07CED86074D3462F2A720D6177F3C5CEC15F0D

Note; the file doesn’t look like this once you’ve told Unbound about it, as it uses the file to store metadata related to the RFC5011 process.

server:
    ...
    auto-trust-anchor-file: "/var/lib/unbound/2017-03-12.automated-ksk-test.research.icann.org.ds"
    ...

After adding those, you’ll want to unbound-control reload to pick up the changes.

Testing

$ dig @::1 2017-03-12.automated-ksk-test.research.icann.org soa

; <<>> DiG 9.9.5-9+deb8u6-Debian <<>> @::1 2017-03-12.automated-ksk-test.research.icann.org soa
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30413
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 3

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;2017-03-12.automated-ksk-test.research.icann.org. IN SOA

;; ANSWER SECTION:
2017-03-12.automated-ksk-test.research.icann.org. 60 IN	SOA ns1.research.icann.org. automated-ksk-test.research.icann.org. 1489968062 3600 600 86400 60

;; AUTHORITY SECTION:
2017-03-12.automated-ksk-test.research.icann.org. 60 IN	NS ns2.research.icann.org.
2017-03-12.automated-ksk-test.research.icann.org. 60 IN	NS ns1.research.icann.org.

;; ADDITIONAL SECTION:
ns1.research.icann.org.	3600	IN	A	192.0.34.56
ns2.research.icann.org.	3600	IN	A	192.0.45.56

;; Query time: 428 msec
;; SERVER: ::1#53(::1)
;; WHEN: Mon Mar 20 13:44:24 GMT 2017
;; MSG SIZE  rcvd: 181

This time, we can see that on line 7, we have a NOERROR response, and on line 8, we can see that we have ad in the flags.

What’s next…

Now, we wait. The next step is that ICANN’s automated test lab will generate and publish a new KSK into the zone on the 19th.

Anycasting DNS

Introduction…

I wanted to have a tinker with anycasting, and DNS seemed a sensible place to start, and easy to test and muck about with. So, I spun up a couple of DNS resolvers, and decided what my anycasted IP addresses would be. They need to be outside of the subnets I’m using on the rest of my network, as I want to route traffic to them. I’ve put the underlying machine’s unicast addresses in this subnet too, but you wouldn’t have to, depending on your set up.

Servers…

The nameservers are, essentially, identical to servers that’d deal with unicast traffic, except for the following changes. I’m using BIND, but it really doesn’t matter what you use.

We need to bind up the anycast addresses so that the O/S will deal with their traffic…

In my case, my anycasted addresses will be 10.1.53.1 and 10.1.53.2, and I’m using Debian, so my additions to /etc/network/interfaces are:

auto lo:1
iface lo:1 inet static
address 10.1.53.1
netmask 255.255.255.255

auto lo:2
iface lo:2 inet static
address 10.1.53.2
netmask 255.255.255.255

We need to stop the machine responding to ARP for these. Actually, we tell it to stop responding to ARP requests unless the interface the ARP arrives on matches the ARP’d for IP, so because we’ve bound them up to the loopback, we don’t want the machine to respond via eth0, for example, so I added the following to /etc/sysctl.conf:

net.ipv4.conf.eth0.arp_ignore = 1
net.ipv4.conf.eth0.arp_announce = 2

BGP & Load Balancing…

Now we need to advertise the anycast addresses to our router. In this case, we’ll use BGP to do this. To do that, we’ll use ExaBGP. Grab that and install it on the server, and then the config looks something like this. My router is 10.1.53.254, and my two nameservers live in 10.1.53.0/24

neighbor 10.1.53.254 {
  router-id 10.1.53.11;
  local-address 10.1.53.11;
  local-as 64601;
  peer-as 64601;
  hold-time 10;

  process watch-nameserver {
    run /usr/local/bin/nameserver_watchdog;
  }

  static {
    route 10.1.53.1/32 next-hop 10.1.53.11 watchdog anycastdns withdraw;
    route 10.1.53.2/32 next-hop 10.1.53.11 watchdog anycastdns withdraw;
    route xxxx.xxxx.xxxx:53::1/128 next-hop xxxx.xxxx.xxxx:53::11 watchdog anycastdns withdraw;
    route xxxx.xxxx.xxxx:53::2/128 next-hop xxxx.xxxx.xxxx:53::11 watchdog anycastdns withdraw;
  }
}

I withdraw the routes from the outset, so that the watchdog will announce them upon successful testing.

The router’s BGP config looks like this (it’s JunOS):

# show protocols bgp group dns-anycast
local-address 10.1.53.254;
hold-time 10;
family inet {
    unicast;
}
family inet6 {
    unicast;
}
peer-as 64601;
local-as 64601;
multipath;
neighbor 10.1.53.11;
neighbor 10.1.53.12;

I’m going to equally load balance between the two servers, but you could set a localpref on each server, for example, and have server1 handle .1 primarily with server2 taking over in the event of failure, and vice versa.

Don’t fall for JunOS’ misleading ‘per packet’ configuration item; this will, despite appearances, load balance per flow based on a hashing algorithm.

# show routing-options forwarding-table
export dns-anycast-loadbalance;

# show policy-options policy-statement dns-anycast-loadbalance
then {
    load-balance per-packet;
}

Monitoring and Health…

We’ve included a watchdog in the ExaBGP config. Without this, clearly if the nameserver fails entirely, then the BGP session will be torn down, and the traffic directed to the other host. However, if the nameserver daemon fails, then the BGP session will remain, and traffic will be disrupted. Therefore, there’s a watchdog that’ll check that the nameserver daemon is listening, and will perform a lookup against it, announcing the anycast address(es) while it’s up, and withdrawing them in the event of failure. The watchdog looks like this:

#!/usr/bin/perl

use strict;

my $debug = 0;

unless($debug) {
	$SIG{'INT'} = sub {};
}
select STDOUT;
$| = 1;

use IO::Socket;
use Net::DNS;

my $state = 'init';

my $ip;
my $domain;
if(open(C,"/etc/nameserver_watchdog.conf")) {
	chomp(($ip, $domain) = split /:/, <C>);
	close C;
} else {
	$ip = '127.0.0.1';
	$domain = 'localdomain';
}
print "checking $ip for $domain\n" if $debug;

while(1) {
	eval {
		local $SIG{ALRM} = sub { die 'Timed Out'; };
		alarm 2;
		print "attempting connect... state is [$state]\n" if $debug;
		my $socket = IO::Socket::INET->new(Proto=>'tcp', PeerAddr=>$ip, PeerPort=>53, Timeout=>2);
		if($socket && $socket->connected() && do_lookup($ip, $domain)) {
			print "announce watchdog anycastdns\n" if $state ne 'up';
			$socket->close();
			alarm 0;
			$state = 'up';
			print "state set to up\n" if $debug;
		} else {
			print "withdraw watchdog anycastdns\n" if $state ne 'down';
			$state = 'down';
			print "state set to down\n" if $debug;
		}
	};
	if($@) {
		print "state is [$state]\n" if $debug;
		print "withdraw watchdog anycastdns\n" if $state ne 'down';
		$state = 'down';
		print "state set to down in barf\n" if $debug;
	}
	alarm 0;
	sleep 10;
}

sub do_lookup {
	my $ip = shift;
	my $domain = shift;
	my $r = Net::DNS::Resolver->new;
	$r->nameservers($ip);
	$r->tcp_timeout(5);
	$r->udp_timeout(5);
	my $q = $r->query($domain,'SOA');
	my $found = 0;
	print "Answer: ".($q->answer)[0]->serial."\n" if $debug;
	$found++ if ($q->answer)[0]->serial =~ m/^\d+$/;
	if($debug > 1) {
		require Data::Dumper;
		print Data::Dumper::Dumper($q)."\n\n";
	}
	return 1 if $q && $found;
	print "Error:\n" if $debug;
	print $r->errorstring if $debug;
	print "\n===\n" if $debug;
	return 0;
}

/etc/nameserver_watchdog.conf contains lines of the format ip.ad.dr.ess:domain.com.

It’ll announce the address in the event that a tcp connection succeeds as well as a DNS lookup that you’d expect the server should answer or be permitted to recurse for you. If the DNS daemon stops responding the watchdog will withdraw the routes; if the server fails, the BGP session will fail, and the route will be withdrawn anyway.