DNS in a Nutshell

The DNS (Domain Name System) is used to translate a domain name (e.g: www.grokit.ca) into an IP address (e.g: 74.125.20.121). This is a necessary since data is routed on the Internet using IP addresses, not domain names.

Even though you type a human-readable domain name in your browser, it cannot be directly used to convey data on the Internet (IP addresses are used for that purpose). Domain names are used because we, humans, can remember better long series of letters than long series of numbers. Therefore, we built a system to allow us to type a (human-readable) domain name in a browser that gets resolved into an IP address by the Domain Name System. This IP address can then be used to communicate on the Internet.

DNS in Detail

A DNS request is typically done using UDP User Datagram Protocol over port 53.

In order to observe the mechanics of DNS, let's capture Internet traffic and see what happens when I type a domain name in my browser. In order to do that, I'll start a Wireshark capture and issue a ping to www.grokit.ca (ping is an ICMP protocol command, which will be covered in a separate article. The thing to remember here is that ping needs to know the IP address associated with the queried domain name so it will trigger DNS lookup). Here it is:

43  1.161347000 192.168.1.8 192.168.1.1 DNS 73  Standard query 0x338a  A www.grokit.ca
52  1.259355000 192.168.1.1 192.168.1.8 DNS 123 Standard query response 0x338a  CNAME ghs.googlehosted.com A 74.125.28.121

You probably noticed that it goes to address 192.168.1.1, which is a local router. DNS is a hierarchically cached system, with caches at router, ISP, closest DNS cache and root server. This is done so that if many people connected to my router issue a DNS request for the same server, it only has to go so far up the cache. Since DNS entries do not change very quickly, caching works remarkably well.

If you are curious to see all the bytes, here is the request:

43  1.161347000 192.168.1.8 192.168.1.1 DNS 73  Standard query 0x338a  A www.grokit.ca
==>
0000   00 26 62 ae 71 b4 a0 88 b4 e4 31 64 08 00 45 00  .&b.q.....1d..E.
0010   00 3b 10 a9 00 00 80 11 a6 af c0 a8 01 08 c0 a8  .;..............
0020   01 01 c1 de 00 35 00 27 4d 15 33 8a 01 00 00 01  .....5.'M.3.....
0030   00 00 00 00 00 00 03 77 77 77 06 67 72 6f 6b 69  .......www.groki
0040   74 02 63 61 00 00 01 00 01                       t.ca.....

Up to "33 8a ..." it is just a standard IP-UDP message from 192.168.1.8 (my computer) to 192.168.1.1 (the local DNS cache, which is my router).

Response:

00000000  33 8a 81 80 00 01 00 02  00 00 00 00 03 77 77 77 3....... .....www
00000010  06 67 72 6f 6b 69 74 02  63 61 00 00 01 00 01 c0 .grokit. ca......
00000020  0c 00 05 00 01 00 00 07  08 00 16 03 67 68 73 0c ........ ....ghs.
00000030  67 6f 6f 67 6c 65 68 6f  73 74 65 64 03 63 6f 6d googleho sted.com
00000040  00 c0 2b 00 01 00 01 00  00 00 44 00 04 4a 7d 1c ..+..... ..D..J}.
00000050  79

If you look at the last hexadecimal values: 4a 7d 1c 79. Translated to decimal this is (using Python shell):

>>> 0x4a, 0x7d, 0x1c, 0x79
(74, 125, 28, 121)

74.125.28.121, which is the IP address corresponding to www.grokit.ca.

You could dig into the DNS RFC (or here), but for the scope of this article I think it's sufficient to show that you needed to know the IP address for www.google.ca, and you got 74.125.28.121. Nice, now you can issue an HTTP request and communicate with the website.

DNS and Virtual Hosting

The astute observer may have noticed that visiting http://74.125.28.121 brings you to a generic Google page, which is different than what you get if you point your browser to http://www.grokit.ca. Is DNS lying? No, there is a small detail remaining so that we can complete our DNS understanding in the context of visiting a website.

Although in some case the IP address will directly be the computer which serves the website, in other case it points to a server that hosts many website. For example, my website is hosted on Google App Engine. Like most large web service, it uses Virtual Hosting. In simple terms, it means that it is using the same infrastructure / VIP / computer / IP address to serve many websites, which makes a lot of sense if you are a large company with beefy computers and a limited set of IP addresses.

In order to disambiguate which website that beefy computer should serve, it simply leverages the Host MIME header. When the HTTP request hits the server (74.125.28.121), the browser inserts the Host MIME header. mitmproxy (use Fiddler if on Windows), allows to capture the HTTP request:

Host:             www.grokit.ca
User-Agent:       Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:42.0) Gecko/20100101 Firefox/42.0
Accept:           text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language:  en-US,en;q=0.5
Accept-Encoding:  gzip, deflate
Cookie:           __utma=137660006.823405121.1434689563.1434694212.1440612362.3;
                  __utmz=137660006.1234689563.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)
Connection:       keep-alive

So that machine sees www.grokit.ca, and probably routes the traffic internally to whichever VM happens to currently be running my service. Just for fun, trigger a HTTP GET to ghs.googlehosted.com with and without the proper Host header, and you will see the how without the header it will reply 404 Not Found.

We can use curl to issue a simple GET without the Host MIME header:

$curl ghs.googlehosted.com

<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 404 (Not Found)!!1</title>
  [...]

... and with the Host MIME header:

$ curl -H 'Host: www.grokit.ca' ghs.googlehosted.com

<!DOCTYPE html>
<html lang="en">
<head>
    <link rel="stylesheet" href="./default.css" type="text/css">
    <title>List of All Content</title>
    [...]

Success, it returns my website!

DNS Tools

Dig

The tool dig allows you get get information about DNS servers:

$ dig

[...]
ADDITIONAL SECTION:
a.root-servers.net.     493617  IN      A       198.41.0.4
a.root-servers.net.     493633  IN      AAAA    2001:503:ba3e::2:30
b.root-servers.net.     493804  IN      A       192.228.79.201
b.root-servers.net.     544839  IN      AAAA    2001:500:84::b
[...]

It also allows to resolve URL -> IP:

$ dig www.grokit.ca

; <<>> DiG 9.9.6 <<>> www.grokit.ca
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34283
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;www.grokit.ca.                 IN      A

;; ANSWER SECTION:
www.grokit.ca.          865     IN      CNAME   ghs.googlehosted.com.
ghs.googlehosted.com.   214     IN      A       74.125.28.121

;; Query time: 55 msec
;; SERVER: 10.221.228.12#53(10.221.228.12)
;; WHEN: Fri Dec 11 17:53:03 PST 2015
;; MSG SIZE  rcvd: 92

Can also use dig to do IP -> URL lookup:

$ dig -x 74.125.28.121 +short
pc-in-f121.1e100.net.

NSLookup

nslookup also allows URL -> IP and IP -> URL translation. For URL -> IP:

$ nslookup -query=any -debug www.yahoo.com
Server:         127.0.1.1
Address:        127.0.1.1#53

------------
    QUESTIONS:
    www.yahoo.com, type = ANY, class = IN
    ANSWERS:
    ->  www.yahoo.com
    canonical name = fd-fp3.wg1.b.yahoo.com.
    ttl = 293
    AUTHORITY RECORDS:
    ADDITIONAL RECORDS:
------------
Non-authoritative answer:
www.yahoo.com   canonical name = fd-fp3.wg1.b.yahoo.com.

A fun thing to do with nslookup is to point to a specific DNS server: nslookup www.domainname.com dns_server.

IP -> URL:

$ nslookup 207.241.224.2
Server:         127.0.1.1
Address:        127.0.1.1#53

Non-authoritative answer:
2.224.241.207.in-addr.arpa      name = www.archive.org.

What if Someone Tries to DDoS the DNS?

If there are only 13 root servers, it must be trivial to DDoS the DNS service, right? Actually, no. Because:

Things not Covered

Links