Domain Troubleshooting

When Google Public DNS cannot resolve a domain, it is often due to a problem with that domain or its authoritative name servers. The following steps can help determine what causes the problem, so that domain administrators can resolve it themselves.

Before starting these steps, check the domain at dns.google as described on the general troubleshooting page, which may refer you to a particular diagnostic step below. Otherwise, try all the following steps until you find the cause.

Step 1: Check for DNSSEC validation problems

If dns.google web lookups for the domain show "Status": 2 (SERVFAIL) and queries without DNSSEC succeed, there may be a DNSSSEC problem with the domain's name servers or its top-level domain (TLD) registry (which publishes DS records for DNSSEC validation of registered domains).

Changes in registrar or DNS service

DNSSEC problems can occur after a domain switches from a registrar or DNS service that supports DNSSEC to one that doesn't. If the previous service leaves stale DS records in the TLD registry, and the new service does not create new DNSKEY records with matching DS records in the TLD registry, validating resolvers like Google Public DNS cannot resolve the domain.

If this happens, ask your domain registrar to remove stale DS records from the TLD registry.

DNSSEC responses are too large

Another cause of DNSSEC problems can be DNSSEC responses that are too large to fit in one IP packet, creating fragmented responses that may be dropped. If DNSViz shows "no response received until UDP payload size was decreased" errors, DNSSEC failures may be caused by very large responses. Response sizes can be reduced by one or more of the following actions:

  • Configure "minimal responses" for authoritative name servers
  • Reduce the number of active DNSKEY records to two or three
  • Use 1280 or 2048 bit DNSKEY records (RFC 6781, StackExchange)
  • Switch from RSA signatures to smaller ECDSA signatures (RFC 8624)

Also check for any other DNSSEC problems reported by the tools in step 2. Examples include bad NSEC or NSEC3 denial-of-existence records proving there are no subdomains (PowerDNS with zones stored in external databases may have these) or expired RRSIG signatures (with broken manually configured signing processes).

Step 2: Check the authoritative name servers

Archived DNSViz page

If Google Public DNS (or any open resolver) has a problem resolving a domain, DNSViz shows domain and name server issues that cause it. Go to the DNSViz web page and enter the problematic domain name. If DNSViz has no historical data, or only has data that is more than a day old (such as shown on the page here) click the large Analyze button to reveal a smaller Analyze button below (if it's not already visible) and click that too. When the analysis completes, click "Continue" to show results. Click the red errors and yellow warnings on the left sidebar to reveal details, or hold the pointer over objects in the diagram to pop-up that info in context.

If earlier diagnostics indicated possible DNSSEC problems with the domain, go to the DNSSEC Analyzer web page and enter the domain name. If this analyzer reports DNSSEC errors or warnings, hold the pointer over the red or yellow ⚠︎ icons for suggestions on how to fix them.

The intoDNS web page reports on non-DNSSEC problems with the domain entered on the main page and also shows suggestions for fixing them.

Domain administrators should fix most of the errors these tools report, since they can cause problems not just for Google Public DNS but also other resolvers.

Step 3: Check for delegation problems

Google Public DNS is a "parent-centric" resolver, which only uses the name servers returned in referrals from the parent domain. If the name server names and glue addresses in the TLD are stale or incorrect, this can cause delegation problems.

If either DNSViz or intoDNS report warnings about inconsistencies between the name servers delegated in the TLD and those present in the child domain itself, those may need to be addressed before Google Public DNS can resolve the domain. If these tools report that the registered domain does not exist (NXDOMAIN), check that the domain is not expired or on registration hold for any reason.

Delegation problems can also be caused by a failure to resolve the names of the name servers for a domain. Check the A and AAAA records for the name servers on dns.google to see if there are problems with the name servers' domains.

Step 4: Check for large responses

DNS relies upon UDP to carry the majority of its traffic. Large UDP datagrams are subject to fragmentation and fragmented UDP suffers from unreliable delivery. This was the focus of DNS Flag Day 2020, an effort to improve reliability of DNS globally. Google Public DNS has participated in this effort, and limits the size of UDP responses it will accept over UDP. Try a query like those below, with your own command prompt or a Google Cloud Shell:

$ dig +short example.com NS
ns1.example.com
ns2.example.com
$ dig +dnssec +nocrypto +bufsize=1400 +timeout=1 @ns1.example.com example.com A
...
$ dig +dnssec +nocrypto +bufsize=1400 +timeout=1 @ns1.example.com example.com TXT
...
$ dig +dnssec +nocrypto +bufsize=1400 +timeout=1 @ns1.example.com example.com DNSKEY
...
$ dig +dnssec +nocrypto +bufsize=1400 +timeout=1 @ns1.example.com com DNSKEY
...

These queries for various record types are specifying:

+dnssec
Enable DNSSEC, especially returning the required records for DNSSEC validation when they are available. These can expand the size of the result significantly. This emulates Google Public DNS's behavior.
+bufsize=1400
Limit the allowed UDP buffer size. This emulates Google Public DNS's behavior, as of the DNS Flag Day 2020 effort.
+timeout=1
Set the timeout to one second. This emulates Google Public DNS's behavior.
@ns1.example.com
Which authoritative server to query -- keep the @ sign but otherwise replace with your own domain's authoritative server, as shown by the first command.

Observe the output; do you see a line like:

;; Truncated, retrying in TCP mode.
This indicates that the response was larger than the requested UDP buffer size, so it was truncated and in response the client switched to TCP. Your authoritative servers should be capable of handling DNS traffic on TCP port 53. (See RFC 7766 which requires that "implementations MUST support both UDP and TCP transport".)
;; MSG SIZE rcvd: 2198
For any number above 1400? This again indicates a large response.
;; Query time: 727 msec
For any number above 500? Slow responses (especially those near or above 1 second) may be discarded by Google Public DNS. This is especially likely if some time was spent on a UDP attempt which was then followed by a TCP attempt. Geographical location of server and client can greatly affect latency.
;; connection timed out; no servers could be reached
Especially when only for some queries, this indicates a problem whereby your server is unable to answer DNS queries in a timely fashion.

You can try the following query variations:

Adding a +tcp parameter.
This forces dig to use TCP immediately, you can check whether your authoritative server handles TCP queries directly this way.
Removing the +bufsize=1400 parameter.
This will restore dig's default behavior, (a bufsize of 4096). If your queries fail with this setting but work without it, this is a hint that your server does not handle TCP failover well. Relying on UDP to carry large responses only works sometimes. The best course of action is to support TCP transport for DNS.
Repeating at each name server.
The example above has two authoritative name servers (ns1 and ns2). Some problems are caused by different servers returning different answers. Check that they all answer consistently by repeating the same queries at all authoritative servers.

If all queries' responses were small (1400 bytes or fewer), fast (preferably 500 milliseconds or faster), and reliable (work consistently over TCP and UDP), then response size is not your concern; read the other troubleshooting sections. Even if your responses are fast, queries from geographically far away might be slower.

If any of these checks failed (large? slow? unreliable?) the primary course of action is to A) make sure that your server responds with UDP truncation, when its response exceeds the requested UDP buffer size and B) that it can handle the TCP query retry which will follow. Several tools can help you diagnose DNS reliability issues:

If any errors or warnings are revealed by these tools, be sure to address them. Also be sure to read all the other troubleshooting instructions on this site.

Step 5: Check whether other public resolvers resolve the domain

If you did not find any cause of the problem after following the steps above, run the following commands at a command prompt, replacing example.test. with the domain in question (and preserving the trailing dots):

Windows

nslookup example.test. resolver1.opendns.com.
nslookup example.test. dns.quad9.net.
nslookup example.test. one.one.one.one.

macOS or Linux

dig example.test. '@resolver1.opendns.com.'
dig example.test. '@dns.quad9.net.'
dig example.test. '@one.one.one.one.'

These commands use the DNS resolvers of OpenDNS, Quad9, and Cloudflare 1.1.1.1. If you get resolution failures from two of these as well as Google Public DNS, the problem is likely with the domain or its name servers.

If you get a successful result from more than one other public resolver, there may be a problem with Google Public DNS. If no similar problems have been reported for the domain (or its TLD) on the issue tracker, you should report the issue to us, including command output and diagnostic page text or screenshots in your report.