Googlebot and the 15 MB thing

Tuesday, June 28, 2022

Over the last few days we've received a great deal of questions about a recent update to our documentation about Googlebot. Namely, we've documented that Googlebot only ever "sees" the first 15 megabytes (MB) when fetching certain file types. This threshold is not new; it's been around for many years. We just added it to our documentation because it might be helpful for some folks when debugging, and because it rarely ever changes.

This limit only applies to the bytes (content) received for the initial request Googlebot makes, not the referenced resources within the page.

For example, when you open https://example.com/puppies.html, your browser will initially download the bytes of the HTML file, and based on those bytes it might make further requests for external JavaScript, images, or whatever else is referenced with a URL in the HTML. Googlebot does the same thing.

What does this 15 MB limit mean to me?
Most likely nothing. There are very few pages on the internet that are bigger in size. You, dear reader, are unlikely to be the owner of one, since the median size of a HTML file is about 500 times smaller: 30 kilobytes (kB). However, if you are the owner of an HTML page that's over 15 MB, perhaps you could at least move some inline scripts and CSS dust to external files, pretty please.

What happens to the content after 15 MB?
The content after the first 15 MB is dropped by Googlebot, and only the first 15 MB gets forwarded to indexing.

What content types does the 15 MB limit apply to?
The 15 MB limit applies to fetches made by Googlebot (Googlebot Smartphone and Googlebot Desktop) when fetching file types supported by Google Search.

Does this mean Googlebot doesn't see my image or video?
No. Googlebot fetches videos and images that are referenced in the HTML with a URL (for example, <img src="https://example.com/images/puppy.jpg" alt="cute puppy looking very disappointed" /> separately with consecutive fetches.

Do data URIs add to the HTML file size?
Yes. Using data URIs will contribute to the HTML file size since they are in the HTML file.

How can I look up the size of a page?
There are a number of ways, but the easiest is probably using your own browser and its Developer Tools. Load the page as you normally would, then launch the Developer Tools and switch to the Network tab. Reload the page, and you should see all the requests your browser had to make to render the page. The top request is what you're looking for, with the byte size of the page in the Size column.

For example, in the Chrome Developer Tools might look something like this, with 150 kB in the size column:

The Network tab in Chrome Developer Tools

If you're more adventurous, you can use cURL from a command line:

curl \
-A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36" \
-so /dev/null https://example.com/puppies.html -w '%{size_download}'

If you have more questions, you can find us on Twitter and in the Search Central Forums, and if you need more clarification about our documentation, leave us feedback on the pages themselves.