Overview of crawling and indexing topics
The topics in this section describe how you can control Google's ability to find and parse your content in order to show it in Search and other Google properties, as well as how to prevent Google from crawling specific content on your site.
Here's a brief description of each page. To get an overview of crawling and indexing, read our How Search works guide.
| Topics | |
|---|---|
| File types indexable by Google | Google can index the content of most types of pages and files. Explore a list of the most common file types that Google Search can index. | 
| URL structure | Consider organizing your content so that URLs are constructed logically and in a manner that is most intelligible to humans. | 
| Sitemaps | Tell Google about pages on your site that are new or updated. | 
| Crawler management | |
| robots.txt | A robots.txt file tells search engine crawlers which pages or files the crawler can or can't request from your site. | 
| Canonicalization | Learn what URL canonicalization is and how to tell Google about any duplicate pages on your site in order to avoid excessive crawling. Learn how Google auto-detects duplicate content, how it treats duplicate content, and how it assigns a canonical URL to any duplicate page groups found. | 
| Mobile sites | Learn how you can optimize your site for mobile devices and ensure that it's crawled and indexed properly. | 
| AMP | If you have AMP pages, learn how AMP works in Google Search. | 
| JavaScript | There are some differences and limitations that you need to account for when designing your pages and applications to accommodate how crawlers access and render your content. | 
| Page and content metadata | |
| Removals | |
| Site moves and changes | |