Tuesday, April 20, 2010
Welcome to the third episode of our URL removals series! In episodes one and two, we talked about expediting the removal of content that's under your control and requesting expedited cache removals. Today, we're covering how to use Google's public URL removal tool to request removal of content from Google's search results when the content originates on a website not under your control.
Google offers two tools that provide a way to request expedited removal of content:
- Verified URL removal tool: for requesting to remove content from Google's search results when it's published on a site of which you're a verified owner in Webmaster Tools (like your blog or your company's site)
- Public URL removal tool: for requesting to remove content from Google's search results when it's published on a site which you can't verify ownership (like your friend's blog)
Sometimes a situation arises where the information you want to remove originates from a site that you don't own or can't control. Since each individual webmaster controls their site and their site's content, the best way to update or remove results from Google is for the site owner (where the content is published) to either block crawling of the URL, modify the content source, or remove the page altogether. If the content isn't changed, it would just reappear in our search results the next time we crawled it. So the first step to remove content that's hosted on a site you don't own is to contact the owner of the website and request that they remove or block the content in question.
Removed or blocked content
If the website owner removes a page, requests for the removed
page should return a
404 Not Found
response or a 410 Gone
response.
If they choose to block the page from search engines, then the page should either be disallowed
in the site's
robots.txt
file or contain a
noindex
meta
tag.
Once one of these requirements is met, you can submit a removal request using the "Webmaster has
already blocked the page" option.
Sometimes a website owner will claim that they've blocked or removed a page but they haven't technically done so. If they claim a page has been blocked you can double check by looking at the site's robots.txt file to see if the page is listed there as disallowed.
User-agent: * Disallow: /blocked-page/
Another place to check if a page has been blocked is within the page's HTML source code itself.
You can visit the page and choose "View Page Source" from your browser. Is there a meta noindex
tag in the HTML head
section?
<html> <head> <title>blocked page</title> <meta name="robots" content="noindex"> </head> ...
If they inform you that the page has been removed, you can confirm this by using an HTTP response
testing tool like the
Live HTTP Headers
add-on for the Firefox browser. With this add-on enabled, you can request any URL in Firefox to
test that the HTTP response is actually 404 Not Found
or 410 Gone
.
Content removed from the page
Once you've confirmed that the content you're seeking to remove is no longer present on the page, you can request a cache removal using the 'Content has been removed from the page' option. This type of removal—usually called a "cache" removal—ensures that Google's search results will not include the cached copy or version of the old page, or any snippets of text from the old version of the page. Only the current updated page (without the content that's been removed) will be accessible from Google's search results. However, the current updated page can potentially still rank for terms related to the old content as a result of inbound links that still exist from external sites. For cache removal requests you'll be asked to enter a "term that has been removed from the page." Be sure to enter a word that is not found on the current live page, so that our automated process can confirm the page has changed—otherwise the request will be denied. Cache removals are covered in more detail in part two of the "URL removal explained" series.
Removing inappropriate webpages or images that appear in our SafeSearch filtered results
Google introduced the SafeSearch filter with the goal of providing search results that exclude potentially offensive content. For situations where you find content that you feel should have been filtered out by SafeSearch, you can request that this content be excluded from SafeSearch filtered results in the future. Submit a removal request using the 'Inappropriate content appears in our SafeSearch filtered results' option.
If you encounter any issues with the public URL removal tool or have questions not addressed here, please post them to the Webmaster Help Forum or consult the more detailed removal instructions in our Help Center. If you do post to the forum, remember to use a URL shortening service to share any links to content you want removed.
Other posts of this series
- Part I: Removing URLs and directories
- Part II: Removing and updating cached content
- Part III: Removing content you don't own
- Part IV: Tracking requests, what not to remove
Finally, you might be also interested to read about managing what information is available about you online.