Google 検索は、オープンウェブで有益な情報を見つけるのに便利なツールですが、残念なことに、すべてのウェブページが善意で作成されているわけではありません。ユーザーを騙す意図が明確に見て取れるページは少なくありません。そして、これらに日々立ち向かうのも私たちの仕事です。ユーザーの安全を確保し、有害なコンテンツや悪意のある行動から検索エクスペリエンスを保護するために、Google 検索は、2020 年も数多くのイノベーションに取り組んできました。
賢いスパム対策
Google は、検索機能を導入した直後からスパム対策に取り組んできましたが、最近の人工知能(AI)の進歩は、このアプローチに革命を起こす大きなチャンスになろうとしています。
Google は昨年、スパムに関する深い知識と AI を組み合わせて、独自のスパム対策 AI を構築しました。この AI を使用すると、既知のスパムと新しいスパムの両方のトレンドを効果的に把握できます。たとえば、自動生成されたサイトやコンテンツが無断複製されたサイトを、数年前と比べて 80% 以上減らすことに成功しました。
昨年は世界的なパンデミックをはじめとする大きな出来事が起こりました。このような重要なトピックに関して数十億もの検索が行われ、検索の保護を強化するために Google では多くの作業を行いました。お近くの COVID 検査所を探しているときに、中身のない誘導用スパムからフィッシング サイトにリダイレクトされてしまうのではという不安があってはなりません。Google ではスパム コンテンツを排除するだけでなく、他の複数の検索チームと協力して、質の高い最新情報を必要なときに必要な場所で提供できるようにしました。
スパム接触防止
Google の検索結果が表示される前に、バックグラウンドではさまざまな処理が行われています。Google は日々、数十億ものウェブページの検出、クロール、インデックス登録を行っています。こうしたページには多数のスパムが存在し、Google では毎日 400 億のスパム行為のあるページを検出しています。ここでは、有益な情報を探しているユーザーの妨げにならないように Google が取り組んでいるスパム対策をご紹介します。
これらのシステムは、サイトマップや Search Console で検出されたコンテンツについても同様に振る舞います。たとえば、Search Console にはインデックス登録をリクエストする機能があり、インデックスに早く登録したほうがよいと思うページがある場合は、この機能を使って Google に知らせることができます。ところが、スパマーが脆弱なサイトをハッキングしてサイトの所有者になりすまし、Search Console で本人確認を行い、このツールを利用してスパマー自身が作成したスパムページのクロールとインデックス登録を Google にリクエストしているケースが確認されました。そこで AI を活用して不審な本人確認を突き止めて、スパム URL がインデックスに登録されるのを未然に防げるようにしました。
また Google には、インデックスに登録されているコンテンツを分析するシステムがあります。このシステムは、ユーザーの検索条件に一致したコンテンツがスパムかどうかを入念にチェックします。スパムである場合、そのコンテンツは検索結果の上位に表示されません。またその情報を活用して、同じようなスパムがインデックスに含まれないようにシステムを改善します。
このように AI を活用した自動システムのおかげで、検索時にスパムが検索結果の上位に表示されることはめったになくなりました。こうした自動システムによって、Google 検索を利用するユーザーの 99% 以上がスパムに遭遇しなくなったと推定しています。残り 1% については手動による対策を実施し、そこから学んだことを教訓に自動システムをさらに改善しています。
スパム以外の脅威からの保護
Google では、2020 年にスパム以外の不正行為からユーザーを保護するための取り組みも強化しました。こうした不正行為の多くは、深刻な経済的、人的損害を引き起こす可能性があります。
AI の進歩が大きく寄与したもう 1 つの側面は、サイトのコンテンツを理解することでした。たとえば、商品レビュー、情報サイト、ショッピング サイトのランキング方法を改善した事例があります。Google 検索は購入前に商品を調査して見つけるのに最適な方法です。そこで、次回の購入時にはもっと役に立つ情報を確実に提供できるように、詳しいデータや有益な情報を掲載しているコンテンツが上位に表示されるようにしたのです。
Google はスパム対策で大幅な進捗を遂げましたが、スパマーはなおも、検出を免れるために新しい技術の開発に意欲的に取り組んでいます。Google では、新しいタイプの不正行為からユーザーを保護するために、常に改善に努めています。外部からの報告が役立つこともあります。最近検索を利用したときに、検索結果が誤解につながるものであったり、詐欺、スパムであると感じたことはあるでしょうか。これらの不正行為を防ぐための取り組みをさらに強化する必要があると思いますか。その場合は、スパムレポートを使用して、クエリやその他の関連情報とともにフィードバックをお送りください。
[[["わかりやすい","easyToUnderstand","thumb-up"],["問題の解決に役立った","solvedMyProblem","thumb-up"],["その他","otherUp","thumb-up"]],[["必要な情報がない","missingTheInformationINeed","thumb-down"],["複雑すぎる / 手順が多すぎる","tooComplicatedTooManySteps","thumb-down"],["最新ではない","outOfDate","thumb-down"],["翻訳に関する問題","translationIssue","thumb-down"],["サンプル / コードに問題がある","samplesCodeIssue","thumb-down"],["その他","otherDown","thumb-down"]],[],[[["\u003cp\u003eGoogle Search leverages AI to combat spam, reducing auto-generated and scraped content by over 80%.\u003c/p\u003e\n"],["\u003cp\u003eGoogle is actively working to prevent various types of online scams and fraud to enhance user safety.\u003c/p\u003e\n"],["\u003cp\u003eGoogle encourages website owners to practice good security hygiene to protect against hacking and spam.\u003c/p\u003e\n"],["\u003cp\u003eUsers can contribute to a safer web experience by reporting spam and suspicious content through feedback channels.\u003c/p\u003e\n"],["\u003cp\u003eGoogle's algorithms prioritize high-quality content, such as in-depth product reviews, to ensure users receive valuable information.\u003c/p\u003e\n"]]],["Google Search utilizes AI to combat spam and protect users. In 2020, AI enhancements reduced auto-generated and scraped content by over 80% and improved hacked spam detection by over 50%. They discovered 40 billion spammy pages daily and prevented over 99% of spam from appearing in top search results. They also expanded protections against online scams, detecting sites that imitate brands to obtain personal information. They use AI to understand sites, such as improving the ranking of product review sites. They encourage users to report spam to help improve.\n"],null,["# How we fought Search spam on Google in 2020\n\nThursday, April 29, 2021\n\n\nGoogle Search is a powerful tool to help you find useful information on the open web. Unfortunately, not all web pages are created with good intent. Many of them are explicitly created to deceive people, and that is something we fight against every day. To ensure your safety and protect your search experience against disruptive content and malicious behaviors, Search has invested in many innovations in 2020.\n\nFighting spam smarter\n---------------------\n\n\nWhile we have been [fighting spam](https://www.youtube.com/watch?v=oJixNEmrwFU) since the early days of Search, recent advances in Artificial Intelligence (AI) offer unprecedented potential to revolutionize our approach.\n\n\nBy combining our deep knowledge of spam with AI, last year we were able to build our very own spam-fighting AI that is incredibly effective at catching both known and new spam trends. For example, we have reduced sites with auto-generated and scraped content by more than 80% compared to a couple of years ago.\n\n\nHacked spam was still rampant in 2020 as the number of vulnerable web sites remained quite large, although we have improved our detection capability by more than 50% and [removed most of the hacked spam from search results](https://www.youtube.com/watch?v=TnhKznlJfTM).\n\n\nThis is a problem that we cannot solve alone. Even if we could detect and protect against all spam, the hackers would not cease exploiting loopholes until they're all closed. Website owners can protect their sites by practicing good security hygiene: it is easier to prevent a site from getting hacked than to recover from a hack. Google offers resources to help you understand [the most common ways websites get hacked](/web/fundamentals/security/hacked/top_ways_websites_get_hacked_by_spammers) and how to [use Search Console](/web/fundamentals/security/hacked/use_search_console) to check [whether your site got hacked](/web/fundamentals/security/hacked). Please do take a look and let's keep the web safer together!\n\n\nWith major events last year, including a global pandemic, we have devoted significant effort in extending protection to the billions of searches we received on such important topics. If you're looking for a COVID testing site near you, you shouldn't have to worry about landing on gibberish spam that may redirect you to phishing sites. Besides eliminating spam content, we worked with several other Search teams to make sure you receive the most up-to-date and highest quality information when and where it matters the most.\n\nPreventing spam from reaching you\n---------------------------------\n\n\nBefore we deliver a set of search results on Google, [there's a lot that happens behind the scenes](https://www.google.com/search/howsearchworks/). Every day, we're discovering, crawling, and indexing billions of web pages. Among those pages is a lot of spam---every day, we discover 40 billion spammy pages. Here's how we work to keep that spam from getting in the way of your search for helpful, useful information.\nThis diagram conceptualizes how we defend against spam.\n\n\nFirst, we have systems that can detect spam when we crawl pages or other content. Crawling is when our automatic systems visit content and consider it for inclusion in the index we use to provide search results. Some content detected as spam isn't added to the index.\n\n\nThese systems also work for content we discover through sitemaps and [Search Console](https://search.google.com/search-console/about). For example, Search Console has a [Request Indexing](/search/docs/crawling-indexing/ask-google-to-recrawl) feature so creators can let us know about new pages that should be added quickly. We observed spammers hacking into vulnerable sites, pretending to be the owners of these sites, verifying themselves in the Search Console and using the tool to ask Google to crawl and index the many spammy pages they created. Using AI, we were able to pinpoint suspicious verifications and prevented spam URLs from getting into our index this way.\n\n\nNext, we have systems that analyze the content that is included in our index. When you issue a search, they work to double-check if the content that matches might be spam. If so, that content won't appear in the top search results. We also use this information to better improve our systems to prevent such spam from being included in the index at all.\n\n\nThe result is that very little spam actually makes it into the top results anyone sees for a search, thanks to our automated systems that are aided by AI. We estimated that these automated systems help keep more than 99% of visits from Search completely without spam. As for the tiny percentage left, our teams take [manual action](https://support.google.com/webmasters/answer/9044175) and use the learnings from that to further improve our automated systems.\n\nProtecting you beyond spam\n--------------------------\n\n\nBeyond spam, we expanded our effort in 2020 to protect you against other types of abuse. Many of these can cause significant financial and personal harm.\n\n\nIn 2020, we made significant progress in improving our coverage and protecting more users against online scams and fraud. Online scams have many shapes and they can negatively affect you in more ways than traditional webspam. For example, many scammers pretend to be offering customer support phone numbers to popular services and products, only to trick users who call in into paying them via bank transfers or gift cards. Commonly known as 'customer support scam' or 'tech support scam', this type of scam has been reported by [hundreds of thousands of users](https://www.ftc.gov/system/files/documents/reports/consumer-sentinel-network-data-book-2020/csn_annual_data_book_2020.pdf) where users may lose [hundreds of dollars](https://www.ftc.gov/news-events/blogs/data-spotlight/2019/03/older-adults-hardest-hit-tech-support-scams) to scammers in each case.\n\n\nSince 2018, our systems have been able to protect hundreds of millions of searches a year by detecting potentially scammy sites. On the web, scammers attempted to create many low quality websites with keyword stuffing, logos of brands they're imitating, and a phone number they want you to call. Our algorithmic solutions made sure that scam and fraud are very unlikely to show up in your search results. This is but one of the several types of protections we have launched last year to ensure the quality of search results and your safety. Our mission is to get ahead of the challenges to provide you with the most trustworthy results. At the same time, you can also better protect yourself by staying informed and [learning about scams](https://blog.google/technology/safety-security/scam-spotter/).\n\n\nAnother dimension where advances in AI helped tremendously was in understanding content of sites. An example of this can be found in how we helped improve [the way we rank product review, informational, and shopping sites](/search/blog/2021/04/product-reviews-update). Google Search is a great way to research and find products before you make a purchase, and we wanted to make sure that you're getting the most useful information for your next purchase by rewarding content that has more in-depth research and useful information.\n\n\nIn spite of the significant advancements we made in our spam-fighting efforts, spammers are highly motivated to develop new techniques that can evade our detection. We're always working to get better and protect people from new types of abuse, and external reports can help. Do you have any recent experiences with Search where you feel misled, scammed, or spammed, and you think we can do a better job with preventing those experiences? If so, please share feedback using the [spam report](/search/docs/advanced/guidelines/report-spam), along with the query and any other information that might be useful.\nPosted by Cody Kwok, Principal Engineer"]]