This document applies to the following methods:
About caching
To reduce client bandwidth usage and to protect Google from traffic spikes, clients of both the
Lookup API and the Update API are required to create and maintain a local cache of threat data.
For the Lookup API, the cache is used to reduce the number of threatMatches
requests that clients send to Google. For the Update API, the cache is used to reduce the number of
fullHashes
requests that clients send to Google. The caching protocol for each API is
outlined below.
Lookup API
Clients of the Lookup API should cache each returned ThreatMatch
item for the duration defined
by its cacheDuration field. Clients then need to consult the cache before making a subsequent
threatMatches
request to the server. If the cache duration for a previously returned ThreatMatch
has not yet expired, the client should assume the item is still unsafe. Caching ThreatMatch
items
may reduce the number of API requests made by the client.
Example: threatMatches.find
Click the request and response links in the table header for complete examples.
URL Check threatMatches Request |
URL match threatMatches Response |
Caching Behavior |
---|---|---|
"threatEntries": [ {"url": "http://www.urltocheck.org/"} ] |
"matches": [{ "threat": {"url": "http://www.urltocheck.org/"}, "cacheDuration": "300.000s" }] |
Match. The client must wait 5 minutes before sending a new threatMatches request that includes
URL http://www.urltocheck.org/.
|
Update API
To reduce the overall number of fullHashes
requests sent to Google using the Update API, clients
are required to maintain a local cache. The API establishes two types of caching, positive and negative.
Positive caching
To prevent clients from repeatedly asking about the state of a particular unsafe full hash,
each returned ThreatMatch
contains a positive cache duration (defined by the cacheDuration
field),
which indicates how long the full hash is to be considered unsafe.
Negative caching
To prevent clients from repeatedly asking about the state of a particular safe full hash,
each fullHashes
response defines a negative cache duration for the requested prefix (defined by the
negativeCacheDuration
field). This duration indicates how long all full hashes with the requested
prefix are to be considered safe for the requested lists, except for those returned by the server as
unsafe. This caching is particularly important as it prevents traffic overload that could be caused
by a hash prefix collision with a safe URL that receives a lot of traffic.
Consulting the cache
When the client wants to check the state of a URL, it first computes its full hash. If the full
hash’s prefix is present in the local database, the client should then consult its cache before
making a fullHashes
request to the server.
First, clients should check for a positive cache hit. If there exists an unexpired positive cache
entry for the full hash of interest, it should be considered unsafe. If the positive cache entry
expired, the client must send a fullHashes
request for the associated local prefix. Per the
protocol, if the server returns the full hash, it is considered unsafe; otherwise, it’s considered
safe.
If there are no positive cache entries for the full hash, the client should check for a negative
cache hit. If there exists an unexpired negative cache entry for the associated local prefix, the
full hash is considered safe. If the negative cache entry expired, or it doesn’t exist, the client
must send a fullHashes
request for the associated local prefix and interpret the response as normal.
Updating the cache
The client cache should be updated whenever a fullHashes
response is received. A positive cache
entry should be created or updated for the full hash per the cacheDuration
field. The hash prefix’s
negative cache duration should also be created or updated per the response’s negativeCacheDuration
field.
If a subsequent fullHashes
request does not return a full hash that is currently positively
cached, the client is not required to remove the positive cache entry. This is not cause for concern
in practice, since positive cache durations are typically short (a few minutes) to allow for quick
correction of false positives.
Example scenario
In the following example, assume h(url) is the hash prefix of the URL and H(url) is the full-length hash of the URL. That is, h(url) = SHA256(url).substr(4), H(url) = SHA256(url).
Now, assume a client (with an empty cache) visits example.com/ and sees that h(example.com/) is in the local database. The client requests the full-length hashes for hash prefix h(example.com/) and receives back the full-length hash H(example.com/) together with a positive cache duration of 5 minutes and a negative cache duration of 1 hour.
The positive cache duration of 5 minutes tells the client how long the full-length hash
H(example.com/) must be considered unsafe without sending another fullHashes
request. After 5
minutes the client must issue another fullHashes
request for that prefix h(example.com/) if the
client visits example.com/ again. The client should reset the hash prefix’s negative cache duration
per the new response.
The negative cache duration of 1 hour tells the client how long all the other full-length hashes
besides H(example.com/) that share the same prefix of h(example.com/) must be considered safe. For
the duration of 1 hour, every URL such that h(URL) = h(example.com/) must be considered safe, and
therefore not result in a fullHashes
request (assuming that H(URL) != H(example.com/)).
If the fullHashes
response contains zero matches and a negative cache duration is set, then the
client must not issue any fullHashes
requests for any of the requested prefixes for the given
negative cache duration.
If the fullHashes
response contains one or more matches, a negative cache duration is still set
for the entire response. In that case, the cache duration of a single full hash indicates how long
that particular full-length hash must be assumed unsafe by the client. After the ThreatMatch
cache
duration elapses, the client must refresh the full-length hash by issuing a fullHashes
request for
that hash prefix if the requested URL matches the existing full-length hash in the cache. In that
case the negative cache duration does not apply. The response’s negative cache duration only applies
to full-length hashes that were not present in the fullHashes
response. For full-length hashes that
are not present in the response, the client must refrain from issuing any fullHashes
requests
until the negative cache duration is elapsed.
Example: fullHashes.find
Click the request and response links in the table header for complete examples.
Hash Prefixes fullHashes Request |
Full-Length Hash Matches fullHashes Response |
Caching Behavior |
---|---|---|
"threatEntries": [ {"hash": "0xaaaaaaaa"} ] |
"matches": [], "negativeCacheDuration": "3600.000s" |
No match. Client must not send any fullHashes requests for hash prefix 0xaaaaaaaa for at least one hour.
Any hash with prefix 0xaaaaaaaa is considered safe for one hour. |
"threatEntries": [ {"hash": "0xbbbbbbbb"} ] |
"matches": [ "threat": {"hash": "0xbbbbbbbb0000..."} "cacheDuration": "600.000s", ], "negativeCacheDuration": "300.000s" |
Possible matches. The client should consider the URL with the full hash 0xbbbbbbbb0000… unsafe for 10 minutes. The client should consider all other URLs with hash prefix 0xbbbbbbbb safe for 5 minutes. After 5 minutes, the hash prefixes negative cache entry would expire. Since the positive cache entry for 0xbbbbbbbb0000… has not yet expired, the client should send fullHashes requests for all hashes
except that one. |
"threatEntries": [ {"hash": "0xcccccccc"} ] |
"matches": [ "threat": {"hash": "0xccccccccdddd..."}, "cacheDuration": "600.000s" ], "negativeCacheDuration": "3600.000s" |
Possible matches. Client must not send any fullHashes request for hash prefix 0xcccccccc for at least 1h and assume
that prefix to be safe — except if the full hash of the URL matches the cached full hash
0xccccccccdddd.... In that case the client should consider that URL to be unsafe for 10 minutes.
After 10 minutes the full-length hash expires. Any subsequent lookups for that full hash should
trigger a new fullHashes request. |