Google Ads API 會根據每個用戶端客戶 ID (CID) 和開發人員權杖的每秒查詢次數 (QPS),將要求分組以進行頻率限制,也就是說,系統會分別對 CID 和開發人員權杖強制執行計量。Google Ads API 會使用 Token Bucket 演算法計算要求,並判斷適當的 QPS 限制,因此實際限制會因任何時間的整體伺服器負載而異。
設定速率限制是為了防止使用者 (無論是有意或無意) 傳送大量要求,導致 Google Ads API 伺服器不堪負荷,進而干擾其他使用者的服務。
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["缺少我需要的資訊","missingTheInformationINeed","thumb-down"],["過於複雜/步驟過多","tooComplicatedTooManySteps","thumb-down"],["過時","outOfDate","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["示例/程式碼問題","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-08-27 (世界標準時間)。"],[[["\u003cp\u003eThe Google Ads API uses rate limits to prevent service disruption, enforced by queries per second (QPS) per client customer ID and developer token.\u003c/p\u003e\n"],["\u003cp\u003eExceeding rate limits results in a \u003ccode\u003eRESOURCE_TEMPORARILY_EXHAUSTED\u003c/code\u003e error and can be mitigated by reducing request frequency and throttling QPS.\u003c/p\u003e\n"],["\u003cp\u003eStrategies for managing rate limits include limiting concurrent tasks, batching requests, implementing client-side throttling, and using message queues for load distribution.\u003c/p\u003e\n"],["\u003cp\u003eBatching requests, while reducing total requests, can potentially trigger Operations Per Minute rate limits with large operations against a single account.\u003c/p\u003e\n"],["\u003cp\u003eImplementing client-side rate limiters or using message queues with throttling mechanisms on producers/consumers provide more advanced control over request rates.\u003c/p\u003e\n"]]],[],null,["Rate Limits\n-----------\n\nThe Google Ads API buckets requests for rate limiting by queries per second (QPS) per\nclient customer ID (CID) and developer token, meaning that metering is enforced\nindependently on both CIDs and developer tokens. The Google Ads API\nuses a [Token Bucket](//en.wikipedia.org/wiki/Token_bucket) algorithm to meter requests and determine an\nappropriate QPS limit, so the exact limit will vary depending on the overall\nserver load at any given time.\n\nThe purpose of imposing rate limits is to prevent one user from disrupting\nservice for other users by (either intentionally or unintentionally)\noverwhelming the Google Ads API servers with a high volume of requests.\n\nRequests that are in violation of rate limits will be rejected with the error:\n[`RESOURCE_TEMPORARILY_EXHAUSTED`](/google-ads/api/reference/rpc/v21/QuotaErrorEnum.QuotaError#resource_temporarily_exhausted).\n\nYou can take control of your app and mitigate rate limits by both actively\nreducing the number of requests and throttling QPS from the client side.\n\nThere are a number of ways to reduce the chances of exceeding the rate limit.\nBecoming familiar with [Enterprise Integration Patterns](//www.enterpriseintegrationpatterns.com/patterns/messaging/) (EIP) concepts\nsuch as Messaging, Redelivery, and Throttling can help you build a more robust\nclient app.\n\nThe following recommended practices ordered by complexity, with simpler\nstrategies at the top and more robust but sophisticated architectures after:\n\n- [Limit concurrent tasks](#limit)\n- [Batching requests](#batch)\n- [Throttling and rate limiters](#throttle)\n- [Queueing](#queue)\n\nLimit concurrent tasks\n----------------------\n\nOne root cause of exceeding rate limits is that the client app is spawning an\nexcessive number of parallel tasks. While we don't limit the number of parallel\nrequests a client app can have, this can easily exceed the Requests Per Second\nlimit at the developer token level.\n\nSetting a reasonable upper bound for the total number of concurrent tasks that\nare going to make requests (across all processes and machines), and adjusting\nupward to optimize your throughput without exceeding the rate limit is\nrecommended.\n\nFurthermore, you can consider throttling QPS from the client side (check out\n[Throttling and rate limiters](#throttle)).\n\nBatching requests\n-----------------\n\nConsider batching multiple operations into a single request. This is most\napplicable on `MutateFoo` calls. For example, if you're updating status for\nmultiple instances of [`AdGroupAd`](/google-ads/api/reference/rpc/v21/AdGroupAd) - instead of\ncalling [`MutateAdGroupAds`](/google-ads/api/reference/rpc/v21/AdGroupAdService/MutateAdGroupAds)\nonce for each [`AdGroupAd`](/google-ads/api/reference/rpc/v21/AdGroupAd), you can call\n[`MutateAdGroupAds`](/google-ads/api/reference/rpc/v21/AdGroupAdService/MutateAdGroupAds) once, and\npass in multiple `operations`. Refer to our [batch operations guidance](/google-ads/api/docs/best-practices/overview#batch_operations)\nfor some additional examples.\n\nWhile batching requests reduces the total number of requests and mitigates the\nRequests Per Minute rate limit, it may trigger the Operations Per Minute rate\nlimit if you perform a large number of operations against a single account.\n\nThrottling and rate limiters\n----------------------------\n\nIn addition to limiting the total number of threads in your client application,\nyou can also implement rate limiters on the client side. This can ensure all the\nthreads across your processes and / or clusters are governed by a specific QPS\nlimit from the client side.\n\nYou can check out [Guava Rate Limiter](//github.com/google/guava/blob/master/guava/src/com/google/common/util/concurrent/RateLimiter.java), or implement your own [Token\nBucket](//en.wikipedia.org/wiki/Token_bucket) based algorithm for a clustered environment. For example, you\ncould generate tokens and store them in a shared transactional storage such as a\ndatabase, and each client would have to acquire and consume a token before it\nprocesses the request. If the tokens were used up, the client would have to wait\nuntil the next batch of tokens is generated.\n\nQueueing\n--------\n\nA message queue is the solution for operation load distribution, while also\ncontrolling request and consumer rates. There are a number of message queue\noptions available---some open source, some proprietary---and many of\nthem can work with different languages.\n\nWhen using message queues, you can have multiple producers pushing messages to\nthe queue and multiple consumers processing those messages. Throttles can be\nimplemented at the consumer side by limiting the number of concurrent consumers,\nor implement rate limiters or throttlers for either the producers or consumers.\n\nFor example, if a message consumer encounters a rates limit error, that consumer\ncan return the request to the queue to be retried. At the same time, that\nconsumer can also notify all other consumers to pause processing for a number of\nseconds to recover from the error."]]