Google Ads API는 클라이언트 고객 ID (CID) 및 개발자 토큰별 초당 쿼리 수 (QPS)에 따라 비율 제한을 위해 요청을 버킷화합니다. 즉, CID와 개발자 토큰 모두에 대해 독립적으로 계량이 적용됩니다. Google Ads API는 토큰 버킷 알고리즘을 사용하여 요청을 측정하고 적절한 QPS 한도를 결정하므로 정확한 한도는 특정 시점의 전체 서버 부하에 따라 달라집니다.
요청률 제한을 적용하는 목적은 한 사용자가 의도적으로 또는 의도치 않게 많은 요청으로 Google Ads API 서버를 압도하여 다른 사용자의 서비스를 방해하지 못하도록 하는 것입니다.
요청을 일괄 처리하면 총 요청 수가 줄어들고 분당 요청 수 제한이 완화되지만, 단일 계정에 대해 많은 작업을 실행하면 분당 작업 수 제한이 트리거될 수 있습니다.
제한 및 비율 제한기
클라이언트 애플리케이션의 총 스레드 수를 제한하는 것 외에도 클라이언트 측에서 비율 제한기를 구현할 수 있습니다. 이렇게 하면 프로세스 및 / 또는 클러스터 전반의 모든 스레드가 클라이언트 측의 특정 QPS 제한에 따라 관리됩니다.
Guava Rate Limiter를 확인하거나 클러스터링된 환경을 위해 자체 토큰 버킷 기반 알고리즘을 구현할 수 있습니다. 예를 들어 토큰을 생성하여 데이터베이스와 같은 공유 트랜잭션 스토리지에 저장할 수 있으며 각 클라이언트는 요청을 처리하기 전에 토큰을 획득하고 사용해야 합니다. 토큰이 모두 사용된 경우 클라이언트는 다음 토큰 배치가 생성될 때까지 기다려야 합니다.
현재 재생목록
메시지 대기열은 작업 부하 분산 솔루션인 동시에 요청 및 소비자 비율을 제어합니다. 다양한 메시지 대기열 옵션이 있습니다. 일부는 오픈소스이고 일부는 독점적이며, 대부분은 다양한 언어로 작동할 수 있습니다.
메시지 대기열을 사용하면 여러 프로듀서가 대기열에 메시지를 푸시하고 여러 소비자가 이러한 메시지를 처리할 수 있습니다. 제한은 동시 소비자 수를 제한하여 소비자 측에서 구현하거나 생산자 또는 소비자를 위한 속도 제한기 또는 제한기를 구현할 수 있습니다.
예를 들어 메시지 소비자가 비율 제한 오류를 발견하면 소비자는 재시도할 수 있도록 요청을 대기열로 반환할 수 있습니다. 동시에 해당 소비자는 오류에서 복구하기 위해 다른 모든 소비자에게 몇 초 동안 처리를 일시중지하도록 알릴 수도 있습니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["필요한 정보가 없음","missingTheInformationINeed","thumb-down"],["너무 복잡함/단계 수가 너무 많음","tooComplicatedTooManySteps","thumb-down"],["오래됨","outOfDate","thumb-down"],["번역 문제","translationIssue","thumb-down"],["샘플/코드 문제","samplesCodeIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-05(UTC)"],[[["\u003cp\u003eThe Google Ads API uses rate limits to prevent service disruption, enforced by queries per second (QPS) per client customer ID and developer token.\u003c/p\u003e\n"],["\u003cp\u003eExceeding rate limits results in a \u003ccode\u003eRESOURCE_TEMPORARILY_EXHAUSTED\u003c/code\u003e error and can be mitigated by reducing request frequency and throttling QPS.\u003c/p\u003e\n"],["\u003cp\u003eStrategies for managing rate limits include limiting concurrent tasks, batching requests, implementing client-side throttling, and using message queues for load distribution.\u003c/p\u003e\n"],["\u003cp\u003eBatching requests, while reducing total requests, can potentially trigger Operations Per Minute rate limits with large operations against a single account.\u003c/p\u003e\n"],["\u003cp\u003eImplementing client-side rate limiters or using message queues with throttling mechanisms on producers/consumers provide more advanced control over request rates.\u003c/p\u003e\n"]]],[],null,["Rate Limits\n-----------\n\nThe Google Ads API buckets requests for rate limiting by queries per second (QPS) per\nclient customer ID (CID) and developer token, meaning that metering is enforced\nindependently on both CIDs and developer tokens. The Google Ads API\nuses a [Token Bucket](//en.wikipedia.org/wiki/Token_bucket) algorithm to meter requests and determine an\nappropriate QPS limit, so the exact limit will vary depending on the overall\nserver load at any given time.\n\nThe purpose of imposing rate limits is to prevent one user from disrupting\nservice for other users by (either intentionally or unintentionally)\noverwhelming the Google Ads API servers with a high volume of requests.\n\nRequests that are in violation of rate limits will be rejected with the error:\n[`RESOURCE_TEMPORARILY_EXHAUSTED`](/google-ads/api/reference/rpc/v21/QuotaErrorEnum.QuotaError#resource_temporarily_exhausted).\n\nYou can take control of your app and mitigate rate limits by both actively\nreducing the number of requests and throttling QPS from the client side.\n\nThere are a number of ways to reduce the chances of exceeding the rate limit.\nBecoming familiar with [Enterprise Integration Patterns](//www.enterpriseintegrationpatterns.com/patterns/messaging/) (EIP) concepts\nsuch as Messaging, Redelivery, and Throttling can help you build a more robust\nclient app.\n\nThe following recommended practices ordered by complexity, with simpler\nstrategies at the top and more robust but sophisticated architectures after:\n\n- [Limit concurrent tasks](#limit)\n- [Batching requests](#batch)\n- [Throttling and rate limiters](#throttle)\n- [Queueing](#queue)\n\nLimit concurrent tasks\n----------------------\n\nOne root cause of exceeding rate limits is that the client app is spawning an\nexcessive number of parallel tasks. While we don't limit the number of parallel\nrequests a client app can have, this can easily exceed the Requests Per Second\nlimit at the developer token level.\n\nSetting a reasonable upper bound for the total number of concurrent tasks that\nare going to make requests (across all processes and machines), and adjusting\nupward to optimize your throughput without exceeding the rate limit is\nrecommended.\n\nFurthermore, you can consider throttling QPS from the client side (check out\n[Throttling and rate limiters](#throttle)).\n\nBatching requests\n-----------------\n\nConsider batching multiple operations into a single request. This is most\napplicable on `MutateFoo` calls. For example, if you're updating status for\nmultiple instances of [`AdGroupAd`](/google-ads/api/reference/rpc/v21/AdGroupAd) - instead of\ncalling [`MutateAdGroupAds`](/google-ads/api/reference/rpc/v21/AdGroupAdService/MutateAdGroupAds)\nonce for each [`AdGroupAd`](/google-ads/api/reference/rpc/v21/AdGroupAd), you can call\n[`MutateAdGroupAds`](/google-ads/api/reference/rpc/v21/AdGroupAdService/MutateAdGroupAds) once, and\npass in multiple `operations`. Refer to our [batch operations guidance](/google-ads/api/docs/best-practices/overview#batch_operations)\nfor some additional examples.\n\nWhile batching requests reduces the total number of requests and mitigates the\nRequests Per Minute rate limit, it may trigger the Operations Per Minute rate\nlimit if you perform a large number of operations against a single account.\n\nThrottling and rate limiters\n----------------------------\n\nIn addition to limiting the total number of threads in your client application,\nyou can also implement rate limiters on the client side. This can ensure all the\nthreads across your processes and / or clusters are governed by a specific QPS\nlimit from the client side.\n\nYou can check out [Guava Rate Limiter](//github.com/google/guava/blob/master/guava/src/com/google/common/util/concurrent/RateLimiter.java), or implement your own [Token\nBucket](//en.wikipedia.org/wiki/Token_bucket) based algorithm for a clustered environment. For example, you\ncould generate tokens and store them in a shared transactional storage such as a\ndatabase, and each client would have to acquire and consume a token before it\nprocesses the request. If the tokens were used up, the client would have to wait\nuntil the next batch of tokens is generated.\n\nQueueing\n--------\n\nA message queue is the solution for operation load distribution, while also\ncontrolling request and consumer rates. There are a number of message queue\noptions available---some open source, some proprietary---and many of\nthem can work with different languages.\n\nWhen using message queues, you can have multiple producers pushing messages to\nthe queue and multiple consumers processing those messages. Throttles can be\nimplemented at the consumer side by limiting the number of concurrent consumers,\nor implement rate limiters or throttlers for either the producers or consumers.\n\nFor example, if a message consumer encounters a rates limit error, that consumer\ncan return the request to the queue to be retried. At the same time, that\nconsumer can also notify all other consumers to pause processing for a number of\nseconds to recover from the error."]]