Skip to content

Added instructions for using throttled-py to smooth API calls #1886

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ZhuoZhuoCrayon
Copy link

@ZhuoZhuoCrayon ZhuoZhuoCrayon commented Jun 7, 2025

Summary

Added instructions for using throttled-py to smooth API calls,

Motivation

Simply backing off and retrying will waste part of the request budget on unnecessary retries.

By using the GCRA rate limiting strategy provided by throttled-py in Wait & Retry mode, the API call process can smoothly adhere to OpenAI API rate limiting rules.

For example, per_sec(2, burst=2) means allows 2 requests per second, and allows 2 burst requests(🪣 Bucket's capacity). In other words, this limiter will consume the burst after 2 requests. If timeout>=0.5 is set, the above example will complete all requests in 1.5 seconds (the burst is consumed immediately, and the 3 requests will be filled in the subsequent 1.5s):

import time
from throttled import RateLimiterType, Throttled, rate_limiter, store

@Throttled(
    key="chat.completions", 
    using=RateLimiterType.GCRA.value, 
    quota=rate_limiter.per_sec(2, burst=2),
    timeout=0.5   # ⏳ Set timeout=0.5 to enable wait-and-retry (max wait 0.5 second).
)
def call_chat_completions(**kwargs):
    pass

def test_throttled():
    # Make 5 sequential requests
    start_time = time.time()
    for i in range(5):
        call_chat_completions()
        print(f"Request {i+1} completed at {time.time() - start_time:.2f}s")

    total_time = time.time() - start_time
    print(f"\nTotal time for 5 requests at 2/sec: {total_time:.2f}s")

if __name__ == "__main__":
    test_throttled()
Testing Throttled rate limiter...
------------- Burst----------------------------
Request 1 completed at 0.00s
Request 2 completed at 0.00s
-----------------------------------------------
------------ Refill: 0.5 tokens per second ------
Request 3 completed at 0.50s
Request 4 completed at 1.00s
Request 5 completed at 1.50s
-----------------------------------------------

Total time for 5 requests at 2/sec: 1.50s
Expected time: ~1.5s

For new content

When contributing new content, read through our contribution guidelines, and mark the following action items as completed:

  • I have added a new entry in registry.yaml (and, optionally, in authors.yaml) so that my content renders on the cookbook website.
  • I have conducted a self-review of my content based on the contribution guidelines:
    • Relevance: This content is related to building with OpenAI technologies and is useful to others.
    • Uniqueness: I have searched for related examples in the OpenAI Cookbook, and verified that my content offers new insights or unique information compared to existing documentation.
    • Spelling and Grammar: I have checked for spelling or grammatical mistakes.
    • Clarity: I have done a final read-through and verified that my submission is well-organized and easy to understand.
    • Correctness: The information I include is correct and all of my code executes successfully.
    • Completeness: I have explained everything fully, including all necessary references and citations.

We will rate each of these areas on a scale from 1 to 4, and will only accept contributions that score 3 or higher on all areas. Refer to our contribution guidelines for more details.

@ZhuoZhuoCrayon ZhuoZhuoCrayon force-pushed the dev branch 2 times, most recently from 7780f43 to eb79f25 Compare June 7, 2025 05:09
@ZhuoZhuoCrayon
Copy link
Author

ZhuoZhuoCrayon commented Jun 7, 2025

@shyamal-anadkat @shikhar-cyber @josiah-openai Looking forward to some reviews

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant