429 Rate Limited : corriger les erreurs trop de requêtes

Resolve MCP 429 rate limit errors with retry strategies, batching solutions, and exponential backoff. Learn rate limit tiers and best practices to avoid hitting limits.

Understanding Rate Limits

Rate limits prevent abuse and ensure fair usage of the MCP API. When you exceed the limit, you receive a 429 error and must wait before making more requests.

What Triggers Rate Limits

Making too many requests in a short time window
Exceeding requests per minute or per hour quotas
Hitting concurrent request limits
Making expensive queries (large result sets) repeatedly

Rate Limit Tiers

Free Tier

Requests per minute: 60
Requests per hour: 1,000
Concurrent requests: 5
Best for: Personal use, testing, small workflows

Standard Tier

Requests per minute: 300
Requests per hour: 10,000
Concurrent requests: 20
Best for: Teams, regular automation, moderate usage

Enterprise Tier

Requests per minute: 1,000+
Requests per hour: Custom limits
Concurrent requests: 100+
Best for: Large teams, heavy automation, integrations

Common Causes

1. Tight Loops Without Delays

Calling MCP tools in a loop without any pause between requests:

Problem Code

// ❌ BAD: No delay between requests
for task_id in task_ids:
    update_task(task_id, status="done")
    // Immediately hits rate limit after 60 requests

Solution: Add Delays

// ✅ GOOD: Add delay between requests
for task_id in task_ids:
    update_task(task_id, status="done")
    sleep(100)  // 100ms delay = max 10 req/sec

2. Parallel Requests Exceeding Limit

Running too many concurrent requests at once:

Problem

Starting 50 parallel requests when limit is 5 concurrent.

Solution: Limit Concurrency

Process in batches (5 at a time for free tier, 20 for standard)
Use a queue with max workers
Implement semaphore or rate limiter in your code

3. No Retry Logic

When you hit rate limit, retrying immediately makes it worse:

Problem

Immediate retry after 429 error causes more errors.

Solution: Exponential Backoff

Wait progressively longer between retries: 1s, 2s, 4s, 8s, 16s...

Solutions and Best Practices

1. Implement Exponential Backoff

Retry Strategy

def call_with_retry(func, max_retries=5):
    retry_count = 0
    base_delay = 1  // Start with 1 second
    
    while retry_count < max_retries:
        try:
            return func()
        except RateLimitError as e:
            retry_count += 1
            delay = base_delay * (2 ** retry_count)  // Exponential
            print(f"Rate limited. Waiting {delay}s...")
            sleep(delay)
    
    raise Exception("Max retries exceeded")

2. Batch Operations

Batching Pattern

Group multiple updates into fewer API calls:

Collect all changes you want to make
Group into batches of 10-20 items
Process each batch with delay between batches
Add exponential backoff for retries

See Batching Guide for detailed examples.

3. Use Retry-After Header

Smart Retry

The 429 response includes a Retry-After header telling you how long to wait:

response = make_request()
if response.status == 429:
    retry_after = response.headers.get('Retry-After', 60)
    print(f"Rate limited. Waiting {retry_after}s...")
    sleep(retry_after)
    // Retry request

4. Optimize Query Patterns

Reduce Request Count

Use filters: Get only the tasks you need, not all tasks
Pagination: Fetch smaller pages instead of huge result sets
Caching: Cache read results for a few minutes when appropriate
Aggregate queries: Fetch once and process locally instead of repeated queries

Example: Rate-Limit-Safe Bulk Update

Complete Implementation

import time

def bulk_update_tasks(task_ids, status):
    BATCH_SIZE = 10
    DELAY_BETWEEN_BATCHES = 1  // 1 second
    MAX_RETRIES = 5
    
    for i in range(0, len(task_ids), BATCH_SIZE):
        batch = task_ids[i:i+BATCH_SIZE]
        retry_count = 0
        
        while retry_count < MAX_RETRIES:
            try:
                for task_id in batch:
                    update_task(task_id, status=status)
                    time.sleep(0.1)  // Small delay between items
                
                break  // Success, exit retry loop
                
            except RateLimitError as e:
                retry_count += 1
                delay = DELAY_BETWEEN_BATCHES * (2 ** retry_count)
                print(f"Rate limited. Retry {retry_count}/{MAX_RETRIES} after {delay}s")
                time.sleep(delay)
        
        // Delay between batches
        time.sleep(DELAY_BETWEEN_BATCHES)