429 Rate Limited : corriger les erreurs trop de requêtes
Resolve MCP 429 rate limit errors with retry strategies, batching solutions, and exponential backoff. Learn rate limit tiers and best practices to avoid hitting limits.
Understanding Rate Limits
Rate limits prevent abuse and ensure fair usage of the MCP API. When you exceed the limit, you receive a 429 error and must wait before making more requests.
What Triggers Rate Limits
- Making too many requests in a short time window
- Exceeding requests per minute or per hour quotas
- Hitting concurrent request limits
- Making expensive queries (large result sets) repeatedly
Rate Limit Tiers
Free Tier
- Requests per minute: 60
- Requests per hour: 1,000
- Concurrent requests: 5
- Best for: Personal use, testing, small workflows
Standard Tier
- Requests per minute: 300
- Requests per hour: 10,000
- Concurrent requests: 20
- Best for: Teams, regular automation, moderate usage
Enterprise Tier
- Requests per minute: 1,000+
- Requests per hour: Custom limits
- Concurrent requests: 100+
- Best for: Large teams, heavy automation, integrations
Common Causes
1. Tight Loops Without Delays
Calling MCP tools in a loop without any pause between requests:
Problem Code
// ❌ BAD: No delay between requests
for task_id in task_ids:
update_task(task_id, status="done")
// Immediately hits rate limit after 60 requestsSolution: Add Delays
// ✅ GOOD: Add delay between requests
for task_id in task_ids:
update_task(task_id, status="done")
sleep(100) // 100ms delay = max 10 req/sec2. Parallel Requests Exceeding Limit
Running too many concurrent requests at once:
Problem
Starting 50 parallel requests when limit is 5 concurrent.
Solution: Limit Concurrency
- Process in batches (5 at a time for free tier, 20 for standard)
- Use a queue with max workers
- Implement semaphore or rate limiter in your code
3. No Retry Logic
When you hit rate limit, retrying immediately makes it worse:
Problem
Immediate retry after 429 error causes more errors.
Solution: Exponential Backoff
Wait progressively longer between retries: 1s, 2s, 4s, 8s, 16s...
Solutions and Best Practices
1. Implement Exponential Backoff
Retry Strategy
def call_with_retry(func, max_retries=5):
retry_count = 0
base_delay = 1 // Start with 1 second
while retry_count < max_retries:
try:
return func()
except RateLimitError as e:
retry_count += 1
delay = base_delay * (2 ** retry_count) // Exponential
print(f"Rate limited. Waiting {delay}s...")
sleep(delay)
raise Exception("Max retries exceeded")2. Batch Operations
Batching Pattern
Group multiple updates into fewer API calls:
- Collect all changes you want to make
- Group into batches of 10-20 items
- Process each batch with delay between batches
- Add exponential backoff for retries
See Batching Guide for detailed examples.
3. Use Retry-After Header
Smart Retry
The 429 response includes a Retry-After header telling you how long to wait:
response = make_request()
if response.status == 429:
retry_after = response.headers.get('Retry-After', 60)
print(f"Rate limited. Waiting {retry_after}s...")
sleep(retry_after)
// Retry request4. Optimize Query Patterns
Reduce Request Count
- Use filters: Get only the tasks you need, not all tasks
- Pagination: Fetch smaller pages instead of huge result sets
- Caching: Cache read results for a few minutes when appropriate
- Aggregate queries: Fetch once and process locally instead of repeated queries
Example: Rate-Limit-Safe Bulk Update
Complete Implementation
import time
def bulk_update_tasks(task_ids, status):
BATCH_SIZE = 10
DELAY_BETWEEN_BATCHES = 1 // 1 second
MAX_RETRIES = 5
for i in range(0, len(task_ids), BATCH_SIZE):
batch = task_ids[i:i+BATCH_SIZE]
retry_count = 0
while retry_count < MAX_RETRIES:
try:
for task_id in batch:
update_task(task_id, status=status)
time.sleep(0.1) // Small delay between items
break // Success, exit retry loop
except RateLimitError as e:
retry_count += 1
delay = DELAY_BETWEEN_BATCHES * (2 ** retry_count)
print(f"Rate limited. Retry {retry_count}/{MAX_RETRIES} after {delay}s")
time.sleep(delay)
// Delay between batches
time.sleep(DELAY_BETWEEN_BATCHES)Monitoring and Prevention
Best Practices
- Track request counts: Log how many requests you're making
- Add delays proactively: Don't wait for 429 errors to slow down
- Use rate limiters: Implement client-side rate limiting
- Monitor response headers: Check
X-RateLimit-Remainingheader - Upgrade tier if needed: If regularly hitting limits, consider upgrading
Quick Fix Checklist
- ✅ Added delays between requests (100-200ms minimum)
- ✅ Implemented exponential backoff for retries
- ✅ Limiting concurrent requests to your tier's limit
- ✅ Using
Retry-Afterheader from 429 responses - ✅ Batching operations instead of individual requests
- ✅ Using filters and pagination to reduce request count
- ✅ Monitoring rate limit headers in responses
When to Upgrade Tier
Consider Upgrading If:
- Regularly hitting rate limits despite optimizations
- Your workflows require more than 60 requests per minute
- Running automated systems that need higher throughput
- Team size growing and multiple people using MCP
- Building integrations that query frequently
Contact sales for enterprise tier pricing and custom limits.
