Rate Limiting¶

The Sherpai Partner API implements rate limiting to ensure fair usage and maintain service quality for all users.

Overview¶

Rate limiting is applied globally across all API endpoints.

Rate Limits¶

Global Application Limits¶

Limit: 60 requests per minute per user, 1000 requests per hour per uer
Scope: Applied globally to all endpoints
Reset Period: Rolling window (1 minute or 1 hour)

Default User Quota¶

Limit: 1000 requests per hour per user
Period: 1 hour (3,600 seconds)

Rate Limit Headers¶

All API responses include rate limit information in the response headers:

Header	Description	Example
`x-ratelimit-limit-hourly`	Maximum number of requests allowed in the period	`1000`
`x-ratelimit-remaining-hourly`	Number of requests remaining in the current period	`995`
`x-ratelimit-reset-hourly`	Unix timestamp when the rate limit resets	`1704067200`
`X-Usage-Total`	Total API calls made in the current period	`5`
`X-Usage-Period`	Period type (hourly, daily, etc.)	`hourly`
`x-ratelimit-limit-minutely`	Maximum number of requests allowed in the period	`60`
`x-ratelimit-remaining-minutely`	Number of requests remaining in the current period	`33`
`x-ratelimit-reset-minutely`	Unix timestamp when the rate limit resets	`1704067205`

Example Response Headers¶

HTTP/1.1 200 OK
x-ratelimit-limit-hourly: 1000
x-ratelimit-remaining-hourly: 995
x-ratelimit-reset-hourly: 1704067200
X-Usage-Total: 50
X-Usage-Period: hourly
x-ratelimit-limit-minutely: 60
x-ratelimit-remaining-minutely: 55
x-ratelimit-reset-minutely: 1704067205

Rate Limit Exceeded Response¶

When the rate limit is exceeded, the API returns a 429 Too Many Requests status code with detailed error information:

Response Format¶

{
  "detail": "Rate limit exceeded. Please wait 2 minutes and 30 seconds before making another request.",
  "error_code": "RATE_LIMIT_EXCEEDED",
  "status_code": 429,
  "status": "error",
  "timestamp": "2025-01-01T12:00:00Z",
  "request_id": "abc123-def456-ghi789",
  "retry_after": 150
}

Response Fields¶

Field	Type	Description
`detail`	string	Human-readable error message with retry time
`error_code`	string	Error code: `RATE_LIMIT_EXCEEDED`
`status_code`	integer	HTTP status code: `429`
`status`	string	Status: `error`
`timestamp`	string	ISO 8601 timestamp of the error
`request_id`	string	Unique request identifier for support
`retry_after`	integer	Seconds to wait before retrying (optional)

Handling Rate Limits¶

Best Practices¶

Monitor Rate Limit Headers: Always check X-RateLimit-Remaining to track your usage
Implement Exponential Backoff: When you receive a 429 response, wait before retrying
Cache Responses: Cache API responses when possible to reduce API calls
Batch Requests: Combine multiple operations into single requests when possible
Handle Errors Gracefully: Implement proper error handling for 429 responses

Exponential Backoff Strategy¶

When you receive a 429 response, implement exponential backoff:

async function makeRequestWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After') ||
                        response.headers.get('X-RateLimit-Reset');
      const waitTime = retryAfter
        ? parseInt(retryAfter)
        : Math.pow(2, attempt) * 1000; // Exponential backoff

      console.log(`Rate limited. Waiting ${waitTime}ms before retry...`);
      await new Promise(resolve => setTimeout(resolve, waitTime));
      continue;
    }

    return response;
  }
  throw new Error('Max retries exceeded');
}

Monitoring Rate Limit Headers¶

Always check rate limit headers to avoid hitting limits:

import requests

def make_api_request(url, headers):
    response = requests.get(url, headers=headers)

    # Check rate limit headers
    limit = int(response.headers.get('X-RateLimit-Limit', 0))
    remaining = int(response.headers.get('X-RateLimit-Remaining', 0))
    reset = int(response.headers.get('X-RateLimit-Reset', 0))

    print(f"Rate Limit: {remaining}/{limit} remaining")
    print(f"Resets at: {reset}")

    # Warn if approaching limit
    if remaining < limit * 0.1:  # Less than 10% remaining
        print("⚠️ Warning: Approaching rate limit!")

    return response

Caching Strategy¶

Implement caching to reduce API calls:

import time
from functools import lru_cache

# Simple in-memory cache with TTL
cache = {}

def cached_request(url, headers, ttl=300):
    cache_key = f"{url}:{headers.get('Authorization', '')}"

    if cache_key in cache:
        data, timestamp = cache[cache_key]
        if time.time() - timestamp < ttl:
            return data

    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        cache[cache_key] = (response.json(), time.time())

    return response

Code Examples¶

JavaScript/Node.js¶

class APIClient {
  constructor(accessToken) {
    this.accessToken = accessToken;
    this.baseURL = 'https://partner-api.sherp.ai';
  }

  async request(endpoint, options = {}) {
    const url = `${this.baseURL}${endpoint}`;
    const headers = {
      'Authorization': `Bearer ${this.accessToken}`,
      'Content-Type': 'application/json',
      ...options.headers
    };

    let retries = 3;
    while (retries > 0) {
      const response = await fetch(url, { ...options, headers });

      // Check rate limit headers
      const remaining = parseInt(response.headers.get('X-RateLimit-Remaining') || '0');
      const reset = parseInt(response.headers.get('X-RateLimit-Reset') || '0');

      if (remaining < 100) {
        console.warn(`⚠️ Rate limit low: ${remaining} requests remaining`);
      }

      // Handle rate limit exceeded
      if (response.status === 429) {
        const retryAfter = response.headers.get('Retry-After') ||
                          Math.ceil((reset * 1000 - Date.now()) / 1000);

        console.log(`Rate limited. Retrying after ${retryAfter} seconds...`);
        await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
        retries--;
        continue;
      }

      if (!response.ok) {
        throw new Error(`HTTP ${response.status}: ${await response.text()}`);
      }

      return await response.json();
    }

    throw new Error('Max retries exceeded');
  }
}

// Usage
const client = new APIClient('your_access_token');
const data = await client.request('/posts/');

Python¶

import requests
import time
from typing import Optional, Dict, Any

class APIClient:
    def __init__(self, access_token: str):
        self.access_token = access_token
        self.base_url = 'https://partner-api.sherp.ai'
        self.session = requests.Session()
        self.session.headers.update({
            'Authorization': f'Bearer {access_token}',
            'Content-Type': 'application/json'
        })

    def request(self, endpoint: str, max_retries: int = 3) -> Dict[str, Any]:
        url = f'{self.base_url}{endpoint}'

        for attempt in range(max_retries):
            response = self.session.get(url)

            # Check rate limit headers
            remaining = int(response.headers.get('X-RateLimit-Remaining', 0))
            limit = int(response.headers.get('X-RateLimit-Limit', 0))
            reset = int(response.headers.get('X-RateLimit-Reset', 0))

            if remaining < limit * 0.1:
                print(f"⚠️ Warning: Only {remaining}/{limit} requests remaining")

            # Handle rate limit exceeded
            if response.status_code == 429:
                error_data = response.json()
                retry_after = error_data.get('retry_after',
                                           reset - int(time.time()))

                if retry_after > 0:
                    print(f"Rate limited. Waiting {retry_after} seconds...")
                    time.sleep(retry_after)
                    continue

            response.raise_for_status()
            return response.json()

        raise Exception('Max retries exceeded')

# Usage
client = APIClient('your_access_token')
data = client.request('/posts/')

cURL¶

#!/bin/bash

# Make request and check rate limit headers
response=$(curl -s -i -H "Authorization: Bearer YOUR_TOKEN" \
  "https://partner-api.sherp.ai/posts/")

# Extract headers
remaining=$(echo "$response" | grep -i "X-RateLimit-Remaining" | cut -d' ' -f2 | tr -d '\r')
limit=$(echo "$response" | grep -i "X-RateLimit-Limit" | cut -d' ' -f2 | tr -d '\r')

echo "Rate Limit: $remaining/$limit requests remaining"

# Check if rate limited
if echo "$response" | grep -q "429"; then
    echo "Rate limit exceeded!"
    retry_after=$(echo "$response" | grep -i "Retry-After" | cut -d' ' -f2 | tr -d '\r')
    echo "Retry after: $retry_after seconds"
fi

Storage and Infrastructure¶

Error Handling - Learn about error responses and handling
Authentication - Understand authentication requirements

For questions about rate limiting, please contact our support team.