Skip to content

Rate Limiting

The Sherpai Partner API implements rate limiting to ensure fair usage and maintain service quality for all users.

Overview

Rate limiting is applied globally across all API endpoints.

Rate Limits

Global Application Limits

  • Limit: 60 requests per minute per user, 1000 requests per hour per uer
  • Scope: Applied globally to all endpoints
  • Reset Period: Rolling window (1 minute or 1 hour)

Default User Quota

  • Limit: 1000 requests per hour per user
  • Period: 1 hour (3,600 seconds)

Rate Limit Headers

All API responses include rate limit information in the response headers:

Header Description Example
x-ratelimit-limit-hourly Maximum number of requests allowed in the period 1000
x-ratelimit-remaining-hourly Number of requests remaining in the current period 995
x-ratelimit-reset-hourly Unix timestamp when the rate limit resets 1704067200
X-Usage-Total Total API calls made in the current period 5
X-Usage-Period Period type (hourly, daily, etc.) hourly
x-ratelimit-limit-minutely Maximum number of requests allowed in the period 60
x-ratelimit-remaining-minutely Number of requests remaining in the current period 33
x-ratelimit-reset-minutely Unix timestamp when the rate limit resets 1704067205

Example Response Headers

HTTP/1.1 200 OK
x-ratelimit-limit-hourly: 1000
x-ratelimit-remaining-hourly: 995
x-ratelimit-reset-hourly: 1704067200
X-Usage-Total: 50
X-Usage-Period: hourly
x-ratelimit-limit-minutely: 60
x-ratelimit-remaining-minutely: 55
x-ratelimit-reset-minutely: 1704067205

Rate Limit Exceeded Response

When the rate limit is exceeded, the API returns a 429 Too Many Requests status code with detailed error information:

Response Format

{
  "detail": "Rate limit exceeded. Please wait 2 minutes and 30 seconds before making another request.",
  "error_code": "RATE_LIMIT_EXCEEDED",
  "status_code": 429,
  "status": "error",
  "timestamp": "2025-01-01T12:00:00Z",
  "request_id": "abc123-def456-ghi789",
  "retry_after": 150
}

Response Fields

Field Type Description
detail string Human-readable error message with retry time
error_code string Error code: RATE_LIMIT_EXCEEDED
status_code integer HTTP status code: 429
status string Status: error
timestamp string ISO 8601 timestamp of the error
request_id string Unique request identifier for support
retry_after integer Seconds to wait before retrying (optional)

Handling Rate Limits

Best Practices

  1. Monitor Rate Limit Headers: Always check X-RateLimit-Remaining to track your usage
  2. Implement Exponential Backoff: When you receive a 429 response, wait before retrying
  3. Cache Responses: Cache API responses when possible to reduce API calls
  4. Batch Requests: Combine multiple operations into single requests when possible
  5. Handle Errors Gracefully: Implement proper error handling for 429 responses

Exponential Backoff Strategy

When you receive a 429 response, implement exponential backoff:

async function makeRequestWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After') ||
                        response.headers.get('X-RateLimit-Reset');
      const waitTime = retryAfter
        ? parseInt(retryAfter)
        : Math.pow(2, attempt) * 1000; // Exponential backoff

      console.log(`Rate limited. Waiting ${waitTime}ms before retry...`);
      await new Promise(resolve => setTimeout(resolve, waitTime));
      continue;
    }

    return response;
  }
  throw new Error('Max retries exceeded');
}

Monitoring Rate Limit Headers

Always check rate limit headers to avoid hitting limits:

import requests

def make_api_request(url, headers):
    response = requests.get(url, headers=headers)

    # Check rate limit headers
    limit = int(response.headers.get('X-RateLimit-Limit', 0))
    remaining = int(response.headers.get('X-RateLimit-Remaining', 0))
    reset = int(response.headers.get('X-RateLimit-Reset', 0))

    print(f"Rate Limit: {remaining}/{limit} remaining")
    print(f"Resets at: {reset}")

    # Warn if approaching limit
    if remaining < limit * 0.1:  # Less than 10% remaining
        print("⚠️ Warning: Approaching rate limit!")

    return response

Caching Strategy

Implement caching to reduce API calls:

import time
from functools import lru_cache

# Simple in-memory cache with TTL
cache = {}

def cached_request(url, headers, ttl=300):
    cache_key = f"{url}:{headers.get('Authorization', '')}"

    if cache_key in cache:
        data, timestamp = cache[cache_key]
        if time.time() - timestamp < ttl:
            return data

    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        cache[cache_key] = (response.json(), time.time())

    return response

Code Examples

JavaScript/Node.js

class APIClient {
  constructor(accessToken) {
    this.accessToken = accessToken;
    this.baseURL = 'https://partner-api.sherp.ai';
  }

  async request(endpoint, options = {}) {
    const url = `${this.baseURL}${endpoint}`;
    const headers = {
      'Authorization': `Bearer ${this.accessToken}`,
      'Content-Type': 'application/json',
      ...options.headers
    };

    let retries = 3;
    while (retries > 0) {
      const response = await fetch(url, { ...options, headers });

      // Check rate limit headers
      const remaining = parseInt(response.headers.get('X-RateLimit-Remaining') || '0');
      const reset = parseInt(response.headers.get('X-RateLimit-Reset') || '0');

      if (remaining < 100) {
        console.warn(`⚠️ Rate limit low: ${remaining} requests remaining`);
      }

      // Handle rate limit exceeded
      if (response.status === 429) {
        const retryAfter = response.headers.get('Retry-After') ||
                          Math.ceil((reset * 1000 - Date.now()) / 1000);

        console.log(`Rate limited. Retrying after ${retryAfter} seconds...`);
        await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
        retries--;
        continue;
      }

      if (!response.ok) {
        throw new Error(`HTTP ${response.status}: ${await response.text()}`);
      }

      return await response.json();
    }

    throw new Error('Max retries exceeded');
  }
}

// Usage
const client = new APIClient('your_access_token');
const data = await client.request('/posts/');

Python

import requests
import time
from typing import Optional, Dict, Any

class APIClient:
    def __init__(self, access_token: str):
        self.access_token = access_token
        self.base_url = 'https://partner-api.sherp.ai'
        self.session = requests.Session()
        self.session.headers.update({
            'Authorization': f'Bearer {access_token}',
            'Content-Type': 'application/json'
        })

    def request(self, endpoint: str, max_retries: int = 3) -> Dict[str, Any]:
        url = f'{self.base_url}{endpoint}'

        for attempt in range(max_retries):
            response = self.session.get(url)

            # Check rate limit headers
            remaining = int(response.headers.get('X-RateLimit-Remaining', 0))
            limit = int(response.headers.get('X-RateLimit-Limit', 0))
            reset = int(response.headers.get('X-RateLimit-Reset', 0))

            if remaining < limit * 0.1:
                print(f"⚠️ Warning: Only {remaining}/{limit} requests remaining")

            # Handle rate limit exceeded
            if response.status_code == 429:
                error_data = response.json()
                retry_after = error_data.get('retry_after',
                                           reset - int(time.time()))

                if retry_after > 0:
                    print(f"Rate limited. Waiting {retry_after} seconds...")
                    time.sleep(retry_after)
                    continue

            response.raise_for_status()
            return response.json()

        raise Exception('Max retries exceeded')

# Usage
client = APIClient('your_access_token')
data = client.request('/posts/')

cURL

#!/bin/bash

# Make request and check rate limit headers
response=$(curl -s -i -H "Authorization: Bearer YOUR_TOKEN" \
  "https://partner-api.sherp.ai/posts/")

# Extract headers
remaining=$(echo "$response" | grep -i "X-RateLimit-Remaining" | cut -d' ' -f2 | tr -d '\r')
limit=$(echo "$response" | grep -i "X-RateLimit-Limit" | cut -d' ' -f2 | tr -d '\r')

echo "Rate Limit: $remaining/$limit requests remaining"

# Check if rate limited
if echo "$response" | grep -q "429"; then
    echo "Rate limit exceeded!"
    retry_after=$(echo "$response" | grep -i "Retry-After" | cut -d' ' -f2 | tr -d '\r')
    echo "Retry after: $retry_after seconds"
fi

Storage and Infrastructure


For questions about rate limiting, please contact our support team.