API Rate Limiting

Quick links: Tiers • Headers • Configuration • Best Practices • Troubleshooting

Overview

The API uses a sliding window rate limiting approach with multiple tiers to prevent abuse while ensuring legitimate users have smooth access. Rate limits are stored in Firestore and enforced using atomic transactions to handle concurrent requests correctly.

Rate Limit Tiers

Four tiers provide flexible rate limiting based on authentication and subscription status:

1. Default Tier (Unauthenticated)

Limit: 60 requests per minute
Burst Allowance: 10 additional requests
Applied to: Requests without authentication
Key: Based on IP address

2. Authenticated Tier

Limit: 300 requests per minute
Burst Allowance: 30 additional requests
Applied to: Requests with valid API keys
Key: Based on user ID

3. Premium Tier

Limit: 1,000 requests per minute
Burst Allowance: 100 additional requests
Applied to: Authenticated users with active subscriptions
Key: Based on user ID

4. Admin Tier

Limit: 10,000 requests per minute
Burst Allowance: 500 additional requests
Applied to: Whitelisted IPs or user IDs
Key: Based on user ID or IP

Configuration

Rate limits are configurable via environment variables in apps/functions/.env:

# Default tier
RATE_LIMIT_DEFAULT_MAX=60
RATE_LIMIT_DEFAULT_WINDOW=60000

# Authenticated tier
RATE_LIMIT_AUTH_MAX=300
RATE_LIMIT_AUTH_WINDOW=60000

# Premium tier
RATE_LIMIT_PREMIUM_MAX=1000
RATE_LIMIT_PREMIUM_WINDOW=60000

# Admin tier
RATE_LIMIT_ADMIN_MAX=10000
RATE_LIMIT_ADMIN_WINDOW=60000

# Whitelist (comma-separated)
RATE_LIMIT_WHITELIST=user123,192.168.1.1

Response Headers

All API responses include rate limit headers:

X-RateLimit-Limit: 300           # Maximum requests allowed
X-RateLimit-Remaining: 287       # Requests remaining in window
X-RateLimit-Reset: 2025-10-09T... # When the limit resets
X-RateLimit-Tier: authenticated  # Current tier applied

Rate Limit Exceeded (HTTP 429)

When rate limits are exceeded, the API returns:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Try again in 30 seconds.",
    "timestamp": "2025-10-09T12:34:56.789Z"
  }
}

Additional headers:

Retry-After: 30  # Seconds until you can retry

How It Works

Architecture

Storage: Firestore collection rateLimits with per-key request timestamps
Algorithm: Sliding window counter approach with burst allowance
Atomicity: Firestore transactions prevent race conditions on concurrent requests
Graceful Degradation: Configurable failover behavior when rate limiting service is unavailable

Sliding Window Algorithm

The system maintains an array of request timestamps for each key:

On each request, filter out timestamps outside the current window
Count remaining requests in the window
Check if count exceeds limit + burst allowance
If within limit, add current timestamp and allow request
If exceeded, return 429 with retry-after information

Key Generation

Authenticated: api:user:{userId}
Unauthenticated: api:ip:{ipAddress}
Endpoint-specific: endpoint:{path}:user:{userId} or endpoint:{path}:ip:{ipAddress}

Implementation

Global Rate Limiting

Rate limiting is automatically applied to all Public API routes via middleware in apps/functions/api.js:

// apps/functions/api.js
import { rateLimiter, checkSubscription } from "./middleware/rate-limit.js";

app.use(checkSubscription); // Check subscription status first
app.use(rateLimiter({ keyPrefix: "api", skipFailureOpen: true }));

Endpoint-Specific Rate Limiting

For additional rate limiting on specific endpoints:

import { endpointRateLimiter } from "../middleware/rate-limit.js";

router.post(
  "/expensive-operation",
  authenticateApiKey,
  endpointRateLimiter({
    maxRequests: 10,
    windowMs: 60000,
    keyPrefix: "expensive-op"
  }),
  asyncHandler(async (req, res) => {
    // Your handler
  })
);

Custom Rate Limiting

For custom rate limiting logic:

import { rateLimiter } from "../middleware/rate-limit.js";

const customLimiter = rateLimiter({
  keyPrefix: "webhook",
  skipFailureOpen: false, // Reject if rate limiting fails
});

app.use("/webhooks", customLimiter);

Scheduled Cleanup

A scheduled Cloud Function runs daily at 2:00 AM UTC to remove stale rate limit data:

Function: scheduledRateLimitCleanup in apps/functions/scheduled/rate-limit-cleanup.js
Cleanup threshold: Documents older than 24 hours
Batch size: 500 documents per execution
Automatic retry: 3 retries on failure

Best Practices

1. Monitor Rate Limit Usage

Track rate limit headers in your application:

const response = await fetch('/api/v1/posts', {
  headers: { 'soku-api-key': apiKey }
});

const remaining = response.headers.get('X-RateLimit-Remaining');
const reset = response.headers.get('X-RateLimit-Reset');

if (remaining < 10) {
  console.warn('Approaching rate limit!');
}

2. Implement Exponential Backoff

When receiving 429 responses:

async function fetchWithRetry(url, options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
      await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
      continue;
    }

    return response;
  }

  throw new Error('Max retries exceeded');
}

3. Batch Requests When Possible

Instead of making multiple individual requests, batch them:

// Bad: Multiple requests
for (const post of posts) {
  await api.post('/v1/posts', post);
}

// Good: Single batch request
await api.post('/v1/posts/batch', { posts });

4. Cache Responses

Reduce API calls by caching responses:

const cache = new Map();

async function getCachedData(key) {
  if (cache.has(key)) {
    return cache.get(key);
  }

  const response = await api.get(`/v1/data/${key}`);
  cache.set(key, response.data);

  return response.data;
}

Monitoring & Testing

Testing

Run the comprehensive test suite:

cd apps/functions
npm test -- rate-limit.test.js

Test coverage includes:

Rate limit enforcement across all tiers
Sliding window accuracy
Concurrent request handling
Error scenarios and graceful degradation
Cleanup operations

Monitoring

Track rate limiting metrics through multiple channels:

Firestore Console: Query rateLimits collection for real-time usage
Cloud Logging: Filter logs by rate-limit logger for violations and errors
Response Headers: Monitor X-RateLimit-Remaining in API responses
Error Tracking: Review 429 responses in errorLogs collection

Recommended alerts:

Spike in 429 responses (indicates potential abuse or misconfigured limits)
Rate limiting service failures (check graceful degradation metrics)
Cleanup job failures (prevents unbounded growth of rateLimits collection)

Troubleshooting

Rate limits not enforced

Cause: Middleware order incorrect

Solution: Verify rate limiting middleware loads before route handlers in apps/functions/api.js:

app.use(checkSubscription);
app.use(rateLimiter({ keyPrefix: "api" }));
app.use('/v1/posts', postsRouter); // Routes must come after

Limits too strict for legitimate traffic

Cause: Default limits don't match usage patterns

Solution: Adjust tier limits via environment variables:

# Increase authenticated tier from default 300 req/min
RATE_LIMIT_AUTH_MAX=500
RATE_LIMIT_AUTH_WINDOW=60000

Or whitelist specific users/IPs for admin tier access:

RATE_LIMIT_WHITELIST=user-abc123,192.168.1.100

Firestore transaction errors

Cause: Cleanup batch size exceeds Firestore transaction limits

Solution: Firestore transactions have constraints:

Maximum 500 operations per transaction
Maximum 10 MB transaction size

If encountering these limits, reduce the batch size in apps/functions/scheduled/rate-limit-cleanup.js:

.limit(500) // Reduce to 100 or 250

High latency on API requests

Cause: Rate limit checks adding overhead

Solution:

Enable graceful degradation (already default):
```
rateLimiter({ skipFailureOpen: true })
```
Optimize Firestore queries: Create composite index on rateLimits collection:
- Fields: updatedAt (Ascending), __name__ (Ascending)
Consider Redis: For very high-traffic scenarios (>10K req/min), integrate Redis for faster rate limit storage:
```
// Future enhancement - see "Future Enhancements" section
```

Future Enhancements

Planned improvements for production scale:

Redis Integration: Replace Firestore with Redis for sub-millisecond rate limit checks in high-traffic scenarios (>10K req/min)
Distributed Rate Limiting: Coordinate rate limits across multiple Cloud Run regions for global deployments
Dynamic Rate Limits: Auto-adjust limits based on system load, time of day, or user reputation scores
User-Specific Overrides: Admin UI for setting custom rate limits per user without environment variable changes
Rate Limit Dashboard: Real-time analytics showing top consumers, tier distribution, and violation patterns
Progressive Enforcement: Graduated response system (warning headers → soft limits → hard limits) instead of immediate 429s

Security Considerations

IP Spoofing

Risk: req.ip can be spoofed via proxy headers

Mitigation:

Express automatically trusts X-Forwarded-For when behind proxies
For production, validate proxy headers match expected sources (Cloud Load Balancer, Cloud Run)
Consider additional client fingerprinting beyond IP addresses

Distributed Attacks

Risk: IP-based rate limiting ineffective against DDoS from many sources

Mitigation:

Deploy Cloud Armor for network-layer DDoS protection
Implement CAPTCHA challenges for suspicious traffic patterns
Use Cloud CDN to absorb and cache static content

Rate Limit Bypass

Risk: Attackers might attempt tier elevation by manipulating request context

Mitigation:

Authentication middleware (authenticateApiKey) runs before rate limiting
Subscription check (checkSubscription) verifies active status from trusted Firestore data
Tier assignment logic uses server-side req.auth (not client-provided data)

APIs → Public API
APIs → API Keys Lifecycle

Overview​

Rate Limit Tiers​

1. Default Tier (Unauthenticated)​

2. Authenticated Tier​

3. Premium Tier​

4. Admin Tier​

Configuration​

Response Headers​

Rate Limit Exceeded (HTTP 429)​

How It Works​

Architecture​

Sliding Window Algorithm​

Key Generation​

Implementation​

Global Rate Limiting​

Endpoint-Specific Rate Limiting​

Custom Rate Limiting​

Scheduled Cleanup​

Best Practices​

1. Monitor Rate Limit Usage​

2. Implement Exponential Backoff​

3. Batch Requests When Possible​

4. Cache Responses​

Monitoring & Testing​

Testing​

Monitoring​

Troubleshooting​

Rate limits not enforced​

Limits too strict for legitimate traffic​

Firestore transaction errors​

High latency on API requests​

Future Enhancements​

Security Considerations​

IP Spoofing​

Distributed Attacks​

Rate Limit Bypass​

Related​

Overview

Rate Limit Tiers

1. Default Tier (Unauthenticated)

2. Authenticated Tier

3. Premium Tier

4. Admin Tier

Configuration

Response Headers

Rate Limit Exceeded (HTTP 429)

How It Works

Architecture

Sliding Window Algorithm

Key Generation

Implementation

Global Rate Limiting

Endpoint-Specific Rate Limiting

Custom Rate Limiting

Scheduled Cleanup

Best Practices

1. Monitor Rate Limit Usage

2. Implement Exponential Backoff

3. Batch Requests When Possible

4. Cache Responses

Monitoring & Testing

Testing

Monitoring

Troubleshooting

Rate limits not enforced

Limits too strict for legitimate traffic

Firestore transaction errors

High latency on API requests

Future Enhancements

Security Considerations

IP Spoofing

Distributed Attacks

Rate Limit Bypass

Related