Skip to main content

API Rate Limiting

Quick links: TiersHeadersConfigurationBest PracticesTroubleshooting

Overview

The API uses a sliding window rate limiting approach with multiple tiers to prevent abuse while ensuring legitimate users have smooth access. Rate limits are stored in Firestore and enforced using atomic transactions to handle concurrent requests correctly.

Rate Limit Tiers

Four tiers provide flexible rate limiting based on authentication and subscription status:

1. Default Tier (Unauthenticated)

  • Limit: 60 requests per minute
  • Burst Allowance: 10 additional requests
  • Applied to: Requests without authentication
  • Key: Based on IP address

2. Authenticated Tier

  • Limit: 300 requests per minute
  • Burst Allowance: 30 additional requests
  • Applied to: Requests with valid API keys
  • Key: Based on user ID

3. Premium Tier

  • Limit: 1,000 requests per minute
  • Burst Allowance: 100 additional requests
  • Applied to: Authenticated users with active subscriptions
  • Key: Based on user ID

4. Admin Tier

  • Limit: 10,000 requests per minute
  • Burst Allowance: 500 additional requests
  • Applied to: Whitelisted IPs or user IDs
  • Key: Based on user ID or IP

Configuration

Rate limits are configurable via environment variables in apps/functions/.env:

# Default tier
RATE_LIMIT_DEFAULT_MAX=60
RATE_LIMIT_DEFAULT_WINDOW=60000

# Authenticated tier
RATE_LIMIT_AUTH_MAX=300
RATE_LIMIT_AUTH_WINDOW=60000

# Premium tier
RATE_LIMIT_PREMIUM_MAX=1000
RATE_LIMIT_PREMIUM_WINDOW=60000

# Admin tier
RATE_LIMIT_ADMIN_MAX=10000
RATE_LIMIT_ADMIN_WINDOW=60000

# Whitelist (comma-separated)
RATE_LIMIT_WHITELIST=user123,192.168.1.1

Response Headers

All API responses include rate limit headers:

X-RateLimit-Limit: 300           # Maximum requests allowed
X-RateLimit-Remaining: 287 # Requests remaining in window
X-RateLimit-Reset: 2025-10-09T... # When the limit resets
X-RateLimit-Tier: authenticated # Current tier applied

Rate Limit Exceeded (HTTP 429)

When rate limits are exceeded, the API returns:

{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Try again in 30 seconds.",
"timestamp": "2025-10-09T12:34:56.789Z"
}
}

Additional headers:

Retry-After: 30  # Seconds until you can retry

How It Works

Architecture

  • Storage: Firestore collection rateLimits with per-key request timestamps
  • Algorithm: Sliding window counter approach with burst allowance
  • Atomicity: Firestore transactions prevent race conditions on concurrent requests
  • Graceful Degradation: Configurable failover behavior when rate limiting service is unavailable

Sliding Window Algorithm

The system maintains an array of request timestamps for each key:

  1. On each request, filter out timestamps outside the current window
  2. Count remaining requests in the window
  3. Check if count exceeds limit + burst allowance
  4. If within limit, add current timestamp and allow request
  5. If exceeded, return 429 with retry-after information

Key Generation

  • Authenticated: api:user:{userId}
  • Unauthenticated: api:ip:{ipAddress}
  • Endpoint-specific: endpoint:{path}:user:{userId} or endpoint:{path}:ip:{ipAddress}

Implementation

Global Rate Limiting

Rate limiting is automatically applied to all Public API routes via middleware in apps/functions/api.js:

// apps/functions/api.js
import { rateLimiter, checkSubscription } from "./middleware/rate-limit.js";

app.use(checkSubscription); // Check subscription status first
app.use(rateLimiter({ keyPrefix: "api", skipFailureOpen: true }));

Endpoint-Specific Rate Limiting

For additional rate limiting on specific endpoints:

import { endpointRateLimiter } from "../middleware/rate-limit.js";

router.post(
"/expensive-operation",
authenticateApiKey,
endpointRateLimiter({
maxRequests: 10,
windowMs: 60000,
keyPrefix: "expensive-op"
}),
asyncHandler(async (req, res) => {
// Your handler
})
);

Custom Rate Limiting

For custom rate limiting logic:

import { rateLimiter } from "../middleware/rate-limit.js";

const customLimiter = rateLimiter({
keyPrefix: "webhook",
skipFailureOpen: false, // Reject if rate limiting fails
});

app.use("/webhooks", customLimiter);

Scheduled Cleanup

A scheduled Cloud Function runs daily at 2:00 AM UTC to remove stale rate limit data:

  • Function: scheduledRateLimitCleanup in apps/functions/scheduled/rate-limit-cleanup.js
  • Cleanup threshold: Documents older than 24 hours
  • Batch size: 500 documents per execution
  • Automatic retry: 3 retries on failure

Best Practices

1. Monitor Rate Limit Usage

Track rate limit headers in your application:

const response = await fetch('/api/v1/posts', {
headers: { 'soku-api-key': apiKey }
});

const remaining = response.headers.get('X-RateLimit-Remaining');
const reset = response.headers.get('X-RateLimit-Reset');

if (remaining < 10) {
console.warn('Approaching rate limit!');
}

2. Implement Exponential Backoff

When receiving 429 responses:

async function fetchWithRetry(url, options, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
const response = await fetch(url, options);

if (response.status === 429) {
const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
continue;
}

return response;
}

throw new Error('Max retries exceeded');
}

3. Batch Requests When Possible

Instead of making multiple individual requests, batch them:

// Bad: Multiple requests
for (const post of posts) {
await api.post('/v1/posts', post);
}

// Good: Single batch request
await api.post('/v1/posts/batch', { posts });

4. Cache Responses

Reduce API calls by caching responses:

const cache = new Map();

async function getCachedData(key) {
if (cache.has(key)) {
return cache.get(key);
}

const response = await api.get(`/v1/data/${key}`);
cache.set(key, response.data);

return response.data;
}

Monitoring & Testing

Testing

Run the comprehensive test suite:

cd apps/functions
npm test -- rate-limit.test.js

Test coverage includes:

  • Rate limit enforcement across all tiers
  • Sliding window accuracy
  • Concurrent request handling
  • Error scenarios and graceful degradation
  • Cleanup operations

Monitoring

Track rate limiting metrics through multiple channels:

  1. Firestore Console: Query rateLimits collection for real-time usage
  2. Cloud Logging: Filter logs by rate-limit logger for violations and errors
  3. Response Headers: Monitor X-RateLimit-Remaining in API responses
  4. Error Tracking: Review 429 responses in errorLogs collection

Recommended alerts:

  • Spike in 429 responses (indicates potential abuse or misconfigured limits)
  • Rate limiting service failures (check graceful degradation metrics)
  • Cleanup job failures (prevents unbounded growth of rateLimits collection)

Troubleshooting

Rate limits not enforced

Cause: Middleware order incorrect

Solution: Verify rate limiting middleware loads before route handlers in apps/functions/api.js:

app.use(checkSubscription);
app.use(rateLimiter({ keyPrefix: "api" }));
app.use('/v1/posts', postsRouter); // Routes must come after

Limits too strict for legitimate traffic

Cause: Default limits don't match usage patterns

Solution: Adjust tier limits via environment variables:

# Increase authenticated tier from default 300 req/min
RATE_LIMIT_AUTH_MAX=500
RATE_LIMIT_AUTH_WINDOW=60000

Or whitelist specific users/IPs for admin tier access:

RATE_LIMIT_WHITELIST=user-abc123,192.168.1.100

Firestore transaction errors

Cause: Cleanup batch size exceeds Firestore transaction limits

Solution: Firestore transactions have constraints:

  • Maximum 500 operations per transaction
  • Maximum 10 MB transaction size

If encountering these limits, reduce the batch size in apps/functions/scheduled/rate-limit-cleanup.js:

.limit(500) // Reduce to 100 or 250

High latency on API requests

Cause: Rate limit checks adding overhead

Solution:

  1. Enable graceful degradation (already default):

    rateLimiter({ skipFailureOpen: true })
  2. Optimize Firestore queries: Create composite index on rateLimits collection:

    • Fields: updatedAt (Ascending), __name__ (Ascending)
  3. Consider Redis: For very high-traffic scenarios (>10K req/min), integrate Redis for faster rate limit storage:

    // Future enhancement - see "Future Enhancements" section

Future Enhancements

Planned improvements for production scale:

  1. Redis Integration: Replace Firestore with Redis for sub-millisecond rate limit checks in high-traffic scenarios (>10K req/min)
  2. Distributed Rate Limiting: Coordinate rate limits across multiple Cloud Run regions for global deployments
  3. Dynamic Rate Limits: Auto-adjust limits based on system load, time of day, or user reputation scores
  4. User-Specific Overrides: Admin UI for setting custom rate limits per user without environment variable changes
  5. Rate Limit Dashboard: Real-time analytics showing top consumers, tier distribution, and violation patterns
  6. Progressive Enforcement: Graduated response system (warning headers → soft limits → hard limits) instead of immediate 429s

Security Considerations

IP Spoofing

Risk: req.ip can be spoofed via proxy headers

Mitigation:

  • Express automatically trusts X-Forwarded-For when behind proxies
  • For production, validate proxy headers match expected sources (Cloud Load Balancer, Cloud Run)
  • Consider additional client fingerprinting beyond IP addresses

Distributed Attacks

Risk: IP-based rate limiting ineffective against DDoS from many sources

Mitigation:

  • Deploy Cloud Armor for network-layer DDoS protection
  • Implement CAPTCHA challenges for suspicious traffic patterns
  • Use Cloud CDN to absorb and cache static content

Rate Limit Bypass

Risk: Attackers might attempt tier elevation by manipulating request context

Mitigation:

  • Authentication middleware (authenticateApiKey) runs before rate limiting
  • Subscription check (checkSubscription) verifies active status from trusted Firestore data
  • Tier assignment logic uses server-side req.auth (not client-provided data)