API Rate Limiting
Quick links: Tiers • Headers • Configuration • Best Practices • Troubleshooting
Overview
The API uses a sliding window rate limiting approach with multiple tiers to prevent abuse while ensuring legitimate users have smooth access. Rate limits are stored in Firestore and enforced using atomic transactions to handle concurrent requests correctly.
Rate Limit Tiers
Four tiers provide flexible rate limiting based on authentication and subscription status:
1. Default Tier (Unauthenticated)
- Limit: 60 requests per minute
- Burst Allowance: 10 additional requests
- Applied to: Requests without authentication
- Key: Based on IP address
2. Authenticated Tier
- Limit: 300 requests per minute
- Burst Allowance: 30 additional requests
- Applied to: Requests with valid API keys
- Key: Based on user ID
3. Premium Tier
- Limit: 1,000 requests per minute
- Burst Allowance: 100 additional requests
- Applied to: Authenticated users with active subscriptions
- Key: Based on user ID
4. Admin Tier
- Limit: 10,000 requests per minute
- Burst Allowance: 500 additional requests
- Applied to: Whitelisted IPs or user IDs
- Key: Based on user ID or IP
Configuration
Rate limits are configurable via environment variables in apps/functions/.env:
# Default tier
RATE_LIMIT_DEFAULT_MAX=60
RATE_LIMIT_DEFAULT_WINDOW=60000
# Authenticated tier
RATE_LIMIT_AUTH_MAX=300
RATE_LIMIT_AUTH_WINDOW=60000
# Premium tier
RATE_LIMIT_PREMIUM_MAX=1000
RATE_LIMIT_PREMIUM_WINDOW=60000
# Admin tier
RATE_LIMIT_ADMIN_MAX=10000
RATE_LIMIT_ADMIN_WINDOW=60000
# Whitelist (comma-separated)
RATE_LIMIT_WHITELIST=user123,192.168.1.1
Response Headers
All API responses include rate limit headers:
X-RateLimit-Limit: 300 # Maximum requests allowed
X-RateLimit-Remaining: 287 # Requests remaining in window
X-RateLimit-Reset: 2025-10-09T... # When the limit resets
X-RateLimit-Tier: authenticated # Current tier applied
Rate Limit Exceeded (HTTP 429)
When rate limits are exceeded, the API returns:
{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Try again in 30 seconds.",
"timestamp": "2025-10-09T12:34:56.789Z"
}
}
Additional headers:
Retry-After: 30 # Seconds until you can retry
How It Works
Architecture
- Storage: Firestore collection
rateLimitswith per-key request timestamps - Algorithm: Sliding window counter approach with burst allowance
- Atomicity: Firestore transactions prevent race conditions on concurrent requests
- Graceful Degradation: Configurable failover behavior when rate limiting service is unavailable
Sliding Window Algorithm
The system maintains an array of request timestamps for each key:
- On each request, filter out timestamps outside the current window
- Count remaining requests in the window
- Check if count exceeds limit + burst allowance
- If within limit, add current timestamp and allow request
- If exceeded, return 429 with retry-after information
Key Generation
- Authenticated:
api:user:{userId} - Unauthenticated:
api:ip:{ipAddress} - Endpoint-specific:
endpoint:{path}:user:{userId}orendpoint:{path}:ip:{ipAddress}
Implementation
Global Rate Limiting
Rate limiting is automatically applied to all Public API routes via middleware in apps/functions/api.js:
// apps/functions/api.js
import { rateLimiter, checkSubscription } from "./middleware/rate-limit.js";
app.use(checkSubscription); // Check subscription status first
app.use(rateLimiter({ keyPrefix: "api", skipFailureOpen: true }));
Endpoint-Specific Rate Limiting
For additional rate limiting on specific endpoints:
import { endpointRateLimiter } from "../middleware/rate-limit.js";
router.post(
"/expensive-operation",
authenticateApiKey,
endpointRateLimiter({
maxRequests: 10,
windowMs: 60000,
keyPrefix: "expensive-op"
}),
asyncHandler(async (req, res) => {
// Your handler
})
);
Custom Rate Limiting
For custom rate limiting logic:
import { rateLimiter } from "../middleware/rate-limit.js";
const customLimiter = rateLimiter({
keyPrefix: "webhook",
skipFailureOpen: false, // Reject if rate limiting fails
});
app.use("/webhooks", customLimiter);
Scheduled Cleanup
A scheduled Cloud Function runs daily at 2:00 AM UTC to remove stale rate limit data:
- Function:
scheduledRateLimitCleanupinapps/functions/scheduled/rate-limit-cleanup.js - Cleanup threshold: Documents older than 24 hours
- Batch size: 500 documents per execution
- Automatic retry: 3 retries on failure
Best Practices
1. Monitor Rate Limit Usage
Track rate limit headers in your application:
const response = await fetch('/api/v1/posts', {
headers: { 'soku-api-key': apiKey }
});
const remaining = response.headers.get('X-RateLimit-Remaining');
const reset = response.headers.get('X-RateLimit-Reset');
if (remaining < 10) {
console.warn('Approaching rate limit!');
}
2. Implement Exponential Backoff
When receiving 429 responses:
async function fetchWithRetry(url, options, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
const response = await fetch(url, options);
if (response.status === 429) {
const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
continue;
}
return response;
}
throw new Error('Max retries exceeded');
}
3. Batch Requests When Possible
Instead of making multiple individual requests, batch them:
// Bad: Multiple requests
for (const post of posts) {
await api.post('/v1/posts', post);
}
// Good: Single batch request
await api.post('/v1/posts/batch', { posts });
4. Cache Responses
Reduce API calls by caching responses:
const cache = new Map();
async function getCachedData(key) {
if (cache.has(key)) {
return cache.get(key);
}
const response = await api.get(`/v1/data/${key}`);
cache.set(key, response.data);
return response.data;
}
Monitoring & Testing
Testing
Run the comprehensive test suite:
cd apps/functions
npm test -- rate-limit.test.js
Test coverage includes:
- Rate limit enforcement across all tiers
- Sliding window accuracy
- Concurrent request handling
- Error scenarios and graceful degradation
- Cleanup operations
Monitoring
Track rate limiting metrics through multiple channels:
- Firestore Console: Query
rateLimitscollection for real-time usage - Cloud Logging: Filter logs by
rate-limitlogger for violations and errors - Response Headers: Monitor
X-RateLimit-Remainingin API responses - Error Tracking: Review 429 responses in
errorLogscollection
Recommended alerts:
- Spike in 429 responses (indicates potential abuse or misconfigured limits)
- Rate limiting service failures (check graceful degradation metrics)
- Cleanup job failures (prevents unbounded growth of
rateLimitscollection)
Troubleshooting
Rate limits not enforced
Cause: Middleware order incorrect
Solution: Verify rate limiting middleware loads before route handlers in apps/functions/api.js:
app.use(checkSubscription);
app.use(rateLimiter({ keyPrefix: "api" }));
app.use('/v1/posts', postsRouter); // Routes must come after
Limits too strict for legitimate traffic
Cause: Default limits don't match usage patterns
Solution: Adjust tier limits via environment variables:
# Increase authenticated tier from default 300 req/min
RATE_LIMIT_AUTH_MAX=500
RATE_LIMIT_AUTH_WINDOW=60000
Or whitelist specific users/IPs for admin tier access:
RATE_LIMIT_WHITELIST=user-abc123,192.168.1.100
Firestore transaction errors
Cause: Cleanup batch size exceeds Firestore transaction limits
Solution: Firestore transactions have constraints:
- Maximum 500 operations per transaction
- Maximum 10 MB transaction size
If encountering these limits, reduce the batch size in apps/functions/scheduled/rate-limit-cleanup.js:
.limit(500) // Reduce to 100 or 250
High latency on API requests
Cause: Rate limit checks adding overhead
Solution:
-
Enable graceful degradation (already default):
rateLimiter({ skipFailureOpen: true }) -
Optimize Firestore queries: Create composite index on
rateLimitscollection:- Fields:
updatedAt(Ascending),__name__(Ascending)
- Fields:
-
Consider Redis: For very high-traffic scenarios (>10K req/min), integrate Redis for faster rate limit storage:
// Future enhancement - see "Future Enhancements" section
Future Enhancements
Planned improvements for production scale:
- Redis Integration: Replace Firestore with Redis for sub-millisecond rate limit checks in high-traffic scenarios (>10K req/min)
- Distributed Rate Limiting: Coordinate rate limits across multiple Cloud Run regions for global deployments
- Dynamic Rate Limits: Auto-adjust limits based on system load, time of day, or user reputation scores
- User-Specific Overrides: Admin UI for setting custom rate limits per user without environment variable changes
- Rate Limit Dashboard: Real-time analytics showing top consumers, tier distribution, and violation patterns
- Progressive Enforcement: Graduated response system (warning headers → soft limits → hard limits) instead of immediate 429s
Security Considerations
IP Spoofing
Risk: req.ip can be spoofed via proxy headers
Mitigation:
- Express automatically trusts
X-Forwarded-Forwhen behind proxies - For production, validate proxy headers match expected sources (Cloud Load Balancer, Cloud Run)
- Consider additional client fingerprinting beyond IP addresses
Distributed Attacks
Risk: IP-based rate limiting ineffective against DDoS from many sources
Mitigation:
- Deploy Cloud Armor for network-layer DDoS protection
- Implement CAPTCHA challenges for suspicious traffic patterns
- Use Cloud CDN to absorb and cache static content
Rate Limit Bypass
Risk: Attackers might attempt tier elevation by manipulating request context
Mitigation:
- Authentication middleware (
authenticateApiKey) runs before rate limiting - Subscription check (
checkSubscription) verifies active status from trusted Firestore data - Tier assignment logic uses server-side
req.auth(not client-provided data)
Related
- APIs → Public API
- APIs → API Keys Lifecycle