Rate Limiting System
Quick links: Public API • Orchestration
Overview
The rate limiting system protects the API from abuse while providing fair usage across subscription tiers. It implements a sliding window algorithm with tiered limits based on user subscriptions.
Implementation: apps/functions/middleware/rate-limit.js
Algorithm: Sliding Window
The system uses a sliding window counter algorithm that provides smooth rate limiting without the "burst at window edge" problem of fixed windows.
How It Works
- Window tracking: For each request, maintain a sorted list of timestamps within the time window
- Old entry cleanup: Remove timestamps older than the window duration
- Count check: If remaining timestamps exceed the limit, reject the request
- New entry: Add current timestamp to the window
- Persistence: Store the window state in Firestore for distributed enforcement
Benefits
- Smooth limiting: No artificial boundaries where bursts can happen
- Fair distribution: Users can spread requests evenly across time
- Accurate counting: Precise tracking of requests in any time period
- Distributed: Works across multiple Cloud Functions instances via Firestore
Rate Limit Tiers
Limits are enforced based on the user's subscription tier, stored in users/{uid}.tier:
| Tier | Window | Limit | Description |
|---|---|---|---|
free | 15 minutes | 5 requests | Free tier, minimal API access |
pro | 1 minute | 10 requests | Pro subscription, higher limits |
basic | 1 minute | 10 requests | Basic subscription (same as pro) |
| Staff | 1 minute | 1000 requests | Staff members (isStaff: true) get effectively unlimited access |
Tier Resolution
// Priority order:
1. Check if user has `isStaff: true` → Staff tier (1000 req/min)
2. Read `tier` field from user document → Apply tier limits
3. Default to `free` tier if no subscription
HTTP Headers
Every API response includes rate limit headers:
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 7
X-RateLimit-Reset: 1640000000
Header Meanings
X-RateLimit-Limit: Maximum requests allowed in the windowX-RateLimit-Remaining: Requests remaining before hitting the limitX-RateLimit-Reset: Unix timestamp (seconds) when the oldest request expires from the window
Rate Limit Exceeded Response
When a user exceeds their rate limit:
Status: 429 Too Many Requests
Response body:
{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Try again in 45 seconds.",
"timestamp": "2025-01-15T10:30:00.000Z",
"requestId": "req_abc123"
}
}
Headers:
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1640000045
Retry-After: 45
The Retry-After header tells clients how many seconds to wait before retrying.
Firestore Storage
Rate limit state is stored in Firestore for distributed enforcement:
Collection: rateLimits/{key}
Document structure:
{
key: string, // Format: "rateLimit:{userId}:{endpoint}"
userId: string, // User ID for querying
endpoint: string, // API endpoint (e.g., "/v1/posts")
window: timestamp[], // Sorted array of request timestamps
lastRequestAt: Timestamp,
createdAt: Timestamp
}
Storage Key Format
rateLimit:{userId}:{endpoint}
Examples:
rateLimit:user_abc123:/v1/postsrateLimit:user_xyz789:/v1/media/upload
Cleanup Strategy
Old entries in the window array are automatically cleaned up on each request:
// Remove timestamps older than the window duration
const cutoff = now - windowDuration;
window = window.filter(timestamp => timestamp > cutoff);
No background job is needed because cleanup happens inline with request processing.
Implementation Details
Middleware Flow
// 1. Extract user from authenticated request
const { uid } = req.user;
// 2. Determine rate limit tier
const tier = await determineUserTier(uid);
// 3. Build rate limit key
const key = `rateLimit:${uid}:${req.path}`;
// 4. Check and update window
const { allowed, remaining, resetTime } = await checkRateLimit(key, tier);
// 5. Set response headers
res.setHeader('X-RateLimit-Limit', tier.limit);
res.setHeader('X-RateLimit-Remaining', remaining);
res.setHeader('X-RateLimit-Reset', resetTime);
// 6. Reject or allow
if (!allowed) {
return res.status(429).json({ error: { code: 'rate_limit_exceeded', ... } });
}
next();
Atomic Updates
Firestore transactions ensure atomic read-modify-write operations:
await db.runTransaction(async (transaction) => {
const doc = await transaction.get(rateLimitRef);
const data = doc.data() || { window: [] };
// Clean old entries
const cutoff = now - windowDuration;
const window = data.window.filter(ts => ts > cutoff);
// Check limit
if (window.length >= limit) {
throw new Error('Rate limit exceeded');
}
// Add new timestamp
window.push(now);
transaction.set(rateLimitRef, {
key,
userId: uid,
endpoint: req.path,
window,
lastRequestAt: now,
createdAt: data.createdAt || now
});
});
Performance Considerations
Write Amplification
Every API request writes to Firestore to update the rate limit window. This is acceptable because:
- Write costs: Firestore writes are cheap (~$0.18 per 100K writes)
- Small documents: Rate limit docs are tiny (typically less than 1KB)
- TTL-aware: Old timestamps are pruned, keeping documents small
- Critical feature: Rate limiting prevents abuse, worth the cost
Read Optimization
The rate limit check requires one Firestore read per request. To minimize latency:
- Single document read: All rate limit data is in one document
- Index-free: No composite indexes needed for rate limiting
- Co-located: Rate limit data is in the same region as Cloud Functions
Alternatives Considered
| Approach | Pros | Cons | Decision |
|---|---|---|---|
| Redis/Memorystore | Faster, in-memory | Additional service, cost, complexity | ❌ Rejected: Firestore is "good enough" |
| Cloud Armor | GCP-native, edge enforcement | Limited to IP-based, no user context | ❌ Rejected: Need per-user limits |
| Fixed windows | Simpler logic | Burst at window edges | ❌ Rejected: Sliding window is better UX |
| Firestore sliding window | No additional services, distributed | Slightly higher latency | ✅ Chosen: Best balance |
Testing Rate Limits
Manual Testing
# Get your API key
API_KEY="your_api_key_here"
# Test free tier (5 req / 15 min)
for i in {1..6}; do
echo "Request $i:"
curl -H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{"post":{"content":{"text":"test"},"platforms":["threads"]}}' \
https://YOUR_FUNCTION_URL/v1/posts
echo -e "\n---"
done
# The 6th request should return 429
Observing Headers
# Watch rate limit headers
curl -i -H "X-API-Key: $API_KEY" \
https://YOUR_FUNCTION_URL/v1/posts | grep -i ratelimit
# Output:
# X-RateLimit-Limit: 10
# X-RateLimit-Remaining: 7
# X-RateLimit-Reset: 1705320045
Upgrading Tiers
# Upgrade user to pro tier (as admin)
firebase firestore:update users/USER_ID '{"tier": "pro"}'
# Verify new limits
curl -i -H "X-API-Key: $API_KEY" https://YOUR_FUNCTION_URL/v1/posts | grep X-RateLimit-Limit
# Should now show: X-RateLimit-Limit: 10
Monitoring & Alerts
Key Metrics
Track these metrics for rate limiting health:
-
Rate limit hit rate: Percentage of requests that hit 429
- Query
errorLogscollection forcode: 'rate_limit_exceeded' - Alert if >5% of requests are rate limited (indicates abuse or wrong tier assignment)
- Query
-
Per-user hit patterns: Users consistently hitting limits
- May need tier upgrade or are attempting abuse
- Query:
rateLimitswherewindow.length >= limit
-
Tier distribution: How many users in each tier
- Query: Group
usersbytierfield - Helps understand revenue vs. usage
- Query: Group
Example Queries
// Count rate limit errors in last hour
db.collection('errorLogs')
.where('code', '==', 'rate_limit_exceeded')
.where('timestamp', '>=', hourAgo)
.get()
// Find users hitting limits frequently
db.collection('rateLimits')
.where('lastRequestAt', '>=', hourAgo)
.where('window', '>', /* check length */)
.get()
// Tier distribution
db.collection('users')
.get()
.then(snapshot => {
const tiers = {};
snapshot.forEach(doc => {
const tier = doc.data().tier || 'free';
tiers[tier] = (tiers[tier] || 0) + 1;
});
return tiers;
})
Alerting Thresholds
| Metric | Threshold | Action |
|---|---|---|
| Rate limit hit rate | >5% | Investigate abuse or tier misconfigurations |
| Free tier users hitting limit | >100/hour | Consider tier upgrade campaigns or adjusting free limits |
| Staff accounts rate limited | Any occurrence | Bug in tier detection, investigate immediately |
Security Considerations
Bypassing Rate Limits
The system enforces limits at the middleware layer before business logic. Users cannot bypass by:
- ❌ Changing API key: Rate limits are per-user (via
req.user.uid), not per key - ❌ Multiple keys: Same
uid→ same rate limit bucket - ❌ Rotating IPs: Not IP-based, user-based
Staff Access
Staff members (isStaff: true) have elevated limits (1000 req/min) for:
- Admin operations
- Testing and debugging
- Support tasks
Security: Staff flag is only settable by:
- Firebase Admin SDK (not exposed to client)
- Firestore Security Rules (prevent client writes)
- Manual Firestore writes by administrators
Future Enhancements
Potential improvements to consider:
- Dynamic tier limits: Allow per-user custom limits in user document
- Burst allowance: Allow short bursts above limit with token bucket
- Endpoint-specific limits: Different limits for different endpoints (e.g., read vs. write)
- Redis caching: Cache tier lookups to reduce Firestore reads
- Grace period: Warning headers before hard limit (e.g., at 80% usage)
- Usage dashboard: Show users their current usage and limits
Related Documentation
- Public API - API reference with rate limit info
- Credit System - AI credit tracking (separate from rate limits)
- Tracing & Logging - Observability and debugging