Skip to main content

Rate Limiting System

Quick links: Public APIOrchestration

Overview

The rate limiting system protects the API from abuse while providing fair usage across subscription tiers. It implements a sliding window algorithm with tiered limits based on user subscriptions.

Implementation: apps/functions/middleware/rate-limit.js

Algorithm: Sliding Window

The system uses a sliding window counter algorithm that provides smooth rate limiting without the "burst at window edge" problem of fixed windows.

How It Works

  1. Window tracking: For each request, maintain a sorted list of timestamps within the time window
  2. Old entry cleanup: Remove timestamps older than the window duration
  3. Count check: If remaining timestamps exceed the limit, reject the request
  4. New entry: Add current timestamp to the window
  5. Persistence: Store the window state in Firestore for distributed enforcement

Benefits

  • Smooth limiting: No artificial boundaries where bursts can happen
  • Fair distribution: Users can spread requests evenly across time
  • Accurate counting: Precise tracking of requests in any time period
  • Distributed: Works across multiple Cloud Functions instances via Firestore

Rate Limit Tiers

Limits are enforced based on the user's subscription tier, stored in users/{uid}.tier:

TierWindowLimitDescription
free15 minutes5 requestsFree tier, minimal API access
pro1 minute10 requestsPro subscription, higher limits
basic1 minute10 requestsBasic subscription (same as pro)
Staff1 minute1000 requestsStaff members (isStaff: true) get effectively unlimited access

Tier Resolution

// Priority order:
1. Check if user has `isStaff: true`Staff tier (1000 req/min)
2. Read `tier` field from user documentApply tier limits
3. Default to `free` tier if no subscription

HTTP Headers

Every API response includes rate limit headers:

X-RateLimit-Limit: 10
X-RateLimit-Remaining: 7
X-RateLimit-Reset: 1640000000

Header Meanings

  • X-RateLimit-Limit: Maximum requests allowed in the window
  • X-RateLimit-Remaining: Requests remaining before hitting the limit
  • X-RateLimit-Reset: Unix timestamp (seconds) when the oldest request expires from the window

Rate Limit Exceeded Response

When a user exceeds their rate limit:

Status: 429 Too Many Requests

Response body:

{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Try again in 45 seconds.",
"timestamp": "2025-01-15T10:30:00.000Z",
"requestId": "req_abc123"
}
}

Headers:

X-RateLimit-Limit: 10
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1640000045
Retry-After: 45

The Retry-After header tells clients how many seconds to wait before retrying.

Firestore Storage

Rate limit state is stored in Firestore for distributed enforcement:

Collection: rateLimits/{key}

Document structure:

{
key: string, // Format: "rateLimit:{userId}:{endpoint}"
userId: string, // User ID for querying
endpoint: string, // API endpoint (e.g., "/v1/posts")
window: timestamp[], // Sorted array of request timestamps
lastRequestAt: Timestamp,
createdAt: Timestamp
}

Storage Key Format

rateLimit:{userId}:{endpoint}

Examples:

  • rateLimit:user_abc123:/v1/posts
  • rateLimit:user_xyz789:/v1/media/upload

Cleanup Strategy

Old entries in the window array are automatically cleaned up on each request:

// Remove timestamps older than the window duration
const cutoff = now - windowDuration;
window = window.filter(timestamp => timestamp > cutoff);

No background job is needed because cleanup happens inline with request processing.

Implementation Details

Middleware Flow

// 1. Extract user from authenticated request
const { uid } = req.user;

// 2. Determine rate limit tier
const tier = await determineUserTier(uid);

// 3. Build rate limit key
const key = `rateLimit:${uid}:${req.path}`;

// 4. Check and update window
const { allowed, remaining, resetTime } = await checkRateLimit(key, tier);

// 5. Set response headers
res.setHeader('X-RateLimit-Limit', tier.limit);
res.setHeader('X-RateLimit-Remaining', remaining);
res.setHeader('X-RateLimit-Reset', resetTime);

// 6. Reject or allow
if (!allowed) {
return res.status(429).json({ error: { code: 'rate_limit_exceeded', ... } });
}

next();

Atomic Updates

Firestore transactions ensure atomic read-modify-write operations:

await db.runTransaction(async (transaction) => {
const doc = await transaction.get(rateLimitRef);
const data = doc.data() || { window: [] };

// Clean old entries
const cutoff = now - windowDuration;
const window = data.window.filter(ts => ts > cutoff);

// Check limit
if (window.length >= limit) {
throw new Error('Rate limit exceeded');
}

// Add new timestamp
window.push(now);

transaction.set(rateLimitRef, {
key,
userId: uid,
endpoint: req.path,
window,
lastRequestAt: now,
createdAt: data.createdAt || now
});
});

Performance Considerations

Write Amplification

Every API request writes to Firestore to update the rate limit window. This is acceptable because:

  1. Write costs: Firestore writes are cheap (~$0.18 per 100K writes)
  2. Small documents: Rate limit docs are tiny (typically less than 1KB)
  3. TTL-aware: Old timestamps are pruned, keeping documents small
  4. Critical feature: Rate limiting prevents abuse, worth the cost

Read Optimization

The rate limit check requires one Firestore read per request. To minimize latency:

  1. Single document read: All rate limit data is in one document
  2. Index-free: No composite indexes needed for rate limiting
  3. Co-located: Rate limit data is in the same region as Cloud Functions

Alternatives Considered

ApproachProsConsDecision
Redis/MemorystoreFaster, in-memoryAdditional service, cost, complexity❌ Rejected: Firestore is "good enough"
Cloud ArmorGCP-native, edge enforcementLimited to IP-based, no user context❌ Rejected: Need per-user limits
Fixed windowsSimpler logicBurst at window edges❌ Rejected: Sliding window is better UX
Firestore sliding windowNo additional services, distributedSlightly higher latencyChosen: Best balance

Testing Rate Limits

Manual Testing

# Get your API key
API_KEY="your_api_key_here"

# Test free tier (5 req / 15 min)
for i in {1..6}; do
echo "Request $i:"
curl -H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{"post":{"content":{"text":"test"},"platforms":["threads"]}}' \
https://YOUR_FUNCTION_URL/v1/posts
echo -e "\n---"
done

# The 6th request should return 429

Observing Headers

# Watch rate limit headers
curl -i -H "X-API-Key: $API_KEY" \
https://YOUR_FUNCTION_URL/v1/posts | grep -i ratelimit

# Output:
# X-RateLimit-Limit: 10
# X-RateLimit-Remaining: 7
# X-RateLimit-Reset: 1705320045

Upgrading Tiers

# Upgrade user to pro tier (as admin)
firebase firestore:update users/USER_ID '{"tier": "pro"}'

# Verify new limits
curl -i -H "X-API-Key: $API_KEY" https://YOUR_FUNCTION_URL/v1/posts | grep X-RateLimit-Limit
# Should now show: X-RateLimit-Limit: 10

Monitoring & Alerts

Key Metrics

Track these metrics for rate limiting health:

  1. Rate limit hit rate: Percentage of requests that hit 429

    • Query errorLogs collection for code: 'rate_limit_exceeded'
    • Alert if >5% of requests are rate limited (indicates abuse or wrong tier assignment)
  2. Per-user hit patterns: Users consistently hitting limits

    • May need tier upgrade or are attempting abuse
    • Query: rateLimits where window.length >= limit
  3. Tier distribution: How many users in each tier

    • Query: Group users by tier field
    • Helps understand revenue vs. usage

Example Queries

// Count rate limit errors in last hour
db.collection('errorLogs')
.where('code', '==', 'rate_limit_exceeded')
.where('timestamp', '>=', hourAgo)
.get()

// Find users hitting limits frequently
db.collection('rateLimits')
.where('lastRequestAt', '>=', hourAgo)
.where('window', '>', /* check length */)
.get()

// Tier distribution
db.collection('users')
.get()
.then(snapshot => {
const tiers = {};
snapshot.forEach(doc => {
const tier = doc.data().tier || 'free';
tiers[tier] = (tiers[tier] || 0) + 1;
});
return tiers;
})

Alerting Thresholds

MetricThresholdAction
Rate limit hit rate>5%Investigate abuse or tier misconfigurations
Free tier users hitting limit>100/hourConsider tier upgrade campaigns or adjusting free limits
Staff accounts rate limitedAny occurrenceBug in tier detection, investigate immediately

Security Considerations

Bypassing Rate Limits

The system enforces limits at the middleware layer before business logic. Users cannot bypass by:

  • Changing API key: Rate limits are per-user (via req.user.uid), not per key
  • Multiple keys: Same uid → same rate limit bucket
  • Rotating IPs: Not IP-based, user-based

Staff Access

Staff members (isStaff: true) have elevated limits (1000 req/min) for:

  • Admin operations
  • Testing and debugging
  • Support tasks

Security: Staff flag is only settable by:

  1. Firebase Admin SDK (not exposed to client)
  2. Firestore Security Rules (prevent client writes)
  3. Manual Firestore writes by administrators

Future Enhancements

Potential improvements to consider:

  1. Dynamic tier limits: Allow per-user custom limits in user document
  2. Burst allowance: Allow short bursts above limit with token bucket
  3. Endpoint-specific limits: Different limits for different endpoints (e.g., read vs. write)
  4. Redis caching: Cache tier lookups to reduce Firestore reads
  5. Grace period: Warning headers before hard limit (e.g., at 80% usage)
  6. Usage dashboard: Show users their current usage and limits