Rate Limiting System

Overview

The rate limiting system protects the API from abuse while providing fair usage across subscription tiers. It implements a sliding window algorithm with tiered limits based on user subscriptions.

Implementation: apps/functions/middleware/rate-limit.js

Algorithm: Sliding Window

The system uses a sliding window counter algorithm that provides smooth rate limiting without the "burst at window edge" problem of fixed windows.

How It Works

Window tracking: For each request, maintain a sorted list of timestamps within the time window
Old entry cleanup: Remove timestamps older than the window duration
Count check: If remaining timestamps exceed the limit, reject the request
New entry: Add current timestamp to the window
Persistence: Store the window state in Firestore for distributed enforcement

Benefits

Smooth limiting: No artificial boundaries where bursts can happen
Fair distribution: Users can spread requests evenly across time
Accurate counting: Precise tracking of requests in any time period
Distributed: Works across multiple Cloud Functions instances via Firestore

Rate Limit Tiers

Limits are enforced based on the user's subscription tier, stored in users/{uid}.tier:

Tier	Window	Limit	Description
`free`	15 minutes	5 requests	Free tier, minimal API access
`pro`	1 minute	10 requests	Pro subscription, higher limits
`basic`	1 minute	10 requests	Basic subscription (same as pro)
Staff	1 minute	1000 requests	Staff members (`isStaff: true`) get effectively unlimited access

Tier Resolution

// Priority order:
Check if user has `isStaff: true` → Staff tier (1000 req/min)
Read `tier` field from user document → Apply tier limits
Default to `free` tier if no subscription

HTTP Headers

Every API response includes rate limit headers:

X-RateLimit-Limit: 10
X-RateLimit-Remaining: 7
X-RateLimit-Reset: 1640000000

Header Meanings

X-RateLimit-Limit: Maximum requests allowed in the window
X-RateLimit-Remaining: Requests remaining before hitting the limit
X-RateLimit-Reset: Unix timestamp (seconds) when the oldest request expires from the window

Rate Limit Exceeded Response

When a user exceeds their rate limit:

Status: 429 Too Many Requests

Response body:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. Try again in 45 seconds.",
    "timestamp": "2025-01-15T10:30:00.000Z",
    "requestId": "req_abc123"
  }
}

Headers:

X-RateLimit-Limit: 10
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1640000045
Retry-After: 45

The Retry-After header tells clients how many seconds to wait before retrying.

Firestore Storage

Rate limit state is stored in Firestore for distributed enforcement:

Collection: rateLimits/{key}

Document structure:

{
  key: string,           // Format: "rateLimit:{userId}:{endpoint}"
  userId: string,        // User ID for querying
  endpoint: string,      // API endpoint (e.g., "/v1/posts")
  window: timestamp[],   // Sorted array of request timestamps
  lastRequestAt: Timestamp,
  createdAt: Timestamp
}

Storage Key Format

rateLimit:{userId}:{endpoint}

Examples:

rateLimit:user_abc123:/v1/posts
rateLimit:user_xyz789:/v1/media/upload

Cleanup Strategy

Old entries in the window array are automatically cleaned up on each request:

// Remove timestamps older than the window duration
const cutoff = now - windowDuration;
window = window.filter(timestamp => timestamp > cutoff);

No background job is needed because cleanup happens inline with request processing.

Implementation Details

Middleware Flow

// 1. Extract user from authenticated request
const { uid } = req.user;

// 2. Determine rate limit tier
const tier = await determineUserTier(uid);

// 3. Build rate limit key
const key = `rateLimit:${uid}:${req.path}`;

// 4. Check and update window
const { allowed, remaining, resetTime } = await checkRateLimit(key, tier);

// 5. Set response headers
res.setHeader('X-RateLimit-Limit', tier.limit);
res.setHeader('X-RateLimit-Remaining', remaining);
res.setHeader('X-RateLimit-Reset', resetTime);

// 6. Reject or allow
if (!allowed) {
  return res.status(429).json({ error: { code: 'rate_limit_exceeded', ... } });
}

next();

Atomic Updates

Firestore transactions ensure atomic read-modify-write operations:

await db.runTransaction(async (transaction) => {
  const doc = await transaction.get(rateLimitRef);
  const data = doc.data() || { window: [] };

  // Clean old entries
  const cutoff = now - windowDuration;
  const window = data.window.filter(ts => ts > cutoff);

  // Check limit
  if (window.length >= limit) {
    throw new Error('Rate limit exceeded');
  }

  // Add new timestamp
  window.push(now);

  transaction.set(rateLimitRef, {
    key,
    userId: uid,
    endpoint: req.path,
    window,
    lastRequestAt: now,
    createdAt: data.createdAt || now
  });
});

Performance Considerations

Write Amplification

Every API request writes to Firestore to update the rate limit window. This is acceptable because:

Write costs: Firestore writes are cheap (~$0.18 per 100K writes)
Small documents: Rate limit docs are tiny (typically less than 1KB)
TTL-aware: Old timestamps are pruned, keeping documents small
Critical feature: Rate limiting prevents abuse, worth the cost

Read Optimization

The rate limit check requires one Firestore read per request. To minimize latency:

Single document read: All rate limit data is in one document
Index-free: No composite indexes needed for rate limiting
Co-located: Rate limit data is in the same region as Cloud Functions

Alternatives Considered

Approach	Pros	Cons	Decision
Redis/Memorystore	Faster, in-memory	Additional service, cost, complexity	❌ Rejected: Firestore is "good enough"
Cloud Armor	GCP-native, edge enforcement	Limited to IP-based, no user context	❌ Rejected: Need per-user limits
Fixed windows	Simpler logic	Burst at window edges	❌ Rejected: Sliding window is better UX
Firestore sliding window	No additional services, distributed	Slightly higher latency	✅ Chosen: Best balance

Testing Rate Limits

Manual Testing

# Get your API key
API_KEY="your_api_key_here"

# Test free tier (5 req / 15 min)
for i in {1..6}; do
  echo "Request $i:"
  curl -H "X-API-Key: $API_KEY" \
       -H "Content-Type: application/json" \
       -d '{"post":{"content":{"text":"test"},"platforms":["threads"]}}' \
       https://YOUR_FUNCTION_URL/v1/posts
  echo -e "\n---"
done

# The 6th request should return 429

Observing Headers

# Watch rate limit headers
curl -i -H "X-API-Key: $API_KEY" \
     https://YOUR_FUNCTION_URL/v1/posts | grep -i ratelimit

# Output:
# X-RateLimit-Limit: 10
# X-RateLimit-Remaining: 7
# X-RateLimit-Reset: 1705320045

Upgrading Tiers

# Upgrade user to pro tier (as admin)
firebase firestore:update users/USER_ID '{"tier": "pro"}'

# Verify new limits
curl -i -H "X-API-Key: $API_KEY" https://YOUR_FUNCTION_URL/v1/posts | grep X-RateLimit-Limit
# Should now show: X-RateLimit-Limit: 10

Monitoring & Alerts

Key Metrics

Track these metrics for rate limiting health:

Rate limit hit rate: Percentage of requests that hit 429
- Query errorLogs collection for code: 'rate_limit_exceeded'
- Alert if >5% of requests are rate limited (indicates abuse or wrong tier assignment)
Per-user hit patterns: Users consistently hitting limits
- May need tier upgrade or are attempting abuse
- Query: rateLimits where window.length >= limit
Tier distribution: How many users in each tier
- Query: Group users by tier field
- Helps understand revenue vs. usage

Example Queries

// Count rate limit errors in last hour
db.collection('errorLogs')
  .where('code', '==', 'rate_limit_exceeded')
  .where('timestamp', '>=', hourAgo)
  .get()

// Find users hitting limits frequently
db.collection('rateLimits')
  .where('lastRequestAt', '>=', hourAgo)
  .where('window', '>', /* check length */)
  .get()

// Tier distribution
db.collection('users')
  .get()
  .then(snapshot => {
    const tiers = {};
    snapshot.forEach(doc => {
      const tier = doc.data().tier || 'free';
      tiers[tier] = (tiers[tier] || 0) + 1;
    });
    return tiers;
  })

Alerting Thresholds

Metric	Threshold	Action
Rate limit hit rate	>5%	Investigate abuse or tier misconfigurations
Free tier users hitting limit	>100/hour	Consider tier upgrade campaigns or adjusting free limits
Staff accounts rate limited	Any occurrence	Bug in tier detection, investigate immediately

Security Considerations

Bypassing Rate Limits

The system enforces limits at the middleware layer before business logic. Users cannot bypass by:

❌ Changing API key: Rate limits are per-user (via req.user.uid), not per key
❌ Multiple keys: Same uid → same rate limit bucket
❌ Rotating IPs: Not IP-based, user-based

Staff Access

Staff members (isStaff: true) have elevated limits (1000 req/min) for:

Admin operations
Testing and debugging
Support tasks

Security: Staff flag is only settable by:

Firebase Admin SDK (not exposed to client)
Firestore Security Rules (prevent client writes)
Manual Firestore writes by administrators

Future Enhancements

Potential improvements to consider:

Dynamic tier limits: Allow per-user custom limits in user document
Burst allowance: Allow short bursts above limit with token bucket
Endpoint-specific limits: Different limits for different endpoints (e.g., read vs. write)
Redis caching: Cache tier lookups to reduce Firestore reads
Grace period: Warning headers before hard limit (e.g., at 80% usage)
Usage dashboard: Show users their current usage and limits

Public API - API reference with rate limit info
Credit System - AI credit tracking (separate from rate limits)
Tracing & Logging - Observability and debugging

Overview​

Algorithm: Sliding Window​

How It Works​

Benefits​

Rate Limit Tiers​

Tier Resolution​

HTTP Headers​

Header Meanings​

Rate Limit Exceeded Response​

Firestore Storage​

Storage Key Format​

Cleanup Strategy​

Implementation Details​

Middleware Flow​

Atomic Updates​

Performance Considerations​

Write Amplification​

Read Optimization​

Alternatives Considered​

Testing Rate Limits​

Manual Testing​

Observing Headers​

Upgrading Tiers​

Monitoring & Alerts​

Key Metrics​

Example Queries​

Alerting Thresholds​

Security Considerations​

Bypassing Rate Limits​

Staff Access​

Future Enhancements​

Related Documentation​