Rate Limiting

Redis-based distributed rate limiting with sliding window counter algorithm for API protection and fair resource usage.

Overview

Based on pkg/v2/infrastructure/ratelimit/, the rate limiting implementation provides:

Sliding Window Counter: Accurate rate limiting across time windows
Distributed: Redis-based for consistent limits across AppServer instances
Hierarchical: Route-level and per-user limits
Configurable: Per-route RPM (requests per minute) and burst capacity
Middleware: HTTP middleware for automatic enforcement

Architecture

Rate Limiting Strategy

Sliding Window Counter Algorithm:

Current Window (Minute N):
├─ Key: ratelimit:{key}:{timestamp/60}
├─ Counter: Incremented on each request
├─ Expiration: 2 minutes (covers edge cases)
└─ Limit Check: count <= (rpm or burst, whichever is higher)

Hierarchical Enforcement

1. Route-Level Limit
   ├─ Key: route:{routeID}
   ├─ Applies to ALL users
   └─ Protects backend from overload

2. Per-User Limit
   ├─ Key: {subject}:route:{routeID}
   ├─ Applies to authenticated users only
   └─ Ensures fair usage per user

Implementation

Based on redis_limiter.go:

RedisLimiter Structure

type RedisLimiter struct {
    client *redis.Client
}

Configuration Storage

Rate limits stored as Redis hashes:

Key: ratelimit:config:{key}
Fields:
  - rpm: 60       // Requests per minute
  - burst: 10     // Burst capacity

Rate Limit Configuration

LimitConfig Structure

type LimitConfig struct {
    RPM   int  // Requests per minute
    Burst int  // Burst capacity
}

Configure Limits

limiter := ratelimit.NewRedisLimiter(redisClient)

// Configure route-level limit
config := ratelimit.LimitConfig{
    RPM:   100,
    Burst: 20,
}
err := limiter.Configure(ctx, "route:550e8400", config)

// Configure per-user limit
userConfig := ratelimit.LimitConfig{
    RPM:   60,
    Burst: 10,
}
err := limiter.Configure(ctx, "user:123:route:550e8400", userConfig)

Rate Limiting Operations

Allow

Check if request is allowed:

allowed, err := limiter.Allow(ctx, "route:550e8400")
if !allowed {
    // Return 429 Too Many Requests
}

Allow Flow

1. Check if config exists
   └─ If no config → Allow (no limit configured)

2. Get RPM and Burst from config
   └─ If RPM <= 0 → Allow (unlimited)

3. Generate window key
   ├─ Current timestamp / 60 (minute window)
   └─ Key: ratelimit:{key}:{window}

4. Increment counter atomically
   ├─ INCR ratelimit:{key}:{window}
   └─ EXPIRE ratelimit:{key}:{window} 120  # 2 minutes

5. Check limit
   ├─ Effective Limit = max(RPM, Burst)
   ├─ Allowed = count <= Effective Limit
   └─ Return allowed

Reset

Reset rate limit counters:

err := limiter.Reset(ctx, "route:550e8400")

Reset Operations:

SCAN for all keys matching ratelimit:{key}:*
DELETE each matching key
DELETE config key ratelimit:config:{key}

HTTP Middleware

Based on middleware.go:

Middleware Structure

type RateLimitMiddleware struct {
    limiter      Limiter
    routeRegistry route.Registry
}

Middleware Application

// Create middleware
middleware := ratelimit.NewMiddleware(limiter, routeRegistry)

// Apply to HTTP handler
handler := middleware.AsHTTPHandler(proxyHandler)

Middleware Flow

1. Match Route
   └─ Get RegisteredRoute and RouteSpec from registry

2. Determine Effective Rate Limit
   ├─ Priority 1: RouteSpec.RateLimit
   ├─ Priority 2: RegisteredRoute.RateLimit
   └─ No limit if neither configured → Skip

3. Route-Level Check
   ├─ Key: route:{routeID}
   ├─ limiter.Allow(ctx, key)
   └─ If exceeded → 429 with X-RateLimit-Scope: route

4. Per-User Check (if authenticated)
   ├─ Extract subject from auth context
   ├─ Key: {subject}:route:{routeID}
   ├─ limiter.Allow(ctx, userKey)
   └─ If exceeded → 429 with X-RateLimit-Scope: user

5. Allow Request
   └─ Continue to next handler

Response Headers

Rate Limit Exceeded (429)

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 60
X-RateLimit-Scope: route|user

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Route rate limit exceeded"
  }
}

Scopes:

route: Route-level limit exceeded (applies to all users)
user: Per-user limit exceeded (specific to authenticated user)

Examples

Route-Level Rate Limiting

Manifest Declaration:

{
  api: {
    base_path: "/api/apps/todos",
    default_rate_limit: {
      rpm: 60,
      burst: 10
    },
    routes: [
      {
        pattern: "/items/**",
        methods: ["GET", "POST"],
        rate_limit: {
          rpm: 100,  // Override default for this route
          burst: 20
        }
      }
    ]
  }
}

Effective Limits:

/api/apps/todos/items: 100 RPM, burst 20
/api/apps/todos/other: 60 RPM, burst 10 (default)

Per-User Rate Limiting

Configuration:

// Route allows 100 RPM total
routeConfig := LimitConfig{RPM: 100, Burst: 20}
limiter.Configure(ctx, "route:123", routeConfig)

// Each user limited to 60 RPM
userConfig := LimitConfig{RPM: 60, Burst: 10}
limiter.Configure(ctx, "user:alice:route:123", userConfig)
limiter.Configure(ctx, "user:bob:route:123", userConfig)

Result:

Route total: 100 RPM (shared across all users)
Alice: 60 RPM (her individual limit)
Bob: 60 RPM (his individual limit)
If Alice uses 60 RPM and Bob uses 40 RPM, route limit (100) is reached

Rate Limit Algorithms

Sliding Window Counter

Current Implementation:

Minute 0 (00:00-00:59):
├─ Key: ratelimit:route:123:0
├─ Count: 50 requests
└─ Expires at 00:02

Minute 1 (01:00-01:59):
├─ Key: ratelimit:route:123:1
├─ Count: 30 requests
└─ Expires at 01:02

Request at 01:30:
├─ Window: Minute 1
├─ Count: 31 (after increment)
├─ Limit: 60 RPM
└─ Allowed: true (31 <= 60)

Advantages:

Accurate per-minute limits
Distributed (works across AppServer instances)
Low memory usage (auto-expiring keys)
Atomic operations (Redis INCR)

Trade-offs:

Window edges: User could make 60 requests at 00:59 and 60 more at 01:00
Solution: Burst capacity handles legitimate spikes

Best Practices

Configuring Limits

✅ DO:

Set burst capacity 15-25% higher than RPM
Use route-level limits to protect backends
Use per-user limits for fair usage
Monitor rate limit hit rates

❌ DON'T:

Set burst lower than RPM (burst is max)
Use very low limits without warning users
Apply same limits to all routes (customize per workload)

Burst Capacity

Purpose: Handle legitimate traffic spikes

Recommendations:

Read-Heavy: Burst = RPM * 1.25 (25% over)
Write-Heavy: Burst = RPM * 1.15 (15% over)
Mixed: Burst = RPM * 1.20 (20% over)

Example:

// Read-heavy route (searching, listing)
LimitConfig{RPM: 100, Burst: 125}

// Write-heavy route (creating, updating)
LimitConfig{RPM: 60, Burst: 70}

Error Handling

✅ DO:

Return 429 with descriptive error messages
Include X-RateLimit-Limit header
Include X-RateLimit-Scope to indicate which limit was exceeded
Log rate limit violations for monitoring

❌ DON'T:

Return 500 on rate limit exceeded
Silently drop requests
Block indefinitely

Monitoring

Metrics to Track:

Rate limit hit rate per route
Percentage of requests blocked
Distribution of request counts per window
Redis operation latency

Alerts:

High rate limit hit rate (> 5% of requests)
Redis connection failures
Unusual traffic patterns

Code References

Component	File	Purpose
RedisLimiter	`redis_limiter.go`	Core rate limiting implementation
Middleware	`middleware.go`	HTTP middleware integration
Limiter Interface	`limiter.go`	Interface definition

HTTP Proxy & Routing - Middleware integration
Manifest Reference - Rate limit configuration in manifests
Platform Architecture - Infrastructure layer

Overview​

Architecture​

Rate Limiting Strategy​

Hierarchical Enforcement​

Implementation​

RedisLimiter Structure​

Configuration Storage​

Rate Limit Configuration​

LimitConfig Structure​

Configure Limits​

Rate Limiting Operations​

Allow​

Allow Flow​

Reset​

HTTP Middleware​

Middleware Structure​

Middleware Application​

Middleware Flow​

Response Headers​

Rate Limit Exceeded (429)​

Examples​

Route-Level Rate Limiting​

Per-User Rate Limiting​

Rate Limit Algorithms​

Sliding Window Counter​

Best Practices​

Configuring Limits​

Burst Capacity​

Error Handling​

Monitoring​

Code References​

Related Topics​