Distributed Rate Limiting: From Local State to Atomic Lua
In a distributed system, a rate limiter isn't just a counter—it's a critical reliability component. If it fails, your services are vulnerable to DDoS; if it’s slow, you break your SLA latency targets.
The Bottleneck: Why Local State Fails
My first approach used a local Python deque. It was fast (O(1)), but it suffered from state fragmentation. In a load-balanced environment, a single user is routed to different pods. Each pod only sees a fraction of the traffic, making it impossible to enforce a global limit per IP.
The Pivot: Server-Side Atomicity
To achieve global consistency, we move the state to Redis. However, a naive implementation (Read -> Calculate -> Write) creates a race condition: two concurrent requests could both read the count as "4" and both decide to allow the request, letting 6 requests through instead of 5.
We solve this using Lua Scripting. By sending the logic to the Redis server, we execute the check and the update as a single, atomic transaction.
-- Atomic check and increment
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
local count = redis.call('ZCARD', key)
if count < limit then
redis.call('ZADD', key, now, now)
return 1
end
Performance at Scale
By using Redis Sorted Sets (ZSET), we keep the eviction logic extremely efficient. The ZREMRANGEBYSCORE command allows us to prune expired data in O(log N) time, ensuring that even under a massive credential-stuffing attack, our latency remains well under a 50ms threshold.
This pattern is the foundation for how we manage risk at scale: push the logic to the data, keep the database lean, and always design for atomicity.