Rate limiting is about protecting your Rails application. It stops too many requests from overloading your system. Think of it like a bouncer at a club, only letting in a certain number of people per minute to keep things running smoothly. I want to show you several effective ways to build this protection, going beyond simple counters.
Let’s start with a sliding window. A fixed window might say “100 requests per hour,” but it can be unfair if you make 100 requests at 1:59 PM and another 100 at 2:01 PM. A sliding window is smoother. It looks at your requests over the last hour from any given moment.
Here is how you can build one using Redis. Redis is a fast key-value store, perfect for this job.
class SlidingWindowLimiter
def initialize(key, limit:, window:)
@key = "rate_limit:#{key}"
@limit = limit
@window = window
@redis = Redis.new(url: ENV['REDIS_URL'])
end
def exceeded?
now = Time.current.to_f
window_start = now - @window
# Clean up old requests outside the current window
@redis.zremrangebyscore(@key, 0, window_start)
# Count how many requests are left in the window
request_count = @redis.zcard(@key)
if request_count >= @limit
true
else
# Add the current request timestamp
@redis.zadd(@key, now, now)
# Make sure our Redis key expires to avoid clutter
@redis.expire(@key, @window)
false
end
end
def remaining
now = Time.current.to_f
window_start = now - @window
@redis.zremrangebyscore(@key, 0, window_start)
@limit - @redis.zcard(@key)
end
end
You can use this in a Rails controller to protect your endpoints. The identifier can be a user ID or an IP address.
class ApiController < ApplicationController
before_action :check_rate_limit
private
def check_rate_limit
identifier = current_user&.id || request.remote_ip
limiter = SlidingWindowLimiter.new(
identifier,
limit: 100,
window: 3600 # One hour in seconds
)
if limiter.exceeded?
render json: {
error: 'Rate limit exceeded'
}, status: :too_many_requests
else
# Tell the client how many requests they have left
response.headers['X-RateLimit-Remaining'] = limiter.remaining.to_s
end
end
end
This pattern is precise. It tracks individual request timestamps. It also cleans up after itself, which is important for long-running applications.
Sometimes, you want to allow short bursts of activity. A user might be idle and then suddenly need to make several requests in a row. A token bucket is great for this. Imagine a bucket that fills with tokens at a steady rate. Each request costs a token. If the bucket has tokens, the request can proceed, even if it’s been a while since the last one.
class TokenBucket
def initialize(key, capacity:, refill_rate:)
@key = "token_bucket:#{key}"
@capacity = capacity
@refill_rate = refill_rate # How many tokens are added per second
@redis = Redis.new(url: ENV['REDIS_URL'])
end
def consume(tokens = 1)
now = Time.current.to_f
# Get the current state of the bucket from Redis
bucket = @redis.hgetall(@key)
if bucket.empty?
# First time? Start with a full bucket.
bucket = { 'tokens' => @capacity.to_s, 'last_refill' => now.to_s }
else
# Calculate how much time has passed since we last checked
elapsed = now - bucket['last_refill'].to_f
# Add new tokens based on that elapsed time
refill_amount = elapsed * @refill_rate
# Update the token count, but don't overflow the bucket
current_tokens = bucket['tokens'].to_f + refill_amount
bucket['tokens'] = [current_tokens, @capacity].min
bucket['last_refill'] = now.to_s
end
# Do we have enough tokens for this request?
if bucket['tokens'].to_f >= tokens
# Take the tokens
bucket['tokens'] = (bucket['tokens'].to_f - tokens).to_s
# Save the new state back to Redis
@redis.hmset(@key, bucket.to_a.flatten)
@redis.expire(@key, 86400) # Keep it for a day
true
else
false
end
end
end
You can use different buckets for different types of actions. Reading data is usually cheaper than writing it.
class ApiRateLimiter
def initialize(identifier)
@identifier = identifier
end
def limit_read_operation
bucket = TokenBucket.new(
"#{@identifier}:reads",
capacity: 100, # They can burst up to 100 reads
refill_rate: 10 # But they get 10 new tokens every second
)
bucket.consume(1)
end
def limit_write_operation
bucket = TokenBucket.new(
"#{@identifier}:writes",
capacity: 20, # Smaller burst capacity for writes
refill_rate: 2 # Slower refill rate
)
bucket.consume(1)
end
end
This gives you fine-grained control. An expensive report might cost 5 tokens, while a simple data lookup costs 1.
When your application grows, you might run it on multiple servers. A rate limit needs to work across all of them. If each server tracks limits separately, a user could send 100 requests to server A and 100 more to server B, breaking your global limit. We need a shared state, like Redis, and we need to be careful about timing.
Using a Lua script in Redis helps. It makes multiple operations happen as a single, atomic step. This prevents race conditions where two servers check the limit at the same time and both let a request through.
class DistributedLimiter
def initialize(key, limit:, window:, redis_pool:)
@key = key
@limit = limit
@window = window
@redis_pool = redis_pool
end
def acquire
lua_script = <<-LUA
local key = KEYS[1]
local limit = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
-- Remove timestamps older than the window
redis.call('zremrangebyscore', key, 0, now - window)
-- Count what's left
local current = redis.call('zcard', key)
if current >= limit then
return 0
end
-- Add the new request and set an expiry
redis.call('zadd', key, now, now)
redis.call('expire', key, window)
return limit - current - 1
LUA
remaining = @redis_pool.with do |redis|
redis.eval(
lua_script,
keys: [@key],
argv: [@limit, @window, Time.current.to_f]
)
end
remaining.to_i
end
end
You can integrate this into your application using Rack middleware. Middleware runs before your controller code, making it a clean place for rate limiting logic.
class ClusterRateLimit
def initialize(app, limiter_factory)
@app = app
@limiter_factory = limiter_factory
end
def call(env)
request = Rack::Request.new(env)
identifier = extract_identifier(request)
limiter = @limiter_factory.build(identifier)
remaining = limiter.acquire
if remaining >= 0
# Add info to the environment for the controller to use
env['rate_limit.remaining'] = remaining
@app.call(env)
else
[429, { 'Content-Type' => 'application/json' }, [
{ error: 'Rate limit exceeded' }.to_json
]]
end
end
end
This setup ensures consistency. No matter which server handles the request, it checks against the same central count in Redis.
Your servers have limited resources. When they are under heavy load, you might want to be stricter with limits to keep the system stable. When things are quiet, you could be more generous. This is adaptive rate limiting.
First, you need a way to measure load. A simple method is to check the system’s CPU load average.
class AdaptiveLimiter
def initialize(key, base_limit:, min_limit:, max_limit:)
@key = key
@base_limit = base_limit
@min_limit = min_limit
@max_limit = max_limit
@redis = Redis.new(url: ENV['REDIS_URL'])
end
def current_limit
load_factor = system_load_factor
if load_factor > 0.8
# System is stressed, cut the limit in half
(@base_limit * 0.5).to_i.clamp(@min_limit, @max_limit)
elsif load_factor < 0.3
# System is idle, increase the limit by 50%
(@base_limit * 1.5).to_i.clamp(@min_limit, @max_limit)
else
@base_limit
end
end
private
def system_load_factor
# Get the 1-minute load average and divide by CPU core count
load_avg = `uptime`.split(':').last.split(',').first.to_f
cores = Etc.nprocessors
load_avg / cores
rescue
0.5 # A safe default if the command fails
end
end
You can also hook into Rails’ built-in instrumentation. If your average response time is getting slow, it’s a sign you should tighten the limits.
class LoadAwareLimiter
def initialize(identifier)
@identifier = identifier
@limiter = AdaptiveLimiter.new(
identifier,
base_limit: 100,
min_limit: 20,
max_limit: 200
)
# Listen to every request completion
ActiveSupport::Notifications.subscribe('process_action.action_controller') do |*args|
event = ActiveSupport::Notifications::Event.new(*args)
update_limits_based_on_performance(event)
end
end
def update_limits_based_on_performance(event)
response_time = event.duration # Duration in milliseconds
if response_time > 1000 # Slower than 1 second
@limiter.adjust_downward
elsif response_time < 100 # Faster than 100ms
@limiter.adjust_upward
end
end
end
This pattern makes your system resilient. It protects itself automatically when things get busy.
Not all API requests are equal. A simple GET request uses fewer resources than a complex search or a data export. Cost-based rate limiting assigns a “cost” to each type of request. A client might have a daily “budget” of 1000 points.
class CostBasedLimiter
COST_MAP = {
'GET' => 1,
'POST' => 3,
'PUT' => 2,
'DELETE' => 2,
'PATCH' => 2
}.freeze
ENDPOINT_COSTS = {
'/api/search' => 2,
'/api/export' => 10, # Very expensive
'/api/bulk_create' => 5
}.freeze
def initialize(identifier, daily_budget: 1000)
@identifier = identifier
@daily_budget = daily_budget
@redis = Redis.new(url: ENV['REDIS_URL'])
end
def cost_for_request(request)
method_cost = COST_MAP[request.request_method] || 1
endpoint_cost = ENDPOINT_COSTS[request.path] || 1
method_cost * endpoint_cost
end
def can_serve?(request)
cost = cost_for_request(request)
today = Date.current.to_s
spent_key = "rate_limit:#{@identifier}:spent:#{today}"
current_spent = @redis.get(spent_key).to_i
if current_spent + cost > @daily_budget
false
else
@redis.incrby(spent_key, cost)
@redis.expire(spent_key, 86400)
true
end
end
end
This is fair. A client can make many cheap requests or a few expensive ones, staying within their resource allowance.
You can also apply different rules to different API endpoints. Your login endpoint needs very strict limits to prevent brute-force attacks. Your data feed endpoint can be more generous.
class EndpointSpecificLimiter
def initialize(identifier)
@identifier = identifier
@limiters = {}
end
def for_endpoint(endpoint)
@limiters[endpoint] ||= begin
config = endpoint_config(endpoint)
SlidingWindowLimiter.new(
"#{@identifier}:#{endpoint}",
limit: config[:limit],
window: config[:window]
)
end
end
private
def endpoint_config(endpoint)
case endpoint
when /\/api\/v1\/login/
{ limit: 5, window: 300 } # Only 5 attempts every 5 minutes
when /\/api\/v1\/search/
{ limit: 30, window: 60 }
when /\/api\/v1\/admin/
{ limit: 100, window: 3600 }
else
{ limit: 60, window: 60 }
end
end
end
This gives you surgical control. You protect sensitive areas tightly while allowing reasonable access elsewhere.
Different users have different needs. An anonymous visitor should have a very low limit. A paying customer should have a higher one. A large enterprise partner might need the highest limit of all.
class TieredLimiter
TIERS = {
anonymous: { limit: 10, window: 60 },
free: { limit: 100, window: 3600 },
basic: { limit: 1000, window: 3600 },
premium: { limit: 10000, window: 3600 },
enterprise: { limit: 100000, window: 3600 }
}.freeze
def initialize(client_identifier)
@client_identifier = client_identifier
end
def tier_for_client
if current_user.nil?
:anonymous
elsif current_user.enterprise_account?
:enterprise
elsif current_user.subscription_level == 'premium'
:premium
else
:free
end
end
def check_limit
tier = tier_for_client
config = TIERS[tier]
limiter = SlidingWindowLimiter.new(
"#{tier}:#{@client_identifier}",
limit: config[:limit],
window: config[:window]
)
!limiter.exceeded?
end
end
For API access, you often use API keys. Each key can have its own daily request limit stored in your database.
class ApiKeyLimiter
def initialize(api_key)
@api_key = api_key
end
def check_and_increment
key_data = ApiKey.find_by(key: @api_key)
return false unless key_data
today = Date.current.to_s
usage_key = "api_key:#{@api_key}:usage:#{today}"
current_usage = Redis.current.get(usage_key).to_i
if current_usage >= key_data.daily_limit
false
else
Redis.current.incr(usage_key)
Redis.current.expire(usage_key, 86400)
true
end
end
end
This ties business logic directly into your rate limiting. You can offer different plans with different limits.
It’s not enough to just block requests. You need to understand why limits are being hit. Is it a misbehaving script, a new feature launch, or an attack? Logging and analytics are crucial.
class RateLimitAnalytics
def track_violation(identifier, endpoint, limit_type)
data = {
identifier: identifier,
endpoint: endpoint,
limit_type: limit_type,
timestamp: Time.current.iso8601,
ip_address: Current.request&.remote_ip
}
# Log it
Rails.logger.info("Rate limit violation: #{data.to_json}")
# Send to an analytics service
Analytics.track_event('rate_limit_violation', data)
# Increment a counter for monitoring
violation_key = "rate_limit:violations:#{Date.current.to_s}"
Redis.current.incr(violation_key)
Redis.current.expire(violation_key, 86400)
check_alert_thresholds
end
def check_alert_thresholds
today = Date.current.to_s
violation_key = "rate_limit:violations:#{today}"
violations = Redis.current.get(violation_key).to_i
# Alert your team if violations spike
if violations > 1000
AlertService.notify("High rate limit violations: #{violations} today")
end
end
end
With this data, you can see which endpoints are most frequently hit. You can identify problematic clients. You might even adjust limits dynamically based on behavior. A well-behaved client who rarely hits limits could be granted a small increase. A client who constantly brushes against the limit might be a candidate for a higher-tier plan or might need their limits temporarily reduced to protect the system.
class IntelligentLimiter
def initialize(identifier, base_config)
@identifier = identifier
@base_config = base_config
@violation_count = 0
end
def current_limit
if frequent_violations?
# Reduce limit for problematic clients
(@base_config[:limit] * 0.5).to_i
elsif good_citizen?
# Reward good behavior
(@base_config[:limit] * 1.2).to_i
else
@base_config[:limit]
end
end
def record_violation
@violation_count += 1
if @violation_count > 10
# Temporary ban for extreme cases
block_key = "blocked:#{@identifier}"
Redis.current.setex(block_key, 3600, '1') # Block for 1 hour
end
end
private
def frequent_violations?
# Check for many violations in the last hour
violations_last_hour = RateLimitViolation
.where(identifier: @identifier)
.where('created_at > ?', 1.hour.ago)
.count
violations_last_hour > 5
end
end
This turns rate limiting from a static wall into a dynamic, learning part of your system.
These patterns can be mixed and matched. You might use a sliding window for anonymous users, a token bucket for your main API, and adaptive limits on your search endpoint. The goal is to keep your application available, responsive, and fair to all users. Start simple, monitor the results, and add complexity only when you need it. The best rate limiting strategy is the one that protects your service without getting in the way of legitimate use.