Building robust Ruby applications requires more than just writing code that works under ideal conditions. It demands a thoughtful approach to handling the inevitable failures and edge cases that occur in production environments. Over the years, I’ve learned that sophisticated error handling separates maintainable applications from fragile ones.
Error management isn’t about preventing all failures—that’s impossible. Instead, it’s about creating systems that degrade gracefully, provide meaningful feedback, and maintain stability when things go wrong. The patterns I’ll share have proven invaluable in my work with production Ruby systems.
Let’s start with the Circuit Breaker pattern. This approach prevents cascading failures by temporarily blocking calls to failing services. I’ve implemented this pattern when working with external APIs that occasionally become unresponsive.
The Circuit Breaker operates in three states. When closed, everything works normally. If failures exceed a threshold, it opens and immediately rejects requests. After a timeout period, it moves to half-open state to test if the service has recovered.
class CircuitBreaker
def initialize(threshold: 5, timeout: 30)
@threshold = threshold
@timeout = timeout
@failures = 0
@state = :closed
@last_failure_time = nil
end
def call(&block)
case @state
when :open
raise CircuitOpenError if Time.now - @last_failure_time < @timeout
@state = :half_open
when :half_open
result = yield
@state = :closed
@failures = 0
result
when :closed
begin
result = yield
@failures = 0
result
rescue => e
@failures += 1
@last_failure_time = Time.now
@state = :open if @failures >= @threshold
raise e
end
end
end
end
I use this pattern for any external service integration. It prevents one failing service from bringing down the entire application. The timeout period gives the remote service time to recover before we attempt another connection.
Contextual error tracking has transformed how I debug production issues. Without proper context, error messages often lack the information needed to identify root causes. I’ve spent countless hours trying to reproduce bugs that could have been solved with better context.
module ErrorContext
def with_error_context(context = {}, &block)
original_context = Thread.current[:error_context] || {}
Thread.current[:error_context] = original_context.merge(context)
yield
ensure
Thread.current[:error_context] = original_context
end
def capture_error(error, additional_context = {})
full_context = (Thread.current[:error_context] || {}).merge(additional_context)
ErrorTracker.capture(error, full_context)
end
end
class PaymentProcessor
extend ErrorContext
def process(payment)
with_error_context(payment_id: payment.id, amount: payment.amount) do
# Payment processing logic
raise PaymentError if something_fails
end
rescue => e
capture_error(e, step: :processing)
raise
end
end
This approach attaches relevant information to exceptions without cluttering method signatures. The thread-local storage maintains context across method calls. When an error occurs, I get a complete picture of what was happening in the system.
I’ve found this particularly valuable in complex business processes. Knowing which user was involved, what data they were working with, and which step failed makes debugging significantly faster.
Retry mechanisms with exponential backoff have saved many integrations from temporary network issues. Early in my career, I would implement simple retries that could actually make problems worse by overwhelming struggling services.
class RetryWithBackoff
def initialize(max_attempts: 3, base_delay: 1)
@max_attempts = max_attempts
@base_delay = base_delay
end
def call(&block)
attempts = 0
begin
yield
rescue => e
attempts += 1
if attempts < @max_attempts && retryable_error?(e)
sleep(@base_delay * (2 ** (attempts - 1)))
retry
end
raise
end
end
def retryable_error?(error)
[NetworkError, TimeoutError, TemporaryServiceError].any? { |klass| error.is_a?(klass) }
end
end
retrier = RetryWithBackoff.new
retrier.call { unreliable_operation }
The exponential increase in wait time between retries reduces load on the struggling service. The retryable error check prevents infinite loops on permanent failures. I always include this pattern when working with network operations or external APIs.
Web applications benefit greatly from centralized error handling. Before adopting this pattern, I would find error handling logic scattered throughout controllers, making maintenance difficult and inconsistent.
class ErrorHandlingMiddleware
def initialize(app)
@app = app
end
def call(env)
begin
@app.call(env)
rescue ActiveRecord::RecordNotFound => e
[404, { 'Content-Type' => 'application/json' }, [{ error: 'Resource not found' }.to_json]]
rescue AuthenticationError => e
[401, { 'Content-Type' => 'application/json' }, [{ error: 'Authentication required' }.to_json]]
rescue AuthorizationError => e
[403, { 'Content-Type' => 'application/json' }, [{ error: 'Access denied' }.to_json]]
rescue => e
ErrorTracker.capture(e, request: env)
[500, { 'Content-Type' => 'application/json' }, [{ error: 'Internal server error' }.to_json]]
end
end
end
This middleware catches exceptions before they propagate through the entire stack. Each rescue block maps specific exception types to appropriate HTTP responses. Users get clear error messages while the application maintains stability.
I’ve found this approach particularly useful for API development. It ensures consistent error responses across all endpoints while keeping the error handling logic in one place.
The Result monad pattern has changed how I think about error propagation. Instead of using exceptions for control flow, this pattern encapsulates success and failure states explicitly.
class Result
def self.success(value = nil)
new(success: true, value: value)
end
def self.failure(error = nil)
new(success: false, error: error)
end
def initialize(success:, value: nil, error: nil)
@success = success
@value = value
@error = error
end
def success?
@success
end
def and_then(&block)
return self unless success?
begin
Result.success(yield(@value))
rescue => e
Result.failure(e)
end
end
def or_else(&block)
return self if success?
begin
Result.success(yield(@error))
rescue => e
Result.failure(e)
end
end
end
def process_data(input)
validate_input(input)
.and_then { |data| transform_data(data) }
.and_then { |result| persist_data(result) }
end
This pattern enables clean error propagation through processing pipelines. Method chaining replaces deep nesting of begin-rescue blocks. The code becomes more readable and maintainable while preserving error context.
I use this pattern extensively in service objects and business logic. It makes error handling explicit and prevents unexpected exception propagation.
Structured error notification ensures that errors reach the right people through the appropriate channels. I’ve seen too many critical errors get lost in noisy logs because they weren’t properly categorized.
class ErrorNotifier
def self.notify(error, severity: :error, context: {})
case severity
when :debug
Rails.logger.debug(format_error(error, context))
when :info
Rails.logger.info(format_error(error, context))
when :warn
Rails.logger.warn(format_error(error, context))
when :error
Rails.logger.error(format_error(error, context))
send_to_external_service(error, context) if production?
when :fatal
Rails.logger.fatal(format_error(error, context))
send_alert_to_ops(error, context)
end
end
def self.production?
Rails.env.production?
end
def self.format_error(error, context)
{
message: error.message,
backtrace: error.backtrace.first(10),
context: context
}.to_json
end
end
The severity-based routing separates operational noise from critical problems. Development errors stay in logs while production errors reach monitoring systems. The structured format includes all necessary debugging information.
I configure this based on the application’s needs. Non-critical services might use less aggressive notification settings than mission-critical systems.
The SafeExecution pattern provides a concise way to handle expected failures gracefully. I use this for operations where failure is an expected outcome rather than an exceptional condition.
module SafeExecution
def safely(default: nil, capture: true, &block)
begin
yield
rescue => e
ErrorNotifier.notify(e, severity: :error) if capture
default
end
end
def safely_with_context(context = {}, &block)
with_error_context(context) do
safely(&block)
end
end
end
class DataImportJob
extend SafeExecution
def perform
safely_with_context(job_id: job_id) do
import_records
end
end
def import_records
# Risky import operation
end
end
The default return value allows the application to continue when non-critical operations fail. Context integration ensures even caught errors include relevant debugging information. I use this pattern for background jobs, file processing, and other operations where partial failure is acceptable.
These patterns work best when combined into a comprehensive error handling strategy. I typically start with centralized error handling for web requests, add contextual tracking for better debugging, and use circuit breakers for external integrations.
The Result monad handles business logic errors explicitly while SafeExecution manages expected failures. Retry mechanisms handle temporary issues, and structured notification ensures the right people know about important problems.
Implementation considerations include setting appropriate thresholds for circuit breakers, defining retry policies that match service characteristics, and configuring notification levels based on operational impact.
User communication is equally important. Technical errors should be logged with full context, but users need clear, actionable messages. The middleware pattern helps maintain this separation while ensuring consistency.
Monitoring and metrics complete the picture. I track error rates, success percentages for external services, and recovery times for circuit breakers. This data helps identify patterns and prioritize improvements.
Error handling is an ongoing process rather than a one-time implementation. I regularly review error reports, adjust thresholds, and refine patterns based on operational experience. The goal is continuous improvement in system resilience and reliability.
These patterns have served me well across various Ruby applications. They provide a foundation for building systems that handle failure gracefully while maintaining operational visibility. The investment in robust error handling pays dividends in reduced debugging time and improved system stability.
Remember that no single pattern fits all situations. The art lies in choosing the right combination for your specific application needs and operational environment. Start with the patterns that address your most pressing pain points and evolve your approach as you learn from production experience.