7 Background Job Patterns Every Rails Developer Needs to Stop Production Fires

Learn 7 proven Rails background job patterns to prevent data loss, duplicate charges, and queue failures. Real code, real solutions from a decade of production experience.

7 Background Job Patterns Every Rails Developer Needs to Stop Production Fires

I have been building Rails applications for over a decade. The one moment that changed everything for me was when a simple user signup form took seven seconds to respond. The user sat there, staring at a blank screen. I was sending two emails, processing an image, and calling an external API, all inside the controller. I learned that day that the request-response cycle is sacred. It must be fast. Everything else needs to happen somewhere else.

That somewhere else is background jobs. They are the most important tool in a Rails developers toolbox. Yet most beginners get them wrong. They throw every slow operation into a job without thinking about failure, ordering, or dependencies. I have made every mistake you can make. I have lost data. I have caused production outages. I have woken up at 3 AM to fix a job queue that was chewing through memory like candy.

Let me save you those sleepless nights. Here are the seven patterns for background job workflows that I use in every application I build today. Each pattern solved a real problem I faced, and I will show you exactly how to implement them.

The Single Responsibility Job

The first mistake I made was putting everything into one job. I created a ProcessOrderJob that sent emails, updated inventory, generated PDFs, and notified shipping. When that job failed, I could not tell what part broke. The error message was useless. The retry logic was wrong because each step had different requirements.

Here is how I fixed it. Every job should do one thing only. If you need to process an order, you need three separate jobs. One for the email, one for the PDF, one for the shipping. They run independently. If the PDF fails, the email still goes out. This is simple but it changes everything.

# Bad - one job does everything
class ProcessOrderJob < ApplicationJob
  queue_as :default

  def perform(order_id)
    order = Order.find(order_id)
    send_invoice_email(order)    # fails here
    generate_pdf_receipt(order)  # never runs
    update_shipping_status(order) # never runs
  end
end

# Good - each job has one purpose
class InvoiceEmailJob < ApplicationJob
  queue_as :default

  def perform(order_id)
    order = Order.find(order_id)
    OrderMailer.invoice(order).deliver_now
  end
end

class PdfReceiptJob < ApplicationJob
  queue_as :default

  def perform(order_id)
    order = Order.find(order_id)
    GenerateReceiptPdf.call(order)
  end
end

class ShippingNotificationJob < ApplicationJob
  queue_as :default

  def perform(order_id)
    order = Order.find(order_id)
    ShippingService.notify(order)
  end
end

I learned this pattern from a production incident where a third-party email API was down for two hours. The old ProcessOrderJob kept failing for every new order. The PDFs and shipping updates never happened for those two hours. After splitting them into separate jobs, only the email failed. Everything else worked fine. The queue kept moving.

The Idempotency Pattern

Here is a truth that took me years to accept. Background jobs will run more than once. Redis crashes. Sidekiq restarts. A worker dies mid-job and the retry mechanism kicks in. If your job is not safe to run twice, you will have duplicate emails, duplicate charges, duplicate everything.

I once built a billing system that charged credit cards. The job would call the payment gateway, then update the local database. One day the worker died right after the payment succeeded but before the database update. On retry, it charged the card again. I had to issue refunds to seventeen customers and write apology emails.

The fix is idempotency. A job is idempotent if running it twice produces the same result as running it once. The simplest way to achieve this is to check the state before doing the work.

class ChargeCustomerJob < ApplicationJob
  queue_as :billing

  def perform(invoice_id)
    invoice = Invoice.find(invoice_id)
    
    # Check if we already did this
    return if invoice.paid?
    return if invoice.payment_attempted_at.present?
    
    # Mark as attempted first
    invoice.update!(payment_attempted_at: Time.current)
    
    # Then process the payment
    result = PaymentGateway.charge(
      amount: invoice.amount,
      token: invoice.payment_token
    )
    
    # Update based on result
    if result.success?
      invoice.update!(paid: true, paid_at: Time.current)
    else
      invoice.update!(payment_failed: true)
    end
    
    # Log the attempt ID from the gateway
    invoice.update!(gateway_transaction_id: result.transaction_id)
  end
end

The key line is the early return. Check if the work is done before doing it. For more complex workflows, I use a unique job ID stored in the database. When the job starts, it tries to acquire a lock using that ID. If the lock already exists, the job exits.

class UniqueJob < ApplicationJob
  queue_as :default
  
  def perform(job_id, *args)
    # Try to create a unique execution record
    execution = JobExecution.find_or_initialize_by(unique_id: job_id)
    
    if execution.persisted?
      Rails.logger.info("Job #{job_id} already executed, skipping")
      return
    end
    
    execution.save!
    
    # Now do the actual work
    do_work(*args)
  end
end

The Chained Workflow Pattern

Some jobs must run in sequence. You cannot send the shipping notification until the PDF is generated. You cannot generate the PDF until the order is processed. This is where chaining comes in. But chaining is dangerous. If you chain them inside each other, you lose the benefits of separate jobs. If one step fails, the whole chain breaks in confusing ways.

I found a solution that works. Each job finishes and then enqueues the next job. The first job runs, completes, and schedules the second job. The second job runs, completes, and schedules the third job. If the second job fails, the third job never runs. The queue handles retries naturally.

class ProcessOrderJob < ApplicationJob
  queue_as :default
  
  def perform(order_id)
    order = Order.find(order_id)
    
    # Step one: validate and mark as processing
    order.update!(status: :processing)
    
    # Enqueue next step
    GenerateInvoicePdfJob.perform_later(order_id)
  end
end

class GenerateInvoicePdfJob < ApplicationJob
  queue_as :documents
  
  def perform(order_id)
    order = Order.find(order_id)
    
    # Step two: generate the PDF
    pdf = InvoicePdfGenerator.new(order).render
    
    # Store it
    order.invoice_pdf.attach(
      io: StringIO.new(pdf),
      filename: "invoice-#{order.id}.pdf"
    )
    
    # Enqueue next step
    SendInvoiceEmailJob.perform_later(order_id)
  end
end

class SendInvoiceEmailJob < ApplicationJob
  queue_as :mailers
  
  def perform(order_id)
    order = Order.find(order_id)
    
    # Step three: send the email
    OrderMailer.invoice(order).deliver_now
    
    # Final step - mark complete
    order.update!(status: :completed)
  end
end

I use this pattern for user onboarding flows. When a user signs up, I need to create their account, send a welcome email, set up their default preferences, and create their first project. Each step depends on the previous one. The chain keeps the workflow organized and the retry logic simple. If the email fails, it retries. The account creation is already done.

The Batch Processing Pattern

Web applications always have those operations that must run on thousands of records at once. Sending a newsletter to ten thousand subscribers. Updating prices for a thousand products. Deleting old data from a million rows.

The wrong approach is to loop through all records in one job. That job will take hours, block the queue, and fail halfway through with no easy way to resume. I learned this the hard way when a data migration job ran for six hours, failed at 85%, and I had to start over.

The right approach is to process in batches. Each batch is its own job. If one batch fails, only that batch needs to retry. The progress is saved between batches.

class ProcessBatchJob < ApplicationJob
  queue_as :batch_processing
  
  BATCH_SIZE = 100
  
  def perform(model_name, ids)
    model = model_name.constantize
    records = model.where(id: ids)
    
    records.find_each do |record|
      process_record(record)
    end
    
    Rails.logger.info("Processed batch of #{ids.length} #{model_name}")
  end
  
  private
  
  def process_record(record)
    # Your processing logic here
    record.update!(processed_at: Time.current)
  end
end

class ScheduleBatchJob < ApplicationJob
  queue_as :batch_processing
  
  BATCH_SIZE = 100
  
  def perform(model_name)
    model = model_name.constantize
    unprocessed = model.where(processed_at: nil)
    
    unprocessed.pluck(:id).each_slice(BATCH_SIZE) do |batch_ids|
      ProcessBatchJob.perform_later(model_name, batch_ids)
    end
    
    Rails.logger.info("Scheduled #{unprocessed.count} records for processing")
  end
end

I use this pattern for monthly subscription billing. I have a scheduler that runs once per day. It finds all customers who need to be billed today. It splits them into batches of 100. Each batch runs independently. If the payment gateway has a hiccup for ten minutes, only a few batches fail. They retry. The rest keep going.

The Scheduled Job Pattern

Not all jobs run immediately. Some need to run later. Send a reminder email in 24 hours. Cancel an unpaid order after 30 minutes. Retry a failed webhook delivery in increasing intervals.

Rails gives you set(wait:) but I found that using this directly in controllers leads to bugs. The controller is supposed to handle requests, not schedule future work. I separate the scheduling into its own service object.

class OrderReminderScheduler
  def self.schedule(order)
    return if order.reminder_scheduled?
    
    # Schedule three reminders
    SendOrderReminderJob
      .set(wait: 24.hours)
      .perform_later(order.id, :first_reminder)
    
    SendOrderReminderJob
      .set(wait: 48.hours)
      .perform_later(order.id, :second_reminder)
    
    SendOrderReminderJob
      .set(wait: 72.hours)
      .perform_later(order.id, :final_reminder)
    
    order.update!(reminder_scheduled: true)
  end
end

class SendOrderReminderJob < ApplicationJob
  queue_as :reminders
  
  def perform(order_id, reminder_type)
    order = Order.find_by(id: order_id)
    return unless order
    
    case reminder_type
    when :first_reminder
      return if order.paid?
      OrderMailer.first_reminder(order).deliver_now
      
    when :second_reminder
      return if order.paid?
      OrderMailer.second_reminder(order).deliver_now
      
    when :final_reminder
      return if order.paid?
      OrderMailer.final_reminder(order).deliver_now
      order.update!(status: :expired)
    end
  end
end

This pattern saves me from the bug where a job runs after the order is already paid. The early return return if order.paid? is critical. The world changes between when you schedule a job and when it runs. Always recheck the state.

For cron-style recurring jobs, I use whenever gem in combination with a scheduled job pattern. The cron triggers a job that triggers the work. This keeps everything in the app layer.

# config/schedule.rb
every :day, at: '2:00 am' do
  runner "DailyCleanupJob.perform_later"
end

# app/jobs/daily_cleanup_job.rb
class DailyCleanupJob < ApplicationJob
  queue_as :maintenance
  
  def perform
    # Delete old sessions
    Session.where('expires_at < ?', 30.days.ago).delete_all
    
    # Archive old orders
    Order.where('created_at < ?', 90.days.ago).find_each do |order|
      ArchiveOrderJob.perform_later(order.id)
    end
    
    # Send daily summary to admins
    AdminMailer.daily_summary.deliver_later
  end
end

The Error Handling with Circuit Breaker

Here is a pattern I wish I knew earlier. External APIs fail. They fail often. They fail at the worst times. A job that calls a third-party service will retry on failure. But if the service is down for an hour, your queue fills up with failed attempts. Each retry wastes time. The queue backs up. Everything slows down.

The circuit breaker pattern fixes this. Track failures. After a threshold, stop trying for a while. Let the queue drain. Try again later.

class CircuitBreaker
  attr_reader :service_name, :failure_count, :failure_threshold, :reset_timeout
  
  def initialize(service_name, failure_threshold: 5, reset_timeout: 60)
    @service_name = service_name
    @failure_count = 0
    @failure_threshold = failure_threshold
    @reset_timeout = reset_timeout
    @last_failure_time = nil
    @open = false
  end
  
  def open?
    if @open
      if Time.current - @last_failure_time > @reset_timeout
        # Try to close the circuit
        @open = false
        @failure_count = 0
        Rails.logger.info("Circuit breaker #{@service_name} reset to closed")
        return false
      end
      return true
    end
    false
  end
  
  def record_failure
    @failure_count += 1
    @last_failure_time = Time.current
    
    if @failure_count >= @failure_threshold
      @open = true
      Rails.logger.warn("Circuit breaker #{@service_name} OPEN due to #{@failure_count} failures")
    end
  end
  
  def record_success
    @failure_count = 0
    @open = false
  end
end

class ExternalApiJob < ApplicationJob
  queue_as :external_apis
  
  CIRCUIT_BREAKERS = {}
  
  def perform(endpoint, payload)
    circuit = circuit_breaker_for(endpoint)
    
    if circuit.open?
      # Re-enqueue with a delay
      raise "Circuit breaker is open for #{endpoint}"
    end
    
    begin
      response = HTTParty.post(endpoint, body: payload)
      
      if response.success?
        circuit.record_success
        process_response(response)
      else
        circuit.record_failure
        raise "API returned #{response.code}"
      end
    rescue StandardError => e
      circuit.record_failure
      raise e
    end
  end
  
  private
  
  def circuit_breaker_for(endpoint)
    self.class::CIRCUIT_BREAKERS[endpoint] ||= CircuitBreaker.new(endpoint)
  end
end

I put this into practice after a payment gateway outage. Every billing job failed. The queue grew to fifty thousand jobs. When the gateway came back, it took two hours to clear the backlog. With a circuit breaker, the jobs would have stopped failing after five attempts. They would have waited. The queue would have stayed manageable.

The Transactional Workflow Pattern

The hardest pattern to master is the transactional workflow. This is when a job affects multiple systems and you need consistency. You cannot charge the card and fail to create the subscription. You cannot update the inventory and fail to create the shipment.

The pattern is simple in theory, hard in practice. Do all the database work first. Then do the external work. If the external work fails, you have a compensation job to revert the database changes.

class CreateSubscriptionJob < ApplicationJob
  queue_as :billing
  
  def perform(user_id, plan_id)
    user = User.find(user_id)
    plan = Plan.find(plan_id)
    
    # Step 1: Create database records first
    subscription = nil
    payment = nil
    
    ActiveRecord::Base.transaction do
      # Create the subscription record
      subscription = user.subscriptions.create!(
        plan: plan,
        status: :pending,
        started_at: Time.current
      )
      
      # Create the payment record
      payment = subscription.payments.create!(
        amount: plan.price,
        status: :pending
      )
    end
    
    # Step 2: Call external services
    begin
      # Charge the card
      result = PaymentGateway.charge(
        amount: plan.price,
        token: user.payment_token
      )
      
      # Update records with success
      ActiveRecord::Base.transaction do
        payment.update!(
          status: :completed,
          transaction_id: result.transaction_id
        )
        subscription.update!(status: :active)
      end
      
      # Send confirmation
      SubscriptionMailer.confirmation(user, subscription).deliver_later
      
    rescue StandardError => e
      # Step 3: Compensation - mark everything as failed
      ActiveRecord::Base.transaction do
        payment.update!(status: :failed, failure_reason: e.message)
        subscription.update!(status: :failed)
      end
      
      # Notify the user
      SubscriptionMailer.payment_failed(user, subscription).deliver_later
      
      # Raise to trigger Sidekiq retry
      raise e
    end
  end
end

The critical insight is that the transaction wraps only the database changes. Not the external call. You cannot roll back a credit card charge with a database rollback. The database transaction is for the local state only. The external call is made after the transaction commits. If it fails, you have a compensation path.

I use this pattern for all financial operations. It is not perfect. There is still a window between the database insert and the external call. But it reduces the risk dramatically compared to doing things in the wrong order.

Bringing It All Together

These seven patterns cover almost every background job scenario I have encountered. Start with single responsibility jobs. Add idempotency to prevent duplicates. Chain them for workflows. Batch them for large operations. Schedule them with state checks. Protect them with circuit breakers. Wrap them in transactions for consistency.

Do not try to implement all seven at once. Pick the one that solves your current pain point. If you are losing data, start with the transactional pattern. If your queue is backing up, implement the circuit breaker. If you have duplicate charges, add idempotency.

The goal is not to have perfect code. The goal is to have code that fails gracefully. Background jobs will fail. Your database will go down. Redis will restart. Your payment gateway will have an outage. The patterns I showed you here are not about preventing failure. They are about making sure your application survives failure and recovers cleanly.

I still make mistakes. Last month I scheduled a job without the idempotency check. It ran twice and sent duplicate emails to a thousand customers. I got complaints for three days. Then I added the check. It has not happened since.

That is the thing about background jobs. They are invisible when they work. When they fail, everyone notices. Spend the time now to get them right. Your future self, the one who gets woken up at 3 AM, will thank you.


// Keep Reading

Similar Articles