Memory optimization in Ruby applications often feels like walking a tightrope. On one side, you have the need for rapid development and clean code. On the other, the harsh reality of production environments where memory bloat can slow everything to a crawl. I have spent countless hours profiling applications, and the patterns I share here come from hard-won experience in the field. They are not theoretical ideals but practical tools I use regularly to keep systems responsive and efficient.
Object pooling stands out as one of the most effective strategies I employ. Creating and destroying objects repeatedly, especially expensive ones like database connections, puts unnecessary strain on the garbage collector. By maintaining a reusable pool, you sidestep this overhead entirely. I remember a project where connection pooling reduced GC pauses by over 30%, making the application feel noticeably snappier for users.
class DatabaseConnectionPool
def initialize(max_connections: 10)
@max_connections = max_connections
@available = Queue.new
@in_use = {}
initialize_connections
end
def with_connection
conn = acquire
yield conn
ensure
release(conn) if conn
end
private
def acquire
if @available.empty? && @in_use.size < @max_connections
create_new_connection
else
@available.pop(true)
end
end
def release(connection)
@available.push(connection)
end
def create_new_connection
connection = Database.new_connection
@in_use[connection.object_id] = connection
connection
end
def initialize_connections
@max_connections.times { @available.push(create_new_connection) }
end
end
# In practice, this simplifies resource management
pool = DatabaseConnectionPool.new(max_connections: 5)
pool.with_connection do |conn|
conn.execute("SELECT * FROM users")
end
The beauty of this approach lies in its simplicity. You borrow a connection, use it, and return it. The pool handles the lifecycle, ensuring that no more than the specified number of connections exist at any time. This prevents memory leaks and keeps resource usage predictable.
Lazy loading transforms how applications handle data. Too often, I see code that eagerly loads everything upfront, only to use a fraction of it. By deferring object creation until the moment of need, you conserve memory and improve initial load times. One of my client projects saw a 40% reduction in memory usage during startup after we implemented lazy loading across their report generation system.
class LazyDataLoader
def initialize(data_source)
@data_source = data_source
@loaded = false
@data = nil
end
def data
load_data unless @loaded
@data
end
def load_data
puts "Loading data from #{@data_source}..."
# Simulate expensive operation
sleep(2)
@data = Array.new(1000) { |i| { id: i, value: "item_#{i}" } }
@loaded = true
end
def process
data.map { |item| expensive_operation(item) }
end
private
def expensive_operation(item)
# Some CPU-intensive work
item[:value].upcase
end
end
loader = LazyDataLoader.new("database")
# Data remains unloaded until accessed
puts "Loader created, no data loaded yet"
results = loader.process # Triggers load and process
This pattern shines in scenarios where data usage is conditional. Why pay the memory cost if you might not need the data? It encourages a more thoughtful approach to resource management.
Custom data structures can dramatically reduce memory footprint. Ruby’s built-in collections are convenient but not always optimal for specific use cases. I once replaced a simple hash with a tailored structure and cut memory usage by half while improving lookup speeds.
class OptimizedUserRegistry
def initialize
@users = []
@email_index = {}
@activity_index = []
end
def add_user(user)
frozen_user = {
id: user[:id],
email: user[:email].freeze,
last_active: user[:last_active],
metadata: user[:metadata].freeze
}.freeze
index = @users.size
@users << frozen_user
@email_index[frozen_user[:email]] = index
@activity_index << [frozen_user[:last_active], index]
@activity_index.sort!
end
def find_by_email(email)
index = @email_index[email]
@users[index] if index
end
def recently_active(limit = 10)
@activity_index.last(limit).map { |_, idx| @users[idx] }
end
def memory_usage
ObjectSpace.memsize_of(@users) +
ObjectSpace.memsize_of(@email_index) +
ObjectSpace.memsize_of(@activity_index)
end
end
registry = OptimizedUserRegistry.new
1000.times do |i|
registry.add_user({
id: i,
email: "user#{i}@example.com",
last_active: Time.now - rand(100000),
metadata: { preferences: { theme: "dark" } }
})
end
puts "Memory used: #{registry.memory_usage} bytes"
Freezing objects prevents modifications and allows Ruby to make internal optimizations. Combined with efficient indexing, this approach handles large datasets gracefully.
String handling frequently causes memory issues in Ruby applications. Each string literal creates a new object, leading to duplication. I often use interning techniques to ensure identical strings share the same memory location.
class StringInterner
def initialize
@pool = {}
end
def intern(str)
@pool[str] ||= str.dup.freeze
end
def intern_all(strings)
strings.map { |s| intern(s) }
end
def pool_size
@pool.size
end
def estimated_savings(original_count)
# Average string overhead in bytes
overhead_per_string = 40
(original_count - pool_size) * overhead_per_string
end
end
interner = StringInterner
words = %w[hello world hello ruby world hello]
unique_words = interner.intern_all(words)
puts "Original strings: #{words.map(&:object_id).uniq.size}"
puts "Interned strings: #{unique_words.map(&:object_id).uniq.size}"
puts "Memory saved: ~#{interner.estimated_savings(words.size)} bytes"
This technique works wonders for applications processing repetitive text data, like log files or user-generated content. The memory savings compound quickly as data volume grows.
Processing large collections requires careful memory management. Loading everything at once can overwhelm available RAM. I prefer batching approaches that work with data in manageable chunks.
class BatchProcessor
def process_in_batches(collection, batch_size: 500, &block)
total = 0
collection.each_slice(batch_size) do |batch|
process_batch(batch, &block)
total += batch.size
manage_memory if total % 5000 == 0
end
end
def process_batch(batch, &block)
batch.each do |item|
yield item
end
end
def manage_memory
current_usage = memory_usage
if current_usage > 400_000 # 400MB
GC.start
sleep(0.05) # Allow GC to complete
end
end
def memory_usage
`ps -o rss= -p #{Process.pid}`.to_i
end
end
processor = BatchProcessor.new
large_dataset = (1..10000).lazy.map { |i| "item_#{i}" }
processor.process_in_batches(large_dataset) do |item|
# Process each item
puts item.upcase
end
The lazy enumeration combined with batching ensures that only a small portion of the dataset resides in memory at any time. This approach has saved me from out-of-memory errors more times than I can count.
Caching improves performance but can lead to memory exhaustion if unchecked. I implement bounded caches that enforce strict size limits and intelligent eviction policies.
class SmartCache
def initialize(max_entries: 1000, ttl: 3600)
@max_entries = max_entries
@ttl = ttl
@data = {}
@access_times = {}
@creation_times = {}
end
def [](key)
if @data.key?(key) && !expired?(key)
@access_times[key] = Time.now
@data[key]
end
end
def []=(key, value)
cleanup if @data.size >= @max_entries
@data[key] = value
@access_times[key] = Time.now
@creation_times[key] = Time.now
end
def size
@data.size
end
private
def expired?(key)
Time.now - @creation_times[key] > @ttl
end
def cleanup
# Remove expired entries first
@data.delete_if { |k, _| expired?(k) }
# If still over limit, remove least recently used
while @data.size > @max_entries
lru_key = @access_times.min_by { |_, time| time }.first
@data.delete(lru_key)
@access_times.delete(lru_key)
@creation_times.delete(lru_key)
end
end
end
cache = SmartCache.new(max_entries: 100, ttl: 1800)
cache[:user_1] = fetch_user_data(1)
cache[:config] = load_configuration
# Access pattern affects what gets evicted
100.times { |i| cache["key_#{i}"] = "value_#{i}" }
puts "Cache size: #{cache.size}" # Will be at most 100
This cache implementation considers both time-based expiration and usage patterns. It prevents the cache from growing indefinitely while keeping frequently accessed data available.
Tracking object allocations helps identify memory hotspots. I regularly use allocation profiling to understand where memory pressure originates and focus optimization efforts accordingly.
class AllocationProfiler
def profile(description, &block)
GC.disable
initial_allocations = ObjectSpace.count_objects
initial_memory = memory_usage
result = yield
final_allocations = ObjectSpace.count_objects
final_memory = memory_usage
GC.enable
report_allocation_changes(description, initial_allocations, final_allocations, initial_memory, final_memory)
result
end
def memory_usage
`ps -o rss= -p #{Process.pid}`.to_i
end
def report_allocation_changes(description, initial, final, start_mem, end_mem)
changes = {}
initial.each do |type, count|
changes[type] = final[type] - count
end
puts "=== #{description} ==="
puts "Memory delta: #{end_mem - start_mem} KB"
changes.each do |type, delta|
puts "#{type}: #{delta}" if delta != 0
end
puts
end
end
profiler = AllocationProfiler.new
# Profile a memory-intensive operation
result = profiler.profile("User serialization") do
users = User.all.to_a
users.map { |u| u.attributes.merge(associations: u.associations_data) }
end
This profiling technique revealed surprising insights in my projects. Sometimes, innocent-looking method calls generated thousands of temporary objects. Addressing these hotspots often yields the biggest performance improvements.
Memory optimization requires ongoing attention. As applications evolve, new patterns emerge, and old optimizations may become obsolete. I make it a habit to regularly profile memory usage and watch for trends. The patterns I described form a toolkit that adapts to different scenarios. They work best when combined thoughtfully rather than applied indiscriminately.
The key is understanding your application’s specific memory characteristics. What works for a data-processing service might not suit a web API. Through careful measurement and incremental changes, you can achieve both performance and maintainability. These strategies have served me well across diverse Ruby projects, from high-traffic web applications to long-running background processors.
Every application has unique challenges, but the fundamental principles remain consistent. Monitor, measure, and optimize based on evidence rather than intuition. The patterns I shared provide a starting point for building memory-efficient Ruby applications that scale gracefully with growing demands.