7 Ruby Enumerator Patterns That Process Millions of Rows Without Crashing Your App
Learn 7 Ruby enumerator patterns that cut memory usage and speed up data processing. From lazy evaluation to custom iterators — see real code examples and use them today.
When I first started writing Ruby, I treated arrays like shopping lists. I would load everything into memory, loop through it with each, and hope for the best. That worked for small data sets, but when I faced files with millions of rows or API streams that never ended, my code crumbled. That is when I discovered enumerators. They are not magic, but they feel like it. Enumerators let you process data one piece at a time, using less memory and running faster. Let me show you seven patterns I use every day.
Pattern 1: Use each_cons to Look at Sliding Windows
Imagine you have a list of daily temperatures and you want to find days where the temperature dropped from the previous day. You could write a loop with an index, but that gets messy. Instead, use each_cons. This method gives you consecutive slices of a collection, sliding one step at a time.
temperatures = [30, 32, 28, 29, 31, 27]
drops = []
temperatures.each_cons(2) do |prev, current|
drops << current if current < prev
end
puts drops.inspect # => [28, 27]
I remember using this to detect sudden changes in stock prices. The code is short and clear. You read it and immediately see the pattern. No index variables, no off-by-one errors. The block yields pairs in order, and you handle each pair separately.
You can change the window size. Want to look at three days at once? Use each_cons(3). It returns arrays of three consecutive elements. This pattern works on any enumerable, including files, strings, and ranges. It is one of the most underused tools in Ruby.
Pattern 2: Slice Data into Chunks with each_slice
Sending a thousand records to an API in one request is a bad idea. Most services have limits. You need to send them in batches. each_slice splits an enumerable into groups of a given size.
users = (1..100).to_a
users.each_slice(10) do |batch|
send_to_api(batch)
end
I use this pattern all the time when importing CSV files. Instead of loading the entire file into memory, I read it line by line and group rows into batches. Each batch gets processed and discarded. My memory usage stays flat.
You can also use each_slice to transform data. For example, to group a flat list of coordinates into pairs.
coordinates = [10.0, 20.0, 30.0, 40.0]
points = []
coordinates.each_slice(2) { |pair| points << { x: pair[0], y: pair[1] } }
puts points.inspect # => [{:x=>10.0, :y=>20.0}, {:x=>30.0, :y=>40.0}]
If the collection length is not divisible by the slice size, the last batch is smaller. That is usually fine, but you may want to pad it. You can use each_slice combined with fill if needed.
Pattern 3: Build Lazy Enumerators for Infinite Data
The word “lazy” sounds bad, but in Ruby it means efficient. When you call lazy on an enumerable, it delays execution until you actually need a result. This is essential for infinite sequences or very large files.
Consider generating prime numbers. You could create an array of all numbers from 1 to infinity, but that is not possible. Instead, use a lazy enumerator.
def primes
(2..Float::INFINITY).lazy.select do |n|
(2...n).none? { |d| n % d == 0 }
end
end
first_five = primes.first(5)
puts first_five.inspect # => [2, 3, 5, 7, 11]
Without lazy, the select would try to build an array of all primes up to infinity. The computer would freeze. With lazy, it only evaluates enough numbers to produce the first five primes. The rest stay in a promise state.
I use lazy enumerators when reading log files. I can filter lines without loading the whole file.
File.foreach("log.txt").lazy.select { |line| line.include?("ERROR") }.each do |line|
puts line
end
The file is read line by line. If I only need the first ten errors, I add .first(10). The reading stops after finding ten matches. No wasted I/O.
Pattern 4: Keep Track of Indexes with with_index
Ruby’s each gives you the element, but sometimes you need its position. You could use each_with_index, but that gives you a two-element array in the block. I prefer each.with_index because it feels more natural and works on any enumerator.
fruits = ["apple", "banana", "cherry"]
fruits.each.with_index do |fruit, i|
puts "#{i}: #{fruit}"
end
But you can also offset the index. Suppose your data starts at year 2000, and you want the year along with each record.
data = [10, 20, 30]
data.each.with_index(2000) do |value, year|
puts "Year #{year}: #{value}"
end
# Year 2000: 10
# Year 2001: 20
# Year 2002: 30
I once had to process a list of timestamps and output a relative number. Using with_index with a starting offset made the code self-documenting.
Pattern 5: Create Custom Enumerators with Enumerator.new
Sometimes the built-in enumeration methods are not enough. You need to define your own iteration logic. That is when you write a custom enumerator using Enumerator.new.
fibonacci = Enumerator.new do |yielder|
a = 0
b = 1
loop do
yielder << a
a, b = b, a + b
end
end
puts fibonacci.first(10).inspect # => [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
The yielder object receives values one at a time. The loop keeps it alive forever. When you ask for the next element, the enumerator yields the next Fibonacci number. This pattern gives you full control over state.
I use custom enumerators for paginating through API responses. The enumerator wraps the HTTP requests and yields each page.
def api_paginator(url)
Enumerator.new do |yielder|
page = 1
loop do
response = HTTP.get(url, params: { page: page })
break if response.body.empty?
yielder << response.body
page += 1
end
end
end
api_paginator("https://api.example.com/items").each do |page|
process(page)
end
The enumerator hides the pagination logic. The consumer just iterates.
Pattern 6: Merge Multiple Enumerators with zip
You often have two or more lists that correspond by index. For example, names and ages. You want to process them together. zip combines enumerables into a series of arrays.
names = ["Alice", "Bob", "Carol"]
ages = [30, 25, 28]
names.zip(ages) do |name, age|
puts "#{name} is #{age} years old."
end
zip stops when the shortest enumerable ends. This is useful for aligning time series data that may have missing points.
You can also chain zip with a block. Without a block, it returns an array of arrays. With a block, it yields each pair.
I once had to merge two CSV files by row number. zip saved me from writing nested loops.
file1 = File.foreach("data1.csv")
file2 = File.foreach("data2.csv")
file1.lazy.zip(file2).each do |line1, line2|
merged = line1.chomp + "," + line2.chomp
puts merged
end
Lazy loading keeps memory low. Each line pair is processed and forgotten.
Pattern 7: Filter and Extract with grep and grep_v
Ruby’s grep uses the === operator (case equality) to filter elements. This is incredibly powerful because you can use regex, ranges, classes, or even lambdas.
codes = ["abc123", "def456", "ghi789", "ABC999"]
codes.grep(/abc/i) { |match| puts match }
# abc123
# ABC999
grep returns a new array of matches. If you give it a block, it processes each match. To get the non-matches, use grep_v.
codes.grep_v(/abc/i) { |non_match| puts non_match }
# def456
# ghi789
I use grep to filter log lines by severity.
warnings = File.foreach("server.log").grep(/WARN/)
errors = File.foreach("server.log").grep(/ERROR/)
You can also use grep with a range to find numbers within a band.
scores = [45, 67, 89, 12, 99]
high_scores = scores.grep(80..100)
puts high_scores.inspect # => [89, 99]
The range 80..100 uses === to test membership. It is cleaner than writing a block.
Bringing It All Together
These seven patterns are the building blocks of efficient data processing in Ruby. I use them every week, sometimes every day. The key is to think about data as a stream, not a pile. When you stop loading everything into memory and start processing one element at a time, your programs become leaner and faster.
Start with each_slice for batch operations. Add lazy when dealing with large files or infinite sequences. Use with_index to keep track of positions without extra bookkeeping. Build custom enumerators when you need to wrap external resources. Merge streams with zip to combine related data. And use grep to filter without writing explicit conditionals.
The beauty of Ruby enumerators is that they compose. You can chain them. For example, you can read a file lazily, grep for errors, slice the results into batches, and process each batch with a custom enumerator. That is a pipeline, and it runs on a tiny memory footprint.
Do not be afraid to write code that feels a little different at first. The first time I used Enumerator.new I thought I broke something. But after I saw my program handle a 2GB CSV file without crashing, I was convinced. Try one pattern on your next project. Pick the one that matches your problem. Start simple, then add more as you get comfortable. You will wonder how you ever lived without them.