reduce vs each_with_object: Pick the Right Accumulator
Here are two methods that build the same hash:
# Count word frequencies
words.reduce(Hash.new(0)) { |counts, word| counts[word] += 1; counts }
words.each_with_object(Hash.new(0)) { |word, counts| counts[word] += 1 }
They produce identical output. They read almost identically. And one of them has a trailing ; counts that looks like a typo but is the entire reason the method works. That dangling word is the whole story of when to reach for reduce and when to reach for each_with_object.
What reduce actually does
reduce (also spelled inject, they are the same method) folds a collection into one value by threading an accumulator through every iteration. The block returns the next accumulator:
[1, 2, 3, 4].reduce(0) { |sum, n| sum + n }
# => 10
Read the data flow carefully. On each iteration, reduce hands the block the current accumulator and the current element. Whatever the block returns becomes the accumulator for the next iteration. The return value is the contract. sum + n produces a new number, that number gets passed in as sum next time, and the final block return value is what reduce gives back.
For numbers this is effortless, because sum + n is an expression that naturally evaluates to the thing you want to carry forward. The trouble starts the moment your accumulator is a mutable object you are filling in place.
The sharp edge: mutating accumulators
Go back to the word counter written with reduce:
words.reduce(Hash.new(0)) { |counts, word| counts[word] += 1 }
This looks right. It is wrong. counts[word] += 1 is an assignment expression, and in Ruby an assignment evaluates to the value assigned, not the container. So the block returns the new count, an integer, and reduce faithfully passes that integer in as counts on the next iteration. The second iteration then calls [] on an Integer:
words = %w[a b a]
words.reduce(Hash.new(0)) { |counts, word| counts[word] += 1 }
# => NoMethodError: undefined method `[]' for 1:Integer
The fix is to make the block return the hash:
words.reduce(Hash.new(0)) { |counts, word| counts[word] += 1; counts }
That trailing counts is doing real work. It throws away the integer the assignment produced and returns the accumulator instead. Forget it and the method explodes; remember it and you have a line that ends in a bare variable for reasons that are invisible to anyone reading quickly.
This is the single most common reduce bug. It shows up with hashes, arrays, strings, anything you build by mutation:
# Every one of these needs the trailing accumulator
items.reduce([]) { |list, x| list << transform(x); list }
keys.reduce({}) { |h, k| h[k] = fetch(k); h }
lines.reduce("") { |buf, line| buf << line.strip; buf }
each_with_object flips the contract
each_with_object exists precisely for this case. It threads a mutable object through the iteration too, but it ignores the block’s return value and always carries the same object forward:
words.each_with_object(Hash.new(0)) { |word, counts| counts[word] += 1 }
# => {"a"=>2, "b"=>1}
No trailing counts. The block can return whatever it likes, the integer from += 1 included, and each_with_object does not care. It owns the accumulator and hands you back the same one every time. When the loop ends it returns that object.
Two more things are different, and they trip people up when switching between the methods:
-
The block arguments are reversed.
reduceyields(accumulator, element).each_with_objectyields(element, accumulator). The object you are building comes second ineach_with_object. Swap them by habit and you get confusing errors. -
The seed is mandatory, and it is positional, not a return.
each_with_object([])passes the empty array in; there is no zero-argument form, because without an object to thread there is nothing to do.
A rule that actually decides
The two methods split cleanly along one question: does each step produce a new value, or mutate an existing one?
-
New value each step →
reduce. Sums, products, boolean folds, building up an immutable result withmergeor+. The block’s return value is the next state, which is exactly whatreducewants.configs.reduce({}) { |merged, c| merged.merge(c) } # merge returns a new hash flags.reduce(true) { |all, f| all && f.valid? } -
Mutate one object across steps →
each_with_object. Filling a hash, pushing onto an array, appending to a string buffer. You want the same object every time and you do not want to remember to return it.records.each_with_object({}) { |r, by_id| by_id[r.id] = r }
Put plainly: if you find yourself writing ; acc at the end of a reduce block to return the accumulator, that is the language telling you the job belongs to each_with_object. The trailing variable is a code smell with a specific cure.
Where group_by and friends fit
Before reaching for either, check whether a more specific Enumerable method already does the fold. A surprising amount of reduce/each_with_object code is reinventing something that ships with Ruby:
# each_with_object, the long way
people.each_with_object(Hash.new { |h, k| h[k] = [] }) { |p, h| h[p.team] << p }
# the same thing
people.group_by(&:team)
# building a lookup table
users.each_with_object({}) { |u, h| h[u.id] = u }
# Ruby 2.1+
users.to_h { |u| [u.id, u] }
# or, when keyed by one attribute
users.index_by(&:id) # Rails, ActiveSupport
# summing a field
orders.reduce(0) { |sum, o| sum + o.total }
# clearer
orders.sum(&:total)
group_by, to_h, tally, sum, partition, min_by, and max_by are all specialized folds. They read better than a hand-rolled accumulator and they are harder to get wrong because there is no accumulator to mismanage. tally in particular makes the word counter a one-liner:
words.tally
# => {"a"=>2, "b"=>1}
Reach for reduce or each_with_object when the built-in folds do not fit, not before.
The takeaway
reduce and each_with_object are the same idea with opposite ergonomics. reduce takes the block’s return value as the next accumulator, which is perfect when each step computes a fresh value and a trap when each step mutates a shared one. each_with_object ignores the return value and threads one object through, which is perfect for the mutate-in-place case and pointless for a running total.
Decide by asking whether the step returns or mutates. If you are reaching for a trailing ; acc, you picked the wrong tool. And before either one, check that group_by, to_h, tally, or sum is not already the answer you are building by hand.
Enjoyed this post?
Subscribe to get notified when we publish more Ruby and Rails content.