Hash Defaults: The Underused Feature That Replaces Half Your Boilerplate

Here’s a counter that almost every Ruby developer writes the same way:

counts = {}

words.each do |word|
  counts[word] ||= 0
  counts[word] += 1
end

It works. It also leaks an implementation detail, the fact that missing keys return nil, into every single line of business logic. The ||= 0 is noise. It has nothing to do with counting words, it just defends against nil + 1 blowing up.

Ruby has a built-in fix that most code never uses:

counts = Hash.new(0)

words.each do |word|
  counts[word] += 1
end

Hash.new(0) creates a hash where every missing key returns 0 instead of nil. The ||= disappears. The intent of the code, count occurrences, is now the only thing on screen.

This is the simplest example of a feature that runs much deeper than most people realise.

Three ways to construct a hash

Most Ruby code uses the literal form:

h = {}
h = { name: "Alice", age: 30 }

A literal hash has no default. Missing keys return nil. That’s fine for most cases, but it’s not the only option.

Hash.new accepts either a default value or a default block:

# Default value: the SAME object is returned for every missing key
zeros = Hash.new(0)
zeros[:anything]   # => 0
zeros[:else]       # => 0

# Default block: called every time a missing key is accessed
arrays = Hash.new { |hash, key| hash[key] = [] }
arrays[:a] << 1
arrays[:a] << 2
arrays[:b] << 3
arrays  # => { a: [1, 2], b: [3] }

The distinction between these two forms is subtle and matters a lot. We’ll get to it.

The default value form

Hash.new(default) returns default for any missing key without modifying the hash:

h = Hash.new("missing")
h[:a]            # => "missing"
h.key?(:a)       # => false
h                # => {} (still empty)

The hash is unchanged. The default is just the value [] returns for absent keys. This makes it perfect for read-only patterns:

# Word frequencies
counts = Hash.new(0)
words.each { |w| counts[w] += 1 }
# counts[w] += 1 expands to counts[w] = counts[w] + 1
# The READ returns 0 for missing keys, the WRITE actually stores the result

Note what’s happening on that last line. counts[w] += 1 is counts[w] = counts[w] + 1. The right-hand side reads from the hash and gets 0 (the default). The left-hand side writes the new value. The default value is never stored, it’s only returned when reading.

This is the right form when:

The default is an immutable value (numbers, symbols, frozen strings, true/false/nil)
You want missing-key reads to return something sensible without mutating the hash
You’re going to overwrite the value on write anyway (like with +=)

The default value trap

Here’s where people get burned:

groups = Hash.new([])

groups[:fruits] << "apple"
groups[:fruits] << "banana"
groups[:vegetables] << "carrot"

groups[:fruits]      # => ["apple", "banana", "carrot"]
groups[:vegetables]  # => ["apple", "banana", "carrot"]
groups               # => {} (still empty!)

Every key returned the SAME array. There’s only one default object, and << mutates it in place. Worse, the hash itself is empty, because << doesn’t trigger an assignment. The default array was never stored anywhere, it just kept getting mutated.

The rule: never use a mutable object as a default value. If you want each missing key to get its own fresh container, use a block.

The default block form

Hash.new { |hash, key| ... } calls the block every time a missing key is accessed. The block receives the hash and the key, and you decide what happens:

groups = Hash.new { |hash, key| hash[key] = [] }

groups[:fruits] << "apple"
groups[:fruits] << "banana"
groups[:vegetables] << "carrot"

groups
# => { fruits: ["apple", "banana"], vegetables: ["carrot"] }

Now each key gets its own array, stored in the hash on first access. This pattern, sometimes called “auto-vivification”, replaces a remarkable amount of initialization code.

Compare the alternatives:

# Without a default block
groups = {}
groups[:fruits] ||= []
groups[:fruits] << "apple"
groups[:fruits] ||= []
groups[:fruits] << "banana"

# With Object#tap and ||=
groups = {}
(groups[:fruits] ||= []) << "apple"
(groups[:fruits] ||= []) << "banana"

# With a default block
groups = Hash.new { |h, k| h[k] = [] }
groups[:fruits] << "apple"
groups[:fruits] << "banana"

The default block is the cleanest of the three because it pushes the initialization logic into the hash itself. Every method that touches the hash gets the behavior for free.

Recursive defaults

The default block is just Ruby code, so it can do anything, including create another hash with the same default:

def deep_hash
  Hash.new { |h, k| h[k] = deep_hash }
end

tree = deep_hash
tree[:users][:alice][:role] = "admin"
tree[:users][:bob][:role] = "viewer"

tree
# => {
#   users: {
#     alice: { role: "admin" },
#     bob: { role: "viewer" }
#   }
# }

No initialization. No nil checks. Just assign down any path and the intermediate hashes appear automatically.

This is great for building up nested structures, but be careful when reading. tree[:nonexistent][:key] will create empty hashes on every access, slowly populating your hash with junk. For reads, use dig:

tree.dig(:users, :alice, :role)    # => "admin"
tree.dig(:users, :charlie, :role)  # => nil (no junk created)

Default blocks that compute, not store

The block doesn’t have to assign. You can use it for memoization, fallbacks, or computation:

# Fibonacci with memoization, in five lines
fib = Hash.new do |h, n|
  h[n] = n < 2 ? n : h[n - 1] + h[n - 2]
end

fib[50]  # => 12586269025 (instant, computed once per index)

# Configuration with environment fallback
config = Hash.new do |_, key|
  ENV.fetch("APP_#{key.to_s.upcase}", nil)
end

config[:database_url] = "postgres://localhost/dev"

config[:database_url]  # => "postgres://localhost/dev"
config[:redis_url]     # => ENV["APP_REDIS_URL"], or nil

The block is just a function from (hash, key) to a value. It can do whatever you want.

What `default` and `default_proc` actually are

Every hash has two attributes that control missing-key behavior: default and default_proc. You can read or change them at runtime:

h = {}
h.default          # => nil
h.default_proc     # => nil

h.default = 0
h[:missing]        # => 0

h.default_proc = ->(hash, key) { hash[key] = [] }
h[:list] << "x"    # => ["x"]
h                  # => { list: ["x"] }

default_proc takes precedence over default if both are set. This means you can:

Add a default to an existing hash you didn’t create
Strip a default by setting it back to nil
Inspect what default a hash has, useful for debugging weird “where did this empty array come from” bugs

def safely(hash)
  copy = hash.dup
  copy.default = nil      # remove any default value
  copy.default_proc = nil # remove any default block
  copy
end

Useful when you’re handed a hash from somewhere else and want predictable nil-on-missing behavior.

Defaults don’t survive serialization

This is the gotcha that bites people in production:

counts = Hash.new(0)
counts[:a] = 5

counts[:nonexistent]  # => 0

# Round-trip through JSON
restored = JSON.parse(counts.to_json, symbolize_names: true)
restored[:nonexistent]  # => nil (default is gone!)

to_json, Marshal.dump, YAML.dump, none of them preserve the default or default_proc. They only serialize the actual key-value pairs. After deserializing, you get a plain hash with no default behavior.

This matters when:

Hashes get cached in Redis or memcached
Hashes are sent over the wire as JSON
Hashes are passed across Sidekiq job boundaries
Hashes are stored in serialize columns in ActiveRecord

If your code relies on a default, set it again after deserialization. Or, better, don’t rely on a default at boundaries. Use fetch with an explicit fallback:

counts.fetch(:key, 0)
counts.fetch(:key) { 0 }

The fetch form is portable. It doesn’t care whether the hash has a default or not.

When to reach for a default

Use a default value when:

You’re counting, summing, or accumulating into a numeric default
You want missing-key reads to return a sentinel like false or an empty frozen string
You’re building a read-mostly lookup with a sensible “not found” answer

Use a default block when:

You’re grouping into arrays, sets, or other mutable containers
You’re memoizing the result of an expensive computation
You’re building nested structures and want them to vivify on access
You want missing-key reads to fall back to ENV, config, a database, anything dynamic

Don’t use a default when:

The hash will cross a serialization boundary (JSON, Marshal, etc.)
You’d rather know about missing keys (use fetch instead)
The default is a mutable object you want shared across keys (use the block form, or accept that the value is shared)

The takeaway

Hash defaults aren’t a niche feature. They’re a small piece of Ruby that, used correctly, removes a lot of the boilerplate that accumulates around hash access. The ||= [] pattern, the counts[w] = (counts[w] || 0) + 1 pattern, the manual nested-hash initialization, all of it can be replaced by setting the right default once at construction time.

Most Ruby code doesn’t use them. That’s a habit, not a rule. Next time you write ||= against a hash, ask whether the hash itself could carry that default for you. The answer is usually yes.