tap, then, and yield_self: Ruby's Pipeline Methods

Ruby has three small methods that don’t get nearly enough attention: tap, then, and yield_self. They’re each one line of implementation, they’ve been in the language for years, and they solve a common problem: how do you build a clean chain of operations without littering your code with temporary variables?

tap: do something, return the original

tap yields the object to a block, then returns the original object unchanged. Whatever the block returns is ignored.

user = User.new(name: "Alice")
  .tap { |u| puts "Created user: #{u.name}" }
  .tap { |u| Logger.info("New user", user_id: u.id) }

The user object flows through both tap calls untouched. The blocks execute their side effects (logging, printing), but the return value of each block is thrown away. You get the original object back every time.

This makes tap perfect for debugging method chains. When something in a chain produces unexpected results, insert a tap to inspect the intermediate value:

orders
  .select { |o| o.status == "completed" }
  .tap { |result| pp result }  # what does this look like?
  .group_by(&:customer_id)
  .tap { |grouped| pp grouped }  # and this?
  .transform_values { |v| v.sum(&:total) }

No temporary variables. No restructuring the chain. Just .tap { |x| pp x } wherever you need to peek. When you’re done debugging, delete the tap. The chain’s behavior doesn’t change at all.

tap for object construction

tap is also great for building up objects that don’t support chained setters:

# Without tap
config = Config.new
config.database = "postgres"
config.pool_size = 5
config.timeout = 30
config

# With tap
Config.new.tap do |c|
  c.database = "postgres"
  c.pool_size = 5
  c.timeout = 30
end

The tap version is a single expression that creates, configures, and returns the object. It reads as one coherent thought instead of four separate statements.

then: transform and return the result

then (aliased as yield_self) yields the object to a block and returns whatever the block returns. It’s the opposite of tap: the block’s return value is the whole point.

"Hello, World"
  .then { |s| s.downcase }
  .then { |s| s.gsub(/[^a-z0-9\s]/, "") }
  .then { |s| s.tr(" ", "-") }
# => "hello-world"

Each then takes the previous result, transforms it, and passes the new value along. This is a pipeline: data flows through a series of transformations, each one self-contained.

then for conditional wrapping

One of the most practical uses of then is wrapping a value conditionally:

def api_url(path)
  path
    .then { |p| p.start_with?("/") ? p : "/#{p}" }
    .then { |p| "https://api.example.com#{p}" }
end

api_url("users")     # => "https://api.example.com/users"
api_url("/users")    # => "https://api.example.com/users"

Without then, you’d need an intermediate variable or a nested method call. With it, the transformations read top to bottom.

then for nil-safe pipelines

then pairs well with Ruby’s safe navigation operator:

params[:user]
  &.then { |u| User.find_by(email: u[:email]) }
  &.then { |user| user.orders.recent }
  &.then { |orders| orders.sum(&:total) }

If any step returns nil, the chain stops. No if nesting, no guard clauses, just a pipeline that gracefully handles absence.

yield_self: the original name

yield_self was introduced in Ruby 2.5. then was added in Ruby 2.6 as an alias because yield_self is frankly too verbose for something you want to use frequently. They’re identical in behavior.

# These do the same thing
42.yield_self { |n| n * 2 }  # => 84
42.then { |n| n * 2 }        # => 84

Use then. It’s shorter, reads better, and is the community standard at this point. yield_self still works and you’ll see it in older codebases, but there’s no reason to prefer it in new code.

The Elixir comparison

If you’ve used Elixir, then will feel familiar. Elixir’s pipe operator (|>) does something similar:

# Elixir
"Hello, World"
|> String.downcase()
|> String.replace(~r/[^a-z0-9\s]/, "")
|> String.replace(" ", "-")

Ruby’s then isn’t quite as clean syntactically, since you need the block, but the idea is the same: data flows through a sequence of transformations. The pipe operator has been proposed for Ruby multiple times, and then is the closest thing we have. It works well enough.

When to use each

The decision tree is simple:

Use tap when you want to do something with the object but return the object itself. Side effects: logging, debugging, mutation that doesn’t change the reference.

Use then when you want to transform the object into something else. The block’s return value becomes the new value in the chain.

Use neither when a simple method chain already reads clearly. Don’t wrap string.downcase in string.then { |s| s.downcase }, that’s just noise.

Here’s both in the same chain:

Order.new(items: cart.items)
  .tap { |o| o.apply_discount(promo_code) }
  .tap { |o| o.calculate_tax }
  .tap { |o| o.save! }
  .then { |o| OrderConfirmation.new(o) }
  .tap { |c| Mailer.send_confirmation(c) }

The tap calls modify the order in place (side effects). The then call transforms it into an OrderConfirmation (new value). The final tap sends the email (side effect). Reading top to bottom, you can follow exactly what happens and in what order.

The debugging trick worth remembering

If you take one thing from this post, let it be this: .tap { |x| pp x } is the fastest way to debug a method chain. You don’t need to break the chain apart, assign intermediate variables, or add binding.irb calls. Just tap, inspect, and move on.

users
  .select(&:active?)
  .tap { |x| pp x.count }  # how many active users?
  .map(&:email)
  .tap { |x| pp x }        # what emails?
  .uniq

It’s a small technique, but it comes up constantly. The methods themselves are simple. The value is in knowing when to reach for each one.

Pipeline producing unexpected hash output? Try RubyHash, paste your expected and actual results for an instant, readable diff.