Noel Rappin Writes Here

Better Know A Ruby Thing #5: Block Arguments

Posted on March 17, 2024


Previously in what what I guess is now “The Argument Trilogy”, we talked about:

And now the trilogy now comes to its inevitable conclusion with “Return of The Jedi” block arguments.

In the interest of keeping this thing within the plausible word count of a newsletter, we’re not going to talk about what blocks are or the way blocks behave here, that’ll be a future Better Know, but we do need to talk about block syntax.


Before we continue, a brief commercial announcement:

If you like this and want to see more of it in your mailbox, you can sign up at http://buttondown.email/noelrap. If you really like this and would like to support it financially, thanks, and you can sign up to financially support this:

Subscription fees go toward covering Buttondown’s costs, once we pass that, they’ll go toward covering audio and video software to create extras for subscribers – I expect to have a subscriber-only video available in a couple of weeks.

Also, you can buy Programming Ruby 3.3 in ebook from Pragmatic or in print from Amazon.

Thanks!

We now return to our post, already in progress…


Syntactically, Ruby allows you to define a block at the end of any method call. That is the only place you can define a block. Ruby has two different syntaxes for delimiting a block: do/end and curly braces { }. In both cases you can define arguments to the block and local variables by putting them between pipe characters | var | at the beginning of the block.

[1, 2, 3, 4].map { |x| x ** 2 }

[1, 2, 3, 4].map do |x| x ** 2 end

The syntactic difference between do/end and the curly braces is precedence. The do/end version has lower precedence than the braces. This is rarely something that will affect normal Ruby code, but if you get fast and loose with parenthesis you can get in a situation where there is a difference.

def test(a, b)
  yield a, b
end

test 1, 2 do |x, y| x + y end
> 3

test 1, 2 { |x, y| x + y }
> Syntax error

The second example is a syntax error because block is associated with the 2, whereas in the first example, the method call has higher precedence and the block associates with the call test 1, 2.

Using parenthesis around method arguments makes this a non-issue.

There are two different sets of advice from style guides on when to use braces vs do/end. The one you will see most often in practice says to use curly braces for one-line blocks and do/end for multiline blocks. The main exception – called “Weirich Style” after long-time Ruby leader and inventor of rake Jim Weirich – says to use curly braces for multiline block if you are chaining the end of the block to another method call. The theory here is that }.uniq looks better than end.uniq.

I’m mostly neutral on Weirich style, honestly if I’m in a situation like that, my inclination is to extract the multi-line block to a method so that my chained calls are all single-line blocks.

Defining Block Arguments Implicitly

In order for a method to declare a block argument, or to be able to use a block passed to a method, the method needs to do… nothing. The block argument is just there, and if a block was passed to the method you can pass control to the block by using the yield keyword.

def method_using_block(a, b)
  yield(a, b) if block_given?
end

method_using_block(4, 5) { |x, y| x * y }
=> 20

You can tell if a block was passed by using block_given? . If you try to use yield and there’s no block, Ruby raises an error, so block_given? is frequently used as a guard, as in yield x if block_given? (Among the more minor oddities here is that block_given? is a method of Kernel whereas yield is a keyword.)

When I first learned Ruby, this was the syntactic decision I had the hardest time with, and I still think it’s genuinely a little surprising, so let’s sit with it for a while.

The questions this syntax raised for me included:

  • “How can you tell from reading a method if it expects a block?” You look for the use of yield and block_given?. Ideally the method is short enough that you can figure this out quickly.

  • “How can you when you are calling a method if a block is expected?” I don’t have a great answer here. The answer is something like “you learn what kind of methods use blocks”, I guess? “Look at the docs?”, “Tool support?”. What I’d say here is that it’s not usually a big deal in practice, but it sure felt like it would be when I was first learning Ruby.

  • “Why is Ruby like this?”

With the disclaimer that I don’t know for sure…

Smalltalk has “blocks” with the same name and slightly different syntax (it uses square brackets, but also uses pipe characters for variables). I presume that’s where the idea came from, it’s the only example I can think of similar to Ruby blocks that would have been a thing in the early 1990s.

The big difference is that in Smalltalk, blocks are treated like any other argument, and are automatically fully callable objects. In Ruby, blocks are a special argument, and have to be explicitly converted to full objects.

One way this plays out in practice is that Smalltalk methods can take multiple blocks as arguments. For example, boolean logic in Smalltalk is in the library, and a normal “if” statement is handled by passing blocks:

(x == 3) ifTrue: [ self doAThing ]
         ifFalse: [ self doAnotherThing ]

You could imagine a case where Ruby automatically turned block syntax into procs and therefore you could do something like:

def if_then_else(condition, if_true, if_false)
  condition ? if_true.call : if_false.call
end

## THIS IS NOT VALID RUBY
x.if_then_else(b == c, { puts "case 1" }, { puts "case 2" })

## TO MAKE IT VALID RUBY, WE NEED TO CREATE LAMBDAS
## OR PROCS
x.if_then_else(b == c, -> { puts "case 1" }, -> { puts "case 2" })

Ruby is not exactly shy about syntactic shortcuts, so it’s a fair question why Ruby would force the explicit -> to pass multiple blocks to a method. (In older versions of Ruby it’d be even longer – you’d need proc or lambda instead of ->).

Again, I don’t know for sure, but I have a couple of guesses. My go-to guesses for anything in Ruby that is surprising are one of the following:

  • Parser issues. Ruby’s parser is notoriously complicated. Sometimes syntax is the way it is to make the parser not be even more notoriously complicated.
  • “Programmer Happiness”, which for Ruby often means “take this object-oriented thing and present it in a way that simplifies/obscures the object-oriented semantics”.

(To some extent, the second issue causes the first one…)

Anyway, my guess here is that it’s the first one – looking at that fake Ruby x.a_method(b == c, { puts "case 1" }, { puts "case 2" }) it seems like it’d be tricky to parse the difference between a block and a Hash literal, and anything that would make it easier to parse, like a different delimiter, isn’t any easier than for the developer than x.a_method(b == c, -> { puts "case 1" }, -> { puts "case 2" }).

The parser isn’t an issue in Smalltalk because Smalltalk has fewer literals and therefore less contention for delimiters.

Ruby gets around the ambiguity between block and hash literals by making so that block literals only go where you can’t have a Hash literal and vice versa.

There are advantages to the Ruby way – because blocks are such a basic part of Ruby syntax, there’s more consistency in Ruby’s block usage than the way Smalltalk uses blocks. (My experience with Smalltalk was that block syntax always looked weird when it came in the middle of a method call.)

Defining Block Arguments Explicitly

Most of the time, your methods will define block arguments implicitly, but if you really want to, you can explicitly declare a block argument. For this to work, the last argument to the method is prefixed with an &. The & argument must come last, after all the positional arguments and all the keyword arguments. There can only be one & argument.

def method_with_proc(a, b, &proc)
  proc.call(a, b)
end

If you have an & argument, and the method is called with a block, then the block is converted to a Proc object and can be used in the method (without the &).

method_with_proc(4, 5) { |x, y| x * y }
=> 20

Typically the use case for this is just to pass the proc forward to another method, but you can also call the method using Proc#call or Proc#[] if you are feeling particularly weird.

If you don’t pass a block to the method, the & argument is set to nil, so that doesn’t cause an error on pass-through, but does cause an error if you try to use the block.

But, to a first approximation, nobody uses the & version unless it’s a passthrough (even though I just did, in the example). The official Ruby docs even mention this: the version “without an explicit block parameter is preferred”.

If you are just using a passthrough, you can use just a plain & as an anonymous block:

def passthrough(&)
  other_method(&)
end

Using Block Arguments

Another wild feature about Ruby block arguments is that a block argument is always syntactically legal at the end of a method call, even if the method being called couldn’t care less.

It’s perfectly legal to do this. It’s confusing, but legal.

"string".upcase { |x| p x }
=> "STRING"

The block just gets ignored.

You can use block syntax to implicitly add a block to any method call and that method can implicitly use the block with yield.

What if you already have a Proc object and don’t want to use block syntax? You can get Ruby to treat that Proc object as a block with the use of the &.

mapping_function = Proc.new { |x| x + 10 }
[1, 2, 3, 4].map(&mapping_function)

The & may seem magical here, but it’s performing the same basic function that * and ** do – converting a data object into method arguments.

In the same way that * and ** have conversion methods that are used if the value attached to them isn’t the expected type, the & uses the method to_proc.

This is to say that in the above code, where it says &mapping_function, what Ruby actually does is call mapping_function.to_proc, and then passes the resulting proc as the block argument to the call. Since mapping_function is already a Proc, Proc#to_proc returns itself…

Proc is not the only Ruby class that implements to_proc. Somewhat notoriously, so does Symbol:

[1, 2, 3, 4, 5].map(&:pred)
=> [0, 1, 2, 3, 4]

Ruby interprets this as :pred.to_proc, which you can do on your own:

> x = :pred.to_proc
=> #<Proc:0x00000001204a8708(&:pred) (lambda)>
> x.call(3)
=> 2

The defined block is basically { |receiver| receiver.send(self) }, so it calls a method on the object with the same name as the symbol.

Symbol#to_proc is useful, but I find it hard to teach, in part because it depends on not-obvious Ruby internals, and in part because &: looks like one sigil, when in fact the & and the : are unrelated.

As it happens, Symbol is not the only core class that implements to_proc. The Method class does, but that’s not interesting, the returned proc is just a wrapper for the method.

The interesting core class that implements to_proc is Hash. And no, I’ve never actually used it for real:

> mapping = {a: 1, b: 2, c: 3}
> [:a, :b, :c].map(&mapping)
=> [1, 2, 3]

Hash#to_proc is a shortcut for { |x| self[x] }.

I can kind of see a use for it, if I squint. You can think of a hash as a kind of function that takes in an object (the key) and converts it to another object (the value). If you are using a hash that way, the to_proc version is a slight shortcut over the explicit hash lookup. And you can hide the implementation detail of whether you have an actual function or a hash. (I think that since the intent is to mimic a functional mapping, that’s why Hash declares to_proc, and, say, Struct and Data don’t – Struct and Data are meant to structure data, not ever mimic a function.)

Blocks Taking Block Arguments

A question that it never occurred to me to ask is “can blocks take blocks”. The most basic answer is no – this doesn’t work:

# this is a syntax error
def block_yielder
  yield { |x| x ** 2 }
end

Which makes sense – blocks can only be defined after method calls and yield isn’t a method.

You can pass a Proc to a block – this works:

def proc_yielder
  yield proc { |x| x ** 2 }
end
> proc_yielder { |y| y.call(3) }
=> 9

And it looks like you can syntactically do this with an ampersand argument, but it blows up on calling it claiming that y is nil:

> proc_yielder { |&y| y.call(3) }

Which again, I guess makes sense because the & causes Ruby to look for a block argument that doesn’t exist.

Whether there’s a way to make that work, and actually place a block in that slot… I’m not sure. I don’t think so, but somebody with a deeper knowledge of the parser might know differently.

Hot Takes

My hottest take about block arguments is that I probably should use them more in my own code.

I love using blocks in the Ruby standard library, I think the Enumerable methods are great, and File#open and the like are very useful, but I never seem to find places in my code where that kind of generic structure in my code would be helpful. I think – not 100% sure, but I suspect – that doing some work to find places in my code that could be blockified would be worth the time, long term.

My other hot take, I think we might have covered somewhere, which is that I’m trying to reduce usage of Symbol#to_proc in favor of the _1 positional arguments so, replacing [1, 2, 3].map(&:abs) with [1, 2, 3].map { _1.abs }. I’m finding that preserving the block structure makes it easier to read teh code later on, and it think it’s much easier to explain to new Ruby devs.

If you’ve made it here, you’ve gotten through a ton of stuff on Ruby arguments! Thanks!



Comments

comments powered by Disqus



Copyright 2024 Noel Rappin

All opinions and thoughts expressed or shared in this article or post are my own and are independent of and should not be attributed to my current employer, Chime Financial, Inc., or its subsidiaries.