Noel Rappin Writes Here

Redundancy, Terseness, and Code

Posted on December 28, 2021


Most human communication, text or written, is wordier and more redundant than it needs to be, strictly speaking.

That previous sentence, for example, would still communicate its point in about a third of the words with “Most human communication is too wordy”.

You’d likely still get the idea if I used about half the characters and wrote: “hmn comms too wrdy”.

There are certainly reasons why you might include words when speaking or writing that are technically not needed:

  • Redundancy provides clarity, the more times you say something, the more likely the receiver is to pick up on context clues, and the less likely the speaker or writer is to be misunderstood.
  • Extra words or their lack can provide tone or subtle shades of meaning — how you say something is as meaningful as what you say. Of the three examples at the start, the third sentence is likely to sound sarcastic because of its unusual terseness.

What might appear as redundancy in communication actually does provide important information. From one angle, the first and third sample sentences at the beginning say the same thing, and from another angle they very much do not.

Which brings us to Ruby and to programming languages in general.

There’s always been a tension in programming languages between terseness and readability. Some developers really try to write code with as little typing as possible, and different languages have different levels of comfort with redundant text for readability.

A simple example is the difference between ERB and Haml, where ERB might look like this:

<div class="bold" id="birthday">
  Happy birthday, <%= name %>
</div>

The equivalent Haml is much shorter

.bold#birthday= "Happy birthday #{name}"

The Haml uses a number of context tricks to shorten its code: assuming that an unspecified tag is a div, shortening class="bold" to .bold and so on.

Those shortcuts make the Haml easier to type, but they also make it more specialized — you have to know Haml’s specific syntax quirks to understand the second line, whereas just knowing basic HTML takes you most of the way to understanding the first example.

When I code, I try to use terseness to my advantage, by making the fact that I’m using a more succinct option carry meaning by itself — usually the meaning is something along the lines of “this part really is as simple as it looks”.

For example, while I usually don’t use abbreviations in variable names, I will use them for block or loop variables that are short lived and whose purpose is simple. In that case, the shortness of the variable name is a message about the importance of the variable.

One reason why I eventually moved away from Haml is that I found that the terseness of Haml just made things harder to read, without making them easier to understand beyond the lack of extra typing.

Ruby has always had what I feel is a nice compromise between terseness and readability. It has always had places where there are multiple options to do the same things with different amounts of typing.

For example, end of line clauses. These two examples are identical:

def complicated_thing(argument)
  if argument.nil?
    return
  end
  # do more stuff
end
def complicated_thing(argument)
  return if argument.nil?
  # do more stuff
end

You don’t actually type that many fewer characters in the second one, but you do save two vertical lines.

The second one is probably a little harder to read if you are not used to Ruby (it takes a little getting used to parsing the order in which things happen). Importantly, the second one has an additional piece of meaning based on common community usage, specifically it says “this is a simple guard clause protecting the rest of the method”. This is using terseness to convey information.

On the other hand, I don’t think the Ruby community has ever consistently come up with a meaningful difference for when to use:

x = if condition? then 3 else 4 end

Versus

x = condition? ? 3 : 4

The second one with the ternary operator is terser, I think harder to read, and seems to be the preferred choice in most Ruby style guides. I prefer the first form most of the time. (Once upon a time, I was team-teaching a course on Ruby, we presented both these options, and the students naturally asked which one to use. The two of us simultaneously, and with confidence, gave opposite answers.)

My point here is that I don’t think the Ruby community has ever come up with a secondary shade of meaning for one or the other version, and as a result it stands as a question of personal taste.

Anyway, over the past few versions of Ruby, I count four different syntax changes that allow Ruby code to be more terse, and I want to look at whether they allow for a secondary meaning to be carried, or if they are just shorter and harder to read.

Yes, I did just spend 800 words setting up the post I really wanted to write. Sorry?

Half-infinite ranges

Ruby now allows you to imply the beginning or ending of a range if the range goes to infinity on either end, so ..10 or 0... While there are some legitimate reasons why you’d want an infinite stream, in practice these are mostly used for array or string access, so [..10] meaning the first 10 entries in an array and [10..] meaning from entry 10 until the end of the array. Both of these have method shortcuts, first(10) and… I was going to say last(10) but last does something slightly different, so maybe there isn’t an easy method short cut for [10..].

Anyway, you used to have to do this with [0..10] or [10..-1] so it’s not like you are saving a lot of characters. Honestly, I think [..10] is a clearer statement of “from the beginning of the array” then [0..10] is and I’m completely sure that [10..] is a clearer statement of “to the end of the array” than [10..-1] is. So it’s modestly more terse and gives at least as much information.

Endless method

The feature I like that I have a feeling nobody else does is Ruby 3.0’s endless method feature, which is the one-line method definition syntax.

Instead of

def thing_squared
  thing ** 2
end

You can now do

def thing_squared = thing ** 2

The space before the equal is important, otherwise it parses as a setter method.

This removes two vertical lines and a very small number of characters. I find it pretty readable, as long as the method body is simple, so I don’t think it’s giving away any clarity.

I feel like this can be used to convey tone, with a secondary meaning that “this is a simple computed attribute, don’t worry about it”. I use it for cases with:

  • No arguments
  • One expression
  • No Boolean logic
  • Total length is less than 80 characters

Some quick examples from Elmer, my task-tracking tool, these are actually all from the same class:

def status_order = status_value.order

def slug_object = Slug.find_by(slug: slug, scope: self)

def sort_name = "#{status_order}_#{name.parameterize}"

def person_id = person&.id

Seems to work fine, I have no trouble reading it. Honestly, I’d stack them on top of each other without a blank line between them, but Standard complains.

Hash shortcuts

This one I feel like everybody else is going to like more than I do.

Ruby 3.1 adds what Ruby calls Hash Shortcut Syntax and what the JavaScript world sometimes calls Punning.

If the key and value have the same symbol in a Hash literal or method call, you don’t need to include both, so instead of:

{card: card, person: person}
do_a_thing(card: card, person: person)

You can now write

{card:, person:}
do_a_thing(card:, person:)

I upgraded my Elmer project to Ruby 3.1 and switched a bunch of places where this syntax was now feasible, and… I don’t know? (Interestingly, most of my clean up here was in partial view files passing local variables to other partial view files via render, as in render("statuses/new_card", status:, project:))

First impression was that it looks weird, especially if the shortcuts are mixed in with regular key value pairs. My brain wants to see the syntax as the argument list of a method definition, not a method call, though I suspect I’ll get used to that with a little time.

In theory, this syntax is terser but not really losing any meaning because the key and the value are spelled the same way. That said, I’m not sure it’s adding any secondary meaning here, it’s just purely clearing redundancy. Which is fine, but in practice, I do find it a little hard to read, I kind of hiccup and need to remind myself what an empty value means. I think I’ll get used to it, I’m willing to try.

Implicit block arguments

This one I suspect nobody likes, but I’m going to try to rehabilitate…

Ruby 2.7 added a shortcut for referring to the arguments in blocks, where you previously would have had to explicitly name each argument to a block:

cards.map { |card| card.name }

You can use a numerical shortcut to refer to the arguments to the block in order:

cards.map { _1.name }

The number after the underscore represents the positional order of the arguments to the block.

This is definitely shorter, and it’s definitely removing a piece of information — the name of the temporary variable. Naming is important, and giving up the card as the name of the block variable is potentially losing information. On the other hand, if the temporary variable is coming from a list called cards, is calling the temporary variable card really adding that much useufl information.

I made a push to try this in Elmer, and… I didn’t hate it? I might kind of like it? It still feels weird, but it’s arguably more straightforward than cards.map(&:name), which I like, but is an absolute bear to explain to somebody seeing it for the first time.

The useful circumstances seem to be something like this:

  • The block is an argument to an enumeration method
  • The receiver of the argument is named with a plural based on its type
  • The block is a single line and one expression

If all those things are true, then I can take in the whole line at once and understand what the _1 means, along with the secondary meaning that “this is a simple block, and the loop control is exactly what you think it is”.

A couple of real examples from Elmer:

snapshot_lists.map { snapshot_project_size(_1) }.sum

projects.select { _1.people.includes(person) }

siblings.select { _1.snapshot_date < snapshot_date }

attrs.map { for_one(_1, from:, to:) }.compact

card.card_snapshots.each { _1.save! }

It’s possible I’m a little too close to it, but these actually seem kind of readable to me? I’m probably too close to it.

A thing I can’t do yet

What I really want to be able to do in Ruby is replace this:

def initialize(foo, bar, bas)
  @foo = foo
  @bar = bar
  @bas = bas
end

With this:

def initialize(@foo, @bar, @bas)
end

CoffeeScript had something like this and I loved it, but I do wonder if it might be too complicated for the Ruby parser, and what it might mean to either limit the feature to initialize methods or allow it on all methods.

After like 1800 words on terseness, I’d sum this up by saying that the goal of programming is to have the code communicate intent, and that sometimes you can encode intent in the very fact of using a more abbreviated construct.



Comments

comments powered by Disqus

Copyright 2021 Noel Rappin