How To Manage Duplicate Test Setup, or Can I Interest You In Weird RSpec?

You have a series of test cases. They cover the same logic with different inputs. In order to get to that logic, there’s some overhead: objects have to be created first. Then there’s more logic needed to evaluate the result.

What’s the best way to manage these tests?

You want it to be easy to add new tests. You also want it to be clear what part of the test is different in each round and what part is just the common logistics. That makes the tests easier to understand.

There are lots of ways to handle duplicate tests – we’re going to cover five of them, ranging from “please never do this, I mean never” to “this is a good choice in some circumstances”. For me, this is a case where there isn’t a single right answer, but there are some answers that are frequently wrong.

My normal bias on repeat logic in tests is to just be okay with duplication there. Tests are different than code, and for small amounts of setup, I’d rather duplicate the setup than risk having bugs in the test setup itself – bugs in tests are annoying. Sometimes, though, the setup is complicated enough or the test repeats itself often enough that managing the duplication is called for.

Recently, I wrote a gem. It’s not a big thing, it just sorts and normalizes a gemfile. I had a definite use case for it, which is, I suppose, not important.

I built this gem incrementally using test-first and golden master testing. I created a simple mini-gemfile that demonstrated the feature that I wanted then I hand-crafted the output I wanted it to produce. The test then runs the sorter and compares the generated output to the golden master.

The first example was just three gem declarations that were not sorted:

gem "zeitwerk"
gem "rails"
gem "awesome_print"

When that was passed, I added version strings to the gem lines, then I added duplicates, eventually building up to group blocks, and so on, and then topping it off with the real Gemfile from one of my Rails apps.

The Gemfiles are simple enough that I can come pretty close to a TDD loop just by incrementally adding a new file with each new feature that I want to cover. The tests are also fast enough for the tests to be run very frequently. (I did write some unit tests to exercise specific parts of the parser, but the bulk of the testing is these acceptance tests.)

Each acceptance test needs to do the following:

Create the sorter object and give it the unsorted file
Run the sorter
Compare the output against the given sorted version

Written out, the first test goes like this. One thing I did was tie the naming of the directory where the unsorted file is with the name of the sorted file, so the name of the test is basically the only thing that changes from test to test.

The first test looks like this:

RSpec.describe "acceptance tests" do
  it "correctly sorts a basic gem file" do
    actual = GemfileSorter.sort_gemfile(
      "spec/unsorted_samples/basic_gem"
    )
	expected = File.read("spec/sorted_samples/basic_gem")
	expect(actual).to eq(expected)
  end
end

That’s a perfectly reasonable test. The problem is that my acceptance test suite wound up with, like, 18 of them and that’s a little unwieldy.

Hi – we’ve gotten some comments that the code snippets don’t look good on Apple Mail in dark mode. Buttondown is working on this, but if you need to, you can also find this newsletter on the web at https://noelrappin.com/blog/2023/12/how-to-manage-duplicate-test-setup-or-can-i-interest-you-in-weird-rspec/

If you like this and want to see more of it in your mailbox, you can sign up at http://buttondown.email/noelrap. If you really like this and would like to support it financially, thanks, and you can sign up for a Monthly Subscription for $3/month or an Annual Subscription for $30/year.

Also, you can buy Programming Ruby 3.3 in ebook from Pragmatic or as a pre-order from Amazon, coming sometime… soon.

Thanks!

Option One: Live With The Duplication

The easiest option for managing the duplication is to just brazen through it:

RSpec.describe "acceptance tests" do
  it "correctly sorts a basic gem file" do
    actual = GemfileSorter.sort_gemfile(
      "spec/unsorted_samples/basic_gem"
    )
	expected = File.read("spec/sorted_samples/basic_gem")
	expect(actual).to eq(expected)
  end

  it "correctly sorts a gem file with versions" do
    actual = GemfileSorter.sort_gemfile(
      "spec/unsorted_samples/gems_with_versions"
    )
	expected = File.read("spec/sorted_samples/gems_with_versions")
	expect(actual).to eq(expected)
  end

  # and so on 16 more times
end

Look, I’ve done this in test suites, and it was more-or-less fine.

The problems with just writing each test out whole are:

It’s verbose
It doesn’t do a good job of separating the important part of the test from the logistics of the test. The important part of each test is the directory name, that’s hard to determine just from glancing at the test. Especially in this case where the name repeats in each test.
Adding a new test is more difficult than I’d like.
Copy-pasting the new tests could lead to subtle problems – not necessarily here, but it’s generally not a great practice.
If the setup changes, changing all the tests is a pain – in particular, if there are other features you want to test at the same time, adding them to each test becomes prohibitive. Or if you want something special in the output message, that’s also difficult.

On the plus side:

It’s explicit
Probably easy to follow if you are a new person hitting this codebase
This is the kind of thing that GitHub Copilot is actually kind of not-terrible at, so maybe writing 18 of these tests won’t be as bad as you think. Nah, it’s still pretty bad.

I thought keeping the duplication didn’t work for me in this case. I’m generally team “duplication in tests is fine, actually” and even I was ready to fix this after one or two copies.

Option 2: Loop. Don’t Do This. Just Don’t.

There are a couple of dynamic ways to write these tests so as to clear the boilerplate.

One I do not recommend is a loop.

This is legal in RSpec:

# Please don't do this:
RSpec.describe "acceptance tests" do
  test_cases = %w[basic_gem gem_with_versions] # and so on 18 times
	test_cases.each do |test_case|
	  it "correctly sorts a #{test_case} file" do
	    actual = GemfileSorter.sort_gemfile(
          "spec/unsorted_samples/#{test_case}"
        )
	    expected = File.read("spec/sorted_samples/#{test_case}")
	    expect(actual).to eq(expected)
      end
    end
  end
end

This will work – unless I’ve made a typo or something. RSpec will generate a separate it block for each entry in the test case, and will run all of them at run time.

Don’t do this – using loops to build RSpec tests generally ends in pain. It is nigh-impossible to run only one of these tests at at time from the RSpec CLI, and it can also mess up CLI-based things like rspec --bisect and RSpec’s “run only failed tests” feature. If one of these tests fails, it can be difficult to tell which iteration of the loop caused the failure – we do mitigate that here by putting the test case name in the name of the it block..

Looping does solve the duplication problem – it’s easier to add new tests here. It kind of solves the separation issue, since the part of the tests that differ is now pulled out into the list that drives the loop. That said, my experience with this is that nearly 100% of the time I’ve been on a team that tried looping tests they’ve regretted it because of the CLI and debugging issues.

Option 3: `let` it be

There’s a more RSpec-ish way to handle duplicate logic using nested let calls and a one assertion per test style.

RSpec.describe "acceptance tests" do
  let actual do
    GemfileSorter.sort_gemfile(
      "spec/unsorted_samples/#{test_case}"
    )
  end
  let expected { File.read("spec/sorted_samples/#{test_case}") }

  describe "with a basic gem file" do
    let test_case { "basic_gem" }

	it "sorts" do
	  expect(actual).to eq(expected)
    end
  end

  describe "with a gem file with versions" do
    let test_case { "gems_with_versions" }

    it "sorts" do
	  expect(actual).to eq(expected)
    end
  end

  # 16 more times
end

This works, and I’ve certainly written lots of tests that look like this. My feeling is this is has kind of worst parts of both previous examples. You still have boilerplate, and this version is significantly more opaque to a code reader than the first version because you have to keep going up and down the file to see what’s going on. You sort of have separated out the important bit for each block, but the test is so spread out across the file that it’s hard to follow.

I’ve kind of soured on the “one assertion per spec” rule over time, because it’s verbose and bouncing up and down the file to track let statements is not easy. Also, with RSpec’s :aggregate_failures metadata you can get the same affect of seeing all your failures at once and still keep the tests readable.

Moving into even more obscure corners of RSpec syntax, we get to shared examples. Solving this issue with the shared example syntax looks like this:

RSpec.describe "Acceptance Tests" do
  shared_examples "a sorted gemfile" do |directory_name|
    it "correctly sorts a file" do
	  actual = GemfileSorter.sort_gemfile(
        "spec/unsorted_samples/#{directory_name}"
      )
	  expected = File.read(
        "spec/sorted_samples/#{directory_name}"
      )
	  expect(actual).to eq(expected)
    end
  end

  describe "sorted tests" do
	it_behaves_like "a sorted gemfile", "basic_gems"
	it_behaves_like "a sorted gemfile", "gems_with_versions"
	it_behaves_like "a sorted gemfile", "duplicate_gems"

    # and so on
  end
end

This works (this one I ran to make sure it worked).

I have to admit, I don’t usually use shared examples, and I don’t usually recommend them – they can easily be overused and get too complicated. They can be nested, for example, which I really don’t recommend.

I was genuinely surprised as I typed this out that it kind of… works here? The syntax is a little weird, but we do have each test with its own unique line, and we’re separating the variable part from the consistent part. You do get weird results from the RSpec CLI on failure, the failure will look like rspec './spec/shared_acceptance_spec.rb[1:1:3:1] rather than something with the line number like rspec './spec/shared_acceptance_spec.rb:3 . The description output is a little weird, too.

As written, the shared example is in the same file as the test making it relatively easy to find the logic of the test (it could be in a support file, though even if it’s in a different file, the string is searchable). Shared examples are also quite flexible, we can have an arbitrary number of tests inside the shared_examples block.

I have a practical downside and an aesthetic downside. The practical downside is that it’s easy to make shared matchers quite complex and they can become hard to follow in their own way. My normal recommendation is not to use shared examples just to group related tests, but I think there might be a stronger case to use them to share complicated setup.

The aesthetic downside is that I don’t love it_behaves_like as syntax, I don’t think it carries a whole lot of meaning.

Option 5: RSpec Custom Matchers

The way that RSpec is designed to handle this problem is with a custom matcher. I realize that nobody actually writes RSpec custom matchers, and what I’m saying is, you should consider writing custom matchers.

Here’s what my actual acceptance tests look like:

RSpec.describe "Acceptance Tests" do
  specify { expect("basic_gems").to be_correctly_sorted }
  specify { expect("gems_with_versions").to be_correctly_sorted }
  specify { expect("duplicate_gems").to be_correctly_sorted }
  specify { expect("gems_with_comments").to be_correctly_sorted }
  # and so on
end

I think this is clear and intention-revealing, if a bit minimalist.

On the testing side, this does all the things I want it to do:

It’s really easy to add new tests
It clearly denotes what the important part of the test is – the name of the test case.
This is a personal opinion but I think the syntax here more clearly expresses my intent in writing the test. The idea here is that you don’t need to read the code for the matcher in order to understand the basics of what’s going on (you will have to read the matcher for some things…)
A failed test refers to the exact line of the test
The description output is ”is expected to correctly sort basic_source”.

The downside is that the actual test logic is more separated from the test (at least that’s true as written, you could include the matcher in the same file, I just didn’t). The hope here is that for many cases, the file as written is clear enough that you don’t need the details of the test logic, the same way you don’t need to look up what be_truthy does. (I’ll grant that you do need to know the details of how to add the data for a new test case.)

Like the shared example, you have to search for the string :be_correctly_sorted to find the logic of the matcher, which could be in another file.

The matcher DSL is a little more specialized than the shared example. Here’s the matcher—the actual matcher in the gem is more complex, because I added some additional features and refactored, this is the part that’s consistent with the previous examples:

RSpec::Matchers.define :be_correctly_sorted do
  match do |directory_name|
    actual = GemfileSorter.sort_gemfile(
      spec/unsorted_samples/(directory_name)
    )
    expected = File.read("spec/sorted_samples/#{directory_name}")
    actual == expected
  end
end

This is using the RSpec matcher DSL to create a custom matcher that RSpec applies when be_correctly_sorted is used as an argument to to or not_to.

To see how this works, we need to talk a little about how RSpec works. Here’s one test:

specify { expect("basic_gems").to be_correctly_sorted }

The lack of parenthesis obscures it somewhat but the end phrase is to(be_correctly_sorted), meaning that be_correctly_sorted is an argument to the method to.

In RSpec, the argument to to is meant to be an object called a matcher – typically, the argument is a method that returns an instance of a matcher class. RSpec manages the details here, using the matcher that we wrote using the DSL so that the method be_correctly_sorted returns a matcher instance.

Our matcher code uses RSpec’s DSL to keep us from having to create a class (whether that’s a good thing or not… open question). When RSpec hits that to(be_correctly_sorted), it finds the custom matcher I’ve defined, and invokes the block attached to match.

The way that the values from the test get placed in the matcher has always been a little confusing to me. What happens is that any argument passed to the matcher itself comes in to the DSL as an argument to the define block at the top – we don’t have such an argument in this custom matcher, but if the matcher was something like expect(name).to eq("Noel"), then Noel would be an argument to the outer define block.

The argument to expect gets passed as the argument to the match block. For be_correctly_sorted, this argument is the directory name.

RSpec then executes the match block. If the block returns a true value then the matcher passes, otherwise it fails (like other RSpec matchers, custom matchers can be reversed by calling not_to). In this case, the match block does all the setup – to generate the actual value, it finds the test file, calls the sorter and sorts the file. To generate the expected value, it reads in the file from the known location.

There are other hooks I didn’t show here, the gem matcher also uses description to customize the output and failure_message to print a detailed diff of the two files on failure. It’s also pretty easy to chain method calls, as in expect(actual).to be_on_team("Cubs").with_position("1B")

Here’s what I see as the positives of using the custom matcher.

It’s the easiest one for adding new tests. The shared example is admittedly close.
It separates the parts of each test that are different from the part of each test that is the same, which is important in trying to understand an entire test suite.
Subjectively, I think the syntax allows me to express intent better than the other ways.
It allows hooks into RSpec to do things like adjust the failure message that are useful and would be more challenging in the other styles.

The main negatives, to my mind are:

Using the matcher requires deeper knowledge of RSpec than the other options.
The matcher itself can be a little verbose.
In some cases, the shared example structure of having multiple shared tests is cleaner than having a single match block.

Takeaways

Did I write 2000+ words to try to convince you to write custom RSpec matchers? Yes. Do I think we now tend to underrate how useful RSpec’s DSL features are? Again, yes.

If you take anything from this post, though, don’t write loops around tests in RSpec. Just please.

And try an RSpec matcher once. Maybe you’ll hate it, but the odds are that if you are reading this far, you use RSpec on a regular basis. It’s worth the time to at least try some of its most powerful features.

Noel Rappin Writes Here