Noel Rappin Writes Here

An Object-Oriented Example

Posted on July 28, 2021


If you’ve been reading this newsletter for a while, you’ve may have noticed there are two patterns I talk about all the time:

  • We’re doing something because everybody does it but we don’t know why.
  • We’re doing something, but we don’t really believe in it so we’re doing it halfway and not getting any benefit.

Object-Oriented Design often hits both those patterns – teams use OO languages for various reasons without thinking about what OO can do for them, and then teams don’t use OO design or programming tools in a way that can help them achieve their goals.

On my tracker side project (code-named Elmer until I think of something better), I’m trying to be explicit about my design goals. I want to:

  • Use OO to allow me to change the code and experiment with how to represent projects and calculate forecasts. I expect a lot of change here and I want the code to make that change easy.
  • Avoid defaulting to the same service/action/workflow layer I’ve been doing in small Rails projects for years, instead I want to try to let the code design be more emergent.
  • Take advantage of not having to justify myself to other people and push a little harder on fancy OO patterns than I might otherwise. (I got a Null Object in there already, which makes me happy).

Here’s a case study:

The main unit of work in Elmer is a Card. Each card has a current status. It’s a design principle of Elmer that other tools have too many status buckets – Elmer has four buckets: “unstarted”, “started”, “done”, and “attic”, plus an inbox for Cards that haven’t been triaged yet. (In contrast, my current work tool has seven, plus the inbox.)

The initial implementation of the status is simple, it’s an Active Record enum (with the Postgres extension that lets the enum values be strings).

As I started this refactoring project, the status was used in three places:

  • Size calculations
  • Display
  • Moving cards through the statuses

The system calculates the total size of cards in the project, and uses the status field to also calculate cards completed and cards yet to do.

The code for that calculation was a little bit like this (slightly simplified for publication):

def in_project
  unstarted? || started? || done?
end

def project_size
  in_project? ? theoretical_size : 0
end

def remaining_size
  case status
  when "started"
    [size - business_duration, 1].max
  when "unstarted"
    size
  else
    0
  end
end

def completed_size
  project_size - remaining_size
end

I had code in the display that sorts the cards by status, and defaults to start showing only certain statuses. In the view, that code is:

<% Card.sorted_statuses.each do |status| %>
  <% hidden = status.in?(["done", "inbox", "attic"]) ? "true" : "false" %>
  # DISPLAY CODE
<% end %>

Since I don’t have an individual card here, I don’t have a structured way to ask about statuses, so it feels like I’m asking about the statuses in a convoluted way that I don’t like.

Finally, I have some logic that puts date stamps on the card as the code changes state, the code keeps track of the date the card was added to the project, the date it was started, and the the date it ended.

I wrote a bunch of tests, covering advancing a card through the project and retreating it backward through various steps, winding up with the following “quick to green” code that passes the tests:

def advance!
  done! if started?
  started! if unstarted?
  unstarted! if inbox? || attic?
  self.date_started ||= Date.current if started?
  self.date_added ||= Date.current if unstarted?
  self.date_ended ||= Date.current if done?
  save!
end

def retreat!
  attic! if inbox?
  inbox! if unstarted?
  unstarted! if started?
  started! if done?
  self.date_added = nil if inbox? || attic?
  self.date_started = nil if unstarted? || inbox? || attic?
  self.date_ended = nil unless done?
  save!
end

(Note the heavy use of Rails enum ! and ? methods to query and change the status – I like them, they feel silly and Rails-y.)

I actually don’t think this is terrible – like a lot of quick-to-green code, it’s admirably direct. However, there are some potential issues. The order of the if statements at the top of each method actually matters. Because each if statement is independent, the complexity of this measured by cyclomatic complexity is quite high. The logic around the date stamp is sort of confusing.

There’s a straightforward refactor that lowers the metric complexity of the code and removes the internal dependencies:

def advance!
  case status
  when "attic"
    unstarted!
    self.date_added ||= Date.current
  when "inbox"
    unstarted!
    self.date_added ||= Date.current
  when "unstarted"
    started!
    self.date_started ||= Date.current
  when "started"
    done!
    self.date_ended ||= Date.current
  end
end

def retreat!
  case status
  when "inbox"
    attic!
    self.date_added = nil
  when "unstarted"
    attic!
    self.date_added = nil
  when "started"
    unstarted!
    self.date_started = nil
  when "done"
    started!
    self.date_ended = nil
  end
end

This version is better, I think. The cyclomatic complexity of it is much lower, and I think it’s easier to see an overview of how the system behaves.

Overall, this code is… fine? Under some circumstances, I would definitely leave it here. (Those circumstances being working on a team that would be uncomfortable with pushing the OO further.)

That said:

  • We now have a case statement switching on the status in three different places.
  • We’ve also got a separate use of the status list that doesn’t go through a card at all.

It does not seem unlikely or weird that we might wind up with more places where both of those things would happen.

I think Status is an abstraction waiting to be extracted from the code.

“Abstraction” is one of those words that we use all the time without really defining, and because we don’t really define it, it’s hard to talk about what makes an abstraction useful.

Here’s my definition: in code, an abstraction is a structure that represents a concept and hides internal details such that the rest of the code can interact with concept via the abstraction without having to worry about the implementation details.

In an object-oriented language like Ruby, the main unit of abstraction is the class. Putting code and data in a class allows the rest of the application to deal with the class via its public interface rather than its internal details.

An abstraction is useful if it is easier for the rest of the code to interact with it rather than interacting with its internals. An abstraction is “leaky” if you need to understand its internals in order to use it.

I often find that when I try to guess what the useful abstractions are in a system before I write the code, I’m wrong, and the abstractions I build are not useful. It can work better to find the abstractions as you build the code piece by piece.

That’s a vague statement. How do you find an abstraction? And why is finding them better than imposing them?

You can find abstractions in the code by looking for places in the code that are nearly abstractions already. A partial abstraction is code that is frequently used together, or changed together, or already acts as a barrier between parts of the code.

Specifically, in Ruby, I look for features like these:

  • Branching logic that uses the same attribute to branch in multiple cases. In OO languages, this is often a tell that you have subclasses or value classes or something.
  • Multiple methods in a class that all get passed the same parameters. This often means you can create a class with those parameters as instance values.
  • Multiple methods in a class that have the same prefix or suffix in their name. The naming may be trying to tell you that those methods are related.
  • Private methods. In Ruby, I think private methods often mean you want a separate class collaborating with the main class. Other people think this is a nonsense opinion, so take that for what its worth.

Once I’ve identified a potential abstraction, there are a couple of questions that I think about before I try and abstract it.

  • Can I give it a name? Naming is important, and having a name for something means that I can articulate what the abstraction is as part of the code.
  • Do I think the code will change in the future? If not, why bother?
  • Do I think there is additional complexity that will be attracted to this new abstraction? Often, once you identify a new abstraction, you’ll start to see places where it can be used and extended.

For the card status value, I have multiple methods branching using the value of the status attribute, and the calls are coming from different parts of the code.

Going through the other questions, “Status” seems like a perfectly good name. The code will definitely change in the future, and there’s going to be large amounts of new complexity going in.

I decide I want an abstraction, which in this case means Status classes.

The goal here is that there is exactly one place where we assign a Status object based on the value, and then nobody outside the status object has to care about the specific value, they just query the object for behavior.

Here’s my code – there are a number of different ways to draw the line between Card and Status, so I’m not claiming this is the only way, or even the best. (Note that because Card and Status are close collaborators, I’m not very concerned about the classes knowing about each other.)

First we need a way to convert the string attribute value to our new object – here’s our one case statement:

class Status
	def self.at(value)
    return value if value.is_a?(Status)
    case value
    when "done" then DoneStatus.new
    when "started" then StartedStatus.new
    when "unstarted" then UnstartedStatus.new
    when "inbox" then InboxStatus.new
    when "attic" then AtticStatus.new
    else
      throw InvalidValueError.new(value)
    end
	end

Then we need all the subclasses – I’ll just show one, you get the point.

class DoneStatus < Status
  def order
    1
  end

  def to_s
    "done"
  end
end

Note that the subclasses have no changeable instance data, they are immutable value objects.

We need a way for the Card to access that value – inside the Card class

def status_value
  Status.at(status)
end

def status_value=(new_value)
  self.status = new_value.to_s
end

The second one lets us transparently assign the status value object and have the underlying string value change.

Now, nearly anything we want to do can be accomplished by getting the status_value and putting single line data methods in each subclass of Status.

In other words, the Card method for remaining size, which previously had an if statement about status, becomes this:

def remaining_size
  status_value.remaining_size(self)
end

We have to define remaining_size in all the Status classes. For most of the statuses, the remaining_size is always 0, so the parent Status class gets this:

def remaining_size(card)
  0
end

Then the two classes that care about calculating the remaining size get methods:

class UnstartedStatus < Status
  def remaining_size(card)
    card.size
  end
end
class StartedStatus < Status
  def remaining_size(card)
    [card.size - card.business_duration, 1].max
  end
end

The new version of the advance/retreat logic looks like this in Card:

def advance!
  self.status_value = status_value.next
  status_value.date_stamp_on_advance_to(self)
  save
end

def retreat!
  self.status_value = status_value.previous
  status_value.date_stamp_on_retreat_to(self)
  save
end

Each individual Status subclass defines its next and previous states and what happens when you advancing and retreating to dates.

class StartedStatus < Status
	def next
    Status.at("done")
  end

	def previous
    Status.at("unstarted")
  end

	def date_stamp_on_advance_to(card)
    card.date_started ||= Date.current
  end

  def date_stamp_on_retreat_to(card)
    card.date_ended = nil
  end
end

Is the new solution better?

I wrote a couple of drafts of this that got weirdly defensive but this version is better for me. This version is much better at achieving my goals for this code than the multiple case statements.

And I like it.

It’s not shorter, that’s for sure. It’s got more methods, but the methods are simpler. I think it’s easier to see what each status code means and its role in the system.

I know that some people will look at it and think its goofy or over-engineered. If you aren’t used to polymorphism, it’s possibly hard to trace where things happen.

So far, the code has adapted well, there’s been some more complexity added to the status classes and further calculations and it’s been handled smoothly.

Thanks for sticking around to the end of this one.

The permanent location of this post is https://noelrappin.com/blog/2021/07/an-object-oriented-example – you can go there and add a comment.

Or you can subscribe at https://buttondown.email/noelrap and get these posts in your mailbox.



Comments

comments powered by Disqus

Copyright 2021 Noel Rappin