Ruby And Its Neighbors: Perl
Ruby takes a large part of its inspiration from two older languages:
- Perl for general syntax and design philosophy
- Smalltalk for Object-Oriented structure
I’ve been in kind of a writers block, for all kinds of reasons, personal and professional. I started to think about an article that I could write that would get my fingers typing, and drifted into what I could do for another “Better Know A Ruby Thing”, started thinking about String literals, wondered how I would answer the question of why Ruby has so many ways to write String literals.
The answer to that is that most of them were inherited from Perl.
In fact, a lot of Ruby’s more unusual syntax was taken from Perl.
And then I realized that Perl has vanished so completely that there’s probably a large group of Ruby developers that don’t know much about it. I’m looking at the documentation for the newish Zed editor and I note that it does not mention Perl in languages supported…
Oh wait — a quick story. About 10 years ago, when Take My Money came out, I was doing user groups in the Chicago area on the forlorn hope that somebody might buy a book. For some reason I did the Perl Monger’s user group, which still existed even in 2017. It was really interesting — very rare even than that I was below the median age. But almost all of them had banking experience, so they didn’t need me to tell them how to do money stuff. But they were fascinated by Ruby…
What Is Perl?
Perl was once the scripting language in a way that no individual language can manage the days. Its first public release was 1987, and its bible, Programming Perl by Larry Wall, was first printed in 1991, and is the basis for the big phone-book overview of a programming language. It’s always been known as “The Camel Book”. (This isn’t exactly something you can easily look up, but I think the first programming book to be known by it’s cover image is “The Dragon Book”, which is the big book on compilers).
The Camel Book is probably the best-selling book about a single programming language, or at least it was, it may have been overtaken by JavaScript: The Good Parts or something.
Perl was a couple of things. It was the first really popular open source language and language community. It was also the language of the very early web, as the first dynamic web pages were overwhelmingly written in Perl.
There are two things that help Perl make sense:
- It was designed to make string manipulation super-easy, especially with regular expressions. And in this case “super-easy” means: with a minimum of typing.
- Larry Wall had a linguistics background, and wanted Perl to have the flexibility of natural language, and as a result, there are many different ways to do most things.
In practice these two features made for a language that was extremely flexible and also could be very hard to read.
Another huge advantage of Perl was CPAN, a package manager that enabled third-party packages to easily be found and installed locally. It’s genuinely weird to explain how hard it was to install most open source stuff before about 2005. But CPAN was so far ahead of its time in 1999 that I’m pretty sure it had features that RubyGems still doesn’t have. It felt like a magic trick.
Perl Syntax for Rubyists
I got weirdly nostalgic looking back over Perl code, even though I never really wrote much Perl. In fact, I left a job in part because I wanted to do a project in Python and they wanted it in Perl. (Okay, that was a symptom of deeper problems, it’s funnier to say I quit a job because they asked me to write Perl.)
Even though I didn’t write much Perl, it was ubiquitous in the late 90s, you could go an any programmer forum and use s/this/that
to describe a text change and people would know what you meant.
Here are some of the Perl features that struck me as I was looking back over some Perl references.
Variable Sigils
Ruby gets the use of @
and $
as variable prefixes from Perl, but Perl uses prefixes differently. In Perl, all scalar variable names start with $
and all list variable names start with @
– $user
and @user
are two different variables. There’s also a prefix (the term in Perl is sigil) for hashes (%
) and subroutines (&
). At the time, I remember finding these vaguely annoying, but now I can see how important they are, because the sigils mean the parser doesn’t have to guess about what a variable name is, and Perl uses them for it’s somewhat idiosyncratic type management.
Automatic Conversion
Perl is generally quite permissive about types and tries to do something with most expressions. In particular, Perl will automatically convert numbers and strings, so "3" + 3
and 3 + "3"
are both legal. What makes this possible is that Perl uses +
for addition and .
for string concatenation, so there’s no confusing "3" + 3
and "3" . 3
, in both cases Perl converts the value that isn’t the correct type. (Perl also distinguishes between *
for multiplication and x
for String repetition).
Perl will also automatically convert between lists and scalars based on context. And Perl will determine the context from, say, the lefthand side of an assignment statement. So $count = @users + @companies
is equivalent to Ruby’s count = users.length + companies.length
, because the $count
puts us in scalar context, and in scalar context for addition, Perl converts an array to the length of the array.
All this type conversion is both kind of handy on quick scripts – we just saved 11 whole characters on that count
example. It’s also a great way to have subtle bugs in large codebases.
Hashes
Hashes in Perl have a %
sigil, but looking up a hash is a scalar, so %user
for the whole hash, but $user{'name'}
for a lookup value. Hashes can be made from lists and there’s no distinction between keys and values, they just alternate: %user = ('name', 'Noel', 'username', 'noelrap')
– Ruby had this syntax originally but eventually deprecated it.
But – Perl considers =>
to be completely equivalent to a comma, so you can write this as %user = ('name' => 'Noel', 'username' => 'noelrap')
. Also, for maximum confusion, you could write it as %user = ('name' => 'Noel' => 'username' => 'noelrap')
. Also you can omit quotes to the left of he arrow, so %user = (name => 'Noel', username => 'noelrap')
. This is a good example of how Perl is both great and exhausting.
Conditionals
Perl uses 0
, an empty string, and the string "0"
for logical false, it doesn’t otherwise have a boolean type. Perl is also (presumably), the source for Ruby’s use of elsif
in compound if statements. Why Wall used this in Perl I’m not sure—Bash uses elif
(and so does Python). I can see why having a dedicated else if keyword makes parsing easier, but it’s funny to me that in a language often dedicated to minimizing typing, Wall added the extra character there.
Strings
Perl has a lot of features that Ruby pulled in – single and double quoted strings with interpolation in the string. In Perl, because of the variable sigils you can just include the variable name "Hello $name"
rather than having to enclose it in other syntax. Ruby actually also lets you do that, sort of, with variables with sigils, "Hello #@name"
is valid, if rarely used, Ruby.
Perl has a shortcut for a list of strings qw(one two three)
and also has the same arbitrary delimiter behavior.
Global Default
One of Perl’s most common shortcuts in practice is the global default value, $_
, which is set automatically by reading from a file or console, or by the index of a loop if the index is not otherwise specified. Where this is a shortcut is that many Perl functions will use $_
as the argument if no argument is specified, and will also automatically set it with the result of the function. This means in practice you can leave off arguments in many places and things will just work, meaning that you can have a series of methods that appear to take no arguments, but are actually all taking in $_
and setting it to a new value for the next function. Like a lot of Perl, this is both convenient and hard to read.
Perl is also the source of the “methods return the last evaluated value” thing. But Perl also defaults to not giving method arguments names. Instead, by default, Perl takes any arguments you pass and throws them all in a magic array value @_
, which you can either use directly or deference into local variables. Local varables in Perl are defined with the function my
as in my $name = "Noel";
Regular Expressions
Regular expressions are the big star of the syntax, Perl has the same /pattern/
literal syntax that Ruby has, but Perl considers the slash to be an operator, not a literal creating an object
This means you can do all kinds of shortcuts.
For example, a raw pattern by default matches against $_
, so
$_ = "Noel Rappin"
if (/N.*/) {
print "this is a match"
}
Here the if
is matching the pattern against the default variable and returning a truthy value if the pattern matches, and a falsey value (specifically 0
) if it doesn’t. I’m explicitly setting $_
there, but you don’t have to, you could use one of the many ways that $_
is implicitly set.
Perl also has ~=
but only for string on the left, pattern on the right, and it’s where Ruby got $1
, $2
and so on for match variables – much more commonly used in Perl than in modern Ruby. Also $&
for the entire match, $` for the part of the string before the match and $'
for the part of the string after the match.
Perl also has a special substitution operator: s///
– the first part of the string that matches the pattern between the first two slashes is replaced by whatever is between the last two slashes. The result, if not assigned, goes to $_
. To get multiple replacements, you use g///
.
The upshot of all this is a language that can be quite compact, especially if you are in a situation where you are using $_
a lot. But there is also a lot of room for personal styles, to the point where it can be very hard to read somebody else’s Perl.
Object Oriented Perl
Objects in Perl are worth a few words. Basically, what Perl does is let you create packages that contain a bunch of methods, similar to a Ruby Module
. You can than use the built-in function bless
to an arbitrary data structure – usually a hash – into an “instance” of that class for the purposes of method lookup .
So if you had an existing file called company.pl
that defined package Company;
that defined a method called get_continent
, you could then do this:
use Company
$company = bless {
name => "Viridian Dynamics",
country => "USA"
}, Company
$company->get_continent()
The bless
method associates that hash with the Company
package. After that, using ->
followed by the name of a subroutine in that package causes Perl to call that subroutine with the hash object as the first argument. In practice, most Perl packages designed to be used as classes provided a new
subroutine that did the bless
call inside it.
The Ruby equivalent would be… I guess taking a hash and including a module inside its singleton class? There isn’t really a Ruby equivalent because Ruby has real classes.
Also, nothing in Perl is private, so you can always just get at the original data in the hash. To quote Larry Wall:
“Perl doesn’t have an infatuation with enforced privacy. It would prefer that you stayed out of its living room because you weren’t invited, not because it has a shotgun”
Having genuinely tried to use this part of Perl, albeit briefly and 20 years ago, it’s generally okay at “here’s some fancy methods specific to this data shape”. But, the idea of encapsulation is sort of fundamentally opposite to the Perl aesthetic, so it always felt a little weird.
I do remember being pretty heavily influenced by the book Perl Best Practices, which functioned as “Perl: The Good Parts”, and I think it advised some specific OO practices to make using it manageable.
What did Ruby Take?
It’s always been acknowledged that Ruby comes from Perl, the name “Ruby” is allegedly because pearl is the birthstone for November and ruby is the birthstone for December.
Very broadly, what Ruby took from Perl is syntax and to some extent design inspiration, and what Ruby added to Perl is genuine object-orientation.
A tremendous amount of Ruby’s syntax is adapted from Perl – just scratching the surface, there are string literals, regex literals, the if statement, here docs, the ability to have custom delimiters, underscore in number literals, the use of sigils at the start of variables, using if
or unless
as modifiers after an expression, not requiring return in a method, and
vs. &&
, and on and on.
More than that, Ruby adopted Perl’s design aesthetic that there should be multiple ways of doing things, and that code should be sort of natural language like (though I think that idea expresses itself very differently in the two languages).
What happened to Perl?
Lots of things but there are two big ones:
- Although it was basically the most common language to create dynamic webpages in like, 2000, a higher-level framework never emerged, and Perl therefore lost share to Python and Ruby. Python and Ruby were also, in different ways, easier to use than Perl. I sometimes idly wonder why nobody came up with “Perl on Planes” or something, given that copying Rails was a big deal in 2007, and Perl had a longstanding association with the web.
- Starting in the early 2000s the Perl community got fixated on a Perl version 6 that was going to be very not-backward compatible, and development on existing Perl languished for years while version 6 was in development. Perl 6 was eventually released as the language Raku.
Should you try it?
Y’know, when I started this, I would have said no, but brushing back up on the syntax makes me think it’s probably worth a go. It’s extremely well designed for its core task – writing a short script that does something fancy with text – and I don’t know that there’s anything quicker if you are a Perl expert.
I also can’t leave this without at least mentioning my favorite piece of random Perl lore – Perligata or Perl in Latin. Not only did the authors here translate Perl’s functions into Latin, they also translated the grammar – in Latin word order in a sentence doesn’t matter, grammatical place is determined by a word ending. They made a version of Perl that works like that – the variable next
would be named nexto
as the receiver of an assignment but nextum
as the value, so $last = $next
could be written as lasto da nextum
or nextum da lasto
or da lasto nextum
. Just an amazing amount of commitment to the bit.
Next up, we’ll look at Smalltalk.