Enumerable#inject, that is.
I know that there are a lot of Budding Rubyists out there. Are you a former (or current) Java developer? That was me, but I saw the light and have moved on to greener pastures.
A big step in the transition from Java to Ruby was the adoption of inject as a first-class weapon in my iteration arsenal. So, what’s the big deal, and how does it work? Let’s break it down.
You’re going to use inject to iterate through some values and use each of them to manipulate some other object. Let’s say that we have an array of two-element arrays (a la [['a', 1], ['b', 2], ['c', 3]]) and want to construct a hash with the first element as the key and the second element as the value ({ ‘a’ => 1, ‘b’ => 2, ‘c’ => 3}). Someone just transitioning to Ruby might do this:
hash = {}
array.each do |current|
hash[current[0]] = current[1]
end
You can do this with inject like so:
hash = array.inject({}) do |results, current|
results[current[0]] = current[1]
results
end
Ok, to be fair…this code is not simpler. Not as pretty, imho. More Ruby-esque? Sure. Patience, Grasshopper.
You can think of inject as a replacement for each that takes an argument representing the starting state of the object you want to manipulate. We want to construct a hash, so this starting state is {}. The block, then, receives another argument (I call it ‘results’ above): the value of your object passed in from the previous iteration.
One key thing to remember when using inject: you must return the ‘results’ from the iteration at the end of the block. Not doing so is a very common mistake that will totally break you.
Let’s pick a different example: we want to sum the values in an array. Old way:
total = 0
array.each do |current|
total += current
end
total
With inject:
total = array.inject(0) do |total, current|
total + current
end
Again, you mentally translate the above to:
- start with 0
- iterate through each value, adding the current value to the total, and
- pass the total on to the next iteration
Easy peasy. (What, you just want to use ActiveSupport’s Enumerable#sum? Bleh. Fair enough.) Ok, one more. This one actually shows it used in a realistic scenario.
Array.class_eval do
# Construct a hash of objects, keyed by some object attribute.
inject({}) do |results, obj|
results[obj.send(attribute)] = obj
results
end
end
end
Here, I have reopened the Array class to add a new method, hash_by(). I use it to construct a cache of objects so that I can quickly look up a specific object using a unique object attribute. Here it is in action:
attr_accessor :name, :age, :email
@name = name
@age = age
@email = email
end
end
users = [User.new('Brian', 32, 'brian@foo.bar.com'),
User.new('Jim', 46, 'jim@foo.bar.com'),
User.new('Scott', 33, 'scott@foo.bar.com'),
User.new('Kenton', 32, 'kenton@foo.bar.com'),
User.new('Chris', 34, 'chris@foo.bar.com')]
user_cache = users.hash_by(:name)
puts user_cache['Brian'].inspect
# => #<User:0x28514 @email="brian@foo.bar.com", @age=32, @name="Brian">
puts user_cache['Scott'].inspect
# => #<User:0x2844c @email="scott@foo.bar.com", @age=33, @name="Scott">
With the constructed User cache, we can now get to the object of our choice in constant time. Yay, inject!
February 5, 2008 at 2:33 pm
In fact, you can clean up your first inject example just a little more by using Ruby’s nice array expansion behavior:
hash = array.inject({}) do |results, key, value|
results[key] = value
results
end
February 5, 2008 at 3:27 pm
Indeed. Good point, Mike.
February 6, 2008 at 8:07 am
In a follow-up to Mike’s assignment through array expansion: remember when using inject on hashes, it’s more complex:
hash = some_other_hash.inject({}) do |results, (key, value)|
results[key] = value
results
end
The second argument passed to the block is an array, so the key and value need to be pattern matched out.
February 6, 2008 at 8:10 am
Also, the direct approach to solve your first example (a hash from an array of key/value pairs) is:
Hash[*array.flatten]
This is commonly used.
February 6, 2008 at 8:18 am
Actually, you need to pattern match in Mike’s inject example as well:
array = [[1,2],[3,4],[5,6]]
array.inject({}) do |results, key, value|
results[key] = value
results
end
# FAIL => {[1, 2]=>nil, [3, 4]=>nil, [5, 6]=>nil}
array.inject({}) do |results, (key, value)|
results[key] = value
results
end
# PASS => {5=>6, 1=>2, 3=>4}
I think Mike was thinking of #each, which does do expansion:
array.each do |key, value|
# do something with key, value
end
February 6, 2008 at 5:12 pm
Good stuff, Bruce. I definitely need to adopt the Hash[*array.flatten] bit.
February 6, 2008 at 7:58 pm
To “refactor” some mor, you can write multible assignments in one line:
def initialize(name, age, email)
@name, @age, @email = name, age, email
end
February 7, 2008 at 6:38 am
To anyone more familiar with functional languages, inject is reduce, or foldl. I certainly find the name changes hard to grok in ruby
February 7, 2008 at 8:05 am
Also, you can clean up (not have the dangling return of the hash itself) the hash example by using a function that returns an updated hash, like merge:
array = [[:a, 1], [:b, 2], [:c, 3]]
hash = array.inject({}) do |results, current|
results.merge current[0] => current[1]
end
Now things are starting to look more succinct. Combine that with Bruce’s comment, and you get what I think is the “proper rubyist’s” approach:
array = [[:a, 1], [:b, 2], [:c, 3]]
hash = array.inject({}) do |results, key, value|
results.merge key => value
end
That is starting to look better, and more succinct, even. In fact, after my first step it can be one-lined (without cheating via
:
[[:a, 1], [:b, 2], [:c, 3]].inject({}) do |results, key, value| { results.merge key => value }
February 7, 2008 at 8:14 am
Err I miscopied and edited that last bit (I hate not having previews on comments). Obviously it should be:
[[:a, 1], [:b, 2], [:c, 3]].inject({}) {|results, (key, value)| results.merge key => value }
February 7, 2008 at 11:16 am
Thanks Tim to establish the true truth
February 7, 2008 at 3:09 pm
I’m begining to think I should’ve posted this to http://www.refactormycode.com.
That being said, any budding rubyists reading this are getting a lot out of it, I am sure. Thanks for the great comments, all.
February 7, 2008 at 3:49 pm
Damn! Both the article and comments are good. (Except for this comment, which is meant simply as some tasty ‘words of encouragement’. I hope they are satisfying.
Subscribed.
February 7, 2008 at 3:57 pm
I think there is a danger (a subtle one) with making Ruby code that’s so elegant, that’s it’s hard to understand. Yes, you can do that function in one line instead of two. Yes, you can now go order that $5 dollar coffee and think yourself pretty cool about how you saved one line of code by being clever.
However, someone else may have to read your code later. Someone who doesn’t realize the difference between merge and reverse_merge. Or, someone who doesn’t quite understand Ruby as well as you do. Code is written to be read. Otherwise, we would all be busting out our ASM files and doing it old school in 1’s and 0’s. (forgive my extremism… I’m trying to hammer in a point).
You can write cool Ruby one-liners all day long that are subtle and powerful and can flip a guy using only his momentum and whatever finger happens to be closest to you. However, it doesn’t help the guy who has to maintain your code. It does not help the guy trying to learn Ruby, when straight-forward idiomatic Ruby would best serve him.
Forgive my rant.
February 7, 2008 at 6:21 pm
In general, maybe, but sorry, Randito, but in this case you’re wrong.
I’m not a fan of one-linerizing everything, and actually I probably wouldn’t write that all on one line, but it’s a pretty common idiom to use merge instead of having a (more confusing, imo) extra meaningless returned line. Anyone who isn’t used to following the basic idiomatic ruby enumerator methods is in for a long slog.
February 9, 2008 at 12:37 am
[...] Why I like to inject « The Budding Rubyist (tags: ruby programming toread) [...]
February 12, 2008 at 3:36 am
Tim, I agree. I use the merge version fairly frequently. It’s a nice, idiomatic pattern.
February 12, 2008 at 7:10 am
Indeed.
In retrospect, despite what I said, I might use the one-liner here (with a variables for the array, though). I mean, really, it’s just one statement and it’s not that bad too follow. As Steve Yegge recently blogged your tolerance for code compression increases [as you get more experienced]. I definitely think that applies within a language as well as in general, as he meant it.
I do wish I proofread my comments better, though (left the parens off the second non-one-liner form, as well).
February 18, 2008 at 1:17 pm
Personally, whenever I do inject, I usually use “next” instead of just having a line with a value “falling out.”
So,
hash = array.inject({}) do |results, current|
results[current[0]] = current[1]
results
end
… becomes …
hash = array.inject({}) do |results, current|
results[current[0]] = current[1]
next results
end
In my opinion, it makes things make a little more sense.
February 18, 2008 at 1:26 pm
@Ben, I haven’t seen that before but I *really* like it as a convention for inject. It just makes the passing of the results on to the next iteration easier to recognize.
June 13, 2008 at 1:06 am
On the topic of next, if you wish to skip an item while doing a total, do
next (total) if item.shouldnt_be_counted?
otherwise total gets set to nil for the next iteration.
May 18, 2009 at 2:34 am
[...] The only “tricky” thing I’m doing is making good use of Ruby’s Enumerable#inject method. Twice. This method iterates an enumerable object similar to how ‘each’ does except it takes an argument and passes it through the block with the object it is iterating. The variable passed can be modified and is returned by the method. For a good write-up on ‘inject’, see this blog post by The Budding Rubyist. [...]