A Deep Dive into the Ruby Map Method

A Deep Dive Into the Ruby Map Method

Feb 22nd, 2015 3:03 am ruby

I recently took a look at how the ruby puts method works. Continuing with the same theme, let’s take a look at the maps method from the Enumerable mixin.

The Enumerable mixin is one of the greatest parts of the ruby programming language. Anything that mixes in Enumerable gets access to 51 very useful and commonly used methods useful for working with, well, things that can be enumerated. The mixin is used in common data structures: both Array and Hash include Enumerable. Many other libraries implement some subset of Enumerable; for example ActiveRecord::Relation which is used by Rails.

The Method

What does the map method do? The method signature, map {|obj| block } -> array, tells us that map takes a block and returns an array. The documentation tells us that the returned array is constructed by running the block on each element of the object that receives map, and adding the return of the block to the array. Hence:

[1, 2, 3, 4, 5].map{|x| x * 2} #=> [2, 4, 6, 8, 10]
{:one => 1, :two => 2}.map{|k, v| k.to_s + " " + v.to_s} #=> ["one 1", "two 2"]

A common pattern is to call a method on each object in the collection and use the return value of that method. As a shortcut, you can do this by passing “&:method_name” instead of an explicit block.

[1, 2, 3].map(&:to_s) #=> ["1", "2", "3"]

Though it is most common to pass a block using the curly brace ({}) syntax to map, it is perfectly valid to pass a ruby block in any of the other normal ways. Hence, the following is perfectly valid:

["jim", "KIM", "rOb"].map do |name|
  name.downcase.capitalize
end
#=> ["Jim", "Kim", "Rob"]

You can also pass the block using a Proc object:

proc = Proc.new {|arg| arg.to_s}
[1, 2, 3].map(&proc) #=> ["1", "2", "3"]

Examples

How can we use the map method in practice?

One of the best ways to use map is to try and refactor your code in a more functional style. If you are used to programming in lower level languages, say Java, you might be used to writing code in this pattern:

new_array = []
old_array.each do |element|
  new_array.push(element.do_something)
end

This code can be made much more concise with map:

old_array.map{|x| x.do_something}

A good rule of thumb is to remember that nothing can be returned from calling .each, so it is best used for cases where your code to cause side effects like being saved to a database or printed to the screen. If your .each block isn’t doing either of those things, it probably can be replaced with an alternative method.

The map method really shines when combined with a couple more enumerable methods. Suppose we have a Person object that has an age. Here, we calculate the variance of the ages in one line:

mean = people.map(&:age).inject(:+) / people.count
(people.map{|x| (x.age - mean) **2}).inject(:+)/people.count

This isn’t good style, but it shows the power that map gives you to write concise, functional code. Note that here we use map twice: the first time, it is part of people.map(&:age).inject(:+), and is used to extract the ages from the set of people before they are summed. The outer map block is used to get the squared deviation of each age from the mean, which is required to calculate the variance.

How is Map Defined?

Like most of the core ruby libraries, map is actually defined directly in C.

To get the hang of using map, look at some of your each loops and see if you can replace it with a call to map. On line 2025 of enumerator.c, we see that map is actually defined exactly the same was as collect, which is a synonym function. It refers to the C function lazy_map - happily, ruby tries to keep enumerables lazy when it can - which in turn delegates to an internal method lazy_set_method and calls lazy_map_func. The lazy_set_method takes care of associating the code block you pass with the lazy enumerator; lazy_map_func actually calls the block on each element¹.

The key insight here is that ruby tries to keep its enumeration lazy by default, which while making the underling C code much harder to understand, provides some great benefits for the end user. You can code with confidence knowing that map will keep your lazy sequence lazy!

Happy coding!

I think. Still learning this C stuff.↩

SecondForge

james c gibson