Wednesday, June 5, 2013

Cooking with For Comprehension

I have been repeatedly trying to understand Iteratees. There are some excellent examples in Scala (Josh Suereth and RunĂ¡r). However everytime I try to follow them, Monads started showing up, and soon I would be neck deep in Scalaz. Monad is a pattern I’m still struggling with. Reading Learn You a Haskell for the Greater Good helped a lot, but I still don’t feel comfortable around monads.

However this was not enough to dissuade me from getting Iteratees. I turned to Haskell to see if I’d get some insight. Reading some articles on iteratees in Haskell I stumbled upon this one this post by Magnus, and the last line of the blog was great “That’s pretty much it. Not that much to it.” So this made me wonder if I was not complicating things too much.

All iteratee examples I saw used for comprehension (or the Haskell equivalent). Maybe my problem was not in the monads. I decided to take baby steps and see if I could understand how for comprehension works.

So I started with the following problem, let’s say I have a Cook defined as follows:
trait Cook[T] {
  def cookFor(guestCount: Int): T
}

This cook will get a number of guests and produce something to feed them.

Now I want to get several cooks to work together and deliver a meal. Based on the examples I found in the post above, this should be something like:
val mealCook = for {
  appetizer <- appetizerCook
  mainCourse <- mainCook
  dessert <- dessertCook
} yield (appetizer, mainCourse, dessert)

The mealCook should have the cookFor method on it and deliver whatever types the meal is composed of.  But how exactly does this work? How will the compiler figure out the types, how will a single cookFor call weave all these cooks together. This was the crux of my lack of understanding. I had used these for-comprehensions before in Martin Odersky’s excellent “Functional Programming in Scala” (available at Coursera). But I really didn’t understand how it worked.

In order to get the example above to compile we need to add couple of methods on our Cook trait. From the compiler error and previous reading I knew it needed the map and flatMap methods implemented. And to make matter easier to investigate, I added type manifests and a self type:
trait Cook[T] {
  self =>
  def cookFor(guestCount: Int): T
  def flatMap[S](f: T => Cook[S])(implicit manifest: Manifest[S]): Cook[S] = ???
  def map[S](f: T => S)(implicit manifest: Manifest[S]): Cook[S] = ???
}

The self type is used to allow us to get the current cook when implementing the methods, as you’ll see below. I will not have the type manifest in all examples for brevity, but they did help when trying to look at types.

Now all we need are some cooks that do real work:
val appetizerCook = new Cook[List[Int]] {
  def cookFor(guestCount: Int): List[Int] = (1 to guestCount).map(_ + guestCount).toList
}

val dessertCook = new Cook[String] {
  def cookFor(guestCount: Int): String = "(> " * guestCount
}

val mainCourseCook = new Cook[List[String]] {
  def cookFor(guestCount: Int) = List("Meat", "Fish", "Chicken", "Shrimp", "Vegan").take(guestCount)
}

The code now compiles but will not execute because of the incomplete implementations of the map and flatMap methods. However even at this stage we can look at the type of mealCook, which is:
val mealCook: Cook[(List[Int], List[String], String)] = ...

Notice that each cook produces its own type, and these types are captured in the composed cook. This however does not shed light on how the cooks work together. And looking at the type signatures for both map and flatMap I could not explain how the compiler derived the types or how a for-comprehension will get them to work together.

So let’s proceed in our investigation. Next obvious step is to implement the missing methods. This should shed some light on how the combined cook works.

Staring with map, defined as follows:
def map[S](f: T => S): Cook[S]

The type signature requires us to return a Cook[S], but we don’t know anything about S. The only thing we have is a function from T to S. We could do something like:
  def map[S](f: T => S): Cook[S] = {
    new Cook[S] {
      def cookFor(guestCount: Int): S = f(???)
    }
  }

But f needs a value of T, and we don’t have any at hand. However we can get one by calling cookFor of the current cook (represented by self):
  def map[S](f: T => S): Cook[S] = {
    new Cook[S] {
      def cookFor(guestCount: Int): S = f(self.cookFor(guestCount))
    }
  }
We now have something that given a guest count, create a T and then uses f to transform it to an S. But what is the type of S? If we print out the manifest we see it is the type of the value yielded by the for-comprehension. In this source here version we output a lot of information as we execute. For example when flatMap and map are applied we see:
Appetizer: flatMapping to scala.Tuple3[scala.collection.immutable.List[Int], scala.collection.immutable.List[java.lang.String], java.lang.String]
MainCourse: flatMapping to scala.Tuple3[scala.collection.immutable.List[Int], scala.collection.immutable.List[java.lang.String], java.lang.String]
Dessert: mapping to scala.Tuple3[scala.collection.immutable.List[Int], scala.collection.immutable.List[java.lang.String], java.lang.String]

Notice both map and flatMap have the same type manifest for S. My take on this is that f is injecting our value of T into the result of mealCook.    

Now let’s take a crack at flatMap, defined thusly:
  def flatMap[S](f: T => Cook[S]): Cook[S]

Again from it’s signature we can have no clue of what S is. And in this case we don’t have a function that returns an S, instead it returns a Cook of S.

Once again we need to start by return some Cook. Since no information on S is available we need to call f like we did for map. But this returns a Cook and not S; in order to get an S we need to call cookFor on whatever f returned.
  def flatMap[S](f: T => Cook[S]): Cook[S] = {
    new Cook[S] {
      def cookFor(guestCount: Int): S = f(self.cookFor(guestCount)).cookFor(guestCount)
    }
  }
So we see how flatMap links the two cooks, and the invocation of cookFor on both. And again if we print out the manifest of S we would get the type yielded by the for-comprehension. So flatMap gets the food cooked by the current cook and placed it in a cook that returns the type of the comprehension (passing it to f). When we invoke the Cook returned by f we get our value of type S.

But how does the compiler extract the types of each cook? This must be linked to the map and flatMap function.  To prove this we can do a simple change like this (full example can be found here):
  def flatMap[S](f: Int => Cook[S]): Cook[S] = ???
  def map[S](f: Float => S): Cook[S] = ???

Our mealCook type changes to Tuple3[Int, Int, Float]. What this shows is that the type of the function input argument of both flatMap and map defines the types of the generator (which is what each <- in the for-comprehension is called). In our original cooking example this argument is always of type T each cook. The compiler knows that each field in the output tuple will have the same value as the T in the cook used as a generator.

We now know that map’s and flatMap’s f function’s input parameter is the type of the generator. And f’s output is the type of the for-comprehension. The implementation of these two methods is what defines how the Cooks are coordinated to work together creating a meal.

I also did a version that outputs messages while executing the code (source here). This version allows us to see how the execution flows through each cook. The main function is the following:
  def main(args: Array[String]) {
    println("---- Each cook ----")
    println(appetizerCook.cookFor(4))
    println(dessertCook.cookFor(4))
    println(mainCourseCook.cookFor(4))
    println("---- Will start cooking ----")
    println(mealCook.cookFor(4))
  }

During execution each cook will output it's name and what it's doing. The output that it produces is as follows:
Appetizer: flatMapping to scala.Tuple3[scala.collection.immutable.List[Int], scala.collection.immutable.List[java.lang.String], java.lang.String]
---- Each cook ----
Appetizer: execute cook
List(5, 6, 7, 8)
Dessert: execute cook
(> (> (> (> 
MainCourse: execute cook
List(Meat, Fish, Chicken, Shrimp)
---- Will start cooking ----
Appetizer: cookFor in flatMap
Appetizer: execute cook
MainCourse: flatMapping to scala.Tuple3[scala.collection.immutable.List[Int], scala.collection.immutable.List[java.lang.String], java.lang.String]
MainCourse: cookFor in flatMap
MainCourse: execute cook
Dessert: mapping to scala.Tuple3[scala.collection.immutable.List[Int], scala.collection.immutable.List[java.lang.String], java.lang.String]
Dessert: cookFor in map
Dessert: execute cook
(List(5, 6, 7, 8),List(Meat, Fish, Chicken, Shrimp),(> (> (> (> )

The first line is actually called during the initialization, before the main function. Then we see the output of each cook. When we ask for the mealCook's food we see the chaining of all the flatMaps and a map to produce the final output. Each cook is called once, and all of them map to the same S output type.

Some interesting things I learned in this experiment:
  1. Because we have no information on what S is, there are few options on what we can do in the map and flatMap functions. We must use the supplied function just to get the types to align.
  2. The first generator in the for-comprehension will dictate the type of the next generators. Each generator may have its own type parameters, but they all will be of the same class. This is the type constructor we see in some of the Monad examples.
  3. When we look at the individual Cook implementations we see they are oblivious to all the logic and control to connect the cooks. We only focus on what they cook, and leave the connection to the for-comprehension.
This last point was somewhat a revelation to me. This is one of the reasons functional programming is so interesting. I’m telling what each cooks does and how I’d like them connected, but not how to execute the connected cooks. Of course the designer of the Cook trait has to know what happens, but not someone extending a Cook. This is even more impressive when we get to things like Iteratees, which I hope to cover in another post.