The burden of FP, part 2

Do you have anything to declare?

Part 1 looked critically an imperative-style solution to the first problem from Project Euler. This part of the discussion introduces a more functional style, using higher-order functions. The goal is to think more declaratively, familiar territory to any programmer who has used SQL.

In the solution from last time…

  def sumi(n: Int) = {
    var s = 0
    for (i <- 1 until n) {
      if (i % 3 == 0 || i % 5 == 0)
        s += i
    }
    s
  }

…it’s useful to think of the for expression as a query, instead of an imperative control structure. The subexpression 1 until n isn’t hardwired syntax of the for statement, but an expression for a set of values. Just as a program written against a database can be made simpler by enhancing the query, we can simplify the body of the for expression by moving the filter, as follows:

  def sumiq(n: Int) = {
    var s = 0
    for {
      i <- 1 until n
      if i % 3 == 0 || i % 5 == 0
    } s += i
    s
  }

That change is bigger than it might seem at first glance; it applies a filter to a set of values instead of applying a control structure to a single value. We can see this even more clearly by going to the Scala interpreter.

  scala> 1 until 10
  res0: Range = Range(1, 2, 3, 4, 5, 6, 7, 8, 9)

The input expression evaluates to an instance of Range; its values are subsequently pulled out and displayed by the interpreter. A sequence of values can be further filtered by a criterion, either written in-line (anonymously)…

  scala> (1 until 10).filter(i => i % 3 == 0 || i % 5 == 0)
  res1: Seq.Projection[Int] = RangeF(3, 5, 6, 9)

…or defined explicitly:

  scala> val divisibleBy3or5 = (i: Int) => i % 3 == 0 || i % 5 == 0
  divisibleBy3or5: (Int) => Boolean = <function>
  scala> (1 until 10).filter(divisibleBy3or5)
  res2: Seq.Projection[Int] = RangeF(3, 5, 6, 9)

Scala allows us to write single-argument methods as if they were operators, so the last input above can be simplified to:

  scala> 1 until 10 filter divisibleBy3or5
  res3: Seq.Projection[Int] = RangeF(3, 5, 6, 9)

So we now have a variety of ways to get the sequence of “interesting” values; for example, we could expose the “multiples of” concept as curried parameters to get something resembling this…

  def sumMultiples(a: Int, b: Int)(n: Int) = {
    val multiples = (i: Int) => i % a == 0 || i % b == 0
    val candidates = 1 until n filter multiples
    var s = 0
    for (i <- candidates) s += i
    s
  }

…but we still have in-line, imperative-looking code accumulating the sum.

Folding the result

We can extract the sum calculation by defining our own sum-of-sequence-of-integers function…

  def sumSeqInt(ns: Seq[Int]) = {
    var s = 0
    for (n <- ns) s += n
    s
  }

…but why reinvent the wheel? The function sumSeqInt is just a specific example of a more general pattern. Calculating a product or List[Int] has exactly the same structure; we only need to provide the appropriate initial value and operation.

Required
result

Parameters
Initial value Operation
Sum 0 +
Product 1 *
List Nil ::

That higher-order pattern is a fold, and comes in two flavors, depending on whether the accumulation begins at the left- or right-hand end of the sequence. As shown below, the symbolic notations (/: for foldLeft and :\ for foldRight) are visual mnemonics for the expression trees they represent, with the initial value of z (for “zero”) at the bottom left or the bottom right of the tree.

foldLfoldR.jpg

Using a fold, we can now express a complete solution, including parameters for the multipliers, in the form below…

  def sumfl(a: Int, b: Int)(n: Int) =
    (0 /: (1 until n filter(i => i % a == 0 || i % b == 0))) {_ + _}

…which is fairly close to the specification: “the sum of positive integers below n which are divisible by a or b“.

“The soul of wit”

That last version certainly looks very different from typical imperative code written in C or Java. The point is not to be cryptic, but to be concise. Improving the “signal-to-boilerplate ratio” brings two benefits:

  1. We lower the risk of error (typos or otherwise), which are usually proportional to the quantity of code, and
  2. If (when!) the specification changes, we can be more efficient about finding and making the required code changes.

From one perspective, higher-order functions allow us to eliminate explicit control structures, simplifying the organization of the code. From a different angle, they allow us to think in terms of whole data structure, rather than individual data elements (avoiding the functional analogue of primitive obsession.)

In the final post in this series, we’ll come at the problem from a different angle, which allows us to address performance.

Advertisements
Trackbacks are closed, but you can post a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: