Category Archives: programming languages

Intermission: Google Squared

Google Squared is an impressive (but still beta) offering from The Google that seems to have some real potential for students and others who want to do quick-and-maybe-not-do-dirty research.

This post is also a bit of an experiment. I want to find out whether I can capture screen shots that are large enough to be (at least partly) readable without completely blowing up the page layout I’m currently using for this blog.

Given the subject matter of this blog, it’s no surprise that I thought of this query:

Google squared query for programming languages

which gave me the following result:

First query result

I admit to being surprised and impressed with the collection of values presented with no additional guidance. However, I want to refine the content a little. First, I’ll click on the [X] column to the left to remove Pascal, Fortran, Cobol, and Forth, (none of which I’m currently using ;-); then I’ll use the “Add items” input field to include Java, Scala, Erlang, and Haskell. Some fairly interesting look-ahead kicked in as I began typing:

Typing J provided a hint for Jython

With all my new rows in place, I have this:

Updated query results with new rows

It’s interesting that “Appeared In” was not found for Java, but Google Squared is a beta. I’m impressed that it chose to put that column in, and did find values for the other languages!

The next hint of the underlying sophistication in G2 came when I decided to modify the columns. After removing “Influenced” and “Appeared In”, I started to add replacement columns, and was offered an interesting set of options.

List of proposed columns to add

Curiosity took over, and I picked “Typing Discipline” from the list. The resulting column confirmed that the term was being used correctly (e.g. with values of “duck,dynamic,strong” for Python and “static,strong,inferred” for Haskell).

Replacing that column with new ones for “tutorial” and “blogs” gave me some additional leads to follow:

Updated results with new columns

When I clicked on the tutorial column in the Scala row, G2 provided a pop-up with a link to Bill Venners’ presentation on Scala at last year’s QCon.

Pop-up with link to Bill's tutorial

I won’t burden this page (or you, kind reader) with any more screen shots from G2; perhaps by now you’ve seen enough that you’ve already left to try it out yourself or have decided it’s not (yet) for you.

There’s an obvious comparison begging to be done here; Wolfram Alpha wasn’t immediately successful for this particular search:

Wolfram|Alpha isn't sure what to do with your input.

…but let’s remember that both of these new tools are work in progress.

My overall impression is that it’s a very impressive start, but has room for improvement. Several of my attempts to add other columns relevant to this blog (such as “DSL” or “parsing”) yielded “No value found” in most or all rows. Even the proposal from G2 of “Major Implementations” was only partially successful. There were multiple values for all languages except Java, Scala, and Erlang, all three of which got “No value found”.

It would be really interesting to be able to derive new columns from the contents of others. For example, counting the number of values in a list or doing arithmetic with numeric values would both be handy.

You may have noticed in the full-page screenshots the link at the top-right-hand corner that invited me to “Sign in to save your Square”. I did so, and plan to come back later to see if the results change over time.

I’m very interested in seeing how G2 grows from this not-so-humble beginning.

BuilderBuilder: The Agenda

The “BuilderBuilder” posts will be my exploration of one question: “What does Functional Programming mean to me as a practicing OO software developer?” Despite contemporary buzz about how FP will save the world from bugs and single-threaded code, I don’t want to read (or write) another clever Fibonacci number generator, or another theoretical explanation of category theory and monads. I want to see what FP is like when it sits down the aisle from me in work clothes.

To pursue that goal, I’m going to explore a programming problem that meets these criteria:

  1. Its purpose would be recognizable to a working programmer.
  2. It is large enough that a clever one-liner won’t suffice.
  3. It isn’t (consciously) chosen to cater to either OO or FP.
  4. It will require me to try things outside my daily comfort zone.

That last point means that I’ll make mistakes and do stupid things. I hope the record, warts and all, helps someone else (including me, later on) avoid them.

The problem I’ve chosen is a specific code generation task, which I’ll describe in the next post. There is a wide range of opinions on source code generation, but source code is one thing that any programmer understands, regardless of other application domain specialization.

Because my goal is exploration, and I’m starting with unequal familiarity with the languages I’ll use, the code almost certainly won’t be idiomatic or make best use of all available libraries. In fact, I’ll probably do more than necessary “by hand” to avoid bias toward more familiar tools. I may be able to address some of that along the way, based on accumulated experience (or helpful feedback, hint, hint… ;-).

OK, enough meta-talk. Next up, the task.

On closed range boundaries

Programmers often need to talk about ranges of values (whether or not the language at hand supports the concept explicitly). For example, given the Java array

String[] streetLines = new String[3];

an index i must satisfy

0 ≤ i < 3 (in normal Mathematical notation)

or

0 <= i && i < 3 (in Java notation)

to be a valid index on streetLines. But how does one read those expressions aloud?

I learned to read <= and as “at most” and >= and as “at least” from Dijkstra. Those readings have at least two benefits:

  1. They are quite natural. For example, the assertion

    there are at least six eggs left in the carton

    is at least as obvious in normal speech as the equivalent phrase

    there are six or more eggs in the carton

    and is probably much more obvous than other equivalent phrases, such as

    the number of eggs left in the carton is greater than or equal to six

    or

    there are more than five eggs left in the carton

    even though we’ve seen the equivalent of those in code many times.
     

  2. They avoid introducing extraneous logical operators. The phrasing

    less than or equal to

    seems more complex than

    at most

    because it implies the use of disjunction (logical “or“) in its evaluation. The other equivalent phrase

    not greater than

    introduces negation into a simple comparison.

As a tiny “easter egg”, look for uses of “at least” en passant in the preceding few sentences! If they went by without being obvious, perhaps that serves as evidence of point 1 above. 😉

“Linguistics” vs. Mathematics?

I happened across an interesting post on Chris Okasaki’s blog, titled Less than vs Greater than. Let me suggest that you read it before continuing here.

I would paraphrase his point about errors he observed in students’ programs as follows:

A student who writes an expression such as

expL < expR

often appears to lock on the concept of “something being smaller” and then become mentally stuck on “dealing with smaller stuff”, even if that is not appropriate for the meaning of expL and expR at that point in the program.

He illustrates his point very nicely with a flawed binary search.

For quite some time I’ve tended to write expressions of the form

lowerBound <= someVar && someVar < upperBound

to express that someVar is within the half-open range

[lowerBound..upperBound)

e.g. array subscripts that must be at least zero, but less than the length of the array. The visual hint of placing the variable textually between the limiting values seemed nicely mnemonic to me. (From there, it has also seemed natural to experiment with preferring left-to-right, lesser-to-greater ordering consistently in comparison expressions, although I’m not suggesting that as a universal coding convention).

However, I was quite surprised by the first comment, which described as “linguistically wrong” the common C-language-based idiom

if (5 == i)

(intended to catch typos involving only a single equal sign). The commenter said

Linguistically that’s wrong. You’re not testing 5 to see if it’s equal to i, you’re testing i.

which implies an asymmetrical interpretation of equality as being a “test” on its left-hand value!

By ignoring the fact that == is simply an assertion that its operands are equal, that interpretation fails on many completely valid cases, such as

(a + 1 == b - 1)

and the first clause

lowerBound <= someVar

of the “within bounds” cliché above.

That misinterpretation of == seems consistent with a variety of incorrect or awkward uses of boolean expressions widely seen. Many of us have probably seen code (or read blog posts about code) resembling:

boolean fooIsEmpty;
...
if (foo == null) {
    fooIsEmpty = true;
} else {
    if (foo.length() == 0) {
        fooIsEmpty = true;
    } else {
        fooIsEmpty = false;
    }
}

I suspect two culprits behind this kind of abuse:

  1. Failure to include the right kind and amount of Mathematics in the education of a programmer, and
  2. The C/FORTRAN use of = for assignment.

With respect to point 1, I believe that Boolean Algebra is simply a fundamental skill for programming. The ability to manipulate and understand boolean expressions equally important as the ability to deal with numeric expressions for most programming of the kind I see daily. It is understandable that a programmer whose training never addressed boolean expressions except in the context of if() or while() would be uncomfortable with other uses, but that’s a problem to be solved, not a permanent condition to be endured.

Regarding point 2, much of what I’ve read or experienced supports the idea that FORTRAN—and later, C—were more commonly used in the US than Algol or other alternatives due to political, cultural, and commercial issues rather than technical ones. (I recommend Dijkstra’s “A new science, from birth to maturity” and Gabriel’s “Worse is Better” as good starting points.)

Whether that conclusion is valid or flawed, the decision to use = for the (asymmetrical!) “assignment” operation immediately creates two new problems:

  1. The need to express the “equality test” in a way that won’t be confused with assignment; and
  2. The risk that users of the language become confused about the symmetry of equality, due to guilt by association with the “assignment” operator.

Algol—and its descendants, including Pascal—avoided those problems by using the asymmetrical := for assignment. The ! character is used as an operator or suffix in other languages to express destructive value-setting. But Java, C#, and others, have followed the FORTRAN/C convention, with the predictable effects.

Given such far-reaching consequences from a single character in the source code, I’m reminded again how important it is to make good choices in coding style, API design, and other naming contexts.


Postscript:

The left-to-right ordering of the number line has some interesting cultural and even individual baggage. I recall reading about a mathematician whose personal mental model was of negative values being behind him and positive values receding into the distance in front of him.

I believe it was in connection with the Dutch National Flag Problem that Esdger Dijkstra wrote about a student whose native language was written right-to-left. While other students had designed programs with indices increasing from zero, that student had produced an equally-valid program with an index that decreased from the maximal value.

Scala and Programming 2.0

I commented elsewhere on how the “Architecture of Participation” idea may be percolating into the field of programming languages. I am especially interested in seeing whether the adoption of Scala provides evidence of this phenomenon.

Scala is a strongly, statically typed language implemented on the JVM—all characteristics that raise eyebrows (if not noses) in some circles. However, Scala’s ultralight approach to syntax is very much in line with the current taste for highly flexible notation and internal DSLs as a tool of expression.

Type inference is a very attractive compiler feature. And it’s great to get performance improvements “for free” every time the JVM team makes HotSpot smarter about JIT compilation, method in-lining, etc. But every time I revisit the ideas in Martin Odersky’s Scala talk at Javapolis 2007, I’m impressed with the design of Scala as a notation that invites participation.

Time will tell.

Even though it was published on April 1st…

…there’s evidence that some folks still haven’t completely bought into the idea of pure functional programming (even those who are somewhat sympathetic… 😉

Scala currying without lab rats

What is it?

In normal function evaluation, you supply all of the arguments of a function and get back a result. Currying is a technique for specializing a function by providing some of the arguments, in which case you get back a function with fewer arguments. This technique is widely used in functional programming, and is named after logician and mathematician Haskell Curry (who is also honored by a programming language using his first name).

As the well-known joke goes, “I’m glad we don’t have to call it Schoenfinkeling!”

A practical example

Some US states compute sales tax by a scheme similar to the following: there are two rates applied, one to the entire taxable value, the other to some part of the value up to a limit. (And it obviously gets more complicated than that, but this is enough real-world complexity for our example.)

For example, here are the sales tax policies for three hypothetical states.

Abbr. State City Tax computation
TF Terrafirma n/a 2% of the entire value
TC Terracotta n/a 6% on the first $1500
CF Confusion Gotham 5% on the entire value plus 4.5% on the first $1000
CF Confusion Ocean City 5% on the entire value plus 3% on the first $1000
CF Confusion Hometown 5% on the entire value plus 1.5% on the first $1000

Without currying

With a helper to compute a rounded percentage…

    def pct(rate: Double, amt: Long) =
(rate * amt / 100.0D + 0.5D).toLong

… we could fit all these cases into the single method below. (To keep things simple, we’re representing money amounts as whole cents.)

    def tax(rateA: Double, limit: Long, rateB: Double, amt: Long) =
pct (rateA, amt) + pct (rateB, limit min amt)

With that general method in hand, we can define functions for the various locations.

    def taxTF         (amt: Long) = tax(2.0D,      0, 0.0D, amt)
def taxTC         (amt: Long) = tax(0.0D, 150000, 6.0D, amt)
def taxGothamCF   (amt: Long) = tax(5.0D, 100000, 4.5D, amt)
def taxOceanCityCF(amt: Long) = tax(5.0D, 100000, 3.0D, amt)
def taxHometownCF (amt: Long) = tax(5.0D, 100000, 1.5D, amt)

The significant changes are hidden in the mass of boilerplate code, which means a large amount of redundant typing (and reading). Let’s look a the difference when we revise the implementation to use currying.

Now with a dash of curry

First, we replace the method above with a curried function. (The extra line-breaks aren’t significant; they’re just fitting the slightly longer line into a narrow blog page layout.)

    def tax =
(rateA: Double) =>
(limit: Long  ) =>
(rateB: Double) =>
(amt:   Long  ) =>
pct (rateA, amt) + pct (rateB, limit min amt)

The first improvement shows up when we define the sales tax functions for Terrafirma and Terracotta:

    def taxTF = tax( 2.0D)(     0)(0.0D)
def taxTC = tax( 0.0D)(150000)(6.0D)

Currying lets us define the specialized functions directly and simply, without the need for all the boilerplate. Providing only the first three arguments to tax, we get back functions that take the one remaining argument, the amount of the sale.

It gets even nicer when we start to deal with the great state of Confusion. We begin by defining a state-level function…

    def taxCF = tax( 5.0D)(100000)

… which we then use in the definitions of the individual locales.

    def taxGothamCF    = taxCF( 4.5D)
def taxOceanCityCF = taxCF( 3.5D)
def taxHomeTownCF  = taxCF( 1.0D)

Could we do this without currying? Of course! But the point is that the curried version eliminates almost all of the typing except for the part that’s significant: the arguments that change for each specialized version. By supporting currying and the use of functions as first-class values (e.g., as arguments of other functions, as returned values from functions, etc.), Scala and other functional languages encourage us to think of composing programs as a much finer-grained level.

As a small example, we can create a function such as:

    def report(taxf: (Long => Long), amts: List[Long]) = {
println("Tax on " + amts + " is " + taxf(amts.foldRight(0L)(_ + _)))
}

Given a list of item prices…

    val items = List[Long](20000, 40000, 60000, 80000)

…we can apply shotItemsTax with a defined tax function:

    report(taxHometownCF, items)

We can also apply it to an-inline function literal based on our curried tax functions:

    report(taxCF(0.5D), items)

And that reflects the larger goal of making the definition, customization, and application of functions as painless as possible.

What about those lab rats?

There was a recent discussion on Artima of an article on currying. The original author used illustrations based on simple numerical calculations, something I’ve done in many conversations or classes.

My rationale has always been this: I want to illustrate a technique without requiring that the reader/audience spend time learning the problem domain. Presumably anyone who would read an article about a programming technique would already know arithmetic, and therefore could concentrate on the technique instead of the example. Biased by my own background, I tend to think of the Euclidean algorithm, factorials, and Fibonacci numbers as the standard lab rats of computing. But I’ve slowly begun to appreciate that people whose learning/thinking style is more inductive than deductive may prefer concrete examples that resemble their day-to-day programming tasks.

Language(s) for teaching programming

I’ve seen (and participated in) a number of discussions recently about the selection of a first programming language. I don’t think the choice is a trivial matter, and don’t necessarily think there’s a single right answer, depending on the overall goals of a curriculum. Here are some choice strategies I’ve seen discussed recently:

  • Commercially-popular language
    Pros:

    • students see a connection to the job market,
    • prospective employers may like it,
    • students may have had prior exposure (especially non-traditional students), and
    • a wealth of related materials (books, web pages, etc.) is available.

    Cons (especially if the curriculum doesn’t provide multi-language experience):

    • risks biasing the student’s perception of programming,
    • risks putting the student into the “knowledge half-life” problem, and
    • risks the nasty choice, as language trends change, between churn in the curriculum or becoming less “relevant”.
  • Academically-popular research language
    Pros:

    • professors and grad students may like it, and
    • some of these languages explore ideas that may be important by the time the students graduate.

    Cons:

    • similar to the “cons” list above(no pun intended!).
  • Teaching-oriented language (remember Pascal?)
    Pros:

    Cons:

    • can risk separation of “theory” and “practice”,
    • teaching-oriented aspects may shield students from the “pain” of production programming, and
    • prospective employers may strongly prefer “practical” language experience.

Following the line of thinking from the previous post, I believe that a language can serve as a bride from one style of thinking and problem solving to another, if it is properly designed for that purpose. Going further, the book Concepts, Techniques, and Models of Computer Programming makes use of a multi-paradigm approach in which a kernel language is extended in various ways to expose the student/reader to a variety of styles of programming. I’m currently focusing my spare time on Scala, but I’m intrigued by Van Roy and Haridi’s use of Oz to explore the style “spokes” out from a simple “hub” of basic computation concepts.

Maybe I can get back to Mozart/Oz after I’ve mastered Scala. Of course, I’d have to make friends with emacs

Scala tutorial on IBM developerWorks

IBM developerWorks has the first two parts of an introduction to Scala by Ted Neward. The series looks good so far, and I’m eager to see more.

I was particularly interested to see in the second article that Neward described Scala as “a better Java“, a theme that Dianne Marsh and I explored in the Scala workshops we did at Java Posse Roundup 2008. I’m tidying up the material we used in the first presentation to make it more blog-friendly, and will post it soon.

For me, one key point of that description is the way Scala provides a graceful transition from Java to functional programming, just as Java offered a graceful transition from C to object-oriented programming.

ScalaBridge.jpg

Java’s similarity to C allowed a developer to quickly begin writing procedural Java “with a C accent”, and then transition to an object-oriented style. In the same sense, a developer can quickly begin writing Scala “with a Java accent”, and then begin to shift to a more functional style.

Scala makes this an attractive proposition by offering some immediate benefits. The canonical example is type inference, which allows the programmer to skip writing redundant type specifications. But more about that as I get the workshop material posted.

Concision, Intuition, and Parsimony

Paul Graham seems to have reservations about Paul Prescod’s views on contrasting programming language design aspects:

Python’s goal is regularity and readability, not succinctness.

There’s also been a bit of a flame-war in the Scala community over the use of symbolic versus verbal notation, as in /: instead of foldLeft. Unfortunately that debate has seen many uses of the “I” word (intuition). The Wictionary definition begins “Immediate cognition without the use of conscious or rational processes.” I become skeptical as soon as that word enters a conversation about programming, because “intuitive” often strikes me as a synonym for “familiar”. Too often some variation of the phrase “that’s non-intuitive” seems to be a judgmental variant of “I’ve never seen that before”.

This also seems to connect with the principle of parsimony (often conflated with Zipf’s Law, but not always): frequency and cost of expression have an inverse relationship. Or, in clear text, “use short words oftenest”. Is it any surprise that when a person moves from one domain of discourse to another, that the shift in subject matter implies that there will be a shift in what things must be expressible economically? To apply the question to programming, should we assume that all programming language need to talk about the same concepts? Should things that are easy (economical and/or symbolic) to express in one language need to be similarly easy in all other languages?

Should experience with one language really dominate our expectations about notation in another language if the two languages really address different application domains or conceptual models of programming. I don’t think so.