Novice Lambda Use in Java

Java provides two ways to easily provide a reference to a function:

button.setOnAction(e -> showAlert("pressed"));
stream2 = stream.map(Object::toString);

The first one is generally referred to as a lambda, and the second as a method reference, but I’m going to refer to them both as lambdas for this post.

In line with the previous post on enum use, this post looks at lambda use in our Blackbox data set (collected from users of the BlueJ beginners’ Java IDE). All data is from the beginning of the project in mid-2013 up until the end of February 2016, a few days ago.

Lambdas are a very recent addition to Java. Java stems from 1996, but lambdas were only introduced in Java 8 in March 2014. Thus it is not surprising that they will not occur very frequently in the data set, as instructors are likely not up to speed on lambdas yet, won’t have had much chance to adjust their course designs, and often treat lambdas as an advanced topic, in contrast to BlueJ’s novice focus. But let’s take a look anyway.

In the data we have 11,666,331 source files which have been successfully compiled at least once. I looked at the most recent successful compilation of each of those source files, to see if they contained a lambda. 6,669 (0.05%) of those source files contained a lambda, with 20,698 lambdas overall. So although lambdas are very rare, the average number used once they are used is three per file. That suggests to me that people find them quite useful, but that they have not gained any traction in education yet.

Number of parameters

For the lambda arrow syntax (i.e. params -> code), I counted the number of parameters on the left-hand side of the arrow:

lambda-params

The reason that 3 and 4 are on there are that there is one instance of each, from 20k lambdas, with no lambdas having more than four parameters. So if the language designers had restricted lambdas to two parameters max, our users would not have noticed.

Haskell programmers, among others, may be a bit confused by the zero-parameter lambda. This is Java’s equivalent of the use of the () -> code pattern in Ocaml/F# to refer to some code but delay its evaluation until it is called later without any real parameters. Haskell would use a monadic computation of a type like “IO ()” for this purpose.

Styles of use

I categorise lambda style into three mutually exclusive categories:

list.forEach(x -> print(x.toString())); // Expression RHS
list.forEach(x -> {print(x.toString());...}); // Block RHS
list.stream().map(Object::toString)... // Method reference

Of our 20,698 lambdas, 14,014 (67.7%) used the expression form, 6,161 (29.8%) used the block form, and 523 (2.5%) used the method reference form. I wonder if a bit of this is advertising: I wasn’t aware of the method reference form in Java for a little while after I learnt about lambdas in the language. I use method references whenever possible in my code because I’m used to point-free style in Haskell. However, I suspect it’s more difficult for novices to understand than the lambda form, so I’m not surprised it’s used much less than the arrow form.

Destinations of Lambdas

I’m doing a syntactic analysis here, so I didn’t attempt the complex semantic task of working out which types the lambdas were being compiled into. However, I did look at which methods the lambdas were being passed to. That is, if you have code like:

myList.map(e -> e.getX())

Then I recorded “map” as the lambda destination. The graph below gives the most popular destination methods for lambdas. The y-axis is purely cosmetic to separate out items with similar frequencies, and I only show frequency 200 upwards because it gets congested below that:

lambda-dest

Since there are low numbers of lambdas, some of this is definitely influenced by individual courses appearing in the data. For example, I tracked the addButton method back to a utility class from a particular university, and I think accumulate has a similar story. But the general pattern is pretty apparent: lambdas are used much more for GUI event handlers than for streams. This is not surprising in one sense: existing code is more likely to use event handlers (which pre-date lambdas) than streams (which were introduced alongside lambdas), so GUIs may be the easiest way to add lambdas to existing courses. I’m still surprised at how large the difference appears to be, though. It will be interesting to check back in a year or two to see if this pattern continues to hold.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s