Category Archives: Uncategorized

What Do Novice Programmers Write Literally?

Recently I was asked about use of hex and floating point literals (especially “E” notation) in the Blackbox data set: do beginners use them? I was intrigued enough to knock up a simple program to find out. My method is quite straightforward: I take the latest version of each source file which successfully compiled, run it through a Java lexer and pick out the literals. This gives us about 40 million source files to look at.

Before we get into the results, here’s some predictions I made beforehand about our data (where most users are assumed to be programming novices):

  • Very few users use hex literals
  • Most hex literals are 0xFF or similar bitmasks
  • Almost no-one uses underscores (Java lets you write numbers with underscores, e.g. 1_000_000)
  • Almost no-one uses E notation (and when they do, mainly 1e-6 for epsilon values in floating point comparison)
  • Most floating point values are between 0 and 1

Hexadecimal Integers

Let’s start with hexadecimal integer literals. There were 814,920 hex integer literals, compared to 29,044,559 decimal integer literals. So 2.7% of hex/decimal integer literals were in hex. (I didn’t bother going into octal, but there were a handful of uses. I suspect many of these were an accident.). That is a bit higher than I was expecting, admittedly. In terms of their value, here’s the top five:

  • 0xFF: frequency 89,663
  • 0x0: frequency 52,732
  • 0x30: frequency 16,742
  • 0xF: frequency 16,009
  • 0x1: frequency 13,799

There are two F bitmask values there as predicted. I was a bit surprised by how many zeroes and ones were in there: why write them as hex (0x0) and not just decimal (0)? My guess is that they are working with bitmasks nearby, and out of habit/consistency write the values as hex.

Decimal Integers

There’s not too much to say about decimal integer literals, but I will mention the most frequent items. It’s a sequence that runs as you might expect (zero being most frequent, and increasing numbers being less frequent), punctuated by some numbers which testify to computing’s love of powers of two. Most frequent first:

0, 1, 2, 255, 128, 3, 4, 5, 256, 8, 10, 7, 6, 100, 127, 16, 20, 1000, 9, 50

The frequency is a decreasing power law (1 is half that of 0, 2 is half that of 1, then the tail begins to flatten out).


Underscores are a relatively recent addition to Java (added in Java 7) and little-known. Indeed, only 692 decimal literals had underscores: 0.002% of all decimal literals. Oddly, 737 hex literals had underscores, which as a proportion is much higher: 0.09%. I suspect this is because both underscores and hex literals are both used by more advanced users. Generally though, our users are clearly not making much use of this underscore feature.

Decimal Floating Point

There were 1,791,915 floating point decimal literals. Of these, only 3,002 used the “E” notation (e.g. 1.15E12): 0.16%. Clearly not a very used feature. As for their values, the top five were: 1e-3, 1e-8, 1e-6, 1e6, 1e-20. I’d say my prediction about the use for epsilon values was borne out.

Regardless of notation, across all floating point decimal literals, the most frequent values were: 1.0, 0.0, 100.0, 3.0, 2.0. Technically, my prediction that most values were between zero and one was almost correct: 47% of values were between zero and one. But really, this is only because 23% of them were zero or one. As a last side note on these literals: 7,130 (0.40%) started with a dot (e.g. “.5”) — something we disbarred in Stride due to the awkwardness of parsing that in expressions. But actually we could have banned E notation (also a pain) with less immediate impact.

Hexadecimal Floating Point

If you even knew that hexadecimal floating point notation was a thing in Java, then give yourself a pat on the back. Added in Java 5, they look like “0x1.fe2p5”, where p takes the place of the usual “E” notation because E is of course a valid hex character. I only know about this because we have a parser in BlueJ, which does accept these. I found precisely four uses of this notation, which is probably more than expected.


This is a pretty cursory look at literals with a fairly crude methodology. Note that although we only looked at the latest version of each source file, source files in Blackbox are not independent of each other (e.g. if a teacher gives out a project with a floating point literal, that will show up identically in each student’s copy). For example, the four hex float point literals were the same value, suggesting they are not independent. And on a related note, I’ve only looked at source files regardless of whether they come from the same user or not, so we’re only measuring source occurrences here, not the number of users who use a particular notation. But I think our N is high enough that individual users cannot tilt the statistics.

Leave a comment

Filed under Uncategorized

Naming abstractions

Naming things is important: names permit more precise and concise communication. We don’t say “the movable clickable pointer controller box”, we say “mouse”. In computing we face the challenge of naming a lot of things, both physical and virtual. Some names are intended to be useful analogies: files or documents are like their physical equivalent. This can quickly get quite tenuous: scrolling is named after the action needed to read a historical scroll despite most people having never scrolled a physical roll of paper, and a mouse was named due to its tail.

Computing has been surprisingly effective at digging up and re-using words which pre-date computing, and vastly increasing their use:

(What was delete used for before computing, I wonder? Ledgers?)

Naming difficulties

Sometimes we can find useful words to refer to computing concepts, even as the concepts become more abstract. A sequence of items is a list. An group without duplicates is a set. A list/set is a collection. That’s not too bad. It gets harder when you need a name for a data structure which corresponds one value to another. We usually talk about keys and values, and variously call this collection a “associative array”, “dictionary” or “map”. It’s clear that there is no useful existing word to borrow which carries the right meaning, so we must just pick one and collectively learn what it means.

A list is a type, as is a string or integer. But a list can have different inner types, so we want a different name for that: a polymorphic type. But a list (which has one inner type) is different from a map (which has two), and so we need a name to talk about different types of types. For this, type theory refers to kinds: the arity of types, where arity is the name for the number of parameters that something takes. If your brain is creaking at this point: is that because type, kind and arity are poorly chosen words, or just because the concepts themselves are abstract and difficult? Not to mention all the other abstractly-named programming terms: class, interface, protocol, metaclass, polymorphism — the list goes on.

I think people struggling to learn something with a non-obvious name (i.e. most programming concepts!) often make a false assumption: that the concept is difficult to learn because the name is unhelpful. (And thus if it was just named better, it would be easier to understand.)

Names can be helpful if they are descriptive (e.g. a network star topology) or transfer intuition (e.g. memory in computers). But intuition is not always available for abstract concepts, such as classes in object-oriented programming. As we get progressively more abstract, names aren’t going to save you from the difficulty of understanding an abstraction: the name isn’t the problem, the concept is.

1 Comment

Filed under Uncategorized

Living Computing

SIGCSE this year was held in Seattle. It turns out there is a computer museum in Seattle, not too far south of the downtown area, called the Living Computer Museum. The key distinguishing feature is that almost all of the computers there are actually running, and usable by visitors to the museum. Sue Sentance, Ian Utting and I headed off to take a look.

The ground floor is filled with modern gadgets and sensors which were all too familiar after spending the week at a computing education conference: a real busman’s holiday. But on the first floor were all the vintage computers, which were much more interesting. There is an oscilloscope which is running Tennis for Two, considered to be one of the first computer games:

This typifies what is great about the museum: it uses original hardware, not emulation, and is usable by visitors: I was playing the game with Ian while recording the video.

Historic Computers

If there’s one lesson I’ve learned from computing museums, it’s that you either die by age 25, or live long enough to see your childhood computers in a museum. I was amused to find an Amstrad in the museum, given that it’s a British manufacturer. The Amstrad was a fairly cheap machine, and the company’s head Alan Sugar has always had a bit of a reputation as a producer of low-quality goods. Hence my amusement at this:

After a moment or two I worked out how to reset the machine:

And then I found Sim City in that box of 5.25″ floppy disks. I’m not sure I expected to ever again use a floppy disk! I first accidentally ejected the floppy with the OS on it, but unlike USB sticks which complain immediately if you yank them out, older PCs didn’t even notice if you didn’t do anything to access the disk while it’s out. So I managed to load Sim City:

To explain: older games used to prevent copying by having special codes in the manual which you had to enter on load. To avoid being defeated by photocopies, they were often printed in yellow-on-white or black-on-black, and/or spread throughout the manual so you’d have to copy the whole manual. I did google on my phone, but at the time I couldn’t find the relevant info (now found, for the curious).

Interface Evolution

There is a card-punching machine in the museum on which you can punch your own cards. I’m a bit too young to have done any card-punching myself, but it’s quite an experience. You can’t see the last 1-2 characters you typed so it’s easy to get lost punching even simple sequences. There’s no backspace/undo, of course. And the keyboard itself is a marvel of bizarre design:

Punch cards are quaint curiosities now, but what a bloody awful way to program. 10x programmers may not exist, but we are all 100x programmers compared to the punch card era.

I also found a computer running Windows 1.0, which I’d never used before (3.1 is where I came in). The basic ideas of GUIs are there, but it’s horribly slow and clunky:


If you’re ever in Seattle, I highly recommended visiting the Living Computer Museum. It’s not a huge place, but it is my favourite of the computing museums I’ve been to, for the simple fact of being able to use the computers. It’s something I’ve mentioned before when discussing computing research: I don’t think you can truly understand an interface without using it. You can’t appreciate how clunky Windows 1.0 was until you drag the horrible ball mouse across a mouse mat and have to hold and drag the mouse to access menus and mess around with monochrome Paint.

Finally, I’ll end with this, as a thought on how computing has evolved (the scale isn’t apparent here, but it’s about a metre tall and a couple of metres wide):

1 Comment

Filed under Uncategorized

Stride and Git arrive in BlueJ preview

We’ve just released a preview version of the next major BlueJ release, 4.0.0. It’s version 4.0.0preview2, available for download from the main website. There’s several features available in this preview release:

Stride is our blocks-like structured code editor which we added to Greenfoot 3, and we’ve now added it to BlueJ as well. There’s lots of details elsewhere on Stride so I won’t reproduce it all here. There is our guide with a simple text+pictures overview of the editor. We recently noticed that a conference talk we gave at last year’s JavaOne made its way online, so you can watch a video of that on Youtube. And Michael has been making a few short videos about Stride over on his blog, which you can watch. We’ve now included two-way conversion from Stride to Java and Java to Stride, so it’s easy to take your existing Java projects and convert them to Stride to get a good feel for the editor.


We’ve also added Git to BlueJ. People have been asking for Git support for years, but previously we only had Subversion and CVS(!) support. We’ve now added Git, with a fairly simple interface that makes it easy to get started with using Git to version control and share BlueJ projects. We’ve got a draft tutorial online for Git in BlueJ. (And CVS support has now been removed.)

Another major change is in the error highlighting. Previously, BlueJ would only show an error when you hit the compile button, and then only one. That’s now changed to match the behaviour of most Java IDEs: errors are shown as a red underline as you write the code, and if there are several errors, they are all highlighted. It will be interesting, once the full release is out, to look at our Blackbox data and see what effect this change has (if any) on programming behaviour in BlueJ.


The last change is one of the most time-consuming but least catchy. We’ve rewritten large parts of BlueJ’s interface (from using Swing to using JavaFX, the newer Java GUI toolkit). Along the way we’ve improved various features, which I’ll talk more about another time. Probably the most noticeable change is that we now support tabbed editors (you can see the tabs in the pictures above), with multiple editor tabs in one window, rather than always having a new window for each editor (another oft-requested feature).

It’s called a preview release because we know it’s not quite finished: there’s still a bit of GUI to improve, some small bugs to iron out and so on. But we think it’s close enough to let everyone have a play with it. If you spot any issues, let us know: in the comments here, by email to, or on the Blueroom teachers site.

Leave a comment

Filed under Uncategorized

A case for publishing research software

A major part of research is acquiring and sharing knowledge. This manages not to be as straightforward as it should for political/business reasons (see: journal publishers, paywalls and open access), but technically it is at least simple. You write a paper consisting of words and pictures, other people download them and read them. Knowledge has been transmitted. Where life gets much more difficult is a newer but fast-growing part of research: sharing software.

One use of software in research is as a tool for doing analysis. This affects all the natural sciences, and there are issues with how to gain credit for producing software (see the new proposed policy on software citation, and the new journal of open source software). But within computer science there is an additional research role for software: sometimes the software is part of the research output. Nowhere is this more apparent than in areas like human-computer interaction research.

Publishing software interface research

The typical form of a modern paper about a new software interface is to provide a description of the interface, followed by an evaluation of the interface with human participants. Thus the research output is two-fold: design and science. Putting these both into a paper might seems sufficient.

The software research process.

The classic software research process. The researchers follow the process on the left, but everyone else only sees the output on the right. The researchers must have the software, but they are not required to share it, only to describe it in the paper.

However, accurately describing an interface design in text is a difficult task — the medium is just ill-suited. (Much like writing about music being compared to dancing about architecture.) It is difficult to describe the function of all interactions with the system: you’d write an endless series of “when the user presses left here they return to the home screen”; something almost akin to the original program code. You’d also need to describe not only the intended interactions but what happens when the user does something wrong. You also can’t use pure text: images are surely necessary to portray an interface. And that’s not to mention emergent properties which affect software’s usability, like the speed of the interface. Ultimately, if you want to understand the design of a software interface, there’s very little substitute for just using the interface.

Research Software Archaeology

Recently, I needed to write a detailed related work section for our work on frame-based editing. One of the challenges of publishing this work is that it is similar to work on the structured editors of the 1980s, which have largely failed to catch on. And additionally, it seems every reviewer knows a different editor, so each one seems to come back with “how is this different to structured editor X that I used in the 1980s?” [1]

So I end up searching for details about the design of 1980s structured editors. If there’s no paper and no software, there’s not really any way to find out about the design. If there is a paper, I hope that it has a reasonably detailed description of the editor (for example, the write-up of the Cornell Program Synthesizer). Regardless, I also try to search for a runnable version of the software. Ha!

There are few editors from that period which are available to run on a modern machine. Some were simply never released, partly because pre-Internet, sharing software was awkward. Some are unavailable due to their age: many of the structured editors were designed for processors or operating systems which are no longer available. So some editors seem to be totally lost — I can’t find any leads on downloading a copy of the Cornell Program Synthesizer, for example. Some other editors have a tantalising binary distribution which often cannot be run: for example, Boxer’s Mac binary.

The ACSE/GENIE editor, alive and running.

The ACSE/GENIE editor, alive and running.

I did have one or two successes, such as getting a version of the GENIE editor running in an emulator. And it was a revelation that greatly pushed forward my understanding of old structured editors. By modern standards, they were awful. The papers’ descriptions didn’t make clear how tedious and fiddly the navigation was, how unhelpful the editor was, how awkward it was to deal with errors. Running the software was an absolutely crucial step to comparing our work to theirs. It allowed me to understand the design and critique the editor’s operation for myself, rather than relying on the authors’ incomplete descriptions of their own software.

For all the other editors which I couldn’t run, there are these reviewers asking the perfectly valid question in research: “How does your work relate to previous work X?” And the honest answer is: I don’t know. Perhaps nobody can know any more — the paper wasn’t very detailed and the software is lost in time. This is no way to do research.

The Solution

The solution to all of this is readily apparent: if your software is part of the research output, you must publish the software. And a binary is insufficient; binaries too easily bit rot, refusing to run on modern systems with no way to fix them. Source code is what is needed.

This week Andy Ko made available his Citrus/Barista structured editor from the 2000s. I downloaded and ran it: the binary did run, but it spat out repeated exceptions and I wasn’t sure if that was impairing the functioning of the software. Thankfully, Ko didn’t just publish a binary: he published the source code. For this, I salute him. I went to modify the source code and it turned out not to compile with a modern Java compiler. After some tweaks I got it compiling, and then fixed the exception. Because the code was on github, one accepted pull request later and the software in his repository will now compile and run on a modern machine. This — this is how software research should be.

Published source code for software is crucial to allow later researchers to use, evaluate and compare the software. I fully understand that everyone feels antsy about publishing source code. If I’m honest, the Citrus source code is a bit confusing, somewhat lacking in documentation and the software seems a little rickety. But that’s how research software usually is; my research code for our Blackbox work is the same. I’m not particularly proud of that code, but my recognition that sharing the source is important marginally outstripped my embarassment at its quality. Research code will almost always be shaky and iffy [2]. It’s usually written by a single person (often not a professional software developer) for a single purpose, so it’s likely to be hacky and not well documented. Let’s all accept that research code is bad, and agree to share anyway.


[1] It’s interesting to note that when the researchers ask how our work compares, they are implicitly asking about the design, not the science. Given that almost all venues will only accept science or design+science, it’s curious that most of the comparison to related work is about comparing the design. This is at least partly because the science quickly outdates in software interfaces. Even if the older editor papers had performed rigorous evaluations (which they almost exclusively did not), the results don’t necessarily persist. If someone told you that editor X had been evaluated as easy to use and as good as text editors, tested on a 25-line text terminal on a 1980s thin-client Unix machine, would you say that was useful in evaluating editor X against a modern editor? Would it even be worth comparing the usability of our editor directly against a 1980s editor? I doubt it; the usefulness of the previous work is more in comparing our design to theirs, not so much our scientific evaluation against theirs.

[2] Given that we make software — BlueJ and Greenfoot — which we encourage people to use, I should point out that they are actually stable, reliable, and fairly well engineered! And open source, to boot. The setup of our research group and funding allows us to do this, making us blessed compared to other researchers. Quality software in research is of course possible, and the preferred option, but we must recognise that it is a rarity, and not let that get in the way of sharing.


Filed under Uncategorized

When do students program?

We store enough information in our Blackbox data set to look at when most programming activity in BlueJ occurs. Most BlueJ users are students, so this should give us an idea of when most student programming occurs. Methodology notes below, but what you really want is the graph, so here it is, for the USA:


It’s a heatmap: time of day on the X axis, days on the Y axis, red is highest amount of activity, down through orange to white being zero activity (e.g. overnight). A few thoughts:

  • Lunchtime is clearly visible in the data. Most programming takes place during the working day, partly because of scheduled classes. Despite stereotypes of night owl programmers, on average, people don’t program late at night.
  • Not much programming on Mondays. One reason for this is that US federal public holidays mostly fall on Monday, which reduces the amount of activity (including in the evening), but I’m not sure if that completely explains it.
  • No-one programs Friday night or Saturday… but check out that Sunday night my-assignment-is-due panic! At least, I’m guessing that’s the explanation

Methodology Notes

We could just look at time of day, but that loses a bit too much detail if you average across all days in the week. So it is better to look at times of day across the week, Monday to Sunday. Weekends are different across the world, so I’ve chosen to narrow by country. Rather than try and pick a list of all the Monday-to-Friday-workweek countries, I just looked at the USA, which is by far and away the country from which we get most data. We store enough information to know the user’s timezone, so I am adjusting properly for the multiple timezones and daylight savings.

The number for activity is a count of IDE events recorded in that hour. That is primarily source code edits, but also includes other things like debugger interactions. I think it’s a good proxy for programming activity, though.

Other Countries

Two of the next most frequent countries in the data are Germany and the UK. Here’s Germany:


and here’s the UK:


These frequencies are much lower (note the scale adjustments) so the data is noisier, and I suspect the frequencies are low enough that some patterns in the data are caused by individual institutions using BlueJ at particular times (remember that the students in our data are not totally independent: 100 UK students all programming at the same time at a university on Friday morning could noticeably affect the data). Germany has an odd pattern: more programming on Sunday than almost any other day of the week. This might be because there’s not much to do in Germany on a Sunday, but it also hints that maybe more work is done outside classes. Feel free to add your own speculation for any of the patterns above.

Leave a comment

Filed under Uncategorized

Novice Lambda Use in Java

Java provides two ways to easily provide a reference to a function:

button.setOnAction(e -> showAlert("pressed"));
stream2 =;

The first one is generally referred to as a lambda, and the second as a method reference, but I’m going to refer to them both as lambdas for this post.

In line with the previous post on enum use, this post looks at lambda use in our Blackbox data set (collected from users of the BlueJ beginners’ Java IDE). All data is from the beginning of the project in mid-2013 up until the end of February 2016, a few days ago.

Lambdas are a very recent addition to Java. Java stems from 1996, but lambdas were only introduced in Java 8 in March 2014. Thus it is not surprising that they will not occur very frequently in the data set, as instructors are likely not up to speed on lambdas yet, won’t have had much chance to adjust their course designs, and often treat lambdas as an advanced topic, in contrast to BlueJ’s novice focus. But let’s take a look anyway.

In the data we have 11,666,331 source files which have been successfully compiled at least once. I looked at the most recent successful compilation of each of those source files, to see if they contained a lambda. 6,669 (0.05%) of those source files contained a lambda, with 20,698 lambdas overall. So although lambdas are very rare, the average number used once they are used is three per file. That suggests to me that people find them quite useful, but that they have not gained any traction in education yet.

Number of parameters

For the lambda arrow syntax (i.e. params -> code), I counted the number of parameters on the left-hand side of the arrow:


The reason that 3 and 4 are on there are that there is one instance of each, from 20k lambdas, with no lambdas having more than four parameters. So if the language designers had restricted lambdas to two parameters max, our users would not have noticed.

Haskell programmers, among others, may be a bit confused by the zero-parameter lambda. This is Java’s equivalent of the use of the () -> code pattern in Ocaml/F# to refer to some code but delay its evaluation until it is called later without any real parameters. Haskell would use a monadic computation of a type like “IO ()” for this purpose.

Styles of use

I categorise lambda style into three mutually exclusive categories:

list.forEach(x -> print(x.toString())); // Expression RHS
list.forEach(x -> {print(x.toString());...}); // Block RHS // Method reference

Of our 20,698 lambdas, 14,014 (67.7%) used the expression form, 6,161 (29.8%) used the block form, and 523 (2.5%) used the method reference form. I wonder if a bit of this is advertising: I wasn’t aware of the method reference form in Java for a little while after I learnt about lambdas in the language. I use method references whenever possible in my code because I’m used to point-free style in Haskell. However, I suspect it’s more difficult for novices to understand than the lambda form, so I’m not surprised it’s used much less than the arrow form.

Destinations of Lambdas

I’m doing a syntactic analysis here, so I didn’t attempt the complex semantic task of working out which types the lambdas were being compiled into. However, I did look at which methods the lambdas were being passed to. That is, if you have code like: -> e.getX())

Then I recorded “map” as the lambda destination. The graph below gives the most popular destination methods for lambdas. The y-axis is purely cosmetic to separate out items with similar frequencies, and I only show frequency 200 upwards because it gets congested below that:


Since there are low numbers of lambdas, some of this is definitely influenced by individual courses appearing in the data. For example, I tracked the addButton method back to a utility class from a particular university, and I think accumulate has a similar story. But the general pattern is pretty apparent: lambdas are used much more for GUI event handlers than for streams. This is not surprising in one sense: existing code is more likely to use event handlers (which pre-date lambdas) than streams (which were introduced alongside lambdas), so GUIs may be the easiest way to add lambdas to existing courses. I’m still surprised at how large the difference appears to be, though. It will be interesting to check back in a year or two to see if this pattern continues to hold.

Leave a comment

Filed under Uncategorized