Greenfoot 3, at SIGCSE

Most of the BlueJ/Greenfoot team will be headed to Kansas City next week, for the SIGCSE 2015 conference. We’ll be doing a Raspberry Pi demo at 3pm on Thursday in the exhibit hall, and we’ll be presenting a new paper on our Blackbox data on Saturday morning at 9am in room 2502A (more on that in a future post). But the event we are most excited about is our Greenfoot event, 5:30pm on Friday night in room 2502B, where we plan to demo and launch a public preview of Greenfoot version 3.0.0.

What’s New In Greenfoot 3?

We’ve continued to refine the existing parts of Greenfoot: we’ve added generics to the right places in our Greenfoot API, and we’ve switched to automatic compilation, so the Compile button will be a thing of the past. By far the biggest development in Greenfoot 3 is that we have added a totally new editor (which will sit alongside the existing editor), with a new way to edit programs. We’re calling it frame-based editing: roughly speaking, it is a hybrid of block-based editors like Scratch, and text-based editors like Greenfoot’s Java editor. The intention is that it takes the best parts of block-based editing (easy manipulation by dragging, avoidance of syntax errors) but marries them with the best parts of text-based editing (keyboard control, less tedious dragging for program creation and expression manipulation, easier management of longer programs).

The new Greenfoot 3 editor.  The look is not too far different from Greenfoot 2, but how you use it is quite different.

The new Greenfoot 3 editor. The look is not too far different from Greenfoot 2, but how you use it is quite different.

I’ll be posting a lot more details about the new editor after SIGCSE (for now, we’re busy focusing on testing the release ahead of the conference), but if you’re coming to SIGCSE: do come join us on Friday night to take a look. There’s even free food and drinks!

Leave a comment

Filed under Uncategorized

How to spell program

The word program has various meanings. The common meaning is this one, from the Oxford English Dictionary (OED):

An advance notice describing any formal proceedings, as an entertainment, a course of study, etc.

This word is spelled “program” in the US, and “programme” in the UK: one OED example usage is “The dance programme featured four works”. Fine. However, the word program also has a special meaning in our domain: a computer program, which is programmed by programmers. Over to OED again:

Noun: A series of coded instructions and definitions which when fed into a computer automatically directs its operation in performing a particular task.

Verb: To write a computer program.

Let’s be clear: this sense of the word is now spelt program, not programme, even in the UK. Even the OED admits it (in its definition of the verb above, and the note on the noun: “Now usu. in form program”). But that doesn’t stop various UK organisations from trying the British spelling:

  • Telegraph, Nov 2013: “This has discouraged software developers from writing programmes for Android”.
  • UK government, this week (Dec 2014): “In schools, a new GCSE in computer science [will cover] the most up-to-date issues including writing code, designing programmes…”.
  • The Guardian have recently got the hang of it — using programme in 2005 (“one programme can infringe many different patents at once”), but updating to program by 2012 (“Coderdojo inspires kids to program”).

Why does this matter? It bothers me because I’m a nitpicking pedant, but I think it’s also a culture signifier. Using phrases like “to programme a computer” or “writing a computer programme” shows that the person has never actually been involved with programming, or else they would realise that all programmers have adopted the US spelling (the US being quite big in computing, apparently). It always suggests someone writing about something they don’t know much about. So, if you want to avoid this, update your style guides: program, not programme.

2 Comments

Filed under Uncategorized

Expressive Whitespace

Do you know the operator precedence rules in the programming languages that you use? Given an expression like this one in Java:

x/6+5&8>>2-4!=8

Can you say how it will be parsed? (Never mind that the semantics may be meaningless; the parser doesn’t care about semantics.) Generally, programming environments give very little help in improving readability of expressions; the expression will be displayed exactly as above. You do get bracket-match highlighting if you have brackets (and some would argue that you should always bracket expressions for extra clarity). But maybe the display of expressions can be further improved in other ways.

Designers tend to use whitespace for grouping items — for example, grouping related columns in tables. And in fact, a lot of programmers do tend to omit the whitespace around high precedence (tightly bound) operators while putting it in around lower precedence operators. You are much more likely to see this:

dist * Math.sqrt(x*x + y*y)

Than this equally spaced version:

dist * Math . sqrt (x * x + y * y)

Or this devilishy spaced version:

dist*Math . sqrt(x * x+y * y)

If the rule that most people follow is simply to have whitespace inversely proportional to the precedence of the operator, surely we could automate this in a programming editor.

Dynamically varying whitespace

Consider this expression, without spaces:

expression-unspaced

Using the parse tree of this expression, we can assign smaller whitespace to operators nearer the leaves (higher precedence operators which bind more tightly) than operators nearer the root (lower precedence operators):

expression-spaced-tree

It is now more obvious at a glance how the expression should be read, compared to the unspaced original. Not only is the display clearer, but if the editor takes charge of putting the whitespace in expressions, the user can save keystrokes by never having to insert spaces in expressions in the first place. They can enter the unspaced version above, and the editor displays it as the second version automatically.

Choosing the amount of space

One design question is whether the width of whitespace is solely determined by absolute operator precedence — i.e. plus always has the same amount of whitespace around it — or whether it is determined by the relative precedence of the operator in the chosen expression. In the absolute case, the complete expression “2 * 3″ will be spaced differently to “2 + 3″, which to me seems odd. In the dynamic case, you get spacing readjustment: if you take “2+3″ and add “*4″ on the end, the + will get more space added to reflect that its dynamic precedence has changed. That is:

expression-23

Becomes:

expression-234

While re-spacing as you edit is visually disturbing, I think this is the right thing to do — the addition of *4 has changed the semantics of the existing expression (it is not “(2+3)*4″ but rather “2+(3*4)”) so respacing it to reflect the altered semantics seems correct.

Longer expressions

There is a limit to how long an expression this technique can make readable — our earlier terrible expression:

expression-long-unspaced

Becomes a bit better when space is varied:

expression-long-spaced

But I think in that case, there’s only so much that spacing can do — you really should add some brackets.

Summary

I’ve discussed a way that programming editors could be smarter about whitespace when displaying expressions: dynamically varying the whitespace around operators based on their relative precedence in the expression. This is likely to be included in a new editor we’re currently working on for our Greenfoot system. I note that this scheme does vary the width of spaces in your editor, which may upset some users. But while fixed-width spacing is useful for aligning the left-hand edges of lines of code, I’m less convinced that it matters within the line of code.

Addendum: a colleague points me to this work which mentions a similar system for mathematical equations (page 6). It’s interesting that this idea has been implemented in mathematics but not yet caught on in programming.

1 Comment

Filed under Uncategorized

The dark at the end of the funnel

In a Q&A session last week, Facebook founder Mark Zuckerberg talked about the problem of getting more women into Computer Science (CS). He referred to the vicious circle of trying to encourage more female participation in CS:

You need to start earlier in the funnel so that girls don’t self-select out of doing computer science education, but at the same time one of the big reasons why today we have this issue is that there aren’t a lot of women in the field today.

The funnel or pipeline is this idea that you only get trained developers by educating them; if you want more graduate developers you need to get them in an educational pipeline at an earlier age so that they will take computing degrees. This September, England began its Computing adventure, with boys and girls required to study Computing (which includes CS and programming) from ages 5–14. We’ll let you know how “filling the funnel” turns out. There are definitely problems of attracting and retaining more women in CS have originated in education — Mark Guzdial has a good blog post about this that I won’t repeat here.

However, this is not solely an issue with the education system (though that would be a familiar narrative — work force not as we would like it? Must be the fault of schools and universities). The pipeline or funnel doesn’t just need filling by shoving lots of 5 year old girls in one end and waiting for the hordes of female developers to swim out of the other end into an idyllic tech industry pool. Zuckerberg mentions that the lack of women in the industry forms a vicious cycle. This is not a problem at the education end of the funnel.

As this Fortune article describes, the industry is not welcoming to women. The Anita Borg Institute found that women’s quit rates were double those of men. Not to mention issues like maternity leave. The pool at the end of the pipeline is leaking, and for good reason. So the vicious cycle is not simply an accident of history; the women that are in the industry tend to leave. There are several reasons for this, some of which are identity and culture in the industry.

Gamer Identity and Culture

You may well have seen press coverage of the recent “#gamergate” mess. Despite their cover story about ethics in games journalism, #gamergate was started as a way to deliberately target women in the games industry and hound and harass them until they quit or worse:

The 4channers express their hatred and disgust towards [Quinn]; they express their glee at the thought of ruining her career; they fantasize about her being raped and killed. They wonder if all the harassment will drive her to suicide, and only the thought of 4chan getting bad publicity convinces some of them that this isn’t something they should hope for.

How do you explain that to the young women you are inviting to join the pipeline? Come learn to program and how to make games — and try to ignore the fact that a terrorist movement was begun in order to hound your gender out of the industry. It’ll be fun!

Some games writers have focused on the gamergate idiocy as being related to identity and gaming going mainstream. Back in the 80s and 90s, a group of people, mainly young white men, who felt excluded from a masculine sports-centred culture, found solace in making gaming their identity. (And during that period, programming was well linked to gaming, as home PCs like the Spectrum, Commodore, etc were amenable to games and to programming.) As more and more people got into games, the original gamers slowly redefined their identity. Sure, lots of people played games, but no true gamer played Candy Crush. Subcultures formed, each looking down on someone else. Call of Duty players sneered at casual gamers. Older gamers scoffed at Call of Duty players. And so on.

(This is not that unusual in media fandom — music lovers have been the same way for generations (a classic examples being mods vs rockers). This article by Arthur Chu directly compares the anti-disco movement to #gamergate, describing how a perception of losing majority status can lead to reactionary rage.)

Programmer Identity and Culture

Now, the games industry isn’t the same thing as the tech industry — but it does clearly overlap: programmers work in both industries. And the identity issues are paralleled in the tech industry. This article by Carlos Bueno nicely sums up how programmer identity is important in the silicon valley tech industry. One tale of non-conformism:

[The interview candidate] was dressed impeccably in a suit… I stole a glance to a few of the people from my team who had looked up when he walked in. I could sense the disappointment. It’s not that we’re so petty or strict about the dress code that we are going to disqualify him for not following an unwritten rule, but we know empirically that people who come in dressed in suits rarely work out well for our team. He was failing the go-out-for-a-beer test and he didn’t even know it…

And another:

Again Max Levchin: “PayPal once rejected a candidate who aced all the engineering tests because for fun, the guy said that he liked to play basketball. That single sentence lost him the job.”

These are not issues of job performance, and not a simple gender issue. These are issues of identity and culture (see also the rise of the “brogrammer”). I think that programmers mirror gamers in this aspect. We built a culture where we subtly redefined what characteristics are important until it fitted only the people who we thought it should. As Bueno puts it:

We’ve created a make-believe cult of objective meritocracy, a pseudo-scientific mythos to obscure and reinforce the belief that only people who look and talk like us are worth noticing.

I’m sure many people have worked in programming offices (or sat in programming classes) where they felt excluded if they did not have anything to say about the latest sci-fi series or talk about last night’s DOTA 2 game or whatever. (This problem occurs in several industries, but that’s no justification not to fix it in your own.) It’s not usually a malicious thing, but as with the “go for a beer test” in the earlier quote, companies often assume that new hires must bend to the culture, rather than bending the culture to fit new hires. It’s not exclusively a gender issue, but women and minorities tend to be hit harder by it.

These problems get buried under the idea that programmers have created this wonderful meritocracy, where if you can code well, you will succeed. Programming skill is what really matters. (Despite evidence to the contrary: is the highest-paid or highest-status person in a tech company the best programmer? Does it actually help that much in your career?) And thus programmers tend to believe the reverse, too: if you didn’t succeed, it’s because you couldn’t code well. When the #ghcmanwatch participants suggested that women should just “be better”, that surely arose from this meritocratic world view.

Summary

Computing education is currently making moves to put more women into “the pipeline” (aka “the funnel”) so that we might get more computing graduates. But it’s a tough sell when the end of the pipeline is not a desirable destination:

The only people who can alter that are those who are already in the tech industry, by making sure that the work environment is more welcoming and nurturing to all. That’s a day-to-day, office-by-office battle. A two-fold approach is needed: making the work place more inviting for women, and getting more women into CS during school and university. Of course culture is just one gender issue (Microsoft’s CEO made headlines and backtracker over his equal pay comments) but it’s one that everyone in the industry can help to address.

Leave a comment

Filed under Uncategorized

ICER 2014 Roundup

In August I was at the International Computing Education Research (ICER) 2014 conference in Glasgow. This post is a round-up of my notes from the conference (for our own paper, see the previous post). Unfortunately, not all papers are publicly available, but if there is one that interests you, I suggest contacting the author directly for a copy.

Simon gave an interesting talk about academic integrity, pointing out that most university ethics policies on plagiarism, etc, are written in the context of essays, and don’t always apply well to computing. You are generally not allowed to take paragraphs you wrote for one essay, and re-use them in a subsequent essay. However, should you similarly be disbarred from taking a procedure you wrote in a previous programming assessment and re-using it another assessment? And what about group work? Some interesting questions, and they gathered opinions from students and educators, but there is no single right answer to many of these issues.

Colleen Lewis pointed out the CS Teaching Tips website that she is involved in running, which attempts to collect small suggestions for teaching computer science. I suggest any computing teachers reading this should take a look and contribute a tip or two — she has one or two from me, which is a clear sign that she needs some higher quality tips.

I enjoyed Michael Hewner’s ICER talk last year and this year, on students’ perceptions of Computer Science, and how they go about choosing modules in their degree. I recommend having a read of the paper — the interview snippets scattered throughout make it a more approachable read than many academic papers (last years’ paper, too). You may nod your head throughout, but it’s one of these pieces of research where almost any outcome will seem obvious in hindsight. But this does not mean it’s all as you would predict. For example: I would have predicted that what their friends took (or had taken) would be an influence, but that’s not the case. They also do not necessarily shy away from difficult courses or those where it is known to be harder to get marks.

I was pleased to learn about Parsons problems from Barbara Ericson and others (see an example here). These are programming problems where students are given some lines of code, and are asked to put the lines into order, including the indentation, to form a program that solves a given task (e.g. drawing a given shape in a turtle language). This seems like quite a nice way to provide a structured introduction to block-based programming.

Speaking of which, there were a couple of posters from PhD students Alistair Stead and Mark Sherman, who plan to look at issues surrounding the transition from block-based programming to text-based programming. It’s clearly going to be a hot topic of research for the next 2–3 years, whether it is researchers investigating the difficulties of the transition or building new tools to try to bridge the gap. I believe John Maloney (designer of Scratch) is working on one such tool with Alan Kay, there is another group doing the same that slips my mind — and our team is also building a new editor for Greenfoot to try to bridge this gap. It will be interesting to see what we all come up with! (Addendum: Shuchi Grover pointed to this recent paper on her work in the area.)

Leave a comment

Filed under Uncategorized

Educator Beliefs vs Student Data

Sitting in a talk a year ago, at ITiCSE 2013, I heard the speaker make a familiar statement: “We all know that [Java] students mix up assignment and equality.” My colleagues and I have heard many such claims about what the most common Java student mistakes are (messing up the semi-colons, getting string comparison wrong, etc). With the launch of our Blackbox project, we had the opportunity to investigate these claims at a large scale. And not just investigate what mistakes students made, but also what mistakes the educators thought the students were making. I’d just been reading the Sadler paper which suggested educators’ knowledge of student mistakes was important, and it seemed interesting to investigate this issue in the context of learning to program in Java.

The results are published at this week’s ICER 2014 conference, in a paper with my colleague Amjad Altadmri. The paper is freely available; this blog post attempts to informally summarise the results. We got 76 complete responses from educators, and we had available data from 100,000 users via Blackbox. We tweaked a pre-existing classification of beginners’ Java mistakes into a set of 18. The plan was fairly straightforward: we asked the educators to assign frequency ratings to the 18 mistake categories (which we then transformed into a set of ranks). Then we looked at how often the students actually made the mistakes in those 18 categories. For our analysis, we compared educators to educators (do they agree with each other?) and educators to students (do their beliefs generalise to all students?).

Student Mistakes

Let’s start with the student data. The top five mistake categories actually committed by students were, in descending order:

  1. Mismatched brackets/parentheses.
  2. Calling a method with the wrong types.
  3. Missing return statement.
  4. Discarding the return type of a method. (Note: this is contentious because it is not always an error.)
  5. Confusing = with == (assignment with equality).

Everything is obvious in hindsight; so how did the educators fare with their predictions of frequency?

Educator Beliefs

Now on to the educator data. Our first surprise was that the educators did not agree with each other very strongly (equivalent Spearman’s correlation: 0.400). There were a few mistakes where the educators agreed “no-one makes that mistake”, but these “gimmes” actually boosted the educators’ agreement and accuracy. At the top end, there was little consensus on which were the most accurate couple of mistake categories.

Given this lack of agreement between educators, it is not a big surprise that the educators did not agree strongly with the student data (if all the educators predict fairly different rankings then, automatically, only a small number can be close to the student data, whatever it turns out to be). The average Spearman’s correlation between each educator and the student data was 0.514. This graph is an attempt to show what that looks like:

Example Accord

The diagonal line represents perfect agreement between frequency rank in Blackbox (X axis) and by the educator (Y axis). You can see that the best educator, shown as empty circles, comes pretty close to this ideal. But the educator with the median agreement, shown as black squares, was not really very close — for example, they assigned the 12th rank (of 18) to the mistake that was actually the most frequent. The red pluses are the worst educator, who had a negative correlation to the actual results.

So, educators in general did not predict the ranks very well. But it could be that the less experienced educators in our sample (e.g. around half had less than 5 years experience teaching introductory Java) are dragging down the scores of more experienced teachers. We checked for an effect of (a) experience in teaching, (b) teaching intro programming, or (c) teaching intro programming in Java specifically. And we found… nothing. Experience (measured as a, b or c) had no significant effect on educators’ ability to predict the Blackbox mistake frequencies. In graphical form, here is an example match between experience teaching introductory programming in Java (X axis) and prediction of Blackbox data (Y axis):

A great example of a lack of correlation! If you want more details on the results or exact methodology we used, please read the paper.

Conclusions

So, what can we take from this study? Well, we could take the results and head off on a jolly crusade: “Teachers know very little about the mistakes their students make, and experience is worthless!” But we have explicitly not pushed this interpretation. There are several more plausible reasons for our results. Firstly, I think ranking student mistakes is a difficult task, and one that does not necessarily align well with teacher efficacy. Sadler’s paper looked at whether teachers could predict the likely wrong answer to a specific question; we ask here about most frequent mistakes across a whole term/semester of learning, which is a different challenge. It seems likely that this task is too alien to educators, and thus they all struggle with it, regardless of experience.

Another factor that came up in comments after the paper presentation (I think by Angelo Kyrilov, but maybe someone else too?) was that it may matter which environment teachers are using to program — IDEs like Eclipse may minimise the incidence of bracket errors when smart bracketing is used, in contrast to BlueJ (which all the students in our data set used). I don’t think this would explain the whole pattern, but in hindsight I wish we had captured this information when we surveyed the educators.

We can draw two fairly definite conclusions from our study. Firstly: beware any educator who tells you what the most common mistakes students make are. You’ll get a different answer from their colleague and they are unlikely to be accurate on a wider scale. (They may, of course, be spot on about their own class — but unless you teach their class, that is not really relevant to you.) Secondly: our results for the frequency of different student mistakes, available in the paper, must be surprising. After all, none of our participants predicted them!

Leave a comment

Filed under Uncategorized

Computing in UK Schools, Chapter Two

In spring of 2013 my co-authors and I published a paper on Computing At School in the UK. Due to the lead time in publications, the majority of the paper was written in summer 2012, and tweaked at the end of 2012. After that paper was published, the change from ICT to Computing in the English National Curriculum was proposed and confirmed, and CAS’s Network of Excellence project started, to try to fill the massive training gap that was created by these developments.

We have now written an updated account of what is happening with Computing in the UK (especially England, where many of the biggest developments occurred in the past year). Entitled “Restart: The Resurgence of Computing in UK Schools”, it has just been published in ACM’s Transactions on Computing Education journal, as part of a timely special issue on computing in schools worldwide. You can download the paper for free.

The special issue includes several other interesting papers on computing in schools in other countries. Alas, by default these papers are not publicly available. (As a starter, Mark Guzdial and Barb Ericcson’s is linked in the comments on his blogpost.) However, if any interest you, contact the authors to ask them for a copy, or even better, to make a copy publicly available.

Leave a comment

Filed under Uncategorized