## Blackbox at SIGCSE

This is a brief announcement, for those who will, or may, attend the SIGCSE (Special Interest Group in Computer Science Education) conference in Atlanta in March next year. The programme has now been published and registration is open. I will attend and present two items about our Blackbox data collection project. Firstly, we have a paper accepted, which I’ll be presenting on Thursday afternoon. (I’ll blog again about this nearer the time.)

Secondly, we will be running a workshop on Saturday afternoon showing how to get started with analysing data from the Blackbox project. This should be of interest to those who already have access to the data but haven’t yet gotten started with their analysis, or those who want a more detailed look at what the project has to offer. If you want to attend, make sure you don’t fly back too early on the Saturday: the workshop runs 3–6pm. You can sign up during registration, or on site. (Technically, the program is still subject to change, but I presume this wouldn’t involve moving the workshop.)

Filed under Uncategorized

## Tracking down a parser problem

In our Blackbox project, we are collecting a lot of data from programming novices as they learn, including the Java source code they are writing. We now have a reasonable amount of data: 5 million successful compilations (and 5 million unsuccessful). I’ve been doing some simple analysis on the data we have collected; in this post I will describe a bug in the analysis that look me a little while to track down.

The analysis is very straightforward. I pull all of the successfully-compiled files out of a disk cache and then process each one (by parsing it and doing my analysis on the resulting syntax tree), then finally pull all of the results together to produce a summary. The analysis runs on a multi-core machine so I run several worker threads in parallel, each running one task at a time until the work queue is empty. I encountered a problem where sometimes the analysis was very slow to finish (or didn’t finish at all). Tracking down a problem that occurs at the end of your hours-long analysis is definitely a frustrating and time-consuming business.

Early on I looked at time spent in garbage collection, and the possibility of a livelock in my worker framework. But eventually I realised that the problem did not lie in the framework or environment, it lay in the processing tasks themselves. Some of my processing tasks were simply taking an incredibly long time to finish. This sounds like an obvious cause, but it was quite surprising. The task was very small. It simply parsed a source file and scanned the resulting tree. There should be no possibility for an infinite loop in there, given that the input is a finite source file. Why would most of these tasks finish near-instantly, while a small handful took near-forever?

Once I realised the problem was in the processing task, I also realised that it was certain inputs that were tying up the worker threads. With enough bad inputs, each worker would eventually be occupied trying to process a difficult input, until my program pretty much ground to a halt. Looking at the difficult inputs gave a clue to the problem. Here is the gist of one problematic input, which was implementing something like a game of twenty questions:

boolean q1 = nextAnswer();
if (q1) {
return "...";
}
if (!q1) {
if (q2) {
return "...";
}
if (!q2) {
if (q3) {
return "...";
}
if (!q3) {
... // Lots more nested ifs for q4 onwards
}
}
}


And so on, with all the if statements nested twenty levels deep. (Like I said: the data comes from programming novices.) So clearly the problem resided with if statements. Now, parsing if statements in many C-like languages has a problem known as the dangling-else problem — if you have this:

if (a)
if (b)
blah();
else
bleh();


Does that else attach to the first if or the second? The usual answer is that it attaches to the innermost if. However, if you use a classic declarative BNF approach to writing the grammar, you will write an ambiguous grammar. So C and Java rewrite their grammar to be unambiguous. Here’s Java’s rewrite, from the Java Language Specification:

StatementNoShortIf:
StatementWithoutTrailingSubstatement
LabeledStatementNoShortIf
IfThenElseStatementNoShortIf
WhileStatementNoShortIf
ForStatementNoShortIf

IfThenStatement:
if ( Expression ) Statement

IfThenElseStatement:
if ( Expression ) StatementNoShortIf else Statement

IfThenElseStatementNoShortIf:
if ( Expression ) StatementNoShortIf else StatementNoShortIf


This rewrite essentially ensures that a StatementNoShortIf will never end in a (non-brace-enclosed) if-without-else, thus removing the ambiguity. And it turns out that the Java parsing library I was using follows this grammar exactly. Effectively, they had copied the Java Language Specification directly into a parsing library, which doesn’t sound unreasonable as a way to ensure that you parse all of Java correctly. Here’s the relevant snippet of code that uses the Parsec parsing combinator library (comments added by me):

ifStmt = do
tok KW_If -- Consume if keyword
e <- parens exp -- then parenthesised expression
-- Then try the following two alternatives in turn.
-- First, statement with else:
(try $do th <- stmtNSI -- Stmt with no short if tok KW_Else -- Consume else keyword el <- stmt -- Parse another statement return$ IfThenElse e th el) <|>
-- Second alternative: if-without-else:
(do th <- stmt
return \$ IfThen e th)


The practical consequence of this code is that every if-without-else has two parse attempts; one that looks for an else afterwards (but must parse the body of the if before it finds out there isn’t one) and one that re-parses the same body, but doesn’t look for an else. By itself, this double parse is not that big an issue — sure, it takes time to parse the body twice, especially if it’s large, but computers are fast enough that it’s not a massive issue.

That is, until you nest multiple if statements. Our novice programmer nested 24 deep. So we first attempt to parse the if looking for an else. But we also make the same attempt with the other 20 nested ifs. The innermost one will fail (our novice isn’t using “else”s), and re-parse. Then the second innermost one will fail and reparse. Suddenly our parsing will make something like 2^24 attempts to parse the innermost if body. It was this complexity that caused our parsing tasks to get into a very long running loop. One small fix later to avoid parsing the body twice, and the parsing was running in reasonable time on the problematic inputs — effectively going from O(2^N) to O(N) complexity. Sanity restored. (You can now argue about the superiority of LL, LR parsers and so on, if you’re so inclined.)

### Conclusion

These sorts of articles are usually end meant to end with a learned lesson. Take your pick: don’t blindly copy BNF into a parsing library, beware of potential problems in the libraries you are using, have analytics on how long your processing tasks are taking and flag up/kill off outliers, and finally be better at narrowing down the problem than I was (a few print statements at the right points pinpointed the problem, but I went down a few blind alleys first). I am certainly learning that when you have a large dataset like ours, the odd corner cases in your code will be found out. This is both a blessing and a curse: you have to write better code, but at least you have a massive source of input data to test your algorithm on.

Filed under Uncategorized

## Interface Design

All programs exist within a larger environment, and typically have an interface to interact with the environment. That may be a hardware interface for doing low-level I/O, a “standard library” in languages like C or Java, or a set of built-in commands in an educational environment like Logo, Scratch or Greenfoot. Broadly, there’s two different approaches to take when designing an interface: I’ll call them minimalism and convenience. I’ll describe each in turn, with examples.

One way to design an interface is for minimalism: to find the smallest set of orthogonal commands that allow you to manipulate an interface. For example, in many environments (e.g. POSIX/C), the file-access interface consists of: open, close, seek, tell, read and write. This allows you to accomplish anything you want with files, and it’s hard to provide a simpler interface that can still accomplish all possible tasks. As a non-programming example, some calculators have a power operation (x^y) but no square root button, as the former subsumes the latter (x^0.5=$\sqrt{x}$) — and they do not offer a conversion between radians and degrees, just multiplication, division and a button for π.

The other way to design an interface is for convenience. The classic file interface is nice and orthogonal, but the most common way to write a file is just to write from a string over the top of a file. Some languages, like Haskell, have a function that does exactly that — you give a file path and a string, and it just writes that to the file (effectively doing open-for-overwrite, seek-to-begin, write, close). It overlaps with the classic operations, which are still provided, but it is easier to get done what you want. As another example, Excel provides an AVERAGE function for getting the mean of a range, even though it’s trivial to do given the SUM and COUNT functions.

These two different approaches are different ends of a spectrum, and each interface designer must make design decisions to veer towards one or the other. Some orthogonal interfaces can make it very hard to implement common cases (e.g. Java’s file API), while some convenience interfaces pack in so many similar methods and synonyms (e.g. Ruby’s array API) that it becomes a tangled mess (especially if, in Ruby’s case, you want to duck-type something to act like an array). I’m particularly interested in these decisions in the context of educational environments, such as in Greenfoot, where I have a hand in the design.

### Education Example: Greenfoot

The original Greenfoot 1.* had these methods for movement: setLocation(x, y), getX() and getY(). It also had setRotation(r) and getRotation(), which rotated your image. This was design by minimalism: these methods are enough to implement any kind of movement you want: up/down/left/right Pacman-style movement, Asteroids style drifting, directional movement, and so on.

So there you are on your first day of teaching. You load up Greenfoot and want to show some novices how to get your character running across the screen in the direction in which it is pointed. Here’s the code, if you just used Greenfoot’s original built-in movement methods:

setLocation((int)(getX() + 5 * Math.cos(getRotation())),
(int)(getY() + 5 * Math.sin(getRotation())));


Ouch! Let’s see how many concepts are in that one line:

• Method calls with no parameters or multiple parameters
• Method calls with and without return values
• The general concept of accessors and mutators (aka getters and setters) and their combination
• Trigonometry
• Nested method calls
• Casting between numeric types

That’s a lot to take in. Compare this to our move method which we added more recently:

move(5);

Now we’re down to just one method call, with one parameter and no return. (We also debated building in a no-parameter version, move(), that advanced a default amount, but that seemed too little gain over the one-parameter version.)

In fact, since version 2.0 of Greenfoot, we’ve slowly added more methods in the convenience category: move, turn, turnTowards, isTouching/removeTouching and a few more. Looking back, I think it has mainly been me arguing to add these methods, based on making our beginners’ workshop smoother. Now, designing the software differently purely to make a single workshop easier is a bit bizarre, but my thought was always been that the workshop mirrors most users’ journey into Greenfoot. Generally, allowing people to easily do what they commonly ask for when they start with Greenfoot seems like a good idea.

Everyone wants collision detection, so in 2.3.0 we made it shorter and simpler. In the beginners’ workshop, everyone always asks how to add a score, and my answer has always been that unfortunately it’s not as easy as you’d think. So in Greenfoot 2.4.0, we’re adding new methods so that it can be as easy as you’d think. There’s a limit to how easy we want to make things: at some point users need to learn more complicated concepts to perform more complicated tasks. But I think the right time for that should be a bit later than it was in version 1.*.

With the new methods, a learner’s journey can be better structured. In the upcoming version, if you want a score, you can add it and display it easily. If you want to eat things you run into, that’s also easier. Then, if you want to be able to shoot bullets that destroy asteroids and increase the player’s score, you already have an easy way to do the score and the destroying, but you must use object references (the bullet holds a reference to the rocket that fired it) to add to the score at the appropriate time. But at least it’s a more manageable pathway.

(Early versions of Greenfoot tried to have it both ways, by providing a minimal core interface, and then introducing convenience helper classes into beginners’ scenarios, so that the move() method would be in a per-scenario helper class. But ultimately this could cause confusion when beginners moved on and found these convenience methods were “missing” from their next scenario. So gradually, we have steered away from this approach to putting convenience methods into the core of Greenfoot.)

### Education Example: Scratch

I find Scratch 1.* an interesting example because it had a couple of significant complications in its interface design. Scratch 1.* did not support user-defined methods (a restriction later remedied by Scratch 2.0 and BYOB/Snap). So users were not able to combine several simple commands into a larger command for re-use. This probably biased Scratch somewhat towards convenience, as the users could not add a convenience layer themselves. But at the same time, in the Scratch interface, all blocks are accessible in a visual menu down the side of the interface. Thus it is harder to hide a large set of methods compared to something like Greenfoot (where you could have a large list available through the documentation). So this might steer Scratch away from having too many blocks available.

You can run a critical eye over Scratch’s interface yourself, but I would say that Scratch mainly ended up with convenience methods. If you look at the list of motion blocks (shown on the left), technically this could be boiled down to a small number of orthogonal methods, but instead they have provided many similar variants. For example, turn anti-clockwise could have been left out (turn clockwise a negative amount), “set x to” and “set y to” are just special cases of “go to x: y:”, and so on. However, having all these blocks means beginners may need to think less about motion: you just search for a block that does almost exactly what you want, rather than having to think how to combine several smaller operations. (On the other hand, you could argue that having all these blocks makes it more overwhelming for the beginner.)

I can’t finish discussing Scratch without pointing out my favourite convenience block: the “say” block. This block displays a speech bubble coming from a character with the given text inside it. This single block enables entire classes of Scratch scenarios: stories, interactive text adventures and so on. I think it was probably the minor masterpiece in Scratch.

### Summary

Interfaces can be designed for minimalism (with fewer, orthogonal methods) or convenience (with more methods to easily handle the common cases). In general software design, there’s no single right answer. However, I wonder if in educational software design, there are a different set of constraints that means that convenience is more often the right answer, to allow learners to get satisfying results earlier on with less cognitive demand. Of course, the meat of programming comes in the combination of methods, but via interface design, the creators of systems like Scratch and Greenfoot can choose at which point novices need to begin to combine methods, and how much they can done with built-in convenient primitive methods.

(The inspiration for this blog post came from John Maloney, one of the Scratch designers, who shared his observation about these two different ways of approaching interface design.)

Filed under Uncategorized

## Unprofessional

In this post, Andrew Old and an accountant friend begin to explore what a profession is. Broadly, three requirements are listed: a professional qualification, body, and code of ethics, such that “all members of a profession will belong to a professional body and the professional qualification is awarded by this body”. One additional implied point that I have heard elsewhere is the idea that the professional body can expel members when appropriate (e.g. if the code of ethics is violated).

I’m reasonably happy with that as a definition of a profession. It intrigues me because my job is to be a researcher and software developer, and many or most practitioners in these areas are not professionals. If you’re a UK teacher reading this, you aren’t a professional either. Is this lack of professionalism a problem that needs fixing?

### Professional

The idea of being a profession is to uphold minimum standards of a profession (to prevent damage to the public from bad doctors or bad engineers), to uphold the reputation of the profession, to provide a voice to the profession and to collectively advance practice. Let’s explore some of those issues and also look at the costs of being a profession.

Upholding minimum standards is clearly a good idea. Bad software can cause real harm (famous example: Therac-25) or irritation on a massive scale (e.g. the general hatred for Microsoft in the late ’90s when Windows was quite unstable). Bad research can cause harm if acted upon, and generally harms the reputation of the occupation. Every researcher that conducts a bad study without a control group, uses bad statistics or fishes for a significant result harms the cause of research. However, regulation of practitioners is typically no guarantee against bad practice, just against malpractice.

As for advancing the profession: researchers and software developers tend to be very open about their practice. The web is full of software development blogs, question and answer sites like Stack Overflow (and in days gone by, newsgroups and similar). Developers invented the web, and colonised it in the name of helping each other to program — even if that often means berating each other about how not to program. One of researchers’ primary functions is to publish their work and attend conferences. Both of these fields seem to advance their practice without needing any professional body involved.

Sometimes, even being a profession by our definition is insufficient to achieve total change. There is a UK professional body for software developers: the BCS. The BCS accredit degree programmes, they have a chartership system and a professional code of ethics. So technically, UK software developers are a profession. It’s just that very few clients or employers pay attention. You wouldn’t hire a lawyer who wasn’t professionally accredited. However, people do hire software developers all the time who are not chartered or accredited (e.g. programmers who are self-taught, or transferred from other disciplines, and hold no qualification in computing at all), which in turn makes people wonder if it’s worthwhile being accredited or chartered, and hence not all developers are members of the BCS. Drawing a parallel to teachers: if someone sets up a professional body for UK teachers tomorrow, but in ten years’ time hardly anyone requires or notices whether teachers are members, will it hold much weight?

### Unprofessional

Being a profession has upsides, like being able to bar those who commit malpractice. But a profession does not come for free. A profession costs money, usually taken from individuals or their employers via membership fees. Certification and chartership costs time for those who apply. (Not to mention that an occupation is not necessarily worthy just because it is a profession. Did you know that homeopaths are a profession?)

I also wonder if professions can sometimes be an unnecessary barrier. One of the interesting movements in software development over the last 20 or 30 years is the growth in open source software. You are reading this via WordPress (open source) which uses the PHP language (open source), probably on an Apache web server (open source) on a Linux server (open source), and viewing it in your browser (which is likely also open source). Some open source software is developed by full-time programmers on the clock, but a significant amount is also developed either by complete amateurs (meaning they have no job as a developer at all) or by programmers in their spare time. If open source was only developed by accredited professionals, its health would be diminished, but I doubt that its quality would be noticeably improved.

### Summary

Ultimately, when it comes down to questions of professionalism, I wonder: is professionalism something worth aiming for? Do the gains outweigh the costs? Is it hypocrisy if I were to answer yes for doctors (where I am a client) but no for an occupation where I am a provider? Thoughts are welcome below.

Filed under Uncategorized

## The Importance of Types

This is a post of two halves. I will start by explaining why I think types are so useful in professional programming, and then later discuss their place in learning to program.

### I ♥ Types

From a software engineering viewpoint, I am a strong proponent of types. Unfortunately, most non-functional programming languages have included types in a fairly lacklustre way. When I say that types are useful, people may think about the distinction between integers, floats and strings (classic question: what type is a telephone number?), or the distinction between different record/class types. These are very useful distinctions but they are relatively basic uses of types. Let’s explore some more powerful uses of types.

NASA’s Mars Climate Orbiter project famously crashed due to a mixup in the units being used. The compiler did not complain because all the numbers involved were typed as floating point numbers, and you are permitted to manipulate them as you please. However, the Orbiter failure should be seen as a typing failure. The F# language has a units feature that can prevent such mixups. One number can be typed as a float<meters>. Another can be typed as a float<seconds>. Divide the former by the latter and you have a float<meters/seconds>. If you try to add that to an acceleration number (typed as float<meters/seconds^2>) then you will get a compile error. This cleverer, more thorough use of types begins to illustrate the benefits.

The NASA Mars Climate Orbiter, which crashed due to a mixup between pounds and Newtons.

Many problems in programming arise from letting too many variables in your program have a plain string type. A URL, a file path, a file’s contents, a MySQL query string, a GUI label and someone’s name are all text. Letting them all have the same type and be manipulated in the same ways leads to all sorts of accidental errors. Concatenating the contents of two files might make sense, but concatenating two absolute directory names or two URLs does not. Many, if not all, SQL injection bugs can be seen as the concatenation of two incompatible types: a string originating from user or external input (which could be typed as such) and a query string, resulting in a query string. To avoid SQL injection, you should not allow any non-escape user input into a query string. Similar logic applies to injecting user-originated Javascript content into webpages. And in fact, several dynamic languages have tried to cut this out by having a dirty flag on strings that originate externally; they then prevent the use of (non-escaped) dirty strings in the wrong place like HTML generation.

Just as not all strings are not the same, not all integers are either. Recently, I have been developing some software that reads from a database. Each table has a 64-bit integer primary key named id (NB: I didn’t design the schema). You might be tempted to read that field into a plain 64-bit integer type. But, as well as permitting nonsensical addition and multiplication of ids, this means that you might accidentally read the id field from the users table and use it to find an entry in the posts table. A classic way this can happen is that you have a function that takes two ids, one for users and one for posts, and you pass the parameters in the wrong order. The better way to program this system is to use a different type for the id of the users table (e.g. id<users>) and for the posts table (id<posts>). That way your method can have two parameters with different types, and the compiler can issue an error if you get the order wrong.

Types prevent errors, and static types prevent errors very early on in the development process: at compile-time. Static type declarations also serve as a form of documentation. What’s more, I know that the types must be kept up-to-date, unlike documentation. Out of date documentation requires a programmer to notice; out of date types cause a compiler error. And if you want some really interesting type-related programming, there are actually some functions where there is only a single implementation for a given type, and thus writing a type is enough for a programmer or a tool to work out the function implementation. The djinn tool provides an implementation.

### And Yet…

You get the picture: I’m a big fan of using types. However, there is one case where there I’m not as certain about how prominently types should feature: programming education.

I accept that a good strategy for teaching is to pare down what the students are exposed to. You start by teaching the students variables, or method calls, and you leave out other concepts (like loops) until students have got the hang of learning the first concept. So should types be one of the later concepts, that are omitted until students are ready for them? Should we start in languages where the types are relatively hidden and then move to more obviously typed languages later on?

Early on, this is the distinction between writing:

x = 5
y = "hello"


and:

int x = 5
string y = "hello"


Obviously, the former involves less concepts to start with. But does the latter help your understanding in the long run? We get into further differences with the question of whether you can write:

x = 5
x = "hello"


Is a variable a place for storing anything, or does a variable have a type? If you start with the former, is it difficult to later teach the latter? Similarly, can you have heterogenously-typed lists with all sorts of different elements:

x = [ 5, "hello", [3.0, 5.6] ]


Broadly, what I’m wondering is: are dynamically/flexibly typed systems a benefit to learners by hiding complexity, or are they a hindrance because they hide the types that are there underneath? (Aside from the lambda calculus and basic assembly language, I can’t immediately think of any programming languages that are truly untyped. Python, Javascript et al do have types; they are just less apparent and more flexible.) Oddly, I haven’t found any research into these specific issues, which I suspect is because these variations tend to be per-language, and there are too many other confounds in comparing, say, Python and Java — they have many more differences than their type system. I’m interested to hear anyone’s thoughts on this issue.

Filed under Uncategorized

## Peer Instruction and Computing

Lectures in higher education correspond to that old Churchill chestnut about democracy: they are the worst of all mass teaching methods, except all the others that have been tried. The criticisms are simple: they are too passive, too monotonous, and students don’t gain much from being there. Various solutions have been proposed from closing all the universities and switching to MOOCs (with, erm, online lectures?) to flipping the classroom and more besides. One early flipped-classroom-style proposal (in the 90s) was Peer Instruction, put forward by Eric Mazur, which I will describe in this post.

### Peer Instruction

Mazur was teaching Physics at Harvard. He saw research showing that students’ wrong preconceptions about how the physical world works (along the lines of heavier things fall faster) were persisting despite studying high school physics and university physics. The solution was centred around drawing out and challenging the students’ misconceptions. I’ve previously discussed how to do this in online videos, but Mazur came up with changes to the lecture format to enable this to be done effectively in a lecture theatre.

The basics of Peer Instruction are:

• Set reading before the class
• Discuss a topic briefly (say, for ten minutes) in the lecture
• Propose a conceptual question on the topic, with a right answer, and several answers corresponding to common misconceptions
• Get the students to answer the question by themselves (e.g. with “clicker” devices or equivalent smartphone apps)
• Get the students to engage in small group discussions with their neighbours about the answer they gave
• Get the students to answer the question again via their clicker
• The lecturer then takes questions from the audience, or clears up misconceptions he/she overhead during the small group discussion

There’s presumably various forces at work here. The question followed by discussion helps students realise when they are wrong (the first step in the battle!) and then students can convince each other of the right answer. Mazur makes the point that students can often provide a well-tailored explanation to each other because they share a novice viewpoint, although presumably you also run the danger of the blind leading the blind. Advocates of Peer Instruction are keen to point out that it’s not just about adding clickers to ordinary lectures. It’s not the technology by itself that makes a difference, it’s the whole protocol.

### Concept not calculations

One central argument in the Peer Instruction book is about solidifying concepts, not calculations. Mazur shows that conceptual understanding is relatively orthogonal to the ability to calculate the correct answer on standard textbook/exam questions. This chimed a little with me: I recall drawing lots of force diagrams in mechanics (the physics end of maths) involving a normal force. I was never happy with the concept: the idea of the ground exerting an upwards force was baffling to me. Is the ground really pushing upwards? What would happen if you turned gravity off — would everything fly upwards with the power of the normal force? However, these questions never mattered to my progress: I knew how and where to add the force on the diagram such that the forces would balance and I could get the correct answer. I got through the course successfully despite not understanding many of the central concepts, and Mazur suggests that many physics students are the same. Based on the evidence he provides in his book, Mazur’s Peer Instruction seems to greatly increase conceptual understanding.

### The question is the questions

One aspect that seems to be key to peer instruction is picking the right questions. Peter Newbury discusses the issue in a post here — broadly, you need a question that illuminates misconceptions and provokes discussion. All of Mazur’s questions are in physics, so to apply peer instruction to computing, someone needs to develop a bank of computing questions. Enter Beth Simon at UC San Diego, who has been applying peer instruction in huge lectures, with sparkling results (see links to papers at the end of the post). You can get access to questions at the research group’s Peer Instruction 4 CS site.

Here’s an example question, about recursion:

Consider the following code.

void recur (int i) {
if (i == 0) {
printf ("%d ", i);
return;
}
for (int j = 0; j < 2; j++)
recur (i - 1);
}


What is printed by the call recur (1)?
A. 0
B. 0 0
C. 0 0 0 … infinitely
D. 0 1
E. None of the above

The idea is that each answer to the question traps a different misunderstanding of the code (e.g. confusing i and j, not understanding that two recursive calls will occur, or that the recursion will end).

### Picking Groups

I recently had the chance to hear Quintin Cutts (who’s been applying these ideas at the University of Glasgow) and Beth Simon talking about the methodology during this year’s ICER conference. One question that came up was the effect of these small groups: computing tends to have a problem with a wide variance in prior knowledge (and a glut of show-off know-it-alls). Simon teaches primarily non-majors, so tends to have a more even distribution of ability; she assigns groups randomly. Cutts has a more traditional computing intake, and said that he deliberately groups people by ability level. If there is a wide disparity, he said, you lose the peer aspect and it just becomes one student forever teaching another. Equalising the ability level allows for better discussions.

I find Peer Instruction an interesting methodology and the results seem positive. It’s a teaching method that’s intended for university; I haven’t looked to see if there has been work on applying this pre-university in schools, although there have been questions developed for Alice. I am conscious of the stress that the proponents place on following the protocol closely and accurately. I’m reminded of Coe’s talk from ResearchEd at the weekend where he mentioned findings that worked well in research not necessarily transferring into widespread practice. Certainly there seem to have been several educators who took the message “use clickers” from this work, without achieving any of the benefits because they dispensed with the important other aspects of the protocol.

There are many papers about Peer Instruction in computing. Rather than list individual papers, here’s some links to lists of relevant papers (almost all publicly available): one list here, and another list here.

Edit: I see that I forgot to mention explicitly that there is a Peer Instruction book written by Mazur. However, it’s only about 40 pages on Peer Instruction in general, and the rest is a vast collection of physics Peer Instruction book. So if you’re interested in Peer Instruction for other subjects, you may find yourself a little disappointed with how slim the book effectively is.

1 Comment

Filed under Uncategorized

## ResearchEd 2013

ResearchEd 2013 took place today. It was a (UK) conference intended to bring together teachers and education academics, to try to better support collaboration between the two. One of the most pleasing things was the appetite for the conference: 500 attendees got themselves to London on a Saturday, with 400 more on the waiting list. And this for a conference that had never been run before, advertised mainly through Twitter and blogs. I went — here’s some of my thoughts.

Dulwich college, host to ResearchEd 2013

### Coe on Evidence

Robert Coe’s talk was my favourite of the day (slides here, PPT). He mentioned the Education Endowment Foundation, and their work on trying to summarise useful educational research (a bit like the Cochrane Review?) — something I want to look into further.

Coe sounded some notes of caution on transferring research into practice: using his example, “Assessment for Learning” apparently comes out very well in trials, but this did not translate into a massive effect in practice when the government pushed it. Understanding why would be useful for future efforts to transfer research into practice.

Coe also pointed out that some practices continue to be used despite a lack of evidence for their effectiveness. Tom Bennett previously documented most of the obviously barmy ones (in his book Teacher Proof, my review here), but as a less obvious example, Coe questioned where the evidence is that classroom observation (teachers observing their peers) improves teaching.

### The Effect Size Debate

Coe also cropped up in another interesting session: a debate between Coe and Ollie Orange about whether effect size is a good measure. Coe was for effect size, Orange against. I think Coe’s argument boiled down to: it’s not ideal, but it is a useful, slightly crude, heuristic in several circumstances (comparing incompatible measures of the same outcome, performing meta analyses).

Orange’s argument was not as convincing. A large part of his argument was that proponents/inventors of effect size do not have maths/statistics degrees (he actually listed them out loud and their degrees) and that mathematicians do not use effect sizes. Dealing with the first part: I agree that the lack of training could be a warning sign, but it is not itself an argument against effect size. Science and rationalism are about reasoned arguments, not who said what. Coe said in counter-argument to the second point: why would a pure mathematician use an effect size? It’s a pragmatic measure used by empirical researchers (in education, psychology, medicine and so forth). It seemed a shame that Orange did not dispense with all this and spend more time on critiquing the mathematical properties of effect size instead. (Not all the audience might have followed it, but it seems to me that debating effect size requires getting into the mathematics at least a little.)

In the comments afterwards, discussion inevitably moved to Hattie, who based his large meta-meta-analysis on effect size. Coe said that he thought Hattie’s work was “riddled with errors” (which sounds like it roughly agrees with my assessment of the book). I think it’s important not to use inappropriate uses of a statistic to argue against all uses of the statistic. The mean is a bad measure for skew data (like salaries) but that does not imply that we should all stop using the mean as a statistical measure.

### Pick and Mix

A few leftover bits. Ben Goldacre’s keynote was good, and he did admirably well at surviving the nightmare hitch of not being able to display his slides. Useful point: if we start properly assessing the claims of new education intiatives and products, this will encourage their proponents to make smaller, more reasonable claims. Amanda Spielman in her talk mentioned having a “drawer of debunking papers” ready to hand out to people who suggested a known-to-be-ineffective initiative to her — I liked that notion. Tom Bennett: good science seeks to disprove itself, not to confirm existing beliefs.

Overall, I enjoyed the conference and wished I could have gone to a couple more sessions (the conference had six parallel sessions!). However, I gather many of the sessions were recorded, so I should shortly get my wish. A good sign from my perspective was meeting Sue Sentance there, who now works for Computing At School, and who is keen on encouraging more research collaboration between computing teachers and academics in the UK. It’s clearly some kind of zeitgeist.