Beyond NAPLAN and the datawall

Denis Fitzgerald

The good news is that Australia has an especially strong research base in outlining what does indeed lift student achievement in all school communities, especially those in greatest need. An exemplary case of this is the work of Professors Wayne Sawyer and Geoff Munns and their colleagues in The Fair Go Project that has closely analysed the teaching practice of highly successful schools in NSW. This project has identified the consistent themes and characteristics of successful teachers and teaching. It found that what truly works is a whole school approach that has a consistently high level of intellectual challenge and where actual teaching and learning are central to all classroom activities thereby forming the basis for academic success. 

This is more elaborate than that simple exposition suggests however. The successful classrooms are those, as Geoff Munns describes them, where: 

The schedule is demanding, with expectations that all learners will be involved with work that is cognitively challenging. The combinations of intellectual quality, active and authentic learning and consciously designed enjoyable experiences build engagement. 

Flowing from this, the classroom becomes a place of:

  • higher order thinking, problem solving and analysis;
  • research and experimentation where students are encouraged to question their conclusions;
  • student-centred discussion;
  • teacher modelling;
  • the foregrounding of goals, concepts and criteria;
  • linking existing student knowledge and new fields of enquiry;
  • the integrated use of resources, including ICT, with rich tasks to encounter; 
  • relentless questioning;
  • the embrace of intellectual risk;
  • valuing the process of learning as well as content knowledge;
  • building understanding and answering problems;
  • interpretation rather than reproduction of rote knowledge.

The Fair Go Project concentrated its work in low SES communities and was able to demonstrate the particular efficacy of these approaches in schools where teaching and learning occurs in a complex and often demanding environment. Encouragingly, schools that have followed these positive pathways of thought are demonstrating clear success. Woonona High School, as one example, has a whole school approach of rich task design, high standards in all classrooms, a practice of drawing students into their own assessment and a culture of setting and achieving challenging learning goals. And it works. 

It is of particular significance that what has not benefited low SES school communities has been the testing agenda of the past 25 years. Indeed, these communities are some of the biggest losers, globally, from the test obsession. As Linda Darling-Hammond reports from the USA: 

Poor districts offer stripped-down drill and practice approaches to reading and math learning, rather than teaching for higher order applications ... Critical thinking and problem-solving, collaboration, effective oral and written communication, accessing and analyzing information, curiosity and imagination ... have typically been rationed to the most advantaged students in the United States. 

What Darling-Hammond identifies here as higher order needs are the very skills that youngsters from linguistically struggling home environments need to have explicitly taught, valued and practised in their classrooms. 

Alan Luke describes a similar phenomenon in Queensland: 

Far from students in poorer communities lacking ‘basic skills’, they in fact receive more work on decoding at the expense of other critical aspects of reading and literacy. 

Teese and Lamb draw this conclusion from their national research: 

Since public perception of schools based on league tables particularly disadvantages low SES schools, the consequence is a focus on “performance”, rather than “achievement".

Meanwhile, both performance and achievement are in decline. John Hattie observes that PISA results have steadily declined since the year 2000 and this has been principally because of the worsening relative outcomes of the top 40 per cent of students. And here we draw closer to the quintessential failures of the testing regime. It does not focus on deep performance. It does not enhance performance. It has corresponded to an inexorable decline in results across the board for Australian students in international measures. It correlates with a cross-curriculum decline in top end outcomes, without any commensurate lift in bottom-end achievement. It has narrowed the taught curriculum. It has sapped the joy from too many classrooms and programs. In fact, it has conspicuously and remorselessly failed by every criterion and goal that it has set out for itself. 

The Fair Go Project has closely analysed the teaching practice of highly successful schools in NSW. This project has identified the consistent themes and characteristics of successful teachers and teaching. Perversely, our sustained national decline across stages, years, subjects and learning areas has led some key policy makers and politicians to prescribe the need for more testing as the cure for the very malady that a generation of testing has clearly contributed to. One might suggest therefore, that the wave of testing has nothing to do with direct student benefit or enhancing student achievement. It now borders on ideology. 

Naturally, it is important for schools and systems to review the outcomes of the students in their care. Indeed, systems have an ethical obligation to use such data to apportion resources as they would in any other area of public policy in direct ratio to areas of identified need. This function however, is very different to using data for the purposes of blood-sport media reporting or mindless yuppie chatter. To use the private academic performance of children in obligatory tests as the basis for public speculation, social division, community shaming and talkback prattle is shameful. Systems can collect all of the data required to inform education and resource policy, without reducing the process to the public spectacle it has become today. 

NAPLAN online

It comes as no revelation to note that there is a huge impetus from education authorities to move this testing juggernaut online. Concomitant with this desire is a massive edu-business motivation to gain as much command of this lucrative process, which is central to their multi-billion dollar business plans. While the early stages of moving the testing online fell at the first hurdle in 2017 — the capacity of school systems to technically comply — it borders on certainty that the push will be resumed. And while there is some potential in some elements of online assessment, including possible variegated and differentiated testing, shifting it away from a mass and simultaneous test, disconnecting the test from the current reporting fever, there is much to be done before it is possible even in limited domains. 

Until such testing does not advantage those students already starting with advanced keyboard and IT skills that often correspond with existing social privilege, and until all schools are equipped with the capacities and facilities to allow their students a fair and equal go in these assessments, the time is not right to move the testing behemoth online. In its apologia for robot marking, ACARA seeks to sidestep the question of students’ varying capacities and experience with computers and keyboard skills by claiming that: “There are variations in how fast and well a student can type, just as there are variations in how fast and well a student can write by hand.” The speciousness of this justification is exposed when one realises that handwriting speed does not correlate with social class. 

The related concern arrives with the desire of the edu-businesses to command not only the setting of these tests but also the marking of them in totality and exclusively by their corporation’s computers. ACARA, which has a close and worryingly fond relationship with some of these edu-businesses, has been energetically pushing plans to move all marking online, including those sections devoted to persuasive, creative and narrative writing. These vast global corporations, glorying in names such as Measurement Incorporated, Pearson, Pacific Metrics and MetaMetrics, claim they have the technical capacity to mark anything that is tested. 

Teachers are aware that simple multiple choice answers can be scanned and diced by computers but naturally this is because of the low level type of response elicited from students in such assessment items. It comes as no revelation to note that there is a huge impetus from education authorities to move this testing juggernaut online. Yet a dollar can make a person (or a corporation) say or think or believe the most extraordinary notions. What we will soon be confronted with is the proposition that these giant companies can develop algorithms that can evaluate creative forms of expression in extended prose form. The fact is: they cannot. Computers can identify from a pre-programmed list one or more correct answers from a scale. They can engage in reductive counting. At the more advanced level, they can identify patterns and locate some technical errors in spelling or punctuation. They can be programmed, and here we begin to enter dangerous territory, to reward as “correct” a certain answer or answers. Their machine capacities are somewhat proficient in low order parts of the curriculum. 

In all of the above, these robot markers contain finite, humanly-programmed language to “mark” and identify finite forms and modes of expression. Quite naturally, as overseas experience reveals, they can therefore be gamed as teachers prepare students to develop responses to please the robot. If you know the underlying algorithms, you can thrive. Again, the underlying feature of computer-based marking is its encouragement of skills at the lower end of the spectrum. But a profit bottom line can be a dangerous thing and we should prepare ourselves for the march of the robots cheered on by very well paid acolytes who will swear that these machines can mark tests as well as (if not better) than professional teachers, or to use ACARA’s Orwellian term “human markers”. 

But as Hamish Douglass, one of Australia’s leading IT venture capitalists admits, we are not even close to having computer capacity to appreciate or evaluate human language and expression. As even this gung-ho tech disciple admits: 

We are within five to ten years away from software programs written to truly automate human knowledge and what is missing is computer understanding of natural language. 

Meanwhile, NAPLAN Online is being trialled in schools right now. In the USA robot marking has spread across school systems and into some of their colleges. Thankfully, there has been a backlash led by academics such as Les Perelman from MIT and colleagues at Harvard and other colleges. Perelman has demonstrated how these machines can be gamed and exposed. As he explains, computer markers “do not understand meaning, and they are not sentient. They do not react to language; they merely count it”. 

Perelman and colleagues point out that computers are unable to recognise or judge the elements most associated with good writing such as logic, flair, clarity, accuracy, relevance of ideas, irony, innovation, playful form, appeals to audience, humour, sequencing, truthfulness, variation of tone, modes of persuasion, complex organisation, appropriate use of evidence, devices such as repetition, the deliberate use of grammatical nonconformity and sheer joyful flights of language. 

The US National Council of Teachers of English concurs with the Perelman analysis. They found that computers can detect low-grade, technical errors. They can mark to a deficit system of tracking down and punishing some exam room blemishes. But little more. They are cheaper than humans, however, and they are coming.

Beyond the datawall

To begin constructing a replacement for the current failed assessment and testing mess, we have a starting point for our understanding of the diverse and complex nature of the broad curriculum that students experience and the different contexts in which teachers operate. 

When the assessment is distributed across the whole course, the negative effects of ‘teaching to the test’ are minimized, because the only way to maximize the students’ achievement is to teach all the students everything. When the assessment is cumulative, there is no incentive for students (and teachers) to adopt a shallow approach, because if material is forgotten, it has to be learned again, because it is going to be assessed again ... 

The important thing is that the assessment system, as far as possible, creates positive incentives for teachers to teach well, and for students to study well. Once this kind of assessment is in place, it should fade into the background and be unnoticeable and unremarkable. (William)

In the Australian context, this would mean that the uniform mass testing of decontextualised skills and knowledge would need to be replaced with testing that has a range of intellectual demands for different student capacities at different times and would be completely embedded in syllabuses and curriculum developed by the appropriate statutory bodies. Quality, equity, personalised learning and curriculum-assessment alignment would all be enhanced through this process. Such an approach would allow schools to progress beyond the current modish managerialist obsession with number harvesting, spreadsheets, datawalls and scattergrams.

As Wiliam’s research establishes: “There is absolutely no evidence that this kind of monitoring has any impact on students’ learning.” What it does do, however, is give the illusion that something constructive is being done, that a pseudo-scientific method is being applied to student need, that political accountability is being served and that the human mind and its progress can be simply mapped. It is scarcely the much-heralded twenty-first century approach to learning. In fact, it bears the hallmarks of a nineteenth century approach to schooling, rather akin to phrenology.

Pasi Sahlberg and Jonathan Hasak remind us that this latest datawall fad is yet another import from US schools where they adorn corridors and staffrooms. They conclude: 

These data sets, however, often don’t spark insight about teaching and learning in classrooms; they are based on analytics and statistics, not on emotions and relationships that drive learning in schools. They also report outputs and outcomes, not the impacts of learning on the lives and minds of learners.

This article has been extracted from: Denis Fitzgerald, Crossroads: A green paper on assessment policy, NSW Teachers Federation.


Linda Darling-Hammond (2010), The flat world and education, New York Teachers College

G Munns, W Sawyer, B Cole and the Fair Go Team (2013), Exemplary Teachers of Students in Poverty, London Routledge

National Council on Teaching of English, “NCTE Position Statement on Machine Scoring”; and

Rich Tasks Assessment

Pasi Sahlberg and Jonathan Hasak (2017), “Next Big Thing in Education: Small Data”, reprinted in The Journal of Professional Learning Semester 1, 2017 at

Dylan Wiliam (2011), “The bridge between teaching and learning”, Pedagogiska Magasinet [Sweden] September 16


Denis Fitzgerald has taught in a range of public schools as a classroom teacher and in a variety of promotion positions. He was the President of the NSW Teachers Federation and the Federal President of the Australian Education Union. He held the position of Director of Equity and Aboriginal Education in the NSW Department and is a member of the NSW Board of Studies and chairs the Board’s senior History Consultative Committee. He was the founding Director of the Centre for Professional Learning and was the founding editor of the Journal for Professional Learning.

This article appears in Professional Voice 12.2 The improvement factor.

Have you recently transitioned to @education email and would you like to change email address for membership?