Putting NAPLAN to the test: Towards a new national assessment system

Putting NAPLAN to the test: Towards a new national assessment system

Rachel Wilson and Pasi Sahlberg

The famous American psychologist Abraham Maslow (1966) once said that “I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.” Sufficient proof of this practical wisdom is right before us: Give a hammer to young children and before too long they find many things in need of hammering around them.

What does this have to do with education? Well, when important education policy decisions are made, and stockpiles of data from standardised student assessments like PISA and NAPLAN are available, most likely these data are somehow used to justify proposed decisions whatever they are. But just like a hammer is not always the right tool, an assessment system can also hit the wrong target. Indeed, they have and, unfortunately, they will.

In this article we will outline the problems with the current NAPLAN tests and what an alternative national assessment system in Australia could look like. Before doing that, we must answer an intriguing question: What are appropriate as well as inappropriate uses of national student assessment tests? Surprisingly, despite the number of previous reviews of NAPLAN this question hasn’t had the attention it deserves.

First, we believe that using standardised student assessments, e.g. NAPLAN, for one or more of the following purposes can be considered as appropriate:

  1. Informing teachers and parents about their students’ relative achievement.
  2. Informing education system leaders about, and holding them accountable for, education system performance.
  3. Informing policy decisionmakers about how to allocate supplemental resources to schools; and
  4. Selecting students for special programs.

There is a strong shared understanding among assessment experts and researchers that “no one test can serve several purposes equally well” (Popham et al., 2014), so there needs to be a priority order for the purposes – and the assessment needs careful design to meet them. If national student assessment data is used for more than one of the above purposes, it is advisable to decide what is the most important purpose and invest most efforts to get that done as well as possible. Otherwise, the quality and validity of serving any of these purposes is put in jeopardy.

Second, we believe that standardised student assessments, e.g., NAPLAN should not be used for the following purposes for the following reasons:

  1. Judging the quality of schools or teachers using data from standardised student assessment tests. Evidence suggests that schools or teachers don’t have significant influence on aggregate measures of students’ performance in school, for instance average test scores of the school or groups of students in the school.
  2. Deciding the quality of curriculum, instructional programs, or teaching in schools. Standardised tests are not designed to be instructionally sensitive enough to differences in the teaching skills of teachers or instructional programs of schools.
  3. Grading individual students. Standardised student assessments are not designed to be end of the school year (or course) tests and they are therefore unfair student grading metrics.

In a recent report by the Gonski Institute for Education together with our colleagues we recommended  scrapping the National Assessment Program – Literacy and Numeracy (NAPLAN) for a new National Assessment System that puts students’ interests first and prioritises supporting them, and their teachers and parents, so that teaching and learning can flourish. 

We are not suggesting that we should abolish all standardised testing altogether. Quite the opposite. We think that Australian students, their parents and teachers deserve much better ways of assessing learning and education that are less harmful and more informative to all of them. Here is what we have suggested.

A radical rethink?

After evaluating NAPLAN against its original stated aims (a C grade) and against a series of more contemporary aims (an F grade) we employed the now substantial mountain of research evidence on NAPLAN, including surveys commissioned by ACARA, to go beyond critique and provide productive feedback on how our assessment system could be totally redesigned. 

To ensure we got our redesign priorities right we used four questions to frame our thinking. First, we ask, what do students need? As a matter of priority students need a system that is developmentally appropriate to them, and which does no harm. They need assessment that reflects the breadth of learning that they have undertaken and supports them to become ‘confident and creative learners’. Furthermore, students need an assessment system that is fair and accurate and able to support their teachers, schools and the broader education system.

The second question is, what do teachers and schools need to support students? For schools and teachers to meet their purpose, they need a system that is oriented to supporting students and their learning. In practical terms this means that the national assessment system should be instructionally useful, and in balance with classroom assessment. The system should be trustworthy and informed by the profession and it should support teachers by providing resources that are quality assured and strengthen professional practice.

The system must also consider parents. What do parents need to support their children and schools? Parents need to be confident their national assessment does no harm and promotes holistic development of their children; and trust that it contributes toward meeting national education goals. To be able to support their children at home they need a system with reporting across a broad curriculum, throughout the school year. For parents to understand their child's progress, effectively reporting should describe what has been learnt, and may be linked to standards and benchmarks.

Finally, we ask: what does the system need for monitoring and public accountability? Again, there is a need for assessment designed to promote learning, as that is the foundational goal of any education system and it must be the highest priority. Secondary to this are system needs for fair and accurate assessment for monitoring and accountability purposes. And further to this, the system must be efficient, with strong utility and imposing no unnecessary burdens. To ensure both accountability and learning goals are met, the system also needs thorough analysis and reporting on assessment data that is linked through to other system data and an effective policy response cycle.

These four questions could be used, more generically, to address any education goal. The answers our group came up with in relation to assessment are based on research literature and international expertise; and they reflect ethical principles (e.g. first, do no harm) and our national educational goals (on “excellence and equity” in learning and student wellbeing for “confident and creative learners”). 

It doesn’t take much to conclude that NAPLAN is not able to meet these needs adequately. Indeed, it was never designed for student and teacher needs, gave brief attention to parents’ needs for reporting and was preoccupied with system monitoring and accountability. As we mentioned above, no one test can serve several purposes equally well.

Make no mistake, system requirements for monitoring and accountability are always important. Therefore, our proposed new national assessment system must meet those aims. We argue that a sample-based assessment system, alongside local evidence from a new, quality assured, classroom assessment system linked to national standards, can do that and provide more purposeful accountability than NAPLAN, with: 

  1. Assessment of the broader curriculum, with all key learning areas, skills and wellbeing assessed.
  2. Proper monitoring and reporting aligned to national goals (NAPLAN is currently not aligned) and a framework for policy action to meet them.
  3. Significantly more chance of effective implementation, stakeholder engagement and the lifting of system performance.

According to ACARA’s own research in 2017 (Colmar-Brunton, 2018), three of five parents value NAPLAN for providing “a comparable snapshot of their child’s progress”. Some recent media commentary argues that our proposed new national assessment system would go soft on the monitoring and accountability intents or provide inadequate reporting to parents. Parents can rest assured that, unlike NAPLAN which suffered from poor articulation of its purpose, mixed messages on its capabilities, rushed implementation unaligned to other system elements, and poor design that has frustrated schools and teachers, the new system’s customised design would meet system goals and ensure accountability and performance. 

With students and teachers as the priority, a focus on assessment for learning and improved system performance is more likely. The new system would provide parents with more than just a snapshot of their child, by providing richer and more diverse assessment across a wide range of learning areas pegged against national standards. Parents will have the added satisfaction of knowing that, unlike NAPLAN, the new system is designed to support and promote their child’s learning and development.

For teachers, the new national assessment system would ease the frustrations of the past decade and provide well overdue respect for their professional classroom assessment work. It would provide a range of resources and supports, including a national Assessment Resource System (ARS) with a curated library of diverse high-quality assessments mapped to the Australian Curriculum and pegged to national standards. Teachers would be co-designers and invited to submit their best assessment tools for inclusion in the library. A well organised library would provide quality-assured and convenient assessment, enriching teaching and learning across the full curriculum. 

The ARS would also include a test item bank that teachers could draw on to generate classroom tests for their students, with the content and timing at the discretion of teachers, online automated marking, and reporting against curriculum and national standards. Such a system would strengthen classroom assessment, drive learning and enable richer reporting. It would enable assessment to be deeply embedded within teaching and learning. This stands in stark contrast with NAPLAN where a narrow focus on literacy and numeracy and poor timing has frustrated and defied teachers’ efforts to link it to fulsome classroom practice.

Contrasting NAPLAN with a new assessment system

In meeting the needs of students, teachers, schools and the system, the proposed new national assessment system shows three key shifts from the current model.

Shift 1: From census to sample testing
First, there is a change from census-based to sample-based student testing for monitoring and accountability. Although there have been arguments against sample-based assessment for monitoring and accountability, many countries (e.g., The National Assessment of Educational Progress in the U.S.) have always utilised this approach successfully. Others have recently changed to this approach, in tandem with the focus on classroom assessment that we also propose. 

Shift 2: From measuring to learning
The second shift, toward learning from stronger classroom-based assessment is a very logical progression, based upon a mountain of research evidence telling us that classroom-based assessment holds the greatest potential for learning gains. We need a national assessment system that promotes a balance between summative and formative assessment; and between external and classroom-based assessment. The OECD, for example, sees this as key to world-class education. 

Shift 3: From accountability to trust-based responsibility.
The third shift is from NAPLAN’s focus on accountability toward a system based on professional trust and responsibility. This lines up well with our focus on students and their teachers. We propose a system which utilises teachers’ professional wisdom and collective expertise. Once again, this is supported by research evidence and a focus on utilising and strengthening professional expertise is a hallmark of recent international assessment system reforms.

Did NAPLAN pass the test? 

After reviewing research evidence, we assessed NAPLAN against its stated aims, identified in the McGaw, Louden and Wyatt-Smith review (2020). These included aims to develop or strengthen:

  1. national monitoring of programs and policies,
  2. system accountability and performance,
  3. school improvement,
  4. individual student learning and
  5. information for parents on student and school performance.

In the Putting Students First report we provide a detailed analysis of how NAPLAN has performed against these aims. It is obvious, for example, that NAPLAN falls short in helping teachers or principals to have reliable and valid information about their school improvement efforts due to its narrow focus and timing, and instructional insensitivity; in other words, the test’s inability to provide results that would allow judgement of how well students were taught at school. 

More surprising is how NAPLAN does not perform well against system monitoring and accountability aims. Although the move to national assessment was a substantial achievement, NAPLAN fails to assess, or even report against, many national goals. Many learning areas in the Australian curriculum are not assessed, and current research evidence included in our report suggests NAPLAN has led to a narrowing of the taught curriculum. 

We can take this even further. There is no consideration of students as “confident and creative learners”, and the reporting and policy cycle has failed to monitor, let alone redress, issues of inequity (Thomson, 2021). Although ACARA constructs the Index of Community Socio-Educational Advantage (ICSEA) the annual national report on schooling fails to consider how NAPLAN performance is related to this important indicator, or other indicators of socio-economic status. Neither does it provide adequate analysis on equity groupings, such as students in regional, rural and remote education, Aboriginal or Torres Strait Islander students, or students whose language background is other than English. 

In the annual report on schooling, NAPLAN data are also mostly reported cross-sectionally, with limited transparency on trends over time. These analyses are left to the motivated individuals to tackle on ACARA’s online data portal, or the 369 page NAPLAN national report. It is not surprising then that something has been amiss between the NAPLAN reporting and the policy response. NAPLAN’s identification of students who do not meet national minimum benchmarks, has labelled them but not resulted in their improvement (Adams et al., 2020). Clarity is everything in monitoring and accountability and these omissions, and a lack of alignment between goals and monitoring, speak volumes. 

Our suggested new national assessment system would build a better connection between national goals and assessment and evaluation of Australian school education. It would provide more accurate information about prevalent inequalities across the nation’s schools and communities to enhance policy-making and targeted interventions. Furthermore, unlike NAPLAN which, despite numerous reviews, has never been officially evaluated against its stated aims, a new assessment system would have a plan for periodic review and evaluation, drawing on feedback from all stakeholders. 

NAPLAN is outdated in light of other contemporary national assessment systems

A more contemporary evaluation of NAPLAN would put it against a series of eight challenges for all educational assessment systems, outlined by the OECD. These are considered very briefly below, a comprehensive account is provided in the Putting Students First report.

Challenge 1: Aligning educational standards and student assessment
NAPLAN was created before the national Australian Curriculum and is pegged to a series of national standards. Poor early alignment between NAPLAN and the curriculum created ongoing confusion, particularly among teachers. Much of the Australian curriculum is not assessed through NAPLAN, or the other elements of our national assessment system. Instead of alignment with the broad curriculum, research suggests NAPLAN has created a narrowing curriculum and increased teaching to the test due to the privileging of those tests over  the other National Assessment Program’s sample-based tests and non-assessed curriculum areas.

Challenge 2: Balancing summative and formative assessment
The original design of NAPLAN was as a summative assessment, however from the start there were mixed messages from political and educational authorities on the purposes of NAPLAN and its potential as a formative and diagnostic assessment. These led to confusion and frustration among many stakeholders. There is no evidence that NAPLAN has produced positive outcomes as a formative assessment, nor is it balanced with other formative assessments. Some research suggests that, far from being formative, NAPLAN’s summative approach has had a negative impact on teaching and learning. The measurement framework for schooling in Australia does not mention formative assessment, it currently only considers summative data from NAPLAN, NAP and international student assessments (PISA, TIMSS, and PIRLS). A national assessment system without formative assessment misses out on the great opportunity to drive learning through assessment. 

Challenge 3: Balancing external assessments and teacher-based assessments
NAPLAN was never presented as a testing system to be balanced with teacher-led assessment and Australian policy makes scarce mention of teacher-based assessment, yet this is the international recommendation. Furthermore, it could be argued that the primacy of NAPLAN within Australian schools, has threatened teacher-based assessment.

Challenge 4: Developing fair assessments for all student groups
Research suggests NAPLAN tests are not suited, and are unfair, to substantial proportions of Australian students (Jorgensen, 2010; Macqueen et al., 2018; Wu 2008; Thompson et al., 2018; Adams et al., 2020; Davies, 2012). There are many students who are unable to participate in the tests and participation rates for NAPLAN tests have been declining, suggesting a lack of suitability and community engagement (Anderson & Boyle, 2015; Thompson et al., 2018). The appropriateness of high-stakes testing is different for different age groups, but NAPLAN’s one-size-fits-all approach is used across years 3 to 9. Furthermore, the commercial nature of preparation for NAPLAN tests also creates unfairness; with more affluent families able to purchase prep materials, additional coaching and private tutoring for their children to do better in these tests. 

Challenge 5: Designing large-scale assessments that are instructionally useful
Many teachers report feeling pressured to teach to the test due to NAPLAN’s role in accountability. Many also find student preparation difficult and challenging to their professional integrity, because NAPLAN is not clearly linked to curriculum, its timing limits its usefulness for broader classroom learning, and time spent on NAPLAN preparation detracts from time for other highly valued curriculum and activities. Outside of schools a wide range of commercial NAPLAN instructional material has evolved, including NAPLAN coaching centres, specialised private tutoring and home study programs. But there is little, or no, evidence to verify the instructional effectiveness of these programs. NAPLAN was not designed to be instructionally sensitive or provide information about how well students are taught at school.

Challenge 6: Ensuring fairness in assessment and marking across schools
Some research, including that by Wu (2009, 2010) and by Perelman (2018), makes concerning analyses of NAPLAN’s reliability and validity. Annual reports by ACARA show limited disaggregation of results, thus much ‘fairness’ and the potential for bias goes unexamined and unreported. While some national assessment systems, like the USA’s SATs for example, have been subject to extensive research regarding bias and fairness.  Much more research and design work is needed to ensure NAPLAN is fair to all Australian students, especially Indigenous students and those from language backgrounds other than English.

Challenge 7: Securing informative reporting of student assessment results
The Louden review examines the issues relating to reporting of NAPLAN and echoes widespread concerns that NAPLAN is producing unintended negative consequences; and is not well designed to support teaching and learning. The major barrier to informative reporting of NAPLAN is the long wait between the tests and reporting back to schools and parents. Unlike some innovative assessments, NAPLAN tests have no elements with specific design for lifting student learning through feedback and reporting to the students themselves. The focus on literacy and numeracy alone, means that reporting is limited and cannot reflect the full breadth of students’ learning, nor many of the aspects laid out in our national goals.

Challenge 8: Ensuring the assessment is informed, valued and of optimal utility to the teaching profession
It seems self-evident that school assessment systems should be informed, valued and trusted by teachers. This is the most serious shortcoming of NAPLAN. Because of initial and ongoing confusion on the purpose of NAPLAN, much of the teaching profession has felt frustrated and disillusioned with it. This has been aggravated by the fact that their role in designing and using NAPLAN has been so limited. Frustrations among teachers and principals caused by NAPLAN have been loud and clear since the beginning. Still, each year before the winter comes most teachers put great effort into minimising negative consequences and maximising teaching and learning opportunities for their students despite the well-known limitations of NAPLAN. 

Conclusion

We have suggested that a new national assessment system could be developed in partnership with the teaching profession, so that it supports teachers’ professional practice, helps, not hinders, student learning – and is valued and trusted by all.

The proposed new national assessment system is neither radical nor risky. Sure enough, the change is substantial, but we wouldn’t throw the accountability baby out with the bathwater; and the assessment for learning design, balancing formative and classroom assessment with summative sample testing, is built upon solid logic and research. The real danger is that without swift change, NAPLAN’s focus on system accountability, at the expense of student learning, will mean our education assessment system continues to shoot itself in the foot. 

References

Adams, E. K., Hancock, K. J., & Taylor, C. L. (2020). Student achievement against national minimum standards for reading and numeracy in Years 3, 5, 7 and 9: A regression discontinuity analysis. Australian Journal of Social Issues.

Australian Curriculum Assessment and Reporting Authority (ACARA) (2020) Measurement Framework for Australian Schooling. https://www.acara.edu.au/docs/defaultsource/default-document-library/measurement-framework-2020-forpubafa92e404c94637ead88ff00003e0139.pdf?sfvrsn=1c3d4d07_0

Australian Curriculum Assessment and Reporting Authority. (2018). National report on schooling in Australia 2018. Sydney, AU. https://www.acara.edu.au/docs/defaultsource/default-document-library/national-report-on-schooling-in-australia2018.pdf?sfvrsn=e3cc7007_0

Davies, M. (2012). Accessibility to NAPLAN assessments for students with disabilities: a ‘fair go’. Australasian Journal of Special Education, 36(1), 62-78.

Haertel, E.H. (2013). Reliability and Validity of Inferences About Teachers Based on Student Test Scores. The 14th William H. Angoff Memorial Lecture. Princeton: Educational Testing Service.

Hetherington, D. (2018). What price the gap? Education and inequality in Australia. Sydney: Public Education Foundation. https://publiceducationfoundation.org.au/wp-content/uploads/2018/04/Issues-Paper_What-Price-The-Gap.pdf

Jorgensen, R. 2010. Structured failing: Reshaping a mathematical future for marginalized learners. In Shaping the future of mathematics education: Proceedings of the 33rd annual conference of the Mathematics Education Research Group of Australasia, ed. L. Sparrow, B. Kissane and C. Hurst, 26–35. Fremantle: MERGA.

Louden, W. (2019). NAPLAN Reporting Review. Prepared for COAG Education Council. Education Services Australia, on behalf of the Education Council.

Macqueen, S., Knoch, U., Wigglesworth, G., Nordlinger, R., Singer, R., McNamara, T., & Brickle, R. (2018). The impact of national standardized literacy and numeracy testing on children and teaching staff in remote Australian Indigenous communities. Language Testing, 0265532218775758.

Maslow, A.H. (1966). The Psychology of Science, New York: Joanna Cotler Books.

McGaw, B., Louden, W. & Wyatt-Smith, C. (2019). NAPLAN Review Interim Report, November 2019. Review commissioned by the state and territory governments of New South Wales, Australian Capital Territory, Victoria and Queensland.

OECD (2011). OECD Reviews of Evaluation and Assessment in Education: Australia. Paris: OECD.

Perelman, L. (2018). Towards a new NAPLAN: Testing to the Teaching. Report commissioned bt the New South Wales Teachers’ Federation.

Popham, W.J., Berliner, D., Kingston, N., Fuhrman, S., Ladd, S., Charbonneau, J. & Chatterji, M. (2014). Can today’s standardized achievement tests yield instructionally useful data? Challenges, promises and the state of the art. Quality Assurance in Education, 22(4), 303-318.

Santiago, P., Donaldson, G., Herman, J., & Shewbridge, C. (2011). OECD Reviews of Evaluation and Assessment in Education: Australia. OECD Publishing (NJ1).

Thompson, G., Adie, L., & Klenowski, V. (2018). Validity and participation: implications for school comparison of Australia’s National Assessment Program. Journal of Education Policy, 33(6), 759-777.

Thomson S. (2021). Australia: PISA Australia—Excellence and Equity? In: Crato, N. (ed.) Improving a Country’s Education. Springer, Cham. https://www.springer.com/gp/book/9783030590307 

Wilson, R., Piccoli, A., Hargreaves, A., Ng, P. T., & Sahlberg, P. (2021). Putting Students First: Moving on from NAPLAN to a new educational assessment system (The Gonski Institute Policy Paper #2-2021). Sydney: UNSW Gonski Institute

Wu, Margaret (2009). ‘Interpreting NAPLAN results for the layperson’, p.2: https://numeracy4schools.files.wordpress.com/2015/03/naplan_for_lay_person.pdf

Wu, Margaret. (2010). The inappropriate use of NAPLAN data. Professional Voice, 8(1), 21-25.

 


 

 

Image
PV14.1_Icon.jpg

 

Rachel Wilson is Associate Professor in Educational Assessment, Evaluation and Research Methods at the University of Sydney. Rachel’s background is in psychology, audiology and teaching, with particular interests in emotional development. She always endeavours to keep student interests at the centre of her work. She has technical expertise in relation to research analysis and educational evidence, and broad interests in policy, practice and system level reform and innovation. 

Pasi Sahlberg is an educator and author who has worked as a schoolteacher, teacher educator, researcher, and policy advisor in Finland and has studied education systems, analysed education policies, and advised education reforms around the world. He has written and spoken widely about these topics, his book “Finnish Lessons 2.0: What Can the World Learn from Educational Change in Finland” won the 2013 Grawemeyer Award for an idea that has potential to change the world. He is a former senior education specialist at the World Bank, a lead education expert at the European Training Foundation, a director general at the Finland’s Ministry of Education, and a visiting Professor of Practice at Harvard University. He is now Professor of Education Policy at the University of New South Wales.

This article appears in Professional Voice 14.1 Assessment, technology and the impact of social concerns.