Deep fakes, authenticity and authentication in the age of artificial intelligence
For a while now I’ve been pondering the concepts of authenticity and authentication of student work within the context of a new machine age where educators rely on software powered by artificial intelligence (AI) to detect plagiarism, and where AI is capable of producing ‘deep fakes’. Deep fakes are AI generated new content or the manipulation of existing content to make new images, videos, audio and text. In this article I provide some observations on this topic, teasing out concepts such as originality, authenticity and authentication in education — both what they mean now for educators and what they might mean in the future. This article draws on research, industry insights, contemporary debate, and my own experience as a teacher-educator. As thoughts-in-progress my hope is to prompt dialogue on the implication of AI for foundational ideas of originality and authenticity across education sectors.
Originality and authenticity
To begin, it is worth identifying a key assumption on which contemporary Western education rests. While there are philosophical, disciplinary, and ideological points of difference, one of the common assumptions is that growth and transformation is evident when learners create their own work to demonstrate mastery of knowledge and skills. This is especially true for the humanities, social sciences and creative and performing arts as they are taught and learnt within educational institutional contexts. Even though there are literary and creative movements that have homage, bricolage, pastiche, parody and mash-up at their core, in formal Western education systems the idea of originality as a yardstick for (assessment of) intellectual and artistic accomplishment remains salient.
The Merriam-Webster dictionary defines originality as “the power of independent thought or constructive imagination.” Originality is a hard-to-define concept. In different fields and contexts, it can mean different things. For example, there is a whole legal field devoted to protecting copyright and intellectual property (Margoni, 2016), and in PhD research it entails the creation of new knowledge in relation to existing knowledge (although there is often fuzziness about its definition as Clarke and Lunt  demonstrate). In history and heritage studies, originality “is a temporal concept: it implies a relationship with a point of origin that, by definition, was in the past and preceded any later versions or copies” (Wain, 2011, p.496). The concept of originality is also inextricably linked to the way we understand divergent and creative thinking with its association to both novelty and authenticity (Corazza, 2016).
The idea of originality is assumed in the enactment of influential educational frameworks such as Bloom’s taxonomy of educational objectives and its revised version (Krathwohl, 2002). Bloom’s revised taxonomy describes the cognitive processes by which thinkers encounter and work with knowledge along a continuum. Even if the initial categories of the taxonomy — remembering, understanding and applying knowledge — are not immediately evocative of original thought, pedagogically learners are often asked to demonstrate their thinking processes by paraphrasing (or putting the original ideas of others in their own words) or through transferring knowledge and understanding to novel problems (application and evaluation) which requires some inventive or divergent thought.
The concept of originality is associated with authenticity which has been defined by the Merriam-Webster dictionary as both “true to one’s own personality, spirit or character” and “not false or imitation”. In education, at a common-sense level this translates to students undertaking and submitting their own work for formative and summative assessment even if it is a paraphrase, application, or analysis of an original artefact. It is worth noting that there is also a U.S. tradition known as authentic pedagogy (Newmann, Marks and Gamoran, 1996) which has spawned many pedagogical variants with its emphasis on higher order thinking, depth of knowledge, social support for student achievement, substantive communication and, connection beyond the classroom. This pedagogical model uses the term authentic to differentiate between achievement that is meaningful not trivial and useless beyond the classroom. Authentic achievement occurs when: “(1) students construct meaning and produce knowledge, (2) students use disciplined inquiry to construct meaning, and (3) students aim their work toward production of discourse, products, and performances that have value or meaning beyond success in school” (Newmann and Wehlage, 1993, n.p).
The next part of this article examines how older ideas about originality and authenticity in pedagogy and assessment are being both informed and challenged by the rise of AI.
A quick guide to artificial intelligence
The new machine age has well and truly arrived. It is powered by AI which permeates and profoundly influences our everyday lives. From smart phone assistants and chatbots, to online advertising suggestions and facial recognition tagging technology in social media platforms, to internet search engines that can sort millions of sources in seconds, AI is both user-facing and working behind-the-scenes in the applications and platforms that we depend on for communication, work and education. AI can be defined as:
a machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments. AI systems are designed to operate with varying levels of autonomy. (OECD, 2019).
I have written elsewhere about AI and education, especially its ethical implications (Southgate et al. 2018; Southgate, 2020a), and so the following is a very brief explanation of the technology and how it works. AI has been a field of study since the 1950s. Today AI usually needs ‘big data’, harvested from the internet, sensors, and the geolocation signals from devices, to train and develop statistical models. These models can predict or make forecasts about phenomena (including human behaviour), provide recommendations for future action, or adapt to personalise content. These functions are increasingly being integrated into educational applications (Berendt, Littlejohn and Blakemore, 2020). At present, we are in an era of narrow AI. This means that AI are only able to do the focused task they were designed to do and that sometimes their effectiveness at doing these tasks can outperform humans. This leads people to think that AI are smarter than they are and both children and adults anthropomorphise AI (or project human characteristics on to it). It is worth noting that there are no AI who have the general intelligence that can be characterised as having a theory of mind that humans possess. That is, AI are incapable of forming (and re-forming) representations of internal states of knowledge, thoughts, expectations, beliefs, motives and emotions, or of appreciating the internal states of others (Southgate et al., 2018).
An important subfield of AI is machine learning (ML). Maini and Sabri, (2017) define this as:
(A) subfield of artificial intelligence. Its goal is to enable computers to learn on their own. A machine’s learning algorithm enables it to identify patterns in observed data, build models that explain the world, and predict things without having explicit pre-programmed rules and models (p.9).
The field of ML involves getting algorithms to learn through experience (an algorithm is instructions that tell the computer or machine how to achieve an operation). Computing systems with ML learn as they receive data but do not need to be explicitly programmed to do this. There are many types of ML and you can find an explanation of these and AI in education more generally in my recent article (Southgate 2020b). To briefly summarise some types of ML, there is: (1) supervised learning that involves humans initially labelling data to teach the algorithm to recognise phenomena; (2) unsupervised learning where algorithms create their own structure (features) that can be used to detect patterns in and classify unlabelled data; (3) reinforcement learning where the machine learns through trial and error within an environment; and, (4) deep learning which is based on feeding data through algorithmic layers similar to neurons in the brain to create outputs. it is important to understand how machines learn because this demystifies AI and allows us as educators to understand where it is appropriate to use the technology and where it is not, and its benefits and limitations.
Originality and authenticity in the age of AI
So why talk about AI in relation to originality and authenticity? AI powers online applications to authenticate student work and presents the greatest future challenge to authentication and commonly held notions of originality. Let’s unpack these statements in relation to three areas: (1) detection of plagiarism and contract cheating; (2) online proctoring services; and, (3) AI generation of original work.
1. Plagarism and contract cheating: Plagiarism refers to the process of taking someone else’s work or ideas and passing them off as your own work. There are different types of plagiarism from cut-and-pasting text straight from an article or the internet without using quote marks or paraphrasing without referencing, to the more sophisticated and time-consuming mosaic or patch writing where other people’s text is used without referencing but words are replaced with synonyms or the structure of each sentence is rearranged. Contract cheating involves buying or commissioning academic work and passing it off as your own. This work is bought from an online ‘essay mill’ or a private contractor. Although contract cheating is illegal in Australia it is still occurring and there is excellent Australian research and pedagogical initiatives designed to address it in higher education.
Platforms designed to assist educators detect plagiarism and contract cheating, such as Turnitin, are powered by AI. ML is used to detect patterns between a student’s work and other sources such as the platform’s database of student work, online research articles and books, and other internet sources. When taught to use Turnitin as a tool for review of work, students can check their “originality report” before a due date to correct referencing or fix inadvertent plagiarism. A quick search of the internet reveals many sites that suggest ways of “fooling the Turnitin robot”. While Turnitin refutes these suggestions, there are limitations to AI pattern detection especially when students take chunks of text from different sources and painstakingly reconstruct sentence structure without changing meaning. Even if students use referencing correctly, you can still read essays that substantially consist of quotes from different sources that have been strung together into paragraphs and that really do not demonstrate depth of understanding. Depending on the settings that instructors specify for each piece of assessment, the platform will not detect what is arguably a lack of original interpretation and certainly synthesis because the essay uses correct quoting conventions and referencing. Turnitin also has an option that is said to detect contract cheating using ML to create a probability score based on a writing style comparison related to prior student work (which needs to be wholly original) and the metadata of the document (although the latter is not without issues).
2. Online proctoring: The coronavirus pandemic has seen a massive rise of online proctoring services where students can undertake examinations in their own home under the supervision of a human and monitored by software. In an opinion piece for MIT Review, Swauger (2020) explains how online proctoring works and some objections to it:
When you begin (the exam), the software starts recording your computer’s camera, audio, and the websites you visit. It measures your body and watches you for the duration of the exam, tracking your movements to identify what it considers cheating behaviors. If you do anything that the software deems suspicious, it will alert your professor to view the recording and provide them a color-coded probability of your academic misconduct….(I)t will use some combination of machine learning, AI, and biometrics (including facial recognition, facial detection, or eye tracking) to do all of this. The problem is that facial recognition and detection have proven to be racist, sexist, and transphobic over, and over, and over again. (n.p.).
To elaborate on some of the issues, AI-powered biometric technology such as facial recognition still have difficulty recognising darker skinned faces especially women’s faces because the ML models have historically been trained on photos of white faces particularly male faces (Simonite, 2019). In the U.S. there have been reports of black students being asked to shine a light on their faces so the algorithm can recognise them. The algorithm can also flag a person’s behaviour as suspicious if they look away from the screen or not look at the screen enough or talk aloud and this can have negative effects for women with caring duties who might respond to a situation at home, neurodiverse students or those who are differently abled, and for students who do not have quiet living conditions to undertake an online exam. Surveillance, privacy and data security issues have also been raised (proctoring software may record and analyse not only the person but their surroundings). There is ongoing and warranted debate concerning the ethics of online proctoring and this highlights the tension between institutional reliance on traditional assessment practices such as exams as a yardstick for authentic knowledge acquisition and the role of algorithms in flagging behaviours through the collection and automated integration of biometric and contextual data.
3. AI-generated student work: Since releasing new software in 2019/20, the company OpenAI has caused a lot of debate about the ethics of an AI that can auto generate extended text, music and images. It’s GPT-3 (Generative Pre-trained Transformer 3) software uses unsupervised ML algorithms that create new texts from prompts and it can be ‘programmed’ to respond by showing it a few examples of what you would like it to do with varying degrees of success related to the complexity of the task (Open AI, 2020). As Simonite (2020) explains:
GPT-3 was built by directing machine-learning algorithms to study the statistical patterns in almost a trillion words collected from the web and digitized books. The system memorized the forms of countless genres and situations, from C++ tutorials to sports writing. It uses its digest of that immense corpus to respond to a text prompt by generating new text with similar statistical patterns.
Open AI also create image and music generators. While subject to critique, GPT-3 and its companion software should prompt serious dialogue about the machine as ‘author’ and how to detect this when applications such as these become commercially available. Imagine a future where a student uses an AI application to produce an AI product (music, text, visual art) that could gain a pass or credit grade. AIs could produce work where no two responses would be the same because it would learn to check against what it and other AI had already created: In other words, to check its original work against other AI original work. No doubt an AI will be developed to detect or authenticate AI-generated work but as machines continue to learn by themselves they may very well learn to avoid such detection. Although this sounds dystopian, there is a recent story of a computer science college student who created an entire credible blog using GPT-3 by working with topics that did not require systematic logic such as productivity or self-help.
Will current concerns about contract cheating be replaced by an AI panic in education? Rather than revert to an over-reliance on the great ‘sorting hat’ of the handwritten exam, educators will need to understand how to design assessment in authentic and rigorous ways that are not optimal for AI-generated responses and to create curriculum that cleverly engages with ML-generated products in critical and creative ways.
This article provides an initial exploration of the concepts of originality, authenticity and authentication of student work in the age of AI. The introduction of technology has challenged educators to think differently about the intersection of technology and teaching; for example, the introduction of calculators in mathematics, word processing programs and hand-writing, and reading in the multi-media age of the internet. It could be argued that AI represents a different type of challenge. To reflect on the Merriam-Webster dictionary definitions of originality and authenticity provided above, AI does not have either independent thought or imagination related to conventional conceptions of originality and yet machines can produce what would commonly be regarded as new work. Can some AI products be conceived of as less authentically original even if the technology is used to produce imitations and false representations in the form of deep fakes? If machine algorithms can learn, through the massive harvesting of existing online artefacts, to produce a piece of work that has not previously existed does that make it unoriginal or inauthentic? Do we need another language or ‘truth’ paradigm to distinguish and understand AI-generated artefacts or would it be better to shift our conception of human learning as (from lesser to greater degrees) machine-augmented? While these seem like philosophical questions, education is a philosophical project with a material foundations and effects. Education is about ethical conduct, ontology (being and reality) and epistemology (the nature of knowledge and knowledge production). AI provides both tools for supporting traditional ethical, ontological, and epistemological ideas of original, authentic student work and is a potentially powerful challenge to this. As educators we need to talk about this state-of-play because the new machine age of AI is already upon us.
Buelow, G. J. (1990). Originality, genius, plagiarism in English criticism of the eighteenth century. International Review of the Aesthetics and Sociology of Music, 117-128.
Clarke, G., & Lunt, I. (2014). The concept of ‘originality’ in the Ph.D.: How is it interpreted by examiners?. Assessment & Evaluation in Higher Education, 39(7), 803-820.
Corazza, G. E. (2016). Potential originality and effectiveness: The dynamic definition of creativity. Creativity Research Journal, 28(3), 258-267.
Guetzkow, J., Lamont, M., & Mallard, G. (2004). What is Originality in the Humanities and the Social Sciences?. American Sociological Review, 69(2), 190-212.
Krathwohl, D. R. (2002). A revision of Bloom's taxonomy: An overview. Theory into practice, 41(4), 212-218.
Margoni, T. (2016). The harmonisation of EU copyright law: the originality standard. In Global Governance of Intellectual Property in the 21st Century (pp. 85-105). Springer, Cham.
Newmann, F. M., Marks, H. M., & Gamoran, A. (1996). Authentic pedagogy and student performance. American Journal of Education, 104(4), 280-312.
Newmann, F. M., & Wehlage, G. G. (1993). Five standards of authentic instruction. Educational leadership, 50, 8-8. Retrieved https://www.ascd.org/
OECD (2019). Recommendation of the Council on Artificial Intelligence. OECD Legal Instruments. Retrieved https://legalinstruments.oecd.org/en/instruments/OECD-LEGAL-0449
Open AI (June 11, 2020) OpenAI API. Retrieved, https://openai.com/blog/openai-api/
Swauger, S. (7 August, 2020) Software that monitors students during tests perpetuates inequality and violates their privacy. MIT Review, Retrieved https://www.technologyreview.com/2020/08/07/1006132/software-algorithms-proctoring-online-tests-ai-ethics/
Simonite, T. (22 July, 2019). The best algorithms struggle to recognize black faces equally. Wired, Retrieved https://www.wired.com/story/best-algorithms-struggle-recognize-black-faces-equally/
Simonite, T. (22 July, 2020). Did a person write this headline, or a machine? Retrieved https://www.wired.com/story/ai-text-generator-gpt-3-learning-language-fitfully/
Southgate, E. (2020a). Artificial intelligence, ethics, equity and higher education: A ‘beginning-of-the-discussion’ paper. Perth: NCSEHE. Retrieved https://www.ncsehe.edu.au/wp-content/uploads/2020/07/Southgate_AI-Equity-Higher-Education_FINAL.pdf
Southgate, E. (2020b). Artificial intelligence, machine learning and why educators need to skill up now. Professional Voice, 13(2). Retrieved /professional-voice-1322
Southgate, E., Blackmore, K., Pieschl, S., Grimes, S., McGuire, J. & Smithers, K. (2018). Artificial Intelligence and Emerging Technologies (Virtual, Augmented and Mixed Reality) in Schools: A Research Report. Commissioned by the Australian Government.
Wain, A. (2011). A many-headed axe: originality in large technology heritage. Journal of Australian Studies, 35(4), 495-510.
 For an exploration on the social construction of ‘originality’ in humanities and social science research see Guetzkow et al. 2004.
 Buelow (1990) provides an historical perspective on the relationship between the idea of originality, genius, plagiarism
Erica Southgate is Associate Professor of Emerging Technologies for Education at the University of Newcastle, Australia. She is Lead Researcher of the VR School Study and a maker of computer games for literacy learning. She was a 2016 National Equity Fellow and 2017 Innovation Award Winner for the Australasian Society for Computers in Learning in Tertiary Education (ASCILITE). She co-authored the research report, ‘Artificial Intelligence and Emerging Technologies in Schools’ and has free infographic classroom posters on AI for primary and secondary students available here. For more on Erica’s research see https://ericasouthgateonline.wordpress.com/projects/
This article appears in Professional Voice 13.3 The new basics.