I’ve been reading a lot of assessment research over the past year, and one paper keeps pulling me back. It is not about AI. It was written before ChatGPT changed how every teacher thinks about exams and essays. But the more I revisit it, the more I see it speaking directly to the moment we’re all in.
The paper is Tai, Ajjawi, Bearman, Boud, Dawson, and Jorre de St Jorre’s (2023) Assessment for Inclusion, published in Higher Education Research & Development. Their argument is simple and uncomfortable: the design of an assessment is never neutral. Every choice we make about what students do, how, where, and under what conditions, will include some learners and push others further from the work we say we are measuring.
I’m repurposing the paper here because most of the “AI-resistant” assessment moves teachers are reaching for in 2026 are silently excluding the wrong students, even when nobody intends them to. This post pulls Tai et al.’s lens into the AI conversation and gives you practical strategies you can try in your classroom this term.
Why “AI-resistant” Assessments Often Hurt the Wrong Students
When generative AI started writing essays for our students, most of us pulled the same lever. We moved assessments back into the room, brought back handwritten closed-book exams, added single-session deadlines, and installed remote proctoring with eye-tracking. Some of us added oral presentations because, surely, AI can’t fake those.
Here is what Tai et al. would say about that move. Every one of those choices assumes a particular kind of student: neurotypical, fluent in the language of instruction, anxiety-free under time pressure, without caring responsibilities, with stable health, with strong fine motor skills. If your student doesn’t fit that profile, the assessment isn’t measuring what you think it is. It is measuring how close they come to that default.
The authors put it plainly: “choices in assessment design are never neutral, as each may promote or constrain inclusion differently, and affect different people” (p. 493).
The teacher who switches to a timed handwritten exam to keep AI out is, without meaning to, telling the student with a chronic pain condition, the carer who has to step away from her laptop to pick up a child, and the student who freezes on test day that this isn’t really for them.
From Cheating to Design Validity
The deeper move in Tai et al. is the reframe. They argue this is a design validity problem, not a cheating one.
The question to ask is simple. What is your assessment actually measuring? If the learning outcome is to analyse a primary source, the test should measure analysis of a primary source. If your closed-book, handwritten, ninety-minute exam is also measuring handwriting speed, native English fluency, recall under pressure, and composure in a quiet room full of strangers, then the test is judging students on things you never set out to teach.
The fix, in Tai et al.’s words, is to redesign the test, not the student.
That argument runs through more recent work too. Dawson and colleagues (2024) made the case that validity, not cheating, should drive assessment design. Nieminen and Eaton (2024) showed how accommodations themselves often get coded as cheating, which compounds the exclusion the assessment already created. The chain of evidence is converging on the same point.
So how do you actually do this work in a classroom? That’s where the rest of this guide lives.
Practical Strategies You Can Try in Your Classroom
These aren’t theoretical moves. They are things teachers I’ve worked with have done, and that the assessment research supports.
1. Audit what your assessment is really measuring
Before you redesign anything, pull out one current assessment and write down two columns. On the left, the learning outcomes you say you are measuring. On the right, every other thing the assessment actually requires. If the outcome is “students can construct a historical argument from primary sources,” and the test requires them to do that by hand, in fifty minutes, in their second language, you’ve just found three things the test is silently judging them on. Three more things you never said you’d teach.
2. Open up the mode
Give students multiple ways to demonstrate the same outcome. A student showing they can analyse a poem could write a short essay, record a five-minute spoken analysis, or build a slide deck with voice-over. The construct stays the same. The mode opens up. This one move addresses a wide range of access issues without you having to label any student as “needing an accommodation.”
3. Use authentic tasks
Tai et al. point to authentic assessment, tasks that mirror what professionals actually do, as one of the more inclusive designs. Real-world tasks tend to allow drafting, revising, consulting sources, and using tools (including AI) the way working adults already do. They also produce work that’s harder for AI to fake convincingly because it requires context only the student has lived.
4. Assess the process, not just the product
Ask students to submit drafts, annotated notes, a short reflection on their choices, or an audio explanation alongside the final piece. You are now assessing the thinking, not just the artifact. This works against AI shortcuts and against the kind of high-pressure, single-session exam that excludes anxious or chronically ill students. Two birds, one redesign.
5. Stretch the time window
Move away from single-session, single-day assessments wherever your curriculum allows. A week-long take-home with milestone check-ins gives carers, students with health conditions, and students who need quiet, unhurried focus a fair shot at the work. This is one of the most under-used inclusion levers in the room.
6. Try assessment for distinctiveness
Tai et al. describe an approach they call assessment for distinctiveness. It asks students to show what makes their thinking theirs: a personal connection, a local example, a unique angle. This is the rare design that is both AI-resistant (AI can mimic style, but it cannot authenticate a student’s particular experience) and inclusive (every student has a distinctive angle if you make space for it).
7. Reconsider the oral assessment
Public oral presentations work for some students. They are torture for others, especially those with anxiety, speech differences, or social communication differences. If you want oral assessment, consider one-on-one conversations, asynchronous video recordings, or small-group dialogues. You get the same construct, with much less of the public-performance penalty.
8. Take a hard look at surveillance tech
AI proctoring with eye-tracking, keystroke monitoring, and screen recording assumes neurotypical behavior. It creates real harm for autistic students, students with tics, students with anxiety, and students whose home environment isn’t a quiet private office. Before adopting any of this, ask Tai et al.’s question: who does this protect, and who does it exclude?
9. Think programmatically across the course
This is Tai et al.’s third design approach, and it’s the one most teachers overlook. The idea is simple. Not every learning outcome needs to be tested in every assessment. If you’re teaching across a semester or a year, you can spread your assessment of, say, research skills, critical analysis, and clear communication across different tasks at different times, each one designed for what it does best. A student who freezes on one mode gets other chances to show the same skill in another. This is harder to set up than a one-off redesign, but it pays you back every time a student would otherwise have fallen through a single test’s cracks.
10. Make the construct visible to students
Tell your students, out loud, what the assessment is measuring and what it isn’t. “This task is measuring how well you can build an argument from evidence. It is not measuring how fast you can type, how confidently you can present, or how perfect your grammar is.” That single sentence does inclusion work all on its own. It tells students what to focus on, reduces anxiety about the wrong things, and gives them grounds to speak up if the rubric silently contradicts the construct.

A Simple Audit You Can Run This Week
If you do nothing else after reading this, try one thing. Pick one assessment you’ll give in the next month. Print it out. Highlight in one color the learning outcomes you actually want to measure. Highlight in another color everything else the test silently requires. Then look at how much of the page is the wrong color. That’s your design validity problem, and it’s also your starting point.
Tips for Getting Started
1. Start with one assessment. Don’t try to overhaul everything at once. One thoughtful redesign is worth ten rushed ones.
2. Map outcomes against requirements. List what the assessment is supposed to measure, then list every other thing it actually requires. The gap is your work.
3. Open the mode wherever you can. Let students choose between written, audio, or visual responses when the outcome allows. The construct doesn’t change.
4. Build in process. Ask for drafts, notes, or a short reflection alongside the final product. You’ll assess thinking, not just output.
5. Stretch the timeline. Replace single-session tests with windowed assignments and milestone check-ins. Carers and students with health conditions notice immediately.
6. Anchor in authenticity. Use real-world tasks that benefit from context only the student has lived. AI struggles to fake that.
7. Co-build the rubric. Share what counts before students start the work, ideally with their input. Transparency reduces anxiety and improves performance.
8. Audit your tech. Before adopting any proctoring or surveillance tool, ask who it will harm. If the answer includes your most vulnerable students, find another way.
Final Thoughts
The AI panic isn’t going away, and the pressure to lock assessments down is real. But locking down rarely keeps AI out, and it routinely pushes vulnerable students further from the work. Tai et al.’s argument, written before any of this hit, gives us a better question to organize the conversation around: what am I actually measuring, and who am I including or excluding in the way I measure it?
You don’t need a perfect new system. You need to start with one assessment, one redesign, one inclusion move at a time. The work is worth it because the students are.
References
- Dawson, P., Bearman, M., Dollinger, M., & Boud, D. (2024). Validity matters more than cheating. Assessment & Evaluation in Higher Education.
- Nieminen, J. H., & Eaton, S. E. (2024). Are accommodations cheating? A critical policy analysis. Higher Education Research & Development.
- Tai, J., Ajjawi, R., Bearman, M., Boud, D., Dawson, P., & Jorre de St Jorre, T. (2023). Assessment for inclusion: Rethinking contemporary strategies in assessment design. Higher Education Research & Development, 42(2), 483–497. https://doi.org/10.1080/07294360.2022.2057451



