A map of the research on AI and learning

How can AI enable real learning?

What the research actually says.

How can AI actually help someone learn, and not just finish faster?

For a few months I have been following that question, and I wanted to step back from the individual headlines to draw the bigger map. This is that attempt: the through-line across the research, and the difference between an AI that answers you and one that makes you better at what you do.

Two honest caveats. First, I am not an education researcher: this is my map of what the research currently shows, meant to provoke a better conversation, definitely not to close one. Second, it is a snapshot as of June 2026, and the field moves fast: the models, the tools and the studies will keep shifting, and some of this will date. You will find the sources at the end, or read the original carousel as a PDF.

Scroll to begin

The problem

AI can improve performance and reduce learning.

In a randomised study of about 1,000 high school maths students in Türkiye, students using a standard ChatGPT scored 48% higher than classmates without it while they had it. When access was taken away, they scored 17% lower than students who never had it. A 2025 paper in Nature Reviews Psychology makes the underlying point: performance gains are not the same as learning.

Source: Bastani et al. (2025), PNAS. On performance vs learning: Yan, Greiff, Lodge & Gašević (2025), Nature Reviews Psychology.

+48% with AI → −17% once removed

Better output is not the same as learning.

Why it happens

When AI does the thinking, the learning does not happen.

Most AI tools were built for work, not to optimise learning. At work, the goal is to finish the task with the least effort. But in learning, that effort is the point: it is what builds the capability. The task gets finished, but the understanding never develops. Researchers call this “metacognitive laziness”: the learner stops planning, monitoring and self-evaluating, because the AI always has an answer.

Source: Khosravi et al. (2026), Building AI Companions that Prioritise Learning over Performance; metacognitive laziness: Fan et al. (2025), British Journal of Educational Technology.

If the AI carries the effort, it also carries off the learning.

Why effort is the point

The struggle is not an obstacle to learning. It is the learning.

Learning scientists Robert and Elizabeth Bjork call these “desirable difficulties”: things like recalling an answer from memory, spacing your practice out, or trying a problem before you see the solution. They make learning feel harder now, but they make it last. When AI removes that effort, it can quietly remove the learning the effort produced. A 2026 review of 67 studies names the same mechanism, epistemic friction: without it, “AI-generated fluency can bypass the reflective struggle central to deep learning.”

Source: Bjork & Bjork, desirable difficulties; Li, Cui & Hagedorn (2026), Computers and Education: AI.

67 studies reviewed

Protect the difficulty that does the teaching.

But design changes everything

The same technology can help or harm. The design decides which.

In a Harvard physics study, a purpose-built AI tutor designed with proper scaffolding beat in-person active learning by 0.73 to 1.3 standard deviations, two to three times the usual bar for a substantial effect in education research. In the Türkiye study, the standard ChatGPT left students worse off once it was removed, while a guardrailed tutor version avoided that loss entirely. Same models, opposite outcomes, depending on how they were designed.

Source: Kestin et al. (2025), Scientific Reports; Bastani et al. (2025), PNAS.

0.73–1.3 SD, 2–3× the usual bar

The result is set by the design, not by the model.

And by the human behind it

Believing a human is paying attention changes how hard we try.

In a controlled study in a university creative-coding course, students received identical AI-generated feedback on their work. Those told it came from a human teaching assistant ran their code more, wrote more code, and spent more time on later work. They rated the feedback equally helpful either way. The effect on effort was large (d = 0.88 to 1.56). The content was the same.

Source: Morris & Maes (2026), Same Feedback, Different Source.

d = 0.88 to 1.56 (effect on effort)

Same words land differently when we believe a human wrote them.

What only a human does

AI can help with the content. It rarely touches the rest.

Education does three things at once: it builds knowledge and skills (qualification), it helps you find your place among others (socialization), and it helps you become someone who thinks independently and takes responsibility (subjectification). AI tools mostly reach the first. They rarely, if ever, address the other two, and those are where a teacher does their deepest work.

Source: Gert Biesta, three functions of education; Wayne Holmes (2026), Learning to think in the AI era.

3 functions: AI reaches the first

AI can teach the content. A human helps you become someone.

Three ways AI can show up

An LLM, a tutor, and a learning companion are not the same thing.

An LLM answers your question. Faster work, less learning.

An AI tutor asks questions back, no matter what you actually need. Often frustration and drop-out.

An AI learning companion (Dr Philippa Hardman calls it a “study mate”) remembers where you got stuck and pushes you towards the thinking you avoid. Capability that lasts.

Source: Khosravi et al. (2026), Building AI Companions that Prioritise Learning over Performance; “study mate” framing via Dr Philippa Hardman.

LLM · Tutor · Companion

Aim for a learning companion, not an answer machine.

What education is for now

AI does not shorten what there is to learn. It lengthens it.

Holmes (UNESCO) asks it directly: if generative AI is this powerful, do we still need to learn? His answer is yes, and the list grows: on top of what we wish to learn, we now need to learn AI’s profound limitations, its impacts on human rights, social justice and the environment, and “perhaps most importantly, [to] learn how to think… critically.” The World Economic Forum (WEF) keeps the balance: rote memorisation may matter less, but “the process of mastering knowledge continues to develop broader capabilities” such as grit, curiosity, communication and critical thinking, and assessment must evolve to capture them.

Source: Wayne Holmes (2026), UNESCO Courier; WEF (2026), Assessments readiness signal.

The emphasis moves from having answers to judging them.

So how do we make it help?

The challenge is also systemic. It runs through the tool, the classroom, and the system around them.

The tool: the guardrailed tutor and the learning companion are instructional design built into software: scaffolding, answers withheld, help that fades as you grow.

The classroom: co-design tools with teachers rather than deploying them on teachers, and set tasks that make learners compare, justify and revise what the AI produces. In the 67-study review, that scaffolding is what separated gains from cognitive offloading.

The system: AI only helps where the conditions are ready. As the WEF puts it, “learning outcomes will not be determined by technology itself, but by the conditions in which it is deployed”, and isolated fixes across policy, pedagogy and technology are unlikely to be sufficient.

Source: Bastani et al. (2025); Khosravi et al. (2026); OECD (2026), chapters 7 and 8; Li, Cui & Hagedorn (2026); WEF (2026), executive summary.

Tool · Classroom · System, together

Conditions at each level shape outcomes at every other. Real progress needs all three moving.

So, how can AI enable real learning?

Protect the difficulty that does the teaching.
The result is set by the design, not by the model.
AI can teach the content. A human helps you become someone.
Aim for a learning companion, not an answer machine.
Isolated fixes are unlikely to be enough. The levels have to move together.

One test cuts through it all, the one Inara Scott’s framework points to: ask who is doing the thinking, you or the AI. Keep yours alive, and use AI to learn, not just to finish.

Scott’s point is blunt: banning AI is the worst strategy. Ignore it or forbid it and you push learners straight to the Executant, the worst outcome for their long-term growth. The useful path is the opposite: design our learning contexts, and our work contexts, to help us climb the pyramid, building agency and skill over time through prompts, context engineering, learning design, and everyday habits with AI. Our own rules, set with the full 360° in view: ethical, societal, environmental.

The question I keep coming back to

Are you here for the output, or to get better at what you do?

Where this comes from

A map of the research, drawn over a few months and compiled with the help of AI.

Download the original carousel (PDF)

Bastani et al. (2025), Generative AI without guardrails can harm learning, PNAS
Khosravi et al. (2026), Building AI Companions that Prioritise Learning over Performance, arXiv
Kestin et al. (2025), AI tutoring outperforms in-class active learning, Scientific Reports
Morris & Maes (2026), Same Feedback, Different Source, arXiv
Yan, Greiff, Lodge & Gašević (2025), Nature Reviews Psychology
Li, Cui & Hagedorn (2026), Computers and Education: AI
Fan et al. (2025), Beware of metacognitive laziness, British Journal of Educational Technology
Inara Scott (2026), The AI Cognitive Pyramid, SSRN
Robert & Elizabeth Bjork, desirable difficulties
Gert Biesta, three functions of education
OECD, Digital Education Outlook 2026
Wayne Holmes (2026), Learning to think in the AI era, UNESCO Courier
World Economic Forum, Shaping the Future of Learning (2026)
Dr Philippa Hardman, From AI Tutors to AI Study Mates

The three institutional reports (OECD, UNESCO, WEF) are sources I drew on, not endorsers of this synthesis.