Medical Students Reach Sustained Topic Mastery in a Median of 10 Encounters on Ora AI's World-First Spaced-Repetition QBank, Validated Across 1.46M Per-Student Topic Records.
Ora AI Research Team. Topic Mastery Trajectories at Scale.
Mean topic-encounter accuracy rose from 52.5% on first encounter to 59.4% by encounter 10 on Ora's clinical-vignette qbank substrate. The analysis starts from 1,459,436 per-user x topic stats rows, reconstructs 1.21 million anonymized response-topic events, and applies the trajectory model to 8,606 user x topic trajectories meeting the analytic encounter threshold. Among trajectories reaching the default sustained-mastery criterion, the median time-to-mastery was 10 encounters. To our knowledge, this is the first published per-user x topic mastery-trajectory characterization in medical education at this scale.
encounter 1 to 10
stats rows
trajectories
to sustained mastery
95% cluster-bootstrap CI: encounter 1 = 51.4-53.5%; encounter 10 = 58.3-60.3%. Encounter 20 remains above the first encounter but flattens.
Root-category aggregation avoids per-topic sparsity while preserving the directional test of mastery growth.
The finding is not that individual learners improve monotonically on every topic. They do not: topic sequences are noisy, scheduler-selected, and heterogeneous. The contribution is the aggregate empirical characterization: across the analytic sample, repeated topic encounters move accuracy upward in the direction predicted by mastery-learning and knowledge-tracing theory, with a stronger signal in tutor mode than in timed mode.
What this adds
Mastery learning, item-response theory, Bayesian Knowledge Tracing, and Learning Factors Analysis already provide the theoretical vocabulary for skill acquisition and learner modeling.1234 What has been sparse in medical education is large-scale per-user x per-topic empirical trajectory data on clinical-knowledge questions. Ora's qbank substrate makes that characterization observable: every submitted clinical vignette response can be attributed to the topic graph, ordered within a learner-topic sequence, and summarized without exposing users, schools, or vignette text.
Method
- Tables. Vignette response, variant, and topic-attribution tables in Ora's production database.
- Scale. 201,205 submitted responses; 1.21M response-topic events after topic attribution.
- Privacy. No user IDs, schools, item text, or response IDs in public artifacts.
- Unit. User x depth-2 topic cluster with at least 10 encounters.
- Attribution. Concept-level vignette joins; duplicate leaf topics collapsed to one depth-2 cluster per response.
- Uncertainty. 1,000-iteration cluster bootstrap over user x topic trajectories.
- Pilot. Stratified 2,000-trajectory pilot matched the full-sample curve.
- Sensitivity. First-attempt-only, tutor/timed mode, time-window, and attribution checks run.
- Mastery. Default criterion: 80% rolling-window accuracy, window 5, sustained for 3 windows.
This is descriptive, not causal: the empirical curve reflects both learner progress and scheduler-driven encounter selection. The analytic threshold excludes low-encounter user x topic starts, so the trajectory claim applies to the engaged analytic sample, not every topic exposure. Multiple-choice accuracy is a recognition measure, not the production-task mastery measure used in much of the classical literature. Multi-topic attribution matters because one vignette can map to several topic clusters; fractional attribution preserved the same directional pattern, but a strict single-cluster approximation was too small for the public headline. Named depth-2 topic clusters did not meet the public min-N rule, so this brief reports aggregate and root-category results only.
References
- Bloom BS. The 2 Sigma Problem: The Search for Methods of Group Instruction as Effective as One-to-One Tutoring. Educational Researcher. 1984;13(6):4-16. doi:10.3102/0013189X013006004
- Kulik C-LC, Kulik JA, Bangert-Drowns RL. Effectiveness of Mastery Learning Programs: A Meta-Analysis. Review of Educational Research. 1990;60(2):265-299. doi:10.3102/00346543060002265
- Corbett AT, Anderson JR. Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction. 1994;4:253-278. doi:10.1007/BF01099821
- Cen H, Koedinger KR, Junker B. Learning Factors Analysis: A general method for cognitive model evaluation and improvement. Lecture Notes in Computer Science. 2006;4053:164-175. doi:10.1007/11774303_17
- Larsen DP, Butler AC, Roediger HL III. Test-enhanced learning in medical education. Medical Education. 2008;42(10):959-966. doi:10.1111/j.1365-2923.2008.03124.x