Abstract
Several recent studies have employed item response theory (IRT) to rank incorrect responses to commonly used research-based multiple-choice assessments. These studies use Bock’s nominal response model (NRM) for applying IRT to categorical (nondichotomous) data, but the response rankings only utilize half of the parameters estimated by the model. We present a mathematical argument for why this practice of using half of the NRM parameters when ranking responses is appropriate based on the primary question of multiple-choice tests: How can we use students’ responses to test items to estimate their overall knowledge levels? We provide additional motivation for this practice by recognizing the similarities between Bock’s NRM and the probability function of the canonical ensemble with degenerate energy states. As physicists often do, we exploit these mathematical similarities to gain new insights into the meaning of the IRT parameters and a richer understanding of the relationship between these parameters and student knowledge.
- Received 8 October 2021
- Accepted 23 March 2022
DOI:https://doi.org/10.1103/PhysRevPhysEducRes.18.010133
Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.
Published by the American Physical Society