Abstract:Simulation studies are extremely common in the item response theory (IRT) research literature. This
article presents a didactic discussion of “truth” and “error” in IRT-based simulation studies. We
ultimately recommend that future research focus less on the simple recovery of parameters from a
convenient generating IRT model, and more on practical comparative estimation studies when the
data are intentionally generated to incorporate nuisance dimensionality and other sources of
nuanced contamination encountered with real data. A new framework is also presented for
conceptualizing and comparing various residuals in IRT studies. The new framework allows even
very different calibration and scoring IRT models to be compared on a common, convenient, and
highly interpretable number-correct metric. Some illustrative examples are included.
Keywords: estimation, item response theory, model fit, simulations
附件列表