Welcome back to the AI Bayeslab Statistics series.
Today, we will explore two common theories used to evaluate item quality: classical test theory and item response theory.
1. What is the Classical Test Theory (CTT)
Basic Theoretical Framework
Core formula: X = T + E (Observed score = True score + Error)
Key assumptions:
— The expected value of the error is 0: E(E)=0
— Error is uncorrelated with the true score: \rho(T,E)=0

Equal error variance for parallel tests
2.Core Concepts and Metrics

Pros and Cons
Pros: Simple calculations, easy to understand, and suitable for small sample sizes.
Cons:
— Item parameters depend on the specific sample (e.g., difficulty is influenced by the ability distribution of the sample).
— Cannot provide individualized measurement error estimates.
2. Item Response Theory (IRT)
Basic Theoretical Framework
Core model (e.g., 3-parameter logistic model, 3PL):

θ: Examinee's ability
a: Item discrimination
b: Item difficulty
c: Guessing parameter
P(\theta) = c + \frac{1-c}{1+e^{-a(\theta-b)}}
Core Concepts and Parameters

Model Types
Pros and Cons
Pros:
— Item parameter invariance (parameters are independent of the examinee group).
— Provides individualized measurement error (information function).
Cons:
— Requires large samples (typically n > 500).
— High model complexity; calculations rely on specialized software.
3. Key Comparisons Between CTT and IRT

Comparison Dimensions

Which One to Use?
Choose CTT if: ✓ Quick results needed ✓ Small sample ✓ Non-critical tests
Choose IRT if: ✓ Precise measurement required ✓ Item bank development ✓ Adaptive testing
Mnemonic:
"CTT is fast but rough, parameters shift with the sample;
IRT is precise and stable, a fixed scale with divided errors."
Stay tuned, subscribe to Bayeslab, and let everyone master the wisdom of statistics at a low cost with the AI Agent Online tool.