Item Response Theory vs Confirmatory Factor Analysis

I was wondering what the core, meaningful differences are between Item Response Theory and Confirmatory Factor Analysis.

I understand that there are differences in the calculations (focusing more on item vs. covariances; log-linear vs. linear).

However, I have no idea what this means from a higher-level perspective – does this mean that IRT is better than CFA in some circumstances? Or for slightly different end-purposes?

Any musings would be useful as a scan of the research literature led to more a description of IRT and CFA than any useful comparison of the core differences between them.


@Philchalmers answer is on point, and if you want a reference from one of the leaders in the field, Muthen (creator of Mplus), here you go:
(Edited to include direct quote)

An MPlus user asks: I am trying to describe and illustrate current
similarities and differences between binary CFA and IRT for my thesis.
The default estimation method in Mplus for categorical CFA is WLSMV.
To run an IRT model, the example in your manual suggests to use MLR as
the estimation method. When I use MLR, is the data input still the
tetrachoric correlation matrix or is the original response data matrix

Bengt Muthen responds: I don’t think there is a difference between CFA
of categorical variables and IRT. It is sometimes claimed but I don’t
agree. Which estimator is typically used may differ, but that’s not
essential. MLR uses the raw data, not a sample tetrachoric correlation
matrix. … The ML(R) approach is the same as the “marginal ML (MML)”
approach described in e.g. Bock’s work. So using the raw data and
integrating over the factors using numerical integration. MML being
contrasted with “conditional ML” used e.g. with Rasch approaches.

Assuming normal factors, probit (normal ogive) item-factor relations,
and conditional independence, the assumptions are the same for ML and
for WLSMV, where the latter uses tetrachorics. This is because those
assumptions correspond to assuming multivariate normal underlying
continuous latent response variables behind the categorical outcomes.
So WLSMV only uses 1st- and 2nd-order information, whereas ML goes all
the way up to the highest order. The loss of info appears small,
however. ML doesn’t fit the model to these sample tetrachorics, so
perhaps one can say that WLSMV marginalizes in a different way. It’s a
matter of estimator differences rather than model differences.

We have an IRT note on our web site:

but again, the ML(R) approach is nothing different from what’s used in


Source : Link , Question Author : SimonsSchus , Answer Author : Jeremy Miles

Leave a Comment