I’ve been cleaning up my computer, going through old files and came across a slew of notes from my Categorical Data Analysis course I took while in grad school at UNL. Apparently, I had a difficult time discerning the difference between residual deviance and null deviance judging by the plethora of question marks on the lecture notes from that particular class. In case you too are having trouble with these two deviants (:), here’s an explanation.

Today’s lesson: Residual deviance, Null deviance and Likelihood Ratio Tests (LRT).

First of all, only someone nerdy enough about logistic regression would still be reading this far, so I’m going to go ahead and make some assumptions about your background (e.g., you are at least vaguely familiar with generalized linear models).

For starters, let’s say you’ve got your response variable (Y) and explanatory variables (X, Z, etc) and you want to find the best fitting model. Naturally, the bestest (not a real word, I know) fitting model is the one with a parameter for each cell of the contingency table, we call this the “saturated” model; but this is just far too cumbersome to work with. So it becomes the baseline to which we make comparisons of other (shorter/simpler/easier) models.

The Null Deviance assesses the goodness of fit of a model with only the intercept term to the saturated model; basically, it tells you whether at least one of your βs is not equal to zero.
The hypotheses you’re testing are:

Ho: logit(π) = α
Ha: logit(π) = the saturated model: γj

The Residual Deviance assesses the goodness of fit of a specified model with k number of βs to the saturated model; this tells you whether the model you’ve deduced from model building process fits adequately compared to the saturated model.
The hypotheses are:

Ho: logit(π) = α + β1x1 + β2x2 +...+ βkxk
Ha: logit(π) = the saturated model

But what if you want to compare two simplified models to each other, not to the saturated model? You set them up normally; the new model you’re testing is the Ho, the old model is the Ha, run the glm() and simply subtract their residual deviances from each other. Ta-da!! You’ve just performed a Likelihood Ratio Test!!



  1. It‘s quite in here! Why not leave a response?