We saw in the previous Chapter that, for in any EF with an unconstrained moment parameter and for a single observation, the MLE is
For a GLM with observations the largest number of parameters that we can have is , a parameter for each observation, corresponding to the saturated model. The largest likelihood value then occurs at
and equals . The model exactly fits all the responses. This sets a baseline for comparison with more interesting sub-models.
For a GLM with , the parameter estimates give corresponding fitted values . These fitted values are constrained to lie in a -dimensional subspace of .
The deviance, of the model is
The closer a GLM fits the data, the more similar is to and the closer is to so the smaller the deviance. Since is always greater than , the residual deviance is always non-negative.
 
 Exercise 4.40 
Find the deviance for a Poisson regression model.
Below is the output from the Poisson regression of the AIDS data:
Call:  glm(formula = number ~ 1 + time, family = poisson(link = "log"),
    data = aids)
Coefficients:
(Intercept)         time
     0.3396       0.2565
Degrees of Freedom: 13 Total (i.e. Null);  12 Residual
Null Deviance:      207.3
Residual Deviance: 29.65        AIC: 86.58
Here we see that the deviance for the proposed model has deviance 29.65. The deviance for the null model has deviance 207.3.
 
 Exercise 4.41 
Find the deviance for an Exponential regression model.
The null model
At the other extreme, the smallest number of parameters we can have is one, in which all responses have a common mean parameter . This simplest model is called the null or minimal model, determined by . The deviance of this model is the total deviance to explain.
For the EF it usually turns out that
The main exception is the binomial model where is the overall proportion of successes, and one gets a weighted mean. For the normal multiple regression model with , . We often write .
The residual deviance of a fitted GLM with fitted means maximised over is
Under certain conditions which require that the sample size is large and the assumptions that generate the data from the model hold, the distribution of the deviance is
where the .
Normal linear models
The normal pdf is
The log-likelihood from independent observations on is
 
 Exercise 4.42 
Find the scaled deviance for a normal linear regression model with known variance  based on a sample of size  and fitted values .
For the birthweight data, we know the sex of each baby, so we might want to assess the effect of this on birthweight as well as age. A possible model for expected birthweight, , is
where is 0 if female, and 1 if male, so that represents how much heavier male babies tend to be compared to female ones, after correcting for age.
The following code is used to fit this model in R.
birthweight = read.table("birthweight.dat")
MODEL <- glm( weight ~ 1 + age + sex, family=gaussian, data=birthweight)
MODEL
The output of the model gives:
Call:  glm(formula = weight ~ 1 + age + sex, family = gaussian,
   data = birthweight)
Coefficients:
(Intercept)          age         sexM
    -1773.3        120.9        163.0
Degrees of Freedom: 23 Total (i.e. Null);  21 Residual
Null Deviance:      1830000
Residual Deviance: 658800       AIC: 321.4
Note that in R, for the Gaussian family, the residual deviance output from the fitted glm reports the residual sum of squares rather than the scaled deviance calculated above. With an estimated residual variance of , it follows that the residual deviance of this model is . Based on 21 degrees of freedom, then the critical value from and so we can conclude that the model is adequately describing the variability seen in the data at the 5% significance level.
A plot of the fitted model is presented in the figure below. Why are there two fitted lines in the plot?
For a given mother’s age, say 38, the expected birthweight for a female child is -1773.3 + 120.9*38 = 2820.9 g, but expected birthweight for a male child is -1773.3 + 120.9*38 + 163.0 = 2983.9 g. Therefore, the intercept of the line of best fit corresponding to male babies increase by 163.0 g above the fitted line for females.
What happens if a sex effect is omitted?
MODEL2 <- glm(weight ~ 1 + age, family = gaussian, data = birthweight) MODEL2
This returns:
Call:  glm(formula = weight ~ 1 + age, family = gaussian,
    data = birthweight)
Coefficients:
(Intercept)          age
    -1485.0        115.5
Degrees of Freedom: 23 Total (i.e. Null);  22 Residual
Null Deviance:      1830000
Residual Deviance: 816100       AIC: 324.5
With an estimated residual variance of , it follow that the residual deviance of this model is . Based on 22 degrees of freedom, then the critical value from and so we can conclude that the model is adequately describing the variability seen in the data at the 5% significance level.
This obtains the same fit as the plot presented in Chapter 1. Note the MLE of the regression coefficient on age differs from above.