Understanding Nugget Variance in Spatial mode3


What is the relationship between the units variance, the residual variance, and the average standard error of a difference as computed in ASREML?

I have fitted two models differing only in the inclusion or not of the units variance, but both modeling an AR1 x AR1 residual variance structure. In the model without the units variance, the residual variance = 1.6156 In the model including the units variance, the residual variance = 1.5251 and the units variance = 1.4386

on the equations for V on pages 111 and 112 of the asreml user manual, I would think that a diagonal element of V would be computed as 1.6156 in the case of no units and 1.5251 + 1.4386 = 2.9637 in the case of the inclusion of the units variance. If this were true, however, it would indicate that inclusion of the units variance increases the total residual variance, suggesting that I have a poorer fitting model. Nevertheless, the inclusion of the units increases the model log likelihood from -10268.2 to -10163.2. Further, the inclusion of the units variance decreases slightly the average standard error of a genetic prediction difference (in the PVS file) from 1.801 to 1.722. So, it seems as if inclusion of the units variance improves the model slightly, but I cannot understand how to reconcile that with the increased value of the diagonal elements of the V matrix.


You are comparing 2 Residual models.
The first is 1.62 C with LogL -10268
The second is 1.44 I + 1.52 C with LogL -10163 (an increase of 105 which is quite substantial.

I would not expect such a big change in the Error variance (1.62 cf 2.96) unless there was another random term which has low replication. Your .asr file shows there is such a term:

 entrynm.pop          9024   9024   7.01639       10.7007      41.82  -1 P
which is the other term that is soaking up the difference.

lets think about the lag 1 and lag 2 covariance.
 Residual            AR=AutoR   110  0.966230      0.966230      71.95   0 U
 Residual            AR=AutoR    70  0.954875      0.954875      65.96   0 U
the second model, lag 1 cov is 1.52*.96 (say) and lag2 is 1.52*.96*.96

Under first model, I expect it would be have a similar lag1 covariance suggesting correlation parameter of about .90.

So, definitely, the +units model is to be preferred.

This issue is important for foresters fitting a tree model and spatial variation. Failure to also include a nugget (units) term then inflates the 'genetic variance'

REML is trying to model the correlation structure as best it can with the terms provided. When units is omitted, part goes into the correlated residual and part into the 'genetic' component.

1 November 2008

See Also