Understanding Nugget Variance in Spatial mode3
What is the relationship between the units variance, the residual variance,
and the average standard error of a difference as computed in ASREML?
I have fitted two models differing only in the inclusion or not of the
units variance, but both modeling an AR1 x AR1 residual variance structure.
In the model without the units variance, the residual variance = 1.6156
In the model including the units variance, the residual variance = 1.5251 and the units variance = 1.4386
on the equations for V on pages 111 and 112 of the asreml user manual,
I would think that a diagonal element of V would be computed as 1.6156
in the case of no units and 1.5251 + 1.4386 = 2.9637 in the case of the
inclusion of the units variance. If this were true, however, it would
indicate that inclusion of the units variance increases the total residual
variance, suggesting that I have a poorer fitting model. Nevertheless, the
inclusion of the units increases the model log likelihood from -10268.2 to
-10163.2. Further, the inclusion of the units variance decreases slightly
the average standard error of a genetic prediction difference
(in the PVS file) from 1.801 to 1.722. So, it seems as if inclusion
of the units variance improves the model slightly, but I cannot understand
how to reconcile that with the increased value of the diagonal elements of the V matrix.
You are comparing 2 Residual models.
The first is 1.62 C with LogL -10268
The second is 1.44 I + 1.52 C with LogL -10163 (an increase of 105 which is quite substantial.
I would not expect such a big change in the Error variance (1.62 cf 2.96) unless there was another random term which has low replication.
Your .asr file shows there is such a term:
entrynm.pop 9024 9024 7.01639 10.7007 41.82 -1 P
which is the other term that is soaking up the difference.
lets think about the lag 1 and lag 2 covariance.
Residual AR=AutoR 110 0.966230 0.966230 71.95 0 U
Residual AR=AutoR 70 0.954875 0.954875 65.96 0 U
the second model, lag 1 cov is 1.52*.96 (say) and lag2 is 1.52*.96*.96
Under first model, I expect it would be have a similar lag1 covariance suggesting correlation parameter of about .90.
So, definitely, the +units model is to be preferred.
This issue is important for foresters fitting a tree model and spatial variation.
Failure to also include a nugget (units) term then inflates the 'genetic variance'
REML is trying to model the correlation structure as best it can with the terms provided.
When units is omitted, part goes into the correlated residual and part into the 'genetic' component.
1 November 2008