S1 | = | ß1 | = | 1 | ß1 | + | 0 | ß2 | + | 0 | ß3 | + | 0 | ß4 |
S2 | = | ß1 | = | 1 | ß1 | + | 0 | ß2 | + | 0 | ß3 | + | 0 | ß4 |
S3 | = | ß2 | = | 0 | ß1 | + | 1 | ß2 | + | 0 | ß3 | + | 0 | ß4 |
S4 | = | ß2 | = | 0 | ß1 | + | 1 | ß2 | + | 0 | ß3 | + | 0 | ß4 |
r1 | = | ß3 | = | 0 | ß1 | + | 0 | ß2 | + | 1 | ß3 | + | 0 | ß4 |
r2 | = | ß4 | = | 0 | ß1 | + | 0 | ß2 | + | 0 | ß3 | + | 1 | ß4 |
r3 | = | ß4 | = | 0 | ß1 | + | 0 | ß2 | + | 0 | ß3 | + | 1 | ß4 |
r4 | = | ß4 | = | 0 | ß1 | + | 0 | ß2 | + | 0 | ß3 | + | 1 | ß4 |
A design matrix must possess as many rows as there are parameters in the PIM. If you wish to place a design matrix structure only upon the survival rates, but not upon reporting rates, there will be a 1-to-1 correspondence between real and ß parameters for reporting rates. Consequently an identity matrix is used as a pass-through filter for the reporting component of the design matrix. Let us further contrive to have our model manifest our biological insight regarding the effect of a group covariate upon the real parameters of interest:
Three things happen with this use of a design matrix.
This invocation of what I called an ultrastructure yesterday, is of considerable power in thorough data analysis. However, the power of this tool comes at a price. That price is the necessity of another matrix through which to pass our struggling real parameters. This hideous creature goes by the name link function.
The link function is a mathematical contrivance. Particularly under the snow-depth scenario described above, we are taking an unbounded independent variable, and applying it to the estimation of a bounded parameter (survival residing in the range (0,1)). Have you ever tried to perform such a task? It requires special talents, namely some type of a transformation, to enforce the boundedness constraint. The most prevalent form of this is logistic regression, hence a common link function is the logit link. The software we are using employs a sin link function, which in our experience has been adequate under all circumstances.
Maximum likelihood estimation actually takes place in this transformed parameter space. The software produces estimates of what it describes as ß-parameters which is equal in number to the number of columns of the design matrix. These estimates are not constrained in the domain (0,1). However, the software also performs the back-transformation (inverse of the link function) to report estimates and standard errors on the "biological" or real parameters scale.
Our advice to you is simple: stay away from link function alteration. There are no biological reasons for altering elements that are mathematical conveniences. Explorations on your part will deduce that different parameter estimates, AIC values, and perhaps model rankings will result from application of differing link functions. This is not justification for such explorations.
With those comments as background, let's take a tour of design matrices from a visual, as well as numerical point of view to discover more models we can construct within any of the sampling design frameworks described yesterday.