Penalized MLE (PMLE)51-53 is a general technique for shrinking (stabilizing) regression fits. Instead of maximizing the log-likelihood, PMLE maximizes a penalized log-likelihood which is the sum of the ordinal model log-likelihood and a penalty, resulting in log L-Ut(s,Pi)2. (12)
Here su s2, ■■■ ,sp are scale factors chosen to make sfli unitless. Most authors standardize the data first and do not have scale factors in the equation,51 but equation (12) has the advantage of allowing estimation of /? on the original scale of the data. The usual methods (for example, Newton-Raphson) are used to maximize equation (12). The usual default values for s are sample standard deviations of columns of the design matrix, but special consideration has to be given to dummy variables,52 which gives rise to a more general form of the penalized log-likelihood logL-^W'Pp (13)
where P is a penalty matrix. Rows and columns of P can easily be set to zero for parameters for which no shrinkage is desired.52,53
The main problem in using PMLE is the choice of X. Many authors use cross-validation to solve for the X which optimizes an unbiased estimate of predictive accuracy, but it is easy to show that one must use a huge number of data splits to get a precise estimate of the optimum X. A faster and usually more reliable strategy, based on findings from a small number of simulation studies, is to choose the X which maximizes the 'effective' AIC. Gray (Eq. 2-9)53 and others show how to compute the 'effective d.f.' in this situation (that is, higher X causes more shrinkage which lowers the effective d.f.). The effective AIC is where LR j2 is the likelihood ratio y 2 for the penalized model, but ignoring the penalty function.
The lrm function will do PMLE, and a separate function called pentrace searches for the optimum X based on effective AIC once the analyst specifies a vector of As to try. pentrace can also allow for differing X for different types of terms in the model. Here we want to do a grid search to determine the optimum penalty for simple main effect (non-interaction) terms and the penalty for interaction terms, most of which are terms interacting with cohort to allow for unequal slopes. The following code uses pentrace on the full extended CR model fit to find the optimum penalty factors. All combinations of simple and interaction A's for which the interaction penalty ^ the penalty for the simple parameters are examined. The range of penalty factors to try for each type of parameter was found by computing effective AIC in a trial and error process.
pentrace (full, list(simple = c(0,.025,.05,.075,.l), interaction=c(0,10,50,100,125,150)))
simple interaction on df aie
Was this article helpful?