Publication Date




Embargo Period


Degree Type


Degree Name

Doctor of Philosophy (PHD)


Biostatistics (Medicine)

Date of Defense


First Committee Member

J. Sunil Rao

Second Committee Member

Hemant Ishwaran

Third Committee Member

Lily Wang

Fourth Committee Member

Daniel A. Sussman


Many practical problems are related to prediction, where the main interest is at subject or sub-population level. In such cases, it is possible to make substantial gains in prediction accuracy by identifying a class that a new subject belongs to. In this way, the new subject is potentially associated with a random effect corresponding to the same class in the training data, so that method of mixed model prediction can be used to make the best prediction. We propose a new method, called classified mixed model prediction (CMMP), to achieve this goal. We develop CMMP for both prediction of mixed effects and prediction of future observations, and consider different scenarios where there may or may not be a “match” of the new subject among the training-data subjects. Even if the actual match does not, CMMP still helps in improving prediction accuracy. We also expand the study of CMMP method by handling the situation where the group information is unknown. While the CMMP is certainly a significant step forward, it's not reasonable to assume that each subgroup will be defined by the exact same predictors (genomic profiles with varying effect sizes). Subgroups are likely to contain some common predictors (genomic markers) as well as some distinct ones. We discuss a series of statistical methodologies which combine the common information across subgroups with subgroup specific information. We demonstrate that prediction errors can be remarkedly reduced as compared to not borrowing strength at all.


Prediction of mixed effect; model based classification; EMS algorithm