Publication Date




Embargo Period


Degree Type


Degree Name

Doctor of Philosophy (PHD)


Public Health Sciences (Medicine)

Date of Defense


First Committee Member

Hemant Ishwaran

Second Committee Member

J. Sunil Rao

Third Committee Member

Daniel Feaster

Fourth Committee Member

Wei Sun


Estimation of individual treatment effect in observational data is complicated due to the challenges of confounding and selection bias. A useful inferential framework to address this is the counterfactual model which takes the hypothetical stance of asking what if an individual had received both treatments. Making use of random forests (RF) within the counterfactual framework, I estimate individual treatment effects by directly modeling the response. This thesis consists of five Chapters. Chapter 1 reviews the methodology in causal inference and provide mathematical notations. Major approaches reviewed include potential outcome approach, graphical approach and counterfactual approach. Chapter 2 discusses assumptions for counterfactual approach. P-values are useful in causal inference, but whenever it is used, caution must be taken. Section 2.3 and Section 2.4 propose machine learning methods as alternatives to p-values and checking proportional hazards assumption in survival analysis. These two sections are more general in content even beyond the scope of counterfactual approach. Chapter 3 describes six random forest methods for estimating individual treatment effects under counterfactual approach framework and discusses model consistency and convergence of random forest in Section 3.6. Chapter 4 demonstrates the performance of these methods in complex simulations and how the most appropriate method is used in a real dataset for continuous outcome. Chapter 5 addresses causal inference in survival analysis of ischemic cardiomyopathy. Treatment effect is viewed as a dynamic causal procedure. New random forest methods are proposed in this chapter to assess individual therapy overlap. These methods possess the unique feature of being able to incorporate external expert knowledge either in a fully supervised way (i.e., we have a strong belief that knowledge is correct), or in a minimally-supervised fashion (i.e., knowledge is not considered gold-standard).


Causal Inference; Random Forests; Machine Learning; Survival; Individual Treatment Effect; Observational Data

Available for download on Thursday, November 14, 2019