An Investigation of Missing Data Methods for Classification Trees (with Extension to Logistic Regression) Yufeng Ding Moody's Classification trees are a type of supervised learning method used in classification problems where the response variable is categorical, with each category representing one target class. Classification tree algorithms build models (classification trees) on training data and apply the models to testing data. Ideally, all of the data points in the training data and all of the independent variables in the testing data are assumed to be observed. However, in reality, those data values can be missing (unobserved) and in fact, missing data is a fairly common problem. There are many different methods used by classification tree algorithms when missing data occur in the predictors, but few studies have been done comparing their appropriateness and performance. This research provides both analytic and Monte Carlo evidence regarding the effectiveness of six popular missing data methods for classification trees applied to binary response data. We make recommendations as to the best method to use in various situations when clear differences occur. We also show that in the context of classification trees (with extension to logistic regression), the relationship between the missingness and the dependent variable, rather than the standard missingness classification approach of Rubin (1976) and Little and Rubin (2002) (missing completely at random (MCAR), missing at random (MAR) and not missing at random (NMAR)), is the most helpful criterion to distinguish between different missing data methods.