Abstract
This chapter covers predictive modeling approaches in the context of health economics and outcomes research. A historical perspective is first presented, and is followed by heuristic explanations of models commonly used in machine learning and predictive modeling. An introduction to high-dimensional data analysis is provided, including reviews of related models and available software.
Predictive modeling is the art and science of crafting a model that can be used to make the most accurate prediction possible. This chapter focuses on predictive modeling, with a special emphasis on health economics and outcomes research (HEOR) applications. It introduces enriched methods used to handle high-dimensional data. The chapter provides a brief summary for selected nonparametric regression techniques, including multivariate adaptive regression splines (MARS), projection pursuit regression, and wavelets. It then reviews a list of commercially or freely available software packages. One of the most commonly used analytical tools is the linear regression model. Linear models, which express the outcome as a linear function of the regression coefficients, can be estimated using the least squares method, which under certain assumptions provides the best linear unbiased estimators. Tree-based methods involve splitting the data into mostly two buckets and achieving minimization using suitably chosen optimization criteria.