Comparing supervised classification methods on small-group recovery in the two-group case : a Monte Carlo simulation study
Authors
Advisor
Issue Date
Keyword
Degree
Department
Other Identifiers
CardCat URL
Abstract
This study considered the case of two-group supervised classification in which group sizes were unequal. It is a demonstrated effect that small-group recovery—accurate identification of cases in the smaller group—is appreciably diminished. The present study utilized a Monte Carlo simulation in order to compare a number of novel and established classifiers under a variety of data conditions including various group size ratios, group distinctiveness (separation), and sample sizes. The methods compared were linear and quadratic discriminant analyses (LDA & QDA), classification and regression trees (CART), random forests (RF), support vector machines (SVM), and Bayesian additive regression trees (BART). The simulation results indicated that RF demonstrated the most promising benefits to small-group recovery; BART demonstrated efficacy on all classification metrics with equal group sizes