Download Full Text (309 KB)

Document Type

Open Access




As the number of applications to universities across America has increased in recent years, the ability to predict which applicants intend to enroll has become integral to the success of undergraduate institutions. Using data analysis to predict matriculation enables admissions departments to make more accurate forecasts of enrollment and finances. We use machine learning techniques to estimate the likelihood of enrollment for each applicant in a given year. These probabilities of enrollment are, in turn, used to calculate the aggregate incoming class profile, allowing an admissions department to tailor their admission decisions accordingly. The study experiments with various machine learning algorithms: logistic regression, random forest and XGBoost models are all built and tested. We find that the XGBoost algorithm consistently outperforms other algorithms in predicting enrollment. We use a random subset of 2013-18 data for training, and the remainder of the subset for validation. Academic strengths, financial offers, and applicant engagement all possess predictive power on enrollment in the model. We conclude the study by applying the model to the accepted students from the 2019 applicant class. This approach serves as a proxy for predicting the profile of an incoming class profile: the model uses the application data to create predictions of enrollment without knowledge of which applicants enrolled. We then compare the predicted class profile to the actual class profile to assess the model’s predictive accuracy.

Predicting Incoming Union College Class Profiles using Machine Learning


blog comments powered by Disqus