Analysis of first-year university student dropout through machine learning models: A comparison between universities

Diego Opazo, Sebastián Moreno, Eduardo Álvarez-Miranda, Jordi Pereira

Research output: Contribution to journalArticlepeer-review

19 Scopus citations

Abstract

Student dropout, defined as the abandonment of a high education program before obtaining the degree without reincorporation, is a problem that affects every higher education institution in the world. This study uses machine learning models over two Chilean universities to predict first-year engineering student dropout over enrolled students, and to analyze the variables that affect the probability of dropout. The results show that instead of combining the datasets into a single dataset, it is better to apply a model per university. Moreover, among the eight machine learning models tested over the datasets, gradient-boosting decision trees reports the best model. Further analyses of the interpretative models show that a higher score in almost any entrance university test decreases the probability of dropout, the most important variable being the mathematical test. One exception is the language test, where a higher score increases the probability of dropout.

Original languageEnglish
Article number2599
JournalMathematics
Volume9
Issue number20
DOIs
StatePublished - 2 Oct 2021
Externally publishedYes

Keywords

  • First-year student dropout
  • Machine learning
  • Universities

Fingerprint

Dive into the research topics of 'Analysis of first-year university student dropout through machine learning models: A comparison between universities'. Together they form a unique fingerprint.

Cite this