Early Prediction of At Risk Students Using Minimal Data: A Machine Learning Framework for Higher Education

Authors

  • Hamsiah Sekolah Tinggi Ilmu Ekonomi Sakti Alam Kerinci
  • Nita Adiyati Universitas Cendekia Abditama
  • Rino Subekti Institut Bisnis dan Informatika (IBI) Kosgoro 1957

DOI:

https://doi.org/10.61978/digitus.v3i2.953

Keywords:

Early Warning Systems, Academic Risk Prediction, Learning Analytics, Machine Learning, CatBoost, LMS Data, Student Retention

Abstract

Early identification of academically at risk students is essential for timely intervention and improved retention in higher education. This study investigates the effectiveness of using pre admission and early semester LMS data to predict student risk using machine learning models. The objective is to assess whether limited, readily available data from the first four weeks of instruction can reliably support early warning systems. A supervised learning framework was applied using the Open University Learning Analytics Dataset (OULAD), with features derived from student demographics and early LMS activity logs. Models evaluated include Logistic Regression, XGBoost, and CatBoost, with time based validation and SMOTE employed to address class imbalance. Model performance was measured using ROC AUC, F1 Score, and Recall. The CatBoost model achieved the best performance, with an F1 score of 0.770 and ROC AUC of 0.750, significantly outperforming baseline models. Quiz submission behavior, login frequency, and pre admission qualification level emerged as the most predictive features. Results also revealed a steady week by week improvement in model accuracy, confirming the increasing value of LMS engagement data over time. These findings affirm that early stage student data can be used effectively to predict academic risk, enabling institutions to act before major assessments are conducted. The study emphasizes the need for institutional readiness, ethical implementation, and inclusive practices in deploying predictive tools. Future research should expand the feature space and test cross institutional generalizability to refine early warning systems further.

References

Ajuwon, O. A., Animashaun, E. S., & Chiekezie, N. R. (2024). Crisis Intervention, Mediation, Counseling, and Mentoring in Schools: Building Resilient Educational Communities. International Journal of Applied Research in Social Sciences, 6(8), 1593–1611. https://doi.org/10.51594/ijarss.v6i8.1372 DOI: https://doi.org/10.51594/ijarss.v6i8.1372

Almodiel, M. C. (2021). Assessing Online Learners’ Access Patterns and Performance Using Data Mining Techniques. International Journal in Information Technology in Governance Education and Business, 3(1), 46–56. https://doi.org/10.32664/ijitgeb.v3i1.87 DOI: https://doi.org/10.32664/ijitgeb.v3i1.87

Alt, A. (2019). The Impact of Social Belonging Interventions on Student Retention and Persistence in College. https://doi.org/10.3102/1440862 DOI: https://doi.org/10.3102/1440862

Ameri, S., Fard, M. J., Chinnam, R. B., & Reddy, C. K. (2016). Survival Analysis Based Framework for Early Prediction of Student Dropouts. 903–912. https://doi.org/10.1145/2983323.2983351 DOI: https://doi.org/10.1145/2983323.2983351

Ara, S., & Tanuja, R. (2024). Exploring Key Parameters Influencing Student Performance in a Blended Learning Environment Using Learning Analytics. Journal of Education and E-Learning Research, 11(1), 77–89. https://doi.org/10.20448/jeelr.v11i1.5330 DOI: https://doi.org/10.20448/jeelr.v11i1.5330

Berkeley, S., Scanlon, D., Bailey, T. R., Sutton, J. C., & Sacco, D. (2020). A Snapshot of RTI Implementation a Decade Later: New Picture, Same Story. Journal of Learning Disabilities, 53(5), 332–342. https://doi.org/10.1177/0022219420915867 DOI: https://doi.org/10.1177/0022219420915867

Davis, G. M., Hanzsek-Brill, M. B., Petzold, M. C., & Robinson, D. (2019). Students’ Sense of Belonging: The Development of a Predictive Retention Model. Journal of the Scholarship of Teaching and Learning, 19(1). https://doi.org/10.14434/josotl.v19i1.26787 DOI: https://doi.org/10.14434/josotl.v19i1.26787

Draganov, T., Kim, J., & Yoon, S. W. (2023). Increasing Retention of Underrepresented Students in STEM Fields at California Community Colleges: A Study of the STEM2 Program. Journal of College Student Retention Research Theory & Practice, 26(4), 1147–1164. https://doi.org/10.1177/15210251221149648 DOI: https://doi.org/10.1177/15210251221149648

Frontistis, Z., Lykogiannis, G., & Sarmpanis, A. (2023). Machine Learning Implementation in Membrane Bioreactor Systems: Progress, Challenges, and Future Perspectives: A Review. Environments, 10(7), 127. https://doi.org/10.3390/environments10070127 DOI: https://doi.org/10.3390/environments10070127

Gnoh, H. Q., Keoy, K. H., Iqbal, J., Anjum, S. S., Yeo, S. F., Lim, A.-F., Lim, W., & Chaw, L. Y. (2024). Enhancing Business Sustainability Through Technology-Enabled AI: Forecasting Student Data and Comparing Prediction Models for Higher Education Institutions (HEIs). PaperASIA, 40(2b), 48–58. https://doi.org/10.59953/paperasia.v40i2b.86 DOI: https://doi.org/10.59953/paperasia.v40i2b.86

Herodotou, C., Naydenova, G., Boroowa, A., Gilmour, A., & Rienties, B. (2020). How Can Predictive Learning Analytics and Motivational Interventions Increase Student Retention and Enhance Administrative Support in Distance Education? Journal of Learning Analytics, 7(2). https://doi.org/10.18608/jla.2020.72.4 DOI: https://doi.org/10.18608/jla.2020.72.4

Herodotou, C., Rienties, B., Boroowa, A., Zdráhal, Z., & Hlosta, M. (2019). A Large-Scale Implementation of Predictive Learning Analytics in Higher Education: The Teachers’ Role and Perspective. Educational Technology Research and Development, 67(5), 1273–1306. https://doi.org/10.1007/s11423-019-09685-0 DOI: https://doi.org/10.1007/s11423-019-09685-0

Imran, A., Li, J., & Alshammari, A. (2025). AI-driven Educational Transformation in ICT: Improving Adaptability, Sentiment, and Academic Performance With Advanced Machine Learning. Plos One, 20(5), e0317519. https://doi.org/10.1371/journal.pone.0317519 DOI: https://doi.org/10.1371/journal.pone.0317519

Lane, T. B. (2016). Beyond Academic and Social Integration: Understanding the Impact of a STEM Enrichment Program on the Retention and Degree Attainment of Underrepresented Students. Cbe—Life Sciences Education, 15(3), ar39. https://doi.org/10.1187/cbe.16-01-0070 DOI: https://doi.org/10.1187/cbe.16-01-0070

Lawson, J. L., O’Dwyer, L. M., Dearing, E., Raczek, A. E., Foley, C., Khanani, N., Walsh, M. E., & Leigh, Y. R. (2024). Estimating the Impact of Integrated Student Support on Elementary School Achievement: A Natural Experiment. Aera Open, 10. https://doi.org/10.1177/23328584241292072 DOI: https://doi.org/10.1177/23328584241292072

Linden, K. (2021). Improving Student Retention by Providing Targeted Support to University Students Who Do Not Submit an Early Assessment Item. Student Success, 12(3). https://doi.org/10.5204/ssj.2152 DOI: https://doi.org/10.5204/ssj.2152

Lozada, N., Pérez, J. E. A., & Henao-García, E. A. (2023). Unveiling the Effects of Big Data Analytics Capability on Innovation Capability Through Absorptive Capacity: Why More and Better Insights Matter. Journal of Enterprise Information Management. https://doi.org/10.1108/jeim-02-2021-0092 DOI: https://doi.org/10.1108/JEIM-02-2021-0092

Melton, C., Power, M. E., Moore, T. D., Plumb, A., Bourget, J., Coyne, M. D., & Simonsen, B. (2024). A Four-Step Plan to Integrate Behavioral Practices Into Tier 1 Foundational Reading Instruction With an Integrated Lesson Plan Template. Intervention in School and Clinic, 60(1), 6–16. https://doi.org/10.1177/10534512241247556 DOI: https://doi.org/10.1177/10534512241247556

Mozahem, N. A. (2020). Using Learning Management System Activity Data to Predict Student Performance in Face-to-Face Courses. International Journal of Mobile and Blended Learning, 12(3), 20–31. https://doi.org/10.4018/ijmbl.2020070102 DOI: https://doi.org/10.4018/IJMBL.2020070102

Murumba, J. W., & Alari, J. O. (2024). An Evaluation of Academic Integrity and Sustainable Quality Education in Higher Learning Institutions in Kenya: Students’ Perspectives. Kabarak J. Res. Innov., 13(4), 81–94. https://doi.org/10.58216/kjri.v13i4.249 DOI: https://doi.org/10.58216/kjri.v13i4.249

Mwalumbwe, I., & Mtebe, J. S. (2017). Using Learning Analytics to Predict Students’ Performance in Moodle Learning Management System: A Case of Mbeya University of Science and Technology. The Electronic Journal of Information Systems in Developing Countries, 79(1), 1–13. https://doi.org/10.1002/j.1681-4835.2017.tb00577.x DOI: https://doi.org/10.1002/j.1681-4835.2017.tb00577.x

Pearson, J., Giacumo, L. A., Farid, A., & Sadegh, M. (2022). A Systematic Multiple Studies Review of Low-Income, First-Generation, and Underrepresented, STEM-Degree Support Programs: Emerging Evidence-Based Models and Recommendations. Education Sciences, 12(5), 333. https://doi.org/10.3390/educsci12050333 DOI: https://doi.org/10.3390/educsci12050333

Pletzen, E. v., Sithaldeen, R., Fontaine-Rainen, D., Bam, M., Shong, C. L., Charitar, D., Dlulani, S., Sebothoma, J., & Sebothoma, D. (2021). Conceptualisation and Early Implementation of an Academic Advising System at the University of Cape Town. Journal of Student Affairs in Africa, 9(2), 31–45. https://doi.org/10.24085/jsaa.v9i2.3688 DOI: https://doi.org/10.24085/jsaa.v9i2.3688

RUPADEVI, R. (2025). Prediction of at-Risk Students in E-Learning Platforms Using Deep Learning Models. International Scientific Journal of Engineering and Management, 04(04), 1–7. https://doi.org/10.55041/isjem03267 DOI: https://doi.org/10.55041/ISJEM03267

Sage, A. J., Cervato, C., Genschel, U., & Ogilvie, C. A. (2018). Combining Academics and Social Engagement: A Major-Specific Early Alert Method to Counter Student Attrition in Science, Technology, Engineering, and Mathematics. Journal of College Student Retention Research Theory & Practice, 22(4), 611–626. https://doi.org/10.1177/1521025118780502 DOI: https://doi.org/10.1177/1521025118780502

Salibo, M. (2025). Development of Emergency Educational Leadership Scale for School Heads. Pemj, 39(5), 656–677. https://doi.org/10.70838/pemj.390508 DOI: https://doi.org/10.70838/pemj.390508

Santiago, R. T., Hall, G. J., Garbacz, S. A., Gulbrandson, K., & Albers, C. A. (2024). Examining an Integrated Factor Structure of Schoolwide MTSS Implementation Measures. Journal of Positive Behavior Interventions, 27(1), 39–49. https://doi.org/10.1177/10983007241249524 DOI: https://doi.org/10.1177/10983007241249524

Shein, W. H. (2022). Split Sample Sequential Fences Based on Bootstrap Cut Off Points for Identifying Outliers and Parameter Estimations. Asm Science Journal, 17, 1–17. https://doi.org/10.32802/asmscj.2022.500 DOI: https://doi.org/10.32802/asmscj.2022.500

Wang, Z., Feng, X., Tang, J., Huang, G. Y., & Liu, Z. (2019). Deep Knowledge Tracing With Side Information. 303–308. https://doi.org/10.1007/978-3-030-23207-8_56 DOI: https://doi.org/10.1007/978-3-030-23207-8_56

Zeng, Y., Núñez, A., & Li, Z. (2023). Incorporating Modal Testing Into Dynamic Load Identification From Structural Vibration Measurement. https://doi.org/10.12783/shm2023/37069 DOI: https://doi.org/10.12783/shm2023/37069

Downloads

Published

2025-04-30

How to Cite

Hamsiah, Adiyati, N., & Subekti, R. (2025). Early Prediction of At Risk Students Using Minimal Data: A Machine Learning Framework for Higher Education. Digitus : Journal of Computer Science Applications, 3(2), 105–116. https://doi.org/10.61978/digitus.v3i2.953

Issue

Section

Articles