Data-Driven Approaches to Fraud Detection in Health Insurance Claims: A Systematic Review of Medical and Pharmaceutical Services

Sri Rukayah; Susi Ari Kristina

doi:10.61978/medicor.v4i2.1352

Authors

Sri Rukayah Universitas Gadjah Mada
Susi Ari Universitas Gadjah Mada

DOI:

https://doi.org/10.61978/medicor.v4i2.1352

Keywords:

health insurance fraud detection, machine learning approaches, anomaly detection, healthcare claims analytics, medical and pharmaceutical services, systematic literature review

Abstract

Fraud in health insurance claims continues to impose significant financial and operational burdens on healthcare systems, especially as the volume and complexity of claims increase. Conventional rule-based detection mechanisms, although widely used, have limited adaptability to evolving fraud patterns and high-dimensional data environments. This limitation has driven a shift toward data-driven analytical approaches capable of identifying suspicious patterns more effectively. This systematic review synthesizes peer-reviewed, open-access studies published between 2020 and 2025 that applied rule-based, supervised, unsupervised, or hybrid methods for fraud detection in health insurance claims. A comprehensive search across major databases yielded fourteen eligible studies representing diverse systems, datasets, and methodological designs. The findings indicate a clear transition from traditional rule-based systems to machine learning approaches, particularly in addressing challenges such as label scarcity, class imbalance, and complex fraud patterns. Most studies focused on integrated medical claims, where pharmaceutical fraud was embedded rather than analyzed independently, highlighting a gap in service-specific research. Significant heterogeneity was observed in fraud definitions, preprocessing techniques, labeling strategies, and evaluation metrics, limiting cross-study comparability and emphasizing the need for greater methodological transparency. Across the literature, data-driven approaches are consistently positioned as decision-support tools rather than definitive solutions, reinforcing their role in complementing expert judgment and regulatory oversight. Overall, effective implementation requires context-aware design, reliable labeling, and rigorous real-world validation. Future research should prioritize domain-specific analyses, particularly in pharmaceutical fraud, and improve transparency to support scalable and responsible deployment.

Author Biography

Susi Ari, Universitas Gadjah Mada

Prof. Dr. Susi Ari Kristina, MPH, Pharmacist is a researcher and lecturer in the Management and Community Pharmacy Division, Pharmaceutics Department, Faculty of Pharmacy, at the Gadjah Mada University. Susi had been completed doctoral degree in Social, Economics, and Administrative Pharmacy at Mahidol University, Thailand in 2015. Susi is the instructor for social pharmacy, pharmaceutical management, and pharmacy informatics field for undergraduate and graduate curriculum. Her research interest was focus on public health pharmacy issues ranging from the cost of illness study, pharmacy practice, and pharmacy education. Her research articles have been published in international peer-reviewed journals including the topic of burden of diseases in Indonesia and Asian countries, availability and price of essential medicines, improving access of controlled medicines in Indonesia. Study on patient satisfaction, patient preferences, and willingness to pay for new pharmacy services is also in the interest.

References

West J, Bhattacharya M, Islam R. Intelligent Financial Fraud Detection Practices: An Investigation. In: Tian J, Jing J, Srivatsa M, editors. International Conference on Security and Privacy in Communication Networks. Springer; 2015. p. 186–203. (Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering). doi:10.1007/978-3-319-23802-9_16 DOI: https://doi.org/10.1007/978-3-319-23802-9_16

Ngai EWT, Hu Y, Wong YH, Chen Y, Sun X. The Application of Data Mining Techniques in Financial Fraud Detection. Decis Support Syst. 2011;50(3):559–69. doi:10.1016/j.dss.2010.08.006 DOI: https://doi.org/10.1016/j.dss.2010.08.006

Leevy JL, Khoshgoftaar TM, Bauder RA, Seliya N. A Survey on Addressing High-Class Imbalance in Big Data. J Big Data. 2018;5(1):42. doi:10.1186/s40537-018-0151-6 DOI: https://doi.org/10.1186/s40537-018-0151-6

Agarwal A, Nene MJ. A five-layer framework for AI governance: integrating regulation, standards, and certification. Transforming Government: People, Process and Policy. 2025. doi:10.1108/TG-03-2025-0065 DOI: https://doi.org/10.1108/TG-03-2025-0065

Hamid Z, Khalique F, Mahmood S, Daud A, Bukhari A, Alshemaimri B. Healthcare Insurance Fraud Detection Using Data Mining. BMC Med Inform Decis Mak. 2024;24(1):112. doi:10.1186/s12911-024-02512-4 DOI: https://doi.org/10.1186/s12911-024-02512-4

Almuzaini T, Choonara I, Sammons H. Substandard and Counterfeit Medicines: A Systematic Review of the Literature. BMJ Open. 2013;3(8):e002923. doi:10.1136/bmjopen-2013-002923 DOI: https://doi.org/10.1136/bmjopen-2013-002923

Ozawa S, Evans DR, Bessias S, Haynie DG, Yemeke TT, Laing SK, et al. Prevalence and Estimated Economic Burden of Substandard and Falsified Medicines in Low- and Middle-Income Countries. JAMA Netw Open. 2018;1(4):e181662. doi:10.1001/jamanetworkopen.2018.1662 DOI: https://doi.org/10.1001/jamanetworkopen.2018.1662

World Health Organization. WHO Global Surveillance and Monitoring System for Substandard and Falsified Medical Products [Internet]. 2017. Available from: https://iris.who.int/bitstream/handle/10665/339295/WHO-EMP-RHT-SAV-2017.01-eng.pdf

Agarwal. Detection of Fraudulent Activities in Health Insurance Using Heterogeneous Information Network. Comput Intell Neurosci. 2023;2706928. doi:10.1155/2023/2706928

Khanizadeh F, Ettefaghian A, Wilson G, Shirazibeheshti A, Radwan T, Luca C. Smart Data-Driven Medical Decisions Through Collective and Individual Anomaly Detection in Healthcare Time Series. Int J Med Inform. 2025;194. doi:10.1016/j.ijmedinf.2024.105696 DOI: https://doi.org/10.1016/j.ijmedinf.2024.105696

Matloob I, Khan SA, Rukaiya R, Khattak MAK, Munir A. A Sequence Mining-Based Novel Architecture for Detecting Fraudulent Transactions in Healthcare Systems. IEEE Access. 2022;10:48447–63. doi:10.1109/ACCESS.2022.3170888 DOI: https://doi.org/10.1109/ACCESS.2022.3170888

Page MJ, McKenzie JE, Bossuyt PM, others. The PRISMA 2020 Statement. Syst Rev. 2021;10(1):89. doi:10.1186/s13643-021-01626-4 DOI: https://doi.org/10.1186/s13643-021-01626-4

Massi MC, Ieva F, Lettieri E. Data Mining Application to Healthcare Fraud Detection. BMC Med Inform Decis Mak. 2020;20(1):160. doi:10.1186/s12911-020-01143-9 DOI: https://doi.org/10.1186/s12911-020-01143-9

Shekhar S, Leder-Luis J, Akoglu L. Unsupervised Machine Learning for Explainable Health Care Fraud Detection. ArXiv. 2023. doi:10.48550/arXiv.2211.02927 DOI: https://doi.org/10.3386/w30946

Razzaq K, Shah M. Next-Generation Machine Learning in Healthcare Fraud Detection. Information. 2025;16(9):730. doi:10.3390/info16090730 DOI: https://doi.org/10.3390/info16090730

Von Elm E et al. The STROBE Statement: Guidelines for Reporting Observational Studies. PLoS Med. 2007;4(10):e296. doi:10.1371/journal.pmed.0040296 DOI: https://doi.org/10.1371/journal.pmed.0040296

Nabrawi E, Alanazi A. Fraud Detection in Healthcare Insurance Claims Using Machine Learning. Risks. 2023;11(9):160. doi:10.3390/risks11090160 DOI: https://doi.org/10.3390/risks11090160

Muspratt R, Mammadov M. Anomaly Detection with Sub-Extreme Values: Health Provider Billing. Data Sci Eng. 2024;9(1):62–72. doi:10.1007/s41019-023-00234-7 DOI: https://doi.org/10.1007/s41019-023-00234-7

Tjoa E, Guan C. A Survey on Explainable Artificial Intelligence (XAI). IEEE Trans Neural Netw Learn Syst. 2021;32(11):4793–813. doi:10.1109/TNNLS.2020.3027314 DOI: https://doi.org/10.1109/TNNLS.2020.3027314

Sabic E, Keeley D, Henderson B, Nannemann S. Healthcare and Anomaly Detection: Using Machine Learning to Predict Anomalies in Heart Rate Data. AI Soc. 2021;36(1):149–58. doi:10.1007/s00146-020-00985-1 DOI: https://doi.org/10.1007/s00146-020-00985-1

Kotekani SS, Velchamy I. An Effective Data Sampling Procedure for Imbalanced Data Learning on Health Insurance Fraud Detection. Journal of Computing and Information Technology. 2020;28(4):269–85. doi:10.20532/cit.2020.1005216 DOI: https://doi.org/10.20532/cit.2020.1005216

Agarwal. An Intelligent Machine Learning Approach for Fraud Detection in Medical Claim Insurance: A Comprehensive Study. SJET. 2023;11(9):191–200. doi:10.36347/sjet.2023.v11i09.003 DOI: https://doi.org/10.36347/sjet.2023.v11i09.003

Lu et al. Health Insurance Fraud Detection Using an Attributed Heterogeneous Information Network with a Hierarchical Attention Mechanism. BMC Med Inform Decis Mak. 2023;23:62. doi:10.1186/s12911-023-02152-0 DOI: https://doi.org/10.1186/s12911-023-02152-0

Curtis ED, Billion-Polak P, Khoshgoftaar TM, Furht B. A Review of Distinct Machine Learning Classifiers for Healthcare Fraud Detection. J Big Data. 2025;12(1):238. doi:10.1186/s40537-025-01295-3 DOI: https://doi.org/10.1186/s40537-025-01295-3

Walauskis MA, Khoshgoftaar TM. Unsupervised Label Generation for Severely Imbalanced Fraud Data. J Big Data. 2025;12(1):63. doi:10.1186/s40537-025-01120-x DOI: https://doi.org/10.1186/s40537-025-01120-x

Kennedy RKL, Villanustre F, Khoshgoftaar TM. Unsupervised Feature Selection and Class Labeling for Credit Card Fraud. J Big Data. 2025;12(1):111. doi:10.1186/s40537-025-01154-1 DOI: https://doi.org/10.1186/s40537-025-01154-1

Fryze I, Naughton BD. Substandard and Falsified Medicine Recalls in the Legitimate Supply Chain: A Systematic Review of Evidence. BMJ Open. 2025;15(10):e103672. doi:10.1136/bmjopen-2025-103672 DOI: https://doi.org/10.1136/bmjopen-2025-103672

McManus D, Naughton BD. A Systematic Review of Substandard, Falsified, Unlicensed and Unregistered Medicine Sampling Studies. BMJ Glob Health. 2020;5(8):e002393. doi:10.1136/bmjgh-2020-002393 DOI: https://doi.org/10.1136/bmjgh-2020-002393

Kumar R, Sporn K, Waisberg E, Ong J, Paladugu P, Vadhera AS, et al. Navigating Healthcare AI Governance: The Comprehensive Algorithmic Oversight and Stewardship Framework for Risk and Equity. Health Care Analysis. 2025. doi:10.1007/s10728-025-00537-y DOI: https://doi.org/10.1007/s10728-025-00537-y

Matloob I, Khan S, ur Rahman H, Hussain F. Medical Health Benefit Management System for Real-Time Notification of Fraud Using Historical Medical Records. Applied Sciences. 2020;10(15):5144. doi:10.3390/app10155144 DOI: https://doi.org/10.3390/app10155144

Sunilram C, Bhavana D, Abhisriraj K, Gayatri M. Identifying Health Insurance Claim Frauds Using Machine Learning Concepts. International Journal of Engineering Research and Science & Technology. 2024;20(1):223–8.

Aarthi V, Raghavendra VS, Rao VD, Birudaraju H. Leveraging Machine Learning for Improved Detection of Medicare Fraud. International Journal of Scientific Research in Engineering and Management. 2025;09(06):1–8. doi:10.55041/IJSREM.NCFT031 DOI: https://doi.org/10.55041/IJSREM.NCFT031

Anwer S, Faisal F, Qureshi MA. A Comprehensive Study of Healthcare Fraud Detection Using Machine Learning. International Journal of Advanced Research in Science, Communication and Technology. 2024;5(1):20–5. doi:10.5281/zenodo.10473530

Subbarayudu Y, Vijendar Reddy G, Sandhya M, Bhargavi J, Abhilash PK, Pushkarna G. Evaluation of Distributed Topic Modeling Paradigms for Detection of Fraudulent Insurance Claims. In: MATEC Web of Conferences. 2024. p. 1111. doi:10.1051/matecconf/202439201111 DOI: https://doi.org/10.1051/matecconf/202439201111

Wang Z, Chen X, Wu Y, Jiang L, Lin S, Qiu G. A Robust and Interpretable Ensemble Machine Learning Model for Predicting Healthcare Insurance Fraud. Sci Rep. 2025;15(1):218. doi:10.1038/s41598-024-82062-x DOI: https://doi.org/10.1038/s41598-024-82062-x

Cherkaoui O, Anoun H, Maizate A. A Benchmark of Health Insurance Fraud Detection Using Machine Learning Techniques. IAES International Journal of Artificial Intelligence. 2024;13(2):1925–34. DOI: https://doi.org/10.11591/ijai.v13.i2.pp1925-1934

Data-Driven Approaches to Fraud Detection in Health Insurance Claims: A Systematic Review of Medical and Pharmaceutical Services

Authors

DOI:

Keywords:

Abstract

Author Biography

Susi Ari, Universitas Gadjah Mada

References

Downloads

Published

How to Cite

Issue

Section

License

Menu

Template

Visitor

Keywords

association

Info

Link

Other Links

Medicor : Journal of Health Informatics and Health Policy