Toward Transparent and Safe Clinical AI: A Framework for Explainable Large Language Models in Medicine

Authors

  • Nuraini Purwandar Institut Bisnis dan Informatika Kosgoro 1957

Keywords:

Large Language Models, Explainable AI, Clinical Decision Support, Hallucination Risk, AMPLIFY, Transparency in Medicine

Abstract

Large Language Models (LLMs), such as GPT-4, are being increasingly adopted in clinical settings to support tasks like diagnosis, patient communication, and medical summarization. However, their black-box nature raises ethical and safety concerns, particularly regarding hallucinations, omissions, and lack of interpretability. This study aims to evaluate the performance of LLMs in clinical tasks and propose a transparent framework that integrates explainability and risk assessment tools. We benchmarked LLM performance using the RJUA-SP dataset, focusing on diagnostic reasoning, therapy recommendation, and multi-turn dialogues. The CREOLA framework was applied to classify hallucination and omission likelihoods in clinical outputs, and post-hoc explainability methods especially AMPLIFY were used to assess their impact on model interpretability and trust. Results show that GPT-4 achieves 63.63% accuracy in single-turn diagnostic QA and 18.18% in therapy recommendation, with reasoning performance peaking at 20.15% and multi-turn dialogue completeness below 16%. CREOLA identified high-risk errors in over 90% of outputs, underscoring the need for human oversight. Integrating AMPLIFY improved reasoning accuracy by nearly 10 percentage points and enhanced clinician trust. These findings suggest that while LLMs offer valuable clinical support, they must be paired with transparent mechanisms to ensure safe deployment. This research contributes a multi-layered framework combining benchmarking, risk evaluation, and explainability strategies for responsible LLM use in high-risk healthcare domains.

References

Amann, J., Blasimme, A., Vayena, E., Frey, D., & Madai, V. I. (2020). Explainability for Artificial Intelligence in Healthcare: A Multidisciplinary Perspective. BMC Medical Informatics and Decision Making, 20(1). https://doi.org/10.1186/s12911-020-01332-6

Antoniadi, A. M., Du, Y., Guendouz, Y., Wei, L., Mazo, C., Becker, B. A., & Mooney, C. (2021). Current Challenges and Future Opportunities for XAI in Machine Learning-Based Clinical Decision Support Systems: A Systematic Review. Applied Sciences, 11(11). https://doi.org/10.3390/app11115088

Arrieta, A. B., Díaz-Rodríguez, N., Ser, J. D., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., Chatila, R., & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges Toward Responsible AI. Information Fusion, 58, 82–115. https://doi.org/10.1016/j.inffus.2019.12.012

Arun, N., Gaw, N., Singh, P., Chang, K., Aggarwal, M., Chen, B., Hoebel, K., Gupta, S., Patel, J., Gidwani, M., Adebayo, J., Li, M., & Kalpathy–Cramer, J. (2021). Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging. Radiology Artificial Intelligence, 3(6). https://doi.org/10.1148/ryai.2021200267

Bejger, S., & Elster, S. (2020). Artificial Intelligence in Economic Decision Making: How to Assure a Trust? Ekonomia I Prawo, 19(3). https://doi.org/10.12775/eip.2020.028

Brankovic, A., Cook, D., Rahman, J. S., Khanna, S., & Huang, W. (2024). Benchmarking the Most Popular XAI Used for Explaining Clinical Predictive Models: Untrustworthy but Could Be Useful. Health Informatics Journal, 30(4). https://doi.org/10.1177/14604582241304730

Dindorf, C., Ludwig, O., Simon, S. B., Becker, S., & Fröhlich, M. (2023). Machine Learning and Explainable Artificial Intelligence Using Counterfactual Explanations for Evaluating Posture Parameters. https://doi.org/10.20944/preprints202303.0510.v1

Falcon, R. M. G., Alcazar, R. M. U., Babaran, H. G., Caragay, B. D. B., Corpuz, C. A. A., Kho, M. E., Perez, A., & Isip-Tan, I. T. (2024). Exploring Filipino Medical Students’ Attitudes and Perceptions of Artificial Intelligence in Medical Education: A Mixed-Methods Study. Mededpublish, 14. https://doi.org/10.12688/mep.20590.1

Farah, L., Borget, I., Martelli, N., & Vallée, A. (2024). Suitability of the Current Health Technology Assessment of Innovative Artificial Intelligence-Based Medical Devices. Scoping Literature Review. Journal of Medical Internet Research, 26. https://doi.org/10.2196/51514

Fuhrman, J., Gorre, N., Hu, Q., Li, H., Naqa, I. E., & Giger, M. L. (2021). A Review of Explainable and Interpretable AI With Applications in COVID‐19 Imaging. Medical Physics, 49(1), 1–14. https://doi.org/10.1002/mp.15359

Gunning, D., & Aha, D. W. (2019). DARPA’s Explainable Artificial Intelligence Program. Ai Magazine, 40(2), 44–58. https://doi.org/10.1609/aimag.v40i2.2850

Hatherley, J., Sparrow, R., & Howard, M. (2022). The Virtues of Interpretable Medical Artificial Intelligence. Cambridge Quarterly of Healthcare Ethics, 1–10. https://doi.org/10.1017/s0963180122000305

Kaur, M., Khosla, R., & Siddiqui, M. H. (2024). Impact of Job Stress on Psychological Well-Being of Teachers. Int Res J Adv Engg MGT, 2(03), 504–515. https://doi.org/10.47392/irjaem.2024.0071

Madi, I. A. e, Redjdal, A., Bouaud, J., & Séroussi, B. (2024). Exploring Explainable AI Techniques for Text Classification in Healthcare: A Scoping Review. https://doi.org/10.3233/shti240544

Markus, A. F., Kors, J. A., & Rijnbeek, P. R. (2021). The Role of Explainability in Creating Trustworthy Artificial Intelligence for Health Care: A Comprehensive Survey of the Terminology, Design Choices, and Evaluation Strategies. Journal of Biomedical Informatics, 113. https://doi.org/10.1016/j.jbi.2020.103655

Mohammad‐Rahimi, H., Ourang, S. A., Pourhoseingholi, M. A., Dianat, O., Dummer, P. M. H., & Nosrat, A. (2023). Validity and Reliability of Artificial Intelligence Chatbots as Public Sources of Information on Endodontics. International Endodontic Journal, 57(3), 305–314. https://doi.org/10.1111/iej.14014

Moradi, M., & Samwald, M. (2021). Post-Hoc Explanation of Black-Box Classifiers Using Confident Itemsets. Expert Systems With Applications, 165. https://doi.org/10.1016/j.eswa.2020.113941

Muddamsetty, S. M., Jahromi, M. N. S., & Moeslund, T. B. (2021). Expert Level Evaluations for Explainable AI (XAI. In Methods in the Medical Domain (pp. 35–46). https://doi.org/10.1007/978-3-030-68796-0_3

Nava, C. F. G., Miranda-Filho, D. d B., Rodrigues, J. J. P. C., Alves, S., Bezerra, P. S., Barbosa, L. M., & Pinto, A. (2024). The Impact of Artificial Intelligence on Medicine: Applications, Challenges and Perspectives. International Journal of Science and Research Archive, 13(2), 3510–3514. https://doi.org/10.30574/ijsra.2024.13.2.2556

Oettl, F. C., Pareek, A., Winkler, P. W., Zsidai, B., Pruneski, J. A., Senorski, E. H., Kopf, S., Ley, C., Herbst, E., Oeding, J. F., Grassi, A., Hirschmann, M. T., Musahl, V., Samuelsson, K., Tischer, T., & Feldt, R. (2024). A Practical Guide to the Implementation of AI in Orthopaedic Research, Part 6: How to Evaluate the Performance of AI Research? Journal of Experimental Orthopaedics, 11(3). https://doi.org/10.1002/jeo2.12039

Okada, Y., Ning, Y., & Ong, M. E. H. (2023). Explainable Artificial Intelligence in Emergency Medicine: An Overview. Clinical and Experimental Emergency Medicine, 10(4), 354–362. https://doi.org/10.15441/ceem.23.145

Rasheed, K., Qayyum, A., Ghaly, M., Al‐Fuqaha, A., Razi, A., & Qadir, J. (2021). Explainable, Trustworthy, and Ethical Machine Learning for Healthcare: A Survey. https://doi.org/10.36227/techrxiv.14376179

Reyes, L. T., Knorst, J. K., Ortiz, F. R., Mendes, F. M., & Ardenghi, T. M. (2020). Pathways Influencing Dental Caries Increment Among Children: A Cohort Study. International Journal of Paediatric Dentistry, 31(3), 422–432. https://doi.org/10.1111/ipd.12730

Riedemann, L., Labonne, M., & Gilbert, S. (2024). The Path Forward for Large Language Models in Medicine Is Open. NPJ Digital Medicine, 7(1). https://doi.org/10.1038/s41746-024-01344-w

Rosoł, M., Gąsior, J. S., Łaba, J., Korzeniewski, K., & Młyńczak, M. (2023). Evaluation of the Performance of GPT-3.5 and GPT-4 on the Medical Final Examination. https://doi.org/10.1101/2023.06.04.23290939

Shobeiri, S. (2024). Enhancing Transparency in Healthcare Machine Learning Models Using Shap and Deeplift a Methodological Approach. Iraqi Journal of Information & Communications Technology, 7(2), 56–72. https://doi.org/10.31987/ijict.7.2.285

Singh, A., Sengupta, S., & Lakshminarayanan, V. (2020). Explainable Deep Learning Models in Medical Image Analysis. Journal of Imaging, 6(6). https://doi.org/10.3390/jimaging6060052

Vrdoljak, J., Boban, Z., Vilović, M., Kumrić, M., & Božić, J. (2024). A Review of Large Language Models in Medical Education, Clinical Decision Support, and Healthcare Administration. https://doi.org/10.20944/preprints202412.0185.v1

Waldock, W., Zhang, J., Guni, A., Nabeel, A., Darzi, A., & Ashrafian, H. (2024). The Accuracy and Capability of Artificial Intelligence Solutions in Health Care Examinations and Certificates. Systematic Review. https://doi.org/10.2196/preprints.56532

Williams, C. Y. K., Subramanian, C. R., Ali, S. S., Apolinario, M., Askin, E., Barish, P., Cheng, M., Deardorff, W. J., Donthi, N., Ganeshan, S., Huang, O., Kantor, M. A., Lai, A., Manchanda, A., Moore, K., Muniyappa, A., Nair, G., Patel, P. P., Santhosh, L., & Rosner, B. (2024). Physician- And Large Language Model-Generated Hospital Discharge Summaries: A Blinded, Comparative Quality and Safety Study. https://doi.org/10.1101/2024.09.29.24314562

Yeung, J. A., Kraljević, Ž., Luintel, A., Balston, A., Idowu, E., Dobson, R., & Teo, J. (2023). AI Chatbots Not Yet Ready for Clinical Use. Frontiers in Digital Health, 5. https://doi.org/10.3389/fdgth.2023.1161098

Downloads

Published

2025-11-27

How to Cite

Purwandar, N. (2025). Toward Transparent and Safe Clinical AI: A Framework for Explainable Large Language Models in Medicine. Intellecta : Journal of Artificial Intelligence, 1(1), 55–64. Retrieved from https://journal.idscipub.com/index.php/intellecta/article/view/1206