Toward Transparent and Safe Clinical AI: A Framework for Explainable Large Language Models in Medicine
Keywords:
Large Language Models, Explainable AI, Clinical Decision Support, Hallucination Risk, AMPLIFY, Transparency in MedicineAbstract
Large Language Models (LLMs), such as GPT-4, are being increasingly adopted in clinical settings to support tasks like diagnosis, patient communication, and medical summarization. However, their black-box nature raises ethical and safety concerns, particularly regarding hallucinations, omissions, and lack of interpretability. This study aims to evaluate the performance of LLMs in clinical tasks and propose a transparent framework that integrates explainability and risk assessment tools. We benchmarked LLM performance using the RJUA-SP dataset, focusing on diagnostic reasoning, therapy recommendation, and multi-turn dialogues. The CREOLA framework was applied to classify hallucination and omission likelihoods in clinical outputs, and post-hoc explainability methods especially AMPLIFY were used to assess their impact on model interpretability and trust. Results show that GPT-4 achieves 63.63% accuracy in single-turn diagnostic QA and 18.18% in therapy recommendation, with reasoning performance peaking at 20.15% and multi-turn dialogue completeness below 16%. CREOLA identified high-risk errors in over 90% of outputs, underscoring the need for human oversight. Integrating AMPLIFY improved reasoning accuracy by nearly 10 percentage points and enhanced clinician trust. These findings suggest that while LLMs offer valuable clinical support, they must be paired with transparent mechanisms to ensure safe deployment. This research contributes a multi-layered framework combining benchmarking, risk evaluation, and explainability strategies for responsible LLM use in high-risk healthcare domains.
References
Amann, J., Blasimme, A., Vayena, E., Frey, D., & Madai, V. I. (2020). Explainability for Artificial Intelligence in Healthcare: A Multidisciplinary Perspective. BMC Medical Informatics and Decision Making, 20(1). https://doi.org/10.1186/s12911-020-01332-6
Antoniadi, A. M., Du, Y., Guendouz, Y., Wei, L., Mazo, C., Becker, B. A., & Mooney, C. (2021). Current Challenges and Future Opportunities for XAI in Machine Learning-Based Clinical Decision Support Systems: A Systematic Review. Applied Sciences, 11(11). https://doi.org/10.3390/app11115088
Arrieta, A. B., Díaz-Rodríguez, N., Ser, J. D., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., Chatila, R., & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges Toward Responsible AI. Information Fusion, 58, 82–115. https://doi.org/10.1016/j.inffus.2019.12.012
Arun, N., Gaw, N., Singh, P., Chang, K., Aggarwal, M., Chen, B., Hoebel, K., Gupta, S., Patel, J., Gidwani, M., Adebayo, J., Li, M., & Kalpathy–Cramer, J. (2021). Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging. Radiology Artificial Intelligence, 3(6). https://doi.org/10.1148/ryai.2021200267
Bejger, S., & Elster, S. (2020). Artificial Intelligence in Economic Decision Making: How to Assure a Trust? Ekonomia I Prawo, 19(3). https://doi.org/10.12775/eip.2020.028
Brankovic, A., Cook, D., Rahman, J. S., Khanna, S., & Huang, W. (2024). Benchmarking the Most Popular XAI Used for Explaining Clinical Predictive Models: Untrustworthy but Could Be Useful. Health Informatics Journal, 30(4). https://doi.org/10.1177/14604582241304730
Dindorf, C., Ludwig, O., Simon, S. B., Becker, S., & Fröhlich, M. (2023). Machine Learning and Explainable Artificial Intelligence Using Counterfactual Explanations for Evaluating Posture Parameters. https://doi.org/10.20944/preprints202303.0510.v1
Falcon, R. M. G., Alcazar, R. M. U., Babaran, H. G., Caragay, B. D. B., Corpuz, C. A. A., Kho, M. E., Perez, A., & Isip-Tan, I. T. (2024). Exploring Filipino Medical Students’ Attitudes and Perceptions of Artificial Intelligence in Medical Education: A Mixed-Methods Study. Mededpublish, 14. https://doi.org/10.12688/mep.20590.1
Farah, L., Borget, I., Martelli, N., & Vallée, A. (2024). Suitability of the Current Health Technology Assessment of Innovative Artificial Intelligence-Based Medical Devices. Scoping Literature Review. Journal of Medical Internet Research, 26. https://doi.org/10.2196/51514
Fuhrman, J., Gorre, N., Hu, Q., Li, H., Naqa, I. E., & Giger, M. L. (2021). A Review of Explainable and Interpretable AI With Applications in COVID‐19 Imaging. Medical Physics, 49(1), 1–14. https://doi.org/10.1002/mp.15359
Gunning, D., & Aha, D. W. (2019). DARPA’s Explainable Artificial Intelligence Program. Ai Magazine, 40(2), 44–58. https://doi.org/10.1609/aimag.v40i2.2850
Hatherley, J., Sparrow, R., & Howard, M. (2022). The Virtues of Interpretable Medical Artificial Intelligence. Cambridge Quarterly of Healthcare Ethics, 1–10. https://doi.org/10.1017/s0963180122000305
Kaur, M., Khosla, R., & Siddiqui, M. H. (2024). Impact of Job Stress on Psychological Well-Being of Teachers. Int Res J Adv Engg MGT, 2(03), 504–515. https://doi.org/10.47392/irjaem.2024.0071
Madi, I. A. e, Redjdal, A., Bouaud, J., & Séroussi, B. (2024). Exploring Explainable AI Techniques for Text Classification in Healthcare: A Scoping Review. https://doi.org/10.3233/shti240544
Markus, A. F., Kors, J. A., & Rijnbeek, P. R. (2021). The Role of Explainability in Creating Trustworthy Artificial Intelligence for Health Care: A Comprehensive Survey of the Terminology, Design Choices, and Evaluation Strategies. Journal of Biomedical Informatics, 113. https://doi.org/10.1016/j.jbi.2020.103655
Mohammad‐Rahimi, H., Ourang, S. A., Pourhoseingholi, M. A., Dianat, O., Dummer, P. M. H., & Nosrat, A. (2023). Validity and Reliability of Artificial Intelligence Chatbots as Public Sources of Information on Endodontics. International Endodontic Journal, 57(3), 305–314. https://doi.org/10.1111/iej.14014
Moradi, M., & Samwald, M. (2021). Post-Hoc Explanation of Black-Box Classifiers Using Confident Itemsets. Expert Systems With Applications, 165. https://doi.org/10.1016/j.eswa.2020.113941
Muddamsetty, S. M., Jahromi, M. N. S., & Moeslund, T. B. (2021). Expert Level Evaluations for Explainable AI (XAI. In Methods in the Medical Domain (pp. 35–46). https://doi.org/10.1007/978-3-030-68796-0_3
Nava, C. F. G., Miranda-Filho, D. d B., Rodrigues, J. J. P. C., Alves, S., Bezerra, P. S., Barbosa, L. M., & Pinto, A. (2024). The Impact of Artificial Intelligence on Medicine: Applications, Challenges and Perspectives. International Journal of Science and Research Archive, 13(2), 3510–3514. https://doi.org/10.30574/ijsra.2024.13.2.2556
Oettl, F. C., Pareek, A., Winkler, P. W., Zsidai, B., Pruneski, J. A., Senorski, E. H., Kopf, S., Ley, C., Herbst, E., Oeding, J. F., Grassi, A., Hirschmann, M. T., Musahl, V., Samuelsson, K., Tischer, T., & Feldt, R. (2024). A Practical Guide to the Implementation of AI in Orthopaedic Research, Part 6: How to Evaluate the Performance of AI Research? Journal of Experimental Orthopaedics, 11(3). https://doi.org/10.1002/jeo2.12039
Okada, Y., Ning, Y., & Ong, M. E. H. (2023). Explainable Artificial Intelligence in Emergency Medicine: An Overview. Clinical and Experimental Emergency Medicine, 10(4), 354–362. https://doi.org/10.15441/ceem.23.145
Rasheed, K., Qayyum, A., Ghaly, M., Al‐Fuqaha, A., Razi, A., & Qadir, J. (2021). Explainable, Trustworthy, and Ethical Machine Learning for Healthcare: A Survey. https://doi.org/10.36227/techrxiv.14376179
Reyes, L. T., Knorst, J. K., Ortiz, F. R., Mendes, F. M., & Ardenghi, T. M. (2020). Pathways Influencing Dental Caries Increment Among Children: A Cohort Study. International Journal of Paediatric Dentistry, 31(3), 422–432. https://doi.org/10.1111/ipd.12730
Riedemann, L., Labonne, M., & Gilbert, S. (2024). The Path Forward for Large Language Models in Medicine Is Open. NPJ Digital Medicine, 7(1). https://doi.org/10.1038/s41746-024-01344-w
Rosoł, M., Gąsior, J. S., Łaba, J., Korzeniewski, K., & Młyńczak, M. (2023). Evaluation of the Performance of GPT-3.5 and GPT-4 on the Medical Final Examination. https://doi.org/10.1101/2023.06.04.23290939
Shobeiri, S. (2024). Enhancing Transparency in Healthcare Machine Learning Models Using Shap and Deeplift a Methodological Approach. Iraqi Journal of Information & Communications Technology, 7(2), 56–72. https://doi.org/10.31987/ijict.7.2.285
Singh, A., Sengupta, S., & Lakshminarayanan, V. (2020). Explainable Deep Learning Models in Medical Image Analysis. Journal of Imaging, 6(6). https://doi.org/10.3390/jimaging6060052
Vrdoljak, J., Boban, Z., Vilović, M., Kumrić, M., & Božić, J. (2024). A Review of Large Language Models in Medical Education, Clinical Decision Support, and Healthcare Administration. https://doi.org/10.20944/preprints202412.0185.v1
Waldock, W., Zhang, J., Guni, A., Nabeel, A., Darzi, A., & Ashrafian, H. (2024). The Accuracy and Capability of Artificial Intelligence Solutions in Health Care Examinations and Certificates. Systematic Review. https://doi.org/10.2196/preprints.56532
Williams, C. Y. K., Subramanian, C. R., Ali, S. S., Apolinario, M., Askin, E., Barish, P., Cheng, M., Deardorff, W. J., Donthi, N., Ganeshan, S., Huang, O., Kantor, M. A., Lai, A., Manchanda, A., Moore, K., Muniyappa, A., Nair, G., Patel, P. P., Santhosh, L., & Rosner, B. (2024). Physician- And Large Language Model-Generated Hospital Discharge Summaries: A Blinded, Comparative Quality and Safety Study. https://doi.org/10.1101/2024.09.29.24314562
Yeung, J. A., Kraljević, Ž., Luintel, A., Balston, A., Idowu, E., Dobson, R., & Teo, J. (2023). AI Chatbots Not Yet Ready for Clinical Use. Frontiers in Digital Health, 5. https://doi.org/10.3389/fdgth.2023.1161098


