PREDIKSI RESISTENSI OBAT PADA MYCOBACTERIUM TUBERCULOSIS MENGGUNAKAN HYBRID GRAPH NEURAL NETWORK–SENTENCE TRANSFORMER DAN XGBOOST DENGAN EXPLAINABILITY BERBASIS GRAPHRAG
DOI:
https://doi.org/10.70248/jrsit.v3i4.3772Abstract
Penelitian ini bertujuan untuk membangun model prediksi resistensi obat pada Mycobacterium tuberculosis yang akurat dan dapat dijelaskan (explainable) melalui integrasi Hybrid Graph Neural Network (GNN), Sentence Transformer, XGBoost, dan Graph Retrieval-Augmented Generation (GraphRAG). Metode penelitian yang digunakan adalah pendekatan komputasi berbasis hybrid representation learning dengan memanfaatkan data mutasi dari Tuberculosis Drug Resistance Database (TBDB) yang terdiri atas 49.328 entri mutasi. Tahapan penelitian meliputi preprocessing data, pembentukan biomedical knowledge graph, ekstraksi embedding struktural menggunakan GNN berdimensi 64, embedding semantik menggunakan Sentence Transformer berdimensi 384, penggabungan fitur menjadi hybrid embedding 448 dimensi, klasifikasi menggunakan XGBoost, serta implementasi GraphRAG sebagai mekanisme explainability. Hasil penelitian menunjukkan bahwa model Hybrid GNN–Sentence Transformer dan XGBoost mampu mencapai akurasi 98,85%, precision 93,4%, recall 95,7%, F1-score 94,5%, dan ROC-AUC 0,998, serta mengungguli model berbasis representasi tunggal. Validasi terhadap literatur ilmiah dan katalog mutasi WHO menunjukkan tingkat akurasi kasus sebesar 95%, lebih tinggi dibandingkan model tabular yang memperoleh 86%. Selain itu, GraphRAG menghasilkan interpretasi berbasis bukti biologis dengan tingkat konsistensi 79%, tanpa ditemukan kesalahan under-risk, sehingga mampu meningkatkan transparansi dan keandalan hasil prediksi. Simpulan penelitian ini adalah bahwa integrasi representasi struktural graf dan representasi semantik teks melalui pendekatan hybrid, yang didukung mekanisme explainability berbasis GraphRAG, efektif dalam meningkatkan akurasi prediksi resistensi obat pada Mycobacterium tuberculosis serta menyediakan penjelasan yang dapat ditelusuri dan relevan untuk mendukung pengambilan keputusan klinis berbasis genomik.
References
Alsuhaibani, M. (2025). The geometry of meaning : evaluating sentence embeddings from diverse transformer-based models for natural language inference. https://doi.org/10.7717/peerj-cs.2957
Babirye, S. R., Nsubuga, M., Mboowa, G., Batte, C., Galiwango, R., & Kateete, D. P. (2024). Machine learning-based prediction of antibiotic resistance in Mycobacterium tuberculosis clinical isolates from Uganda. BMC Infectious Diseases, 24(1). https://doi.org/10.1186/s12879-024-10282-7
Carter, J. J., Walker, T. M., Walker, A. S., Whitfield, M. G., Morlock, G. P., Lynch, C. I., Adlard, D., Peto, T. E. A., Posey, J. E., Crook, D. W., & Fowler, P. W. (2024). Prediction of pyrazinamide resistance in Mycobacterium tuberculosis using structure-based machine-learning approaches. JAC-Antimicrobial Resistance, 6(2), 1–11. https://doi.org/10.1093/jacamr/dlae037
Cen, H., Zhang, P., Ling, Y., Zhao, G., & Zhang, G. (2025). An explainable artificial intelligence framework reveals mutations associated with drug resistance in Mycobacterium tuberculosis. bioRxiv, 2024.12.18.629160. https://www.biorxiv.org/content/10.1101/2024.12.18.629160v1%0Ahttps://www.biorxiv.org/content/10.1101/2024.12.18.629160v1.abstract
Colangelo, M. T., Meleti, M., Guizzardi, S., & Calciolari, E. (2025). A Comparative Analysis of Sentence Transformer Models for Automated Journal Recommendation Using PubMed Metadata. 1–18.
Faye, O., Tan, L. M., Johnson-hagler, M., Suntay, M., Ali, J., Recto, K., Glenn, P., & Pennings, P. (2024). PLOS COMPUTATIONAL BIOLOGY Using genomic data and machine learning to predict antibiotic resistance : A tutorial paper. 1–16. https://doi.org/10.1371/journal.pcbi.1012579
Hall, M. B., Lima, L., Coin, L. J. M., & Iqbal, Z. (2023). Drug resistance prediction for Mycobacterium tuberculosis with reference graphs. Microbial Genomics, 9(8). https://doi.org/10.1099/mgen.0.001081
Kuang, X., Wang, F., Hernandez, K. M., Zhang, Z., & Grossman, R. L. (2022). Accurate and rapid prediction of tuberculosis drug resistance from genome sequence data using traditional machine learning algorithms and CNN. Scientific Reports, 12(1), 1–11. https://doi.org/10.1038/s41598-022-06449-4
Mangkunegara, I. S., Ma, A., Basil, N., & Marhoon, H. M. (2025). Transformer Models in Deep Learning : Foundations , Advances , Challenges and Future Directions. 7(2), 231–241. https://doi.org/10.12928/biste.v7i2.13053
Nguyen, H.-A., Peleg, A. Y., Wisniewski, J. A., Wang, X., Wang, Z., Blakeway, L. V., Badoordeen, G. Z., Theegala, R., Doan, N. Q., Parker, M. H., Green, A. G., Song, J., Dowe, D. L., & Macesic, N. (2025). AMR-GNN: A multi-representation graph neural network framework to enable genomic antimicrobial resistance prediction. bioRxiv, 2025.07.24.666581. https://www.biorxiv.org/content/10.1101/2025.07.24.666581v1%0Ahttps://www.biorxiv.org/content/10.1101/2025.07.24.666581v1.abstract
Phelan, J. E., Sullivan, D. M. O., Machado, D., Ramos, J., Oppong, Y. E. A., Campino, S., Grady, J. O., Mcnerney, R., Hibberd, M. L., Viveiros, M., Huggett, J. F., & Clark, T. G. (2019). Integrating informatics tools and portable sequencing technology for rapid detection of resistance to anti-tuberculous drugs. 1–7.
Pritchard, E., Pouwels, K. B., Hopkins, S., Guy, R. L., Henderson, K., Chudasama, D., & Hope, R. (2024). Predicting future hospital antimicrobial resistance prevalence using machine learning. Communications Medicine, 1–14. https://doi.org/10.1038/s43856-024-00606-8
Soman, K., Rose, P. W., Morris, J. H., Akbas, R. E., Smith, B., Peetoom, B., Villouta-reyes, C., Cerono, G., Shi, Y., Rizk-jackson, A., Israni, S., Nelson, C. A., Huang, S., & Baranzini, S. E. (2024). Databases and ontologies Biomedical knowledge graph-optimized prompt generation for large language models. 40(August).
Tilahun, M., Atnafu, A., Gebresilase, T. T., Abebe, M., Alemu, M., Neway, S., Letta, T., Bezabeh, A., Assefa, T., Melaku, K., Alemayehu, D. H., Moga, S., Ayele, A., Fetu, M., Adnew, B., Mulu, A., Wassie, L., & Bobosha, K. (2025). Genotypic drug resistance and transmission clusters of Mycobacterium tuberculosis isolates among Ethiopian returnees from Saudi Arabia. PLoS ONE, 20(4 April), 1–16. https://doi.org/10.1371/journal.pone.0318743
Verboven, L., Phelan, J., Heupink, T. H., & Rie, A. Van. (2022). TBProfiler for automated calling of the association with drug resistance of variants in Mycobacterium tuberculosis. 1–15. https://doi.org/10.1371/journal.pone.0279644
Wang, Y., Jiang, Z., Liang, P., Liu, Z., Cai, H., & Sun, Q. (2024). TB-DROP: deep learning-based drug resistance prediction of Mycobacterium tuberculosis utilizing whole genome mutations. BMC Genomics, 25(1), 1–14. https://doi.org/10.1186/s12864-024-10066-y
Wenner, I., Falcao, S., Cardoso, D. L., Einstein, A., Santos, S., Paixao, E., R, F. A., Figueiredo, K., Carneiro, S., & César, M. (2024). Model for predicting drug resistance based on the clinical pro fi le of tuberculosis patients using machine learning techniques. 1–20. https://doi.org/10.7717/peerj-cs.2246
WHO. (2023). Catalogue of mutations in Mycobacterium tuberculosis complex and their association with drug resistance.
WHO. (2024). Global tuberculosis report 2024.
Wu, J., Zhu, J., Qi, Y., Chen, J., Xu, M., Menolascina, F., Jin, Y., & Grau, V. (2025). Medical Graph RAG: Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation. 1, 28443–28467. https://doi.org/10.18653/v1/2025.acl-long.1381
Xu, Y., Mao, Y., Hua, X., Jiang, Y., Zou, Y., Wang, Z., Liu, Z., Zhang, H., Lu, L., & Yu, Y. (2025). Machine learning-based prediction of antimicrobial resistance and identification of AMR-related SNPs in Mycobacterium tuberculosis. BMC Genomic Data, 26(1). https://doi.org/10.1186/s12863-025-01338-x
Yang, Y., Walker, T. M., Kouchaki, S., Wang, C., Peto, T. E. A., Crook, D. W., & Clifton, D. A. (2021). An end-to-end heterogeneous graph attention network for Mycobacterium tuberculosis drug-resistance prediction. Briefings in Bioinformatics, 22(6), 1–13. https://doi.org/10.1093/bib/bbab299


















