Classification of Health Indicators for Diabetes Mellitus Prediction Using a TabTransformer Model on Clinical Tabular Data
DOI:
https://doi.org/10.65126/jocosir.v2i2.54Keywords:
Diabetes Mellitus, TabTransformer, Deep Learning, Clinical Tabular Data, Machine Learning, ClassificationAbstract
Diabetes mellitus is a non-communicable disease with a continuously increasing global prevalence and impacts quality of life and long-term economic burden; therefore, data-driven early detection is crucial to prevent clinical complications. This study aims to develop a diabetes prediction model using the TabTransformer architecture by utilizing a clinical dataset from Kaggle containing 100,000 patient profiles with more than 35 relevant numerical and categorical attributes. The research stages include preprocessing to remove potential leakage features, target and feature separation, numerical normalization, and categorical feature embedding. The TabTransformer model is applied for binary classification (diagnosed_diabetes) by utilizing a self-attention mechanism to capture latent interactions between tabular features, and is evaluated using accuracy, precision, recall, F1-score, and ROC AUC metrics. The results show competitive performance with an accuracy of 82.55%, a diabetes class F1-score of 0.8527, and a ROC AUC value of 0.9009, indicating the model's discriminatory ability to reliably distinguish diabetic and non-diabetic patients. Based on these results, the TabTransformer architecture has been proven effective for processing large-scale clinical tabular data and is worthy of consideration in the implementation of an artificial intelligence-based medical decision support system for predicting chronic diseases, especially diabetes mellitus.
References
R. Arania, T. Triwahyuni, F. Esfandiari, and F. R. Nugraha, “Hubungan antara usia, jenis kelamin, dan tingkat pendidikan dengan kejadian diabetes mellitus di Klinik Mardi Waluyo Lampung Tengah,” J. Medika Malahayati, vol. 5, no. 3, pp. 146–153, 2021.
U. P. Sukmara, “Meningkatkan kesadaran pencegahan penyakit tidak menular pada hipertensi dan diabetes melitus melalui edukasi di masyarakat,” J. Medika: Medika, vol. 4, no. 3, pp. 436–441, 2025.
M. R. Febriansyah, “Diplomasi Kesehatan International Diabetes Federation (IDF) untuk meningkatkan kesadaran diabetes di kawasan Pasifik Barat pada tahun 2017–2022,” Doctoral Dissertation, Universitas Islam Indonesia, 2024.
H. Sun et al., “IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045,” Diabetes Res. Clin. Pract., vol. 183, p. 109119, 2022.
P. Saeedi et al., “Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9th edition,” Diabetes Res. Clin. Pract., vol. 157, p. 107843, 2019.
I. Irwansyah and I. S. Kasim, “Deteksi Dini Risiko Diabetes Melitus pada Staff Pengajar Stikes Megarezky Makassar,” J. Ilm. Kesehatan Sandi Husada, vol. 9, no. 1, pp. 540–547, 2020.
F. K. Adli, “Diabetes Melitus Gestasional: Diagnosis dan Faktor Risiko,” J. Medika Hutama, vol. 3, no. 01 Oktober, pp. 1545–1551, 2021.
H. Siswanti, F. A. Rohmaniah, S. Sukarmin, and M. Ridwanto, “Deteksi dini faktor risiko sebagai upaya pencegahan penyakit Diabetes Mellitus,” J. Inov. Penelitian dan Pengabdian Masyarakat, vol. 5, no. 1, pp. 118–127, 2025.
V. Agustina et al., “Deteksi dini penyakit diabetes melitus,” Magistrorum et Scholarium: J. Pengabdian Masyarakat, vol. 2, no. 2, pp. 300–309, 2021.
H. Purnama, H. Z. N. Adzidzah, M. Solihat, and M. Septriani, “Determinan risiko dan pencegahan terhadap kejadian penyakit diabetes melitus tipe 2 pada usia produktif di Wilayah DKI Jakarta,” J. Public Health Educ., vol. 2, no. 4, pp. 158–166, 2023.
A. Khaidar, M. Arhami, and M. Abdi, “Application of the Random Forest Method for UKT Classification at Politeknik Negeri Lhokseumawe,” J. Artif. Intell. Softw. Eng., vol. 4, no. 2, pp. 94–103, 2024.
A. Al Khaidar, “Analisis sentimen di Instagram terhadap Menteri Keuangan Purbaya Yudhi Sadewa menggunakan metode Logistic Regression,” J. Inform. dan Teknik Elektro Terapan, vol. 13, no. 3S1, 2025.
I. N. Migdalis, “Chronic Complications of Diabetes: Prevalence, Prevention, and Management,” Journal of Clinical Medicine, vol. 13, no. 23, p. 7001, 2024, doi: 10.3390/jcm13237001.
D. Tomic et al., “The burden and risks of emerging complications of diabetes mellitus,” 2022, doi: 10.1177/.... (detail jurnal tidak dicantumkan pada sumber).
S. A. Antar, “Diabetes mellitus: Classification, mediators, and emerging therapies,” 2023, doi: 10.1016/S0753-3322(23)01532-9.
X. Lu et al., “Type 2 Diabetes Mellitus in adults: Pathogenesis, complications and management,” Signal Transduction and Targeted Therapy, 2024, doi: 10.1038/s41392-024-01951-9.
“Global, regional, and national burden of type 2 diabetes mellitus from 1990 to 2021, with projections to 2050,” Frontiers in Endocrinology, 2024, doi: 10.3389/fendo.2024.1501690.
M. Yoon, J. Park, and S. Lee, “Transformer-based deep learning model for predicting chronic disease progression using longitudinal health records,” IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 9, pp. 4321–4332, 2023, doi: 10.1109/JBHI.2023.3265512.
A. Gupta, R. Das, and S. K. Singh, “Attention-based deep neural networks for diabetes diagnosis using clinical tabular datasets,” Computers in Biology and Medicine, vol. 158, p. 106993, 2023, doi: 10.1016/j.compbiomed.2023.106993.
M. Rahman et al., “Deep learning for diabetes prediction using structured electronic health records,” BMC Medical Informatics and Decision Making, vol. 23, no. 1, 2023, doi: 10.1186/s12911-023-02291-3.
H. Wang, F. Wu, and Y. Zhou, “Transformer-based model for tabular clinical data classification,” IEEE J. Biomed. Health Inform., vol. 28, no. 1, pp. 112–122, 2024.
D. Wang, H. Lin, and F. Zhou, “A transformer-driven framework for risk prediction using structured electronic health records,” Applied Intelligence, vol. 53, no. 18, pp. 21547–21562, 2023, doi: 10.1007/s10489-023-04655-1.
Y. Li and M. Zhang, “Prediction of diabetes using deep neural networks on real-world clinical datasets,” Computers in Biology and Medicine, vol. 160, p. 107216, 2023.
M. Sun et al., “Medical tabular data classification using hybrid attention network,” Applied Intelligence, vol. 53, pp. 15230–15244, 2023.
R. Kaur and P. Singh, “An improved transformer-based model for diabetes prediction,” IEEE Access, vol. 11, pp. 44512–44522, 2023.
J. Chen, X. Liu, and Y. Gao, “Attention-guided embedding learning for clinical tabular prediction,” Expert Systems with Applications, vol. 238, p. 121801, 2024.
L. Patel et al., “Artificial intelligence in diabetes diagnosis and risk stratification,” Frontiers in Endocrinology, vol. 14, pp. 1–15, 2023, doi: 10.3389/fendo.2023.1180922.
S. Ö. Arık and T. Pfister, “TabTransformer: Tabular data modeling using contextual embeddings,” in Proc. AAAI Conf. Artif. Intell., vol. 35, no. 8, pp. 6679–6687, 2021.
S. Huang, Z. Chen, J. Li, and K. Ma, “TabTransformer for electronic health record classification,” IEEE Access, vol. 10, pp. 125834–125846, 2022, doi: 10.1109/ACCESS.2022.3220114.
S. Xu, Q. Zhang, and H. Liu, “Self-attention based deep models for chronic disease risk prediction using structured EHR,” Int. J. Med. Inform., vol. 177, p. 105237, 2023, doi: 10.1016/j.ijmedinf.2023.105237.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Al Khaidar, Sri Kurnia

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
