Model Predictive Analysis of Performance in Training and Course Institutions Using Naive Bayes and K-Means Clustering
DOI:
https://doi.org/10.65126/jocosir.v3i1.68Keywords:
Predictive Analysis, Naive Bayes, K-Means, Institutional Performance, Data MiningAbstract
The performance of course and training institutions (LKP) is a crucial factor in determining the quality of non-formal education in Indonesia. Performance assessments are currently conducted manually using the National Accreditation Board for Non-Formal Education (BAN-PNF) assessment instrument, which is time-consuming and prone to subjectivity. This research aims to develop a predictive analysis model for the performance of course and training institutions using a combination of the Naive Bayes and K-Means Clustering methods. The K-Means Clustering method is used to group institutions based on similar characteristics across key variables such as trainers, infrastructure, curriculum, management, and graduate outcomes. These clustering results are then used as additional features for the Naive Bayes classification model to predict performance categories (high, medium, or low). Testing of 150 institutions' data showed a predictive accuracy of 89.2%, with three main clusters representing high-, medium-, and low-performing institutions. This model has the potential to become a data-driven tool for governments and institutions to conduct performance evaluations quickly, objectively, and adaptively to changes in training data.
References
Han, J., Kamber, M., & Pei, J. (2021). Data Mining: Concepts and Techniques. Morgan Kaufmann.
Jain, A. K. (2010). Data Clustering: 50 Years Beyond K-Means. Pattern Recognition Letters.
Witten, I. H., & Frank, E. (2017). Data Mining: Practical Machine Learning Tools and Techniques. Elsevier.
Simargolang, M. Y., & Budianto, E. (2024). Sistem Pendukung Keputusan Penilaian Akreditasi Lembaga Pelatihan Kerja. Journal of Science and Social Research.
Tan, P.-N., Steinbach, M., & Kumar, V. (2022). Introduction to Data Mining. Pearson.
Fadhil, Z. M. (2021). Hybrid of K-means clustering and naive Bayes classifier for predicting performance of an employee. Periodicals Of Engineering And Natural Sciencesissn, 9(2), 799-807.
Mohamed Nafuri, A. F., Sani, N. S., Zainudin, N. F. A., Rahman, A. H. A., & Aliff, M. (2022). Clustering analysis for classifying student academic performance in higher education. Applied Sciences, 12(19), 9467. https://doi.org/10.3390/app12199467
Wulandari, D. A. N., Annisa, R., Yusuf, L., & Prihatin, T. (2020). Educational data mining for student academic prediction using k-means clustering and Naïve Bayes classifier. Jurnal Pilar Nusa Mandiri, 16(2), 155-160. https://doi.org/10.33480/pilar.v16i2.1432
Hidayat, N., Wardoyo, R., Sn, A., & Surjono, H. D. (2020). Enhanced performance of the automatic learning style detection model using a combination of modified k-means algorithm and naive bayesian. International Journal of Advanced Computer Science and Applications, 11(3), 638-648.
Riadi, I., Umar, R., & Anggara, R. (2024). Comparative Analysis of Naive Bayes and K-NN Approaches to Predict Timely Graduation using Academic History. International Journal of Computing and Digital Systems, 16(1), 1163-1174.
Muda, Z., Yassin, W., Sulaiman, M. N., & Udzir, N. I. (2011). A K-Means and Naive Bayes learning approach for better intrusion detection. Information technology journal, 10(3), 648-655.
Razaque, F., Soomro, N., Shaikh, S. A., Soomro, S., Samo, J. A., Kumar, N., & Dharejo, H. (2017, November). Using naïve bayes algorithm to students' bachelor academic performances analysis. In 2017 4th IEEE International Conference on Engineering Technologies and Applied Sciences (ICETAS) (pp. 1-5). IEEE. 10.1109/ICETAS.2017.8277884
Anwarudin, A., Andriyani, W., DP, B. P., & Kristomo, D. (2022). The Prediction on the students’ graduation timeliness using naive bayes classification and k-nearest neighbor. Journal of Intelligent Software Systems, 1(1), 75-88. http://dx.doi.org/10.26798/jiss.v1i1.597
Mohd Talib, N. I., Abd Majid, N. A., & Sahran, S. (2023). Identification of student behavioral patterns in higher education using K-means clustering and support vector machine. Applied Sciences, 13(5), 3267. https://doi.org/10.3390/app13053267
Muda, Z., Yassin, W., Sulaiman, M. N., & Udzir, N. I. (2014). K-means clustering and naive bayes classification for intrusion detection. Journal of IT in Asia, 4(1), 13-25. https://doi.org/10.33736/jita.45.2014
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Eko Budianto, Muhammad Iqbal

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
