Document Type : Original Article


1 Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran

2 Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran Student Research Committee Department, Abadan University of Medical Sciences, Abadan, Iran

3 Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran

4 Department of English Language, School of Medicine, Ilam University of Medical Science, Ilam, Iran


BACKGROUND: An outbreak of atypical pneumonia termed COVID‑19 has widely spread all over
the world since the beginning of 2020. In this regard, designing a prediction system for the early
detection of COVID‑19 is a critical issue in mitigating virus spread. In this study, we have applied
selected machine learning techniques to select the best predictive models based on their performance.
MATERIALS AND METHODS: The data of 435 suspicious cases with COVID‑19 which were recorded
from the Imam Khomeini Hospital database between May 9, 2020 and December 20, 2020, have
been taken into consideration. The Chi‑square method was used to determine the most important
features in diagnosing the COVID‑19; eight selected data mining algorithms including multilayer
perceptron (MLP), J‑48, Bayesian Net (Bayes Net), logistic regression, K‑star, random forest,
Ada‑boost, and sequential minimal optimization (SMO) were applied in data mining. Finally, the most
appropriate diagnostic model for COVID‑19 was obtained based on comparing the performance of
the selected algorithms.
RESULTS: As the result of using the Chi‑square method, 21 variables were identified as the
most important diagnostic criteria in COVID‑19. The results of evaluating the eight selected data
mining algorithms showed that the J‑48 with true‑positive rate = 0.85, false‑positive rate = 0.173,
precision = 0.85, recall = 0.85, F‑score = 0.85, Matthews Correlation Coefficient = 0.68, and area
under the receiver operator characteristics = 0.68, respectively, had the higher performance than
the other algorithms.
CONCLUSION: The results of evaluating the performance criteria showed that the J‑48 can be
considered as a suitable computational prediction model for diagnosing COVID‑19 disease.


1. BikdeliB, TalasazAH, RashidiF, Sharif‑KashaniB, FarrokhpourM,
Bakhshandeh H, et al. Intermediate versus standard‑dose
prophylactic anticoagulation and statin therapy versus placebo
in critically‑ill patients with COVID‑19: Rationale and design
of the INSPIRATION/INSPIRATION‑S studies. Thromb Res
2. Tang D, Comish P, Kang R. The hallmarks of COVID‑19 disease.
PLoS Pathog 2020;16:e1008536.
3. Mehta P, McAuley DF, Brown M, Sanchez E, Tattersall RS,
Manson JJ, et al. COVID‑19: Consider cytokine storm syndromes
and immunosuppression. Lancet 2020;395:1033‑4.
4. Shereen MA, Khan S, Kazmi A, Bashir N, Siddique R. COVID‑19
infection: Origin, transmission, and characteristics of human
coronaviruses. J Adv Res 2020;24:91‑8.
5. Zhao Z, Chen A, Hou W, Graham JM, Li H, Richman PS, et al.
Prediction model and risk scores of ICU admission and mortality
in COVID‑19. PLoS One 2020;15:e0236618.
6. Hu H, Yao N, QiuY. Comparing rapid scoring systems in mortality
prediction of critically ill patients with novel coronavirus disease.
Acad Emerg Med 2020;27:461‑8.
7. Thomson G. COVID‑19: Social distancing, ACE 2 receptors,
protease inhibitors and beyond? Int J Clin Pract 2020;74:e13503.
8. Xu G, Yang Y, Du Y, Peng F, Hu P, Wang R, et al. Clinical
pathway for early diagnosis of COVID‑19: Updates from
experience to evidence‑based practice. Clin Rev Allergy Immunol
9. Ayyoubzadeh SM, Ayyoubzadeh SM, Zahedi H, Ahmadi M,
R Niakan Kalhori S. Predicting COVID‑19 incidence through
analysis of google trends data in Iran: Data mining and deep
learning pilot study. JMIR Public Health Surveill 2020;6:e18828.
10. James P, Das R, Jalosinska A, Smith L. Smart cities and a
data‑driven response to COVID‑19. Dialogues Hum Geogr
11. Peck KR. Early diagnosis and rapid isolation: Response to
COVID‑19 outbreak in Korea. Clin Microbiol Infect 2020;26:805‑7.
12. Prabu S, Velan B, Nelson SC, Jayasudha FV, Visu P, Janarthanan K.
Mobile technologies for contact tracing and prevention of
COVID‑19 positive cases: A cross‑ sectional study. Int J
Pervasive Comput Commun 2020 Aug 18; 8:185‑93 [doi: 10.1108/
13. Shaban WM, Rabie AH, Saleh AI, Abo‑Elsoud MA. A new
COVID‑19 Patients Detection Strategy (CPDS) based on hybrid
feature selection and enhanced KNN classifier. Knowl Based Syst
14. Syed‑Abdul S, Hsu MH, Iqbal U, SchollJ, Huang CW, Nguyen PA,
et al. Utilizing health information technology to support universal
healthcare delivery: Experience of a National Healthcare System.
Telemed J E Health 2015;21:742‑7.
15. UrbaczewskiA, LeeYJ. Information technology and the pandemic:
A preliminary multinational analysis of the impact of mobile
tracking technology on the COVID‑19 contagion control. Eur J
Inf Syst 2020;29:405‑14.
16. Lavrač N, Zupan B. Data Mining in Medicine. Data Mining
and Knowledge Discovery Handbook. Boston: Springer; 2005.
p. 1107‑37. [doi: 10.1007/0‑387‑25465‑X_52].
17. Zhang Y, Xin Y, Li Q, Ma J, Li S, Lv X, et al. Empirical study
of seven data mining algorithms on different characteristics of
datasets for biomedical classification applications. Biomed Eng
Online 2017;16:125.
18. Pan P, Li Y, Xiao Y, Han B, Su L, Su M, et al. Prognostic assessment
of COVID‑19 in the intensive care unit by machine learning
methods: Model development and validation. J Med Internet Res
19. Shanbehzadeh M, Nopour R, Kazemi‑Arpanahi H. Comparison
of four data mining algorithms for predicting colorectal cancer
risk. J Adv Med Biomed Res 2021;29:100‑8. [doi: 10.30699/jambs.
20. Albahri AS, Hamid RA, Alwan JK, Al‑Qays ZT, Zaidan AA,
Zaidan BB, et al. Role of biological data mining and machine
learning techniques in detecting and diagnosing the novel coronavirus (COVID‑19): A systematic review. J Med Syst
21. Foddai A, Lubroth J, Ellis‑Iversen J. Base protocol for
real time active random surveillance of coronavirus
disease (COVID‑19)‑Adapting veterinary methodology to public
health. One Health 2020;9:100129.
22. Xu T, Chen C, Zhu Z, Cui M, Chen C, Dai H, et al. Clinical features
and dynamics of viral load in imported and non‑imported patients
with COVID‑19. Int J Infect Dis 2020;94:68‑71.
23. Shipe ME, Deppen SA, Farjah F, Grogan EL. Developing
prediction models for clinical use using logistic regression: An
overview. J Thorac Dis 2019;11:S574‑84.
24. Gao Y, Cai GY, Fang W, Li HY, Wang SY, Chen L, et al. Machine
learning based early warning system enables accurate mortality
risk prediction for COVID‑19. Nat Commun 2020;11:1‑10.
25. Xiao C, Zheng L, Chen F, Xiao Y. Design and research of a smart
monitoring system for 2019‑nCoV infection‑contact isolated people
based on blockchain and internet of things technology. Reasearch
Square. 2020;6(3):e19399. [doi: 10.21203/rs.‑18678/v1].
26. Allam Z, Jones DS. On the coronavirus (COVID‑19) outbreak and
the smart city network: Universal data sharing standards coupled
with artificial intelligence (AI) to benefit urban health monitoring
and management. Healthcare (Basel) 2020;8:46.
27. Bayram M, Springer S, Garvey CK, Özdemir V. COVID‑19 digital
health innovation policy: A portal to alternative futures in the
making. OMICS 2020;24:460‑9.
28. Agieb R. Machine learning models for the prediction the necessity
of resorting to icu of covid‑19 patients. Int J Adv Trends Comput
Sci Eng 2020;9(5): 6980‑4.[doi: 10.1161/STROKEAHA.118.024293].
29. Govindan K, Mina H, Alavi B. A decision support system for
demand management in healthcare supply chains considering
the epidemic outbreaks: A case study of coronavirus disease
2019 (COVID‑19). Transp Res E Logist Transp Rev 2020;138:101967.
30. Bredmose PP, Diczbalis M, Butterfield E, Habig K, Pearce A,
Osbakk SA, et al. Decision support tool and suggestions for
the development of guidelines for the helicopter transport of
patients with COVID‑19. Scand J Trauma Resusc Emerg Med
31. Alakus TB, Turkoglu I. Comparison of deep learning approaches
to predict COVID‑19 infection. Chaos Solitons Fractals
32. Narin A, Kaya C, Pamuk Z. Automatic detection of
coronavirus disease (COVID‑19) using x‑ray images and
deep convolutional neural networks. arXiv 2020;3(5):10849.
[doi: 10.1080/07391102.2021.1875049].
33. Elaziz MA, Hosny KM, Salah A, Darwish MM, Lu S, Sahlol AT.
New machine learning method for image‑based diagnosis of
COVID‑19. PLoS One 2020;15:e0235187.
34. Brunese L, Mercaldo F, Reginelli A, Santone A. Explainable deep
learning for pulmonary disease and coronavirus COVID‑19
detection from x‑rays. Comput Methods Programs Biomed
35. Torrealba‑Rodriguez O , C o n d e ‑ G u t i é r r e z R A ,
Hernández‑Javier AL. Modeling and prediction of COVID‑19 in
Mexico applying mathematical and computational models. Chaos
Solitons Fractals 2020;138:109946.