Tree-based homogeneous ensemble model with feature selection for diabetic retinopathy prediction

Tamunopriye Ene Dagogo-George orcid  -  Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin, Nigeria
*Hammed Adeleye Mojeed orcid scopus  -  Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin, Nigeria
Abdulateef Oluwagbemiga Balogun orcid  -  Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin, Nigeria
Modinat Abolore Mabayoje  -  Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin, Nigeria
Shakirat Aderonke Salihu  -  Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin, Nigeria
Received: 20 Feb 2020; Revised: 14 Sep 2020; Accepted: 13 Oct 2020; Published: 31 Oct 2020; Available online: 19 Oct 2020.
Fulltext Fulltext |
Open Access Copyright (c) 2020 Jurnal Teknologi dan Sistem Komputer under http://creativecommons.org/licenses/by-sa/4.0.

Citation Format:
Article Info
Section: Original Research Articles
Language: EN
Statistics: 108 45
Share:
Abstract
Diabetic Retinopathy (DR) is a condition that emerges from prolonged diabetes, causing severe damages to the eyes. Early diagnosis of this disease is highly imperative as late diagnosis may be fatal. Existing studies employed machine learning approaches with Support Vector Machines (SVM) having the highest performance on most analyses and Decision Trees (DT) having the lowest. However, SVM has been known to suffer from parameter and kernel selection problems, which undermine its predictive capability. Hence, this study presents homogenous ensemble classification methods with DT as the base classifier to optimize predictive performance. Boosting and Bagging ensemble methods with feature selection were employed, and experiments were carried out using Python Scikit Learn libraries on DR datasets extracted from UCI Machine Learning repository. Experimental results showed that Bagged and Boosted DT were better than SVM. Specifically, Bagged DT performed best with accuracy 65.38 %, f-score 0.664, and AUC 0.731, followed by Boosted DT with accuracy 65.42 %, f-score 0.655, and AUC 0.724 when compared to SVM (accuracy 65.16 %, f-score 0.652, and AUC 0.721). These results indicate that DT's predictive performance can be optimized by employing the homogeneous ensemble methods to outperform SVM in predicting DR.
Keywords: machine learning; ensemble learning; diabetic retinopathy; decision trees
  1. K. Zielinski, M. Duplaga, and D. Ingram, Information technology solutions for healthcare. Springer Science & Business Media, 2007. doi: 10.1007/1-84628-141-5
  2. S. Dua, U. R. Acharya, and P. Dua, Machine learning in healthcare informatics. Springer, 2014. doi: 10.1007/978-3-642-40017-9
  3. R. Beaglehole et al., "Improving the prevention and management of chronic disease in low-income and middle-income countries: a priority for primary health care," The Lancet, vol. 372, no. 9642, pp. 940-949, 2008. doi: 10.1016/S0140-6736(08)61404-X
  4. P. S. Kumar and S. Pranavi, "Performance analysis of machine learning algorithms on diabetes dataset using big data analytic," in International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions), Dubai, UAE, Dec. 2017, pp. 508-513. doi: 10.1109/ICTUS.2017.8286062
  5. R. Balaji, R. Duraisamy, and M. Kumar, "Complications of diabetes mellitus: A review," Drug Invention Today, vol. 12, no. 1, 2019.
  6. C. Dow et al., "Diet and risk of diabetic retinopathy: a systematic review," European journal of epidemiology, vol. 33, no. 2, pp. 141-156, 2018. doi: 10.1007/s10654-017-0338-8
  7. S. Mohammadian, A. Karsaz, and Y. M. Roshan, "A comparative analysis of classification algorithms in diabetic retinopathy screening," in 7th International Conference on Computer and Knowledge Engineering, Mashhad, Iran, Oct. 2017, pp. 84-89. doi: 10.1109/ICCKE.2017.8167934
  8. N. K. Das et al., "Investigation of alterations in multifractality in optical coherence tomographic images of in vivo human retina," Journal of Biomedical Optics, vol. 21, no. 9, 096004, 2016. doi: 10.1117/1.JBO.21.9.096004
  9. G. Mahendran and R. Dhanasekaran, "Investigation of the severity level of diabetic retinopathy using supervised classifier algorithms," Computers & Electrical Engineering, vol. 45, pp. 312-323, 2015. doi: 10.1016/j.compeleceng.2015.01.013
  10. R. Pal, J. Poray, and M. Sen, "Application of machine learning algorithms on diabetic retinopathy," in 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology, Bangalore, India, May 2017, pp. 2046-2051. doi: 10.1109/RTEICT.2017.8256959
  11. P. Sonar and K. JayaMalini, "Diabetes prediction using different machine learning approaches," in 3rd International Conference on Computing Methodologies and Communication, Erode, India, Mar. 2019, pp. 367-371. doi: 10.1109/ICCMC.2019.8819841
  12. H.-Y. Tsao, P.-Y. Chan, and E. C.-Y. Su, "Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms," BMC bioinformatics, vol. 19, no. 9, 195, 2018. doi: 10.1186/s12859-018-2277-0
  13. S. Cui, D. Wang, Y. Wang, P.-W. Yu, and Y. Jin, "An improved support vector machine-based diabetic readmission prediction," Computer Methods and Programs in Biomedicine, vol. 166, pp. 123-135, 2018. doi: 10.1016/j.cmpb.2018.10.012
  14. S. Yin and J. Yin, "Tuning kernel parameters for SVM based on expected square distance ratio," Information Sciences, vol. 370, pp. 92-102, 2016. doi: 10.1016/j.ins.2016.07.047
  15. D. Zhao, H. Liu, Y. Zheng, Y. He, D. Lu, and C. Lyu, "Whale optimized mixed kernel function of support vector machine for colorectal cancer diagnosis," Journal of Biomedical Informatics, vol. 92, 103124, 2019. doi: 10.1016/j.jbi.2019.103124
  16. A. O. Balogun, A. O. Bajeh, V. A. Orie, and W. A. Yusuf-Asaju, "Software defect prediction using ensemble learning: an ANP based evaluation method," FUOYE Journal of Engineering and Technology, vol. 3, no. 2, pp. 50-55, 2018. doi: 10.46792/fuoyejet.v3i2.200
  17. A. O. Balogun, A. M. Balogun, P. O. Sadiku, and L. Amusa, "An ensemble approach based on decision tree and bayesian network for intrusion detection," Annals. Computer Science Series, vol. 15, no. 1, pp. 82-91, 2017.
  18. S. P. Healey et al., "Mapping forest change using stacked generalization: An ensemble approach," Remote Sensing of Environment, vol. 204, pp. 717-728, 2018. doi: 10.1016/j.rse.2017.09.029
  19. N. Gurudath, M. Celenk, and H. B. Riley, "Machine learning identification of diabetic retinopathy from fundus images," in IEEE Signal Processing in Medicine and Biology Symposium, Philadelphia, USA, Dec. 2014, pp. 1-7. doi: 10.1109/SPMB.2014.7002949
  20. J. Lachure, A. Deorankar, S. Lachure, S. Gupta, and R. Jadhav, "Diabetic Retinopathy using morphological operations and machine learning," in IEEE International Advance Computing Conference, Banglore, India, Jun. 2015, pp. 617-622. doi: 10.1109/IADCC.2015.7154781
  21. S. Murugeswari and R. Sukanesh, "Investigations of severity level measurements for diabetic macular oedema using machine learning algorithms," Irish Journal of Medical Science (1971-), vol. 186, no. 4, pp. 929-938, 2017. doi: 10.1007/s11845-017-1598-8
  22. E. V. Carrera, A. González, and R. Carrera, "Automated detection of diabetic retinopathy using SVM," in IEEE XXIV International Conference on Electronics, Electrical Engineering and Computing, Cusco, Peru, Aug. 2017, pp. 1-4. doi: 10.1109/INTERCON.2017.8079692
  23. S. Somasundaram and P. Alli, "A machine learning ensemble classifier for early prediction of diabetic retinopathy," Journal of Medical Systems, vol. 41, no. 12, 201, 2017. doi: 10.1007/s10916-017-0853-x
  24. A. O. Balogun, S. Basri, S. J. Abdulkadir, and A. S. Hashim, "Performance analysis of feature selection methods in software defect prediction: a search method approach," Applied Sciences, vol. 9, no. 13, p. 2764, 2019. doi: 10.3390/app9132764
  25. S. Piri, D. Delen, T. Liu, and H. M. Zolbanin, "A data analytics approach to building a clinical decision support system for diabetic retinopathy: Developing and deploying a model ensemble," Decision Support Systems, vol. 101, pp. 12-27, 2017. doi: 10.1016/j.dss.2017.05.012
  26. Y. Yang, Temporal data mining via unsupervised ensemble learning. Elsevier, 2016. doi: 10.1016/B978-0-12-811654-8.00004-X
  27. C. Zhang and Y. Ma, Ensemble machine learning: methods and applications. Springer, 2012. doi: 10.1007/978-1-4419-9326-7
  28. A. G. Akintola, A. O. Balogun, F. Lafenwa-Balogun, and H. A. Mojeed, "Comparative analysis of selected heterogeneous classifiers for software defects prediction using filter-based feature selection methods," FUOYE Journal of Engineering and Technology, vol. 3, no. 1, pp. 134-137, 2018. doi: 10.46792/fuoyejet.v3i1.178
  29. M. A. Mabayoje, A. O. Balogun, S. M. Bello, J. O. Atoyebi, H. A. Mojeed, and A. H. Ekundayo, "Wrapper feature selection based heterogeneous classifiers for software defect prediction," Adeleke University Journal of Engineering and Technology, vol. 2, no. 1, pp. 1-11, 2019.

No citation recorded.