Tree-based homogeneous ensemble model with feature selection for diabetic retinopathy prediction

Tamunopriye Ene Dagogo-George; Hammed Adeleye Mojeed; Abdulateef Oluwagbemiga Balogun; Modinat Abolore Mabayoje; Shakirat Aderonke Salihu

doi:10.14710/jtsiskom.2020.13669

DOI: https://doi.org/10.14710/jtsiskom.2020.13669

Tree-based homogeneous ensemble model with feature selection for diabetic retinopathy prediction

Tamunopriye Ene Dagogo-George

, Hammed Adeleye Mojeed

, Abdulateef Oluwagbemiga Balogun

, Modinat Abolore Mabayoje, Shakirat Aderonke Salihu

Department of Computer Science, Faculty of Communication and Information Sciences, University of Ilorin, Nigeria

Received: 20 Feb 2020; Revised: 14 Sep 2020; Accepted: 13 Oct 2020; Available online: 19 Oct 2020; Published: 31 Oct 2020.

BibTex Citation Data :

@article{JTSISKOM13669,
    author = {Tamunopriye Ene Dagogo-George and Hammed Adeleye Mojeed and Abdulateef Oluwagbemiga Balogun and Modinat Abolore Mabayoje and Shakirat Aderonke Salihu},
    title = {Tree-based homogeneous ensemble model with feature selection for diabetic retinopathy prediction},
    journal = {Jurnal Teknologi dan Sistem Komputer},
  volume = {8},
    number = {4},
    year = {2020},
    keywords = {machine learning; ensemble learning; diabetic retinopathy; decision trees},
    abstract = {Diabetic Retinopathy (DR) is a condition that emerges from prolonged diabetes, causing severe damages to the eyes. Early diagnosis of this disease is highly imperative as late diagnosis may be fatal. Existing studies employed machine learning approaches with Support Vector Machines (SVM) having the highest performance on most analyses and Decision Trees (DT) having the lowest. However, SVM has been known to suffer from parameter and kernel selection problems, which undermine its predictive capability. Hence, this study presents homogenous ensemble classification methods with DT as the base classifier to optimize predictive performance. Boosting and Bagging ensemble methods with feature selection were employed, and experiments were carried out using Python Scikit Learn libraries on DR datasets extracted from UCI Machine Learning repository. Experimental results showed that Bagged and Boosted DT were better than SVM. Specifically, Bagged DT performed best with accuracy 65.38 %, f-score 0.664, and AUC 0.731, followed by Boosted DT with accuracy 65.42 %, f-score 0.655, and AUC 0.724 when compared to SVM (accuracy 65.16 %, f-score 0.652, and AUC 0.721). These results indicate that DT's predictive performance can be optimized by employing the homogeneous ensemble methods to outperform SVM in predicting DR.},
   issn = {2338-0403},   pages = {297--303}  doi = {10.14710/jtsiskom.2020.13669},
    url = {https://jtsiskom.undip.ac.id/article/view/13669}
}

Citation Format:

Abstract

Diabetic Retinopathy (DR) is a condition that emerges from prolonged diabetes, causing severe damages to the eyes. Early diagnosis of this disease is highly imperative as late diagnosis may be fatal. Existing studies employed machine learning approaches with Support Vector Machines (SVM) having the highest performance on most analyses and Decision Trees (DT) having the lowest. However, SVM has been known to suffer from parameter and kernel selection problems, which undermine its predictive capability. Hence, this study presents homogenous ensemble classification methods with DT as the base classifier to optimize predictive performance. Boosting and Bagging ensemble methods with feature selection were employed, and experiments were carried out using Python Scikit Learn libraries on DR datasets extracted from UCI Machine Learning repository. Experimental results showed that Bagged and Boosted DT were better than SVM. Specifically, Bagged DT performed best with accuracy 65.38 %, f-score 0.664, and AUC 0.731, followed by Boosted DT with accuracy 65.42 %, f-score 0.655, and AUC 0.724 when compared to SVM (accuracy 65.16 %, f-score 0.652, and AUC 0.721). These results indicate that DT's predictive performance can be optimized by employing the homogeneous ensemble methods to outperform SVM in predicting DR.

Fulltext View|Download Email colleagues

Keywords: machine learning; ensemble learning; diabetic retinopathy; decision trees

Funding: University of Ilorin, Nigeria

Article Metrics:

Article Info

Section: Original Research Articles

Language : EN

In Volume 8, Issue 4, Year 2020 (October 2020)

Performance Comparison of Data Mining Classification Algorithms for Early Warning System of Students Graduation Timeliness HSV image classification of ancient script on copper Kintamani inscriptions using GLRCM and SVM Computer vision for sports SVM optimization using a grid search algorithm to identify robusta coffee bean images based on circularity and eccentricity Evaluations of Emotion Analysis of Tweets using Bidirectional Long Short Term Memory and Conventional Machine Learning More related articles

Most cited articles

Pengembangan Aplikasi Manajemen Pelatihan Laboratorium Software Engineering Di Fakultas Teknik Sistem Komputer Pembuatan Aplikasi Terintegrasi, Pendataan Barang di Gudang Berbasis Android Sistem Informasi Geografis Pariwisata Kota Semarang Perancangan Game Math Adventure Sebagai Media Pembelajaran Matematika Berbasis Android Flood Prediction with Ensemble Machine Learning using BP-NN and SVM More cited articles

K. Zielinski, M. Duplaga, and D. Ingram, Information technology solutions for healthcare. Springer Science & Business Media, 2007. doi: 10.1007/1-84628-141-5
S. Dua, U. R. Acharya, and P. Dua, Machine learning in healthcare informatics. Springer, 2014. doi: 10.1007/978-3-642-40017-9
R. Beaglehole et al., "Improving the prevention and management of chronic disease in low-income and middle-income countries: a priority for primary health care," The Lancet, vol. 372, no. 9642, pp. 940-949, 2008. doi: 10.1016/S0140-6736(08)61404-X
P. S. Kumar and S. Pranavi, "Performance analysis of machine learning algorithms on diabetes dataset using big data analytic," in International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions), Dubai, UAE, Dec. 2017, pp. 508-513. doi: 10.1109/ICTUS.2017.8286062
R. Balaji, R. Duraisamy, and M. Kumar, "Complications of diabetes mellitus: A review," Drug Invention Today, vol. 12, no. 1, 2019
C. Dow et al., "Diet and risk of diabetic retinopathy: a systematic review," European journal of epidemiology, vol. 33, no. 2, pp. 141-156, 2018. doi: 10.1007/s10654-017-0338-8
S. Mohammadian, A. Karsaz, and Y. M. Roshan, "A comparative analysis of classification algorithms in diabetic retinopathy screening," in 7th International Conference on Computer and Knowledge Engineering, Mashhad, Iran, Oct. 2017, pp. 84-89. doi: 10.1109/ICCKE.2017.8167934
N. K. Das et al., "Investigation of alterations in multifractality in optical coherence tomographic images of in vivo human retina," Journal of Biomedical Optics, vol. 21, no. 9, 096004, 2016. doi: 10.1117/1.JBO.21.9.096004
G. Mahendran and R. Dhanasekaran, "Investigation of the severity level of diabetic retinopathy using supervised classifier algorithms," Computers & Electrical Engineering, vol. 45, pp. 312-323, 2015. doi: 10.1016/j.compeleceng.2015.01.013
R. Pal, J. Poray, and M. Sen, "Application of machine learning algorithms on diabetic retinopathy," in 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology, Bangalore, India, May 2017, pp. 2046-2051. doi: 10.1109/RTEICT.2017.8256959
P. Sonar and K. JayaMalini, "Diabetes prediction using different machine learning approaches," in 3rd International Conference on Computing Methodologies and Communication, Erode, India, Mar. 2019, pp. 367-371. doi: 10.1109/ICCMC.2019.8819841
H.-Y. Tsao, P.-Y. Chan, and E. C.-Y. Su, "Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms," BMC bioinformatics, vol. 19, no. 9, 195, 2018. doi: 10.1186/s12859-018-2277-0
S. Cui, D. Wang, Y. Wang, P.-W. Yu, and Y. Jin, "An improved support vector machine-based diabetic readmission prediction," Computer Methods and Programs in Biomedicine, vol. 166, pp. 123-135, 2018. doi: 10.1016/j.cmpb.2018.10.012
S. Yin and J. Yin, "Tuning kernel parameters for SVM based on expected square distance ratio," Information Sciences, vol. 370, pp. 92-102, 2016. doi: 10.1016/j.ins.2016.07.047
D. Zhao, H. Liu, Y. Zheng, Y. He, D. Lu, and C. Lyu, "Whale optimized mixed kernel function of support vector machine for colorectal cancer diagnosis," Journal of Biomedical Informatics, vol. 92, 103124, 2019. doi: 10.1016/j.jbi.2019.103124
A. O. Balogun, A. O. Bajeh, V. A. Orie, and W. A. Yusuf-Asaju, "Software defect prediction using ensemble learning: an ANP based evaluation method," FUOYE Journal of Engineering and Technology, vol. 3, no. 2, pp. 50-55, 2018. doi: 10.46792/fuoyejet.v3i2.200
A. O. Balogun, A. M. Balogun, P. O. Sadiku, and L. Amusa, "An ensemble approach based on decision tree and bayesian network for intrusion detection," Annals. Computer Science Series, vol. 15, no. 1, pp. 82-91, 2017
S. P. Healey et al., "Mapping forest change using stacked generalization: An ensemble approach," Remote Sensing of Environment, vol. 204, pp. 717-728, 2018. doi: 10.1016/j.rse.2017.09.029
N. Gurudath, M. Celenk, and H. B. Riley, "Machine learning identification of diabetic retinopathy from fundus images," in IEEE Signal Processing in Medicine and Biology Symposium, Philadelphia, USA, Dec. 2014, pp. 1-7. doi: 10.1109/SPMB.2014.7002949
J. Lachure, A. Deorankar, S. Lachure, S. Gupta, and R. Jadhav, "Diabetic Retinopathy using morphological operations and machine learning," in IEEE International Advance Computing Conference, Banglore, India, Jun. 2015, pp. 617-622. doi: 10.1109/IADCC.2015.7154781
S. Murugeswari and R. Sukanesh, "Investigations of severity level measurements for diabetic macular oedema using machine learning algorithms," Irish Journal of Medical Science (1971-), vol. 186, no. 4, pp. 929-938, 2017. doi: 10.1007/s11845-017-1598-8
E. V. Carrera, A. González, and R. Carrera, "Automated detection of diabetic retinopathy using SVM," in IEEE XXIV International Conference on Electronics, Electrical Engineering and Computing, Cusco, Peru, Aug. 2017, pp. 1-4. doi: 10.1109/INTERCON.2017.8079692
S. Somasundaram and P. Alli, "A machine learning ensemble classifier for early prediction of diabetic retinopathy," Journal of Medical Systems, vol. 41, no. 12, 201, 2017. doi: 10.1007/s10916-017-0853-x
A. O. Balogun, S. Basri, S. J. Abdulkadir, and A. S. Hashim, "Performance analysis of feature selection methods in software defect prediction: a search method approach," Applied Sciences, vol. 9, no. 13, p. 2764, 2019. doi: 10.3390/app9132764
S. Piri, D. Delen, T. Liu, and H. M. Zolbanin, "A data analytics approach to building a clinical decision support system for diabetic retinopathy: Developing and deploying a model ensemble," Decision Support Systems, vol. 101, pp. 12-27, 2017. doi: 10.1016/j.dss.2017.05.012
Y. Yang, Temporal data mining via unsupervised ensemble learning. Elsevier, 2016. doi: 10.1016/B978-0-12-811654-8.00004-X
C. Zhang and Y. Ma, Ensemble machine learning: methods and applications. Springer, 2012. doi: 10.1007/978-1-4419-9326-7
A. G. Akintola, A. O. Balogun, F. Lafenwa-Balogun, and H. A. Mojeed, "Comparative analysis of selected heterogeneous classifiers for software defects prediction using filter-based feature selection methods," FUOYE Journal of Engineering and Technology, vol. 3, no. 1, pp. 134-137, 2018. doi: 10.46792/fuoyejet.v3i1.178
M. A. Mabayoje, A. O. Balogun, S. M. Bello, J. O. Atoyebi, H. A. Mojeed, and A. H. Ekundayo, "Wrapper feature selection based heterogeneous classifiers for software defect prediction," Adeleke University Journal of Engineering and Technology, vol. 2, no. 1, pp. 1-11, 2019

Last update:

Advances in Cyber Security
Abdullateef O. Balogun, Shuib Basri, Said Jadid Abdulkadir, Saipunidzam Mahamad, Malek A. Al-momamni, Abdullahi A. Imam, Ganesh M. Kumar. Communications in Computer and Information Science, 1347 , 2021. doi: 10.1007/978-981-33-6835-4_25
RoBERTaEns: Deep Bidirectional Encoder Ensemble Model for Fact Verification
Muchammad Naseer, Jauzak Hussaini Windiatmaja, Muhamad Asvial, Riri Fitri Sari. Big Data and Cognitive Computing, 6 (2), 2022. doi: 10.3390/bdcc6020033

Last update: 2025-06-23 23:43:08

No citation recorded.

Starting from 2021, the author(s) whose article is published in the JTSiskom journal attain the copyright for their article and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. By submitting the manuscript to JTSiskom, the author(s) agree with this policy. No special document approval is required.

The author(s) guarantee that:

their article is original, written by the mentioned author(s),
has never been published before,
does not contain statements that violate the law, and
does not violate the rights of others, is subject to copyright held exclusively by the author(s), is free from the rights of third parties, and the necessary written permission to quote from other sources has been obtained by the author(s).

The author(s) retain all rights to the published work, such as (but not limited to) the following rights:

Copyright and other proprietary rights related to the article, such as patents,
The right to use the substance of the article in its own future works, including lectures and books,
The right to reproduce the article for its own purposes,
The right to archive all versions of the article in any repository, and
The right to enter into separate additional contractual arrangements for the non-exclusive distribution of published versions of the article (for example, posting them to institutional repositories or publishing them in a book), acknowledging its initial publication in this journal (Jurnal Teknologi dan Sistem Komputer).

Suppose the article was prepared jointly by more than one author. Each author submitting the manuscript warrants that all co-authors have given their permission to agree to copyright and license notices (agreements) on their behalf and notify co-authors of the terms of this policy. JTSiskom will not be held responsible for anything arising because of the writer's internal dispute. JTSiskom will only communicate with correspondence authors.

Authors should also understand that their articles (and any additional files, including data sets and analysis/computation data) will become publicly available once published. The license of published articles (and additional data) will be governed by a Creative Commons Attribution-ShareAlike 4.0 International License. JTSiskom allows users to copy, distribute, display and perform work under license. Users need to attribute the author(s) and JTSiskom to distribute works in journals and other publication media. Unless otherwise stated, the author(s) is a public entity as soon as the article is published.

Tree-based homogeneous ensemble model with feature selection for diabetic retinopathy prediction

EDITORIAL OFFICE OF JURNAL TEKNOLOGI DAN SISTEM KOMPUTER