A proposed method for handling an imbalance data in classification of blood type based on Myers-Briggs type indicator

Ahmad Taufiq Akbar; Rochmat Husaini; Bagus Muhammad Akbar; Shoffan Saifullah

doi:10.14710/jtsiskom.2020.13625

DOI: https://doi.org/10.14710/jtsiskom.2020.13625

A proposed method for handling an imbalance data in classification of blood type based on Myers-Briggs type indicator

Ahmad Taufiq Akbar , Rochmat Husaini, Bagus Muhammad Akbar, Shoffan Saifullah

Department of Informatics, Universitas Pembangunan Nasional Veteran Yogyakarta, Indonesia

Received: 11 Jan 2020; Revised: 4 Sep 2020; Accepted: 11 Sep 2020; Available online: 16 Sep 2020; Published: 31 Oct 2020.

Citation Format:

Abstract

Blood type still leads to an assumption about its relation to some personality aspects. This study observes preprocessing methods for improving the classification accuracy of MBTI data to determine blood type. The training and testing data use 250 data from the MBTI questionnaire answers given by 250 respondents. The classification uses the k-Nearest Neighbor (k-NN) algorithm. Without preprocessing, k-NN results in about 32 % accuracy, so it needs some preprocessing to handle data imbalance before the classification. The proposed preprocessing consists of two-stage, the first stage is the unsupervised resample, and the second is the supervised resample. For the validation, it uses ten cross-validations. The result of k-Nearest Neighbor classification after using these proposed preprocessing stages has finally increased the accuracy, F-score, and recall significantly.

Fulltext View|Download Email colleagues

Keywords: imbalance data; blood type; resample; k-nearest neighbor; MBTI

Funding: Universitas Pembangunan Nasional Veteran Yogyakarta

Article Metrics:

Article Info

Section: Original Research Articles

Language : EN

In Volume 8, Issue 4, Year 2020 (October 2020)

Most viewed articles

Pembuatan Aplikasi Antar-Jemput Laundry Berbasis Web Service pada Platform Android Sistem Pengukur Suhu dan Kelembaban Ruang Server Optimization for prediction model of palm oil land suitability using spatial decision tree algorithm Perancangan Game Math Adventure Sebagai Media Pembelajaran Matematika Berbasis Android Sistem Monitoring Digital Penggunaan dan Kualitas Kekeruhan Air PDAM Berbasis Mikrokontroler ATMega328 Menggunakan Sensor Aliran Air dan Sensor Fotodiode More articles

Most cited articles

Perancangan Aplikasi Multimedia Untuk Pembelajaran Gerbang Logika Menggunakan Augmented Reality PID Parameters Auto-Tuning on GPS-based Antenna Tracker Control using Fuzzy Logic Application of Quality of Service on Internet Network using Hierarchical Token Bucket Method Location Based Service Panduan Pencarian Rumah Sakit dengan Platform Android di Kota Semarang K-means-SMOTE for handling class imbalance in the classification of diabetes with C4.5, SVM, and naive Bayes More cited articles

S. Tsuchimine, J. Saruwatari, A. Kaneda, and N. Yasui-Furukori, “ABO blood type and personality traits in healthy Japanese subjects,” PLoS One, vol. 10, no. 5, pp. 1-10, 2015. doi: 10.1371/journal.pone.0126983
A. Nahida, N. Chatterjee, and C. A. Nahida, “A study on relationship between blood group and personality,” International Journal of Home Sciences, vol. 2, no. 21, pp. 239–243, 2016
C. Y. Lee and S. Chin, “Finding EEG correlates of ABO blood types,” International Journal of Multimedia and Ubiquitous Engineering, vol. 9, no. 3, pp. 291–300, 2014
S. Bharadwaj, S. Sridhar, R. Choudhary, and R. Srinath, “persona traits identification based on myers-briggs type indicator (MBTI) - a text classification approach,” in 2018 international conference on advances in computing, communications and informatics, bangalore, india, sept. 2018, pp. 1076–1082. doi: 10.1109/ICACCI.2018.8554828
F. Noori and M. Kazemifard, “Simulation of pair programming using multi-agent and MBTI personality model,” in 6th International Conference of Cognitive Science, Tehran, Iran, Apr. 2015, pp. 29–36. doi: 10.1109/COGSCI.2015.7426665
M. S. Halawa, M. E. Shehab, and E. M. R. Hamed, “Predicting student personality based on a data-driven model from student behavior on LMS and social networks,” in 5th International Conference on Digital Information Processing and Communications, Sierre, Switzerland, Oct. 2015, pp. 294–299. doi: 10.1109/ICDIPC.2015.7323044
S. Selvi, S. Rohini, and C. Velou, “Relation between blood group and mood changes,” Indian Journal of Basic and Applied Medical Research, vol. 6, no. 3, pp. 118–125, 2017
J. Patil et al., “Influence of blood group on the character traits - A cross-sectional study on Malaysian student population,” Journal of Chemical and Pharmaceutical Sciences, vol. 9, no. 2, pp. 865–868, 2016
L. S. Katore and J. S. Umale, “Comparative study of recommendation algorithms and systems using WEKA,” International Journal of computer Applications, vol. 110, no. 3, pp. 14–17. doi: 10.5120/19295-0731
Z. Zheng, Y. Cai, and Y. Li, “Oversampling method for imbalanced classification,” Computing and Informatics, vol. 34, no. 5, pp. 1017–1037, 2015
G. N. Ramadevi, K. U. Rani, and D. Lavanya, “Evaluation of Classifiers Performance using Resampling on Breast cancer Data,” International Journal of Scientific & Engineering Research, vol. 6, no. 2, pp. 200–207, 2015
S. Zhang et al., “Efficient knn classification with different numbers of nearest neighbors,” IEEE Transactions On Neural Networks And Learning Systems, vol. 29, no. 5, pp. 1–12, 2017. doi: 10.1109/TNNLS.2017.2673241
Hartono, O. S. Sitompul, T. Tulus, and E. B. Nababan, “Biased support vector machine and weighted-SMOTE in handling class imbalance problem,” International Journal of Advances in Intelligent Informatics, vol. 4, no. 1, pp. 21–27, 2018. doi: 10.26555/ijain.v4i1.146
N. Cahyana, S. Khomsah, and A. S. Aribowo, “Improving imbalanced dataset classification using oversampling and gradient boosting,” in 5th International Conference on Science in Information Technology, Yogyakarta, Indonesia, Oct. 2019, pp. 217–222. doi: 10.1109/ICSITech46713.2019.8987499
M. Tajik, M. Malakpour, and J. G. Bidgoli, “Examine the relationship between blood groups and intercity driving jobs in Iran,” International Journal of Medical Research & Health Science., vol. 5, no. 12, pp. 292–301, 2016
V. D. Valerio, R. M. Pereira, Y. M. G. Costa, and D. Bertolini, “A resampling approach for imbalanceness on music genre classification using spectrograms,” in International Florida Artificial Intelligence Research Society Conference (FLAIRS-31), Florida, USA, May 2018, pp. 500–505
N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: synthetic minoriy over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, no. 1, pp. 321–357, 2002. doi: 10.1613/jair.953
T. E. Tallo and A. Musdholifah, “The implementation of genetic algorithm in SMOTE (synthetic minority oversampling technique) for handling imbalanced dataset problem,” in 4th International Conference on Science and Technology, Yogyakarta, Indonesia, Aug. 2018, pp. 1–4. doi: 10.1109/ICSTC.2018.8528591
H. Hairani, K. E. Saputro, and S. Fadli, “K-means-SMOTE for handling class imbalance in the classification of diabetes with C4.5, SVM, and naive Bayes,” Jurnal Teknologi dan Sistem Komputer, vol. 8, no. 2, pp. 89–93, 2020. doi: 10.14710/jtsiskom.8.2.2020.89-93
M. Al-Khaldy, “Resampling imbalanced class and the effectiveness of feature selection methods for heart failure dataset,” International Robotics & Automation Journal, vol. 4, no. 1, pp. 37–45, 2018. doi: 10.15406/iratj.2018.04.00090
J. Huang, Y. Wei, J. Yi, and M. Liu, “An improved knn based on class contribution and feature weighting,” in 10th International Conference on Measuring Technology and Mechatronics Automation, Changsha, China, Feb. 2018, pp. 313–316. doi: 10.1109/ICMTMA.2018.00083
X. Wang, Z. Jiang, and D. Yu, “an improved knn algorithm based on kernel methods and attribute reduction,” in International Conference On Instrumentation And Measurement, Computer, Communication, And Control, Qinhuangdao, China, Sept. 2015, pp. 567–570. doi: 10.1109/IMCCC.2015.125
A. More, “Survey of resampling techniques for improving classification performance in unbalanced datasets,” 2016, arXiv:1608.06048
R. Batuwita and V. Palade, “Efficient resampling methods for training support vector machines with imbalanced datasets,” in International Joint Conference on Neural Networks, Barcelona, Spain, Jul. 2010, pp. 1-8. doi: 10.1109/IJCNN.2010.5596787
A. N. Kasanah, Muladi, and U. Pujianto, “Penerapan teknik SMOTE untuk mengatasi imbalance class dalam klasifikasi objektivitas berita online menggunakan algoritma kNN,” RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 3, no. 10, pp.196-201, 2019. doi: 10.29207/resti.v3i2.945
R. Siringoringo, “K-Nearest Neighbor pada prediksi cacat,” Journal Information System Development (ISD), vol. 2, no. 1, pp. 47–58, 2017

Last update:

9th International Conference on the Development of Biomedical Engineering in Vietnam
Le Xuan Hieu, Le T. H. Toan, Ngo Thanh Hoan. IFMBE Proceedings, 95 , 2024. doi: 10.1007/978-3-031-44630-6_36
Data scaling performance on various machine learning algorithms to identify abalone sex
Willdan Aprizal Arifin, Ishak Ariawan, Ayang Armelita Rosalia, Lukman Lukman, Nabila Tufailah. Jurnal Teknologi dan Sistem Komputer, 10 (1), 2022. doi: 10.14710/jtsiskom.2021.14105
Optimal feature selection for a weighted k-nearest neighbors for compound fault classification in wind turbine gearbox
Samuel M. Gbashi, Paul A. Adedeji, Obafemi O. Olatunji, Nkosinathi Madushele. Results in Engineering, 25 , 2025. doi: 10.1016/j.rineng.2024.103791

Last update: 2026-04-23 01:36:00

No citation recorded.

Starting from 2021, the author(s) whose article is published in the JTSiskom journal attain the copyright for their article and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. By submitting the manuscript to JTSiskom, the author(s) agree with this policy. No special document approval is required.

The author(s) guarantee that:

their article is original, written by the mentioned author(s),
has never been published before,
does not contain statements that violate the law, and
does not violate the rights of others, is subject to copyright held exclusively by the author(s), is free from the rights of third parties, and the necessary written permission to quote from other sources has been obtained by the author(s).

The author(s) retain all rights to the published work, such as (but not limited to) the following rights:

Copyright and other proprietary rights related to the article, such as patents,
The right to use the substance of the article in its own future works, including lectures and books,
The right to reproduce the article for its own purposes,
The right to archive all versions of the article in any repository, and
The right to enter into separate additional contractual arrangements for the non-exclusive distribution of published versions of the article (for example, posting them to institutional repositories or publishing them in a book), acknowledging its initial publication in this journal (Jurnal Teknologi dan Sistem Komputer).

Suppose the article was prepared jointly by more than one author. Each author submitting the manuscript warrants that all co-authors have given their permission to agree to copyright and license notices (agreements) on their behalf and notify co-authors of the terms of this policy. JTSiskom will not be held responsible for anything arising because of the writer's internal dispute. JTSiskom will only communicate with correspondence authors.

Authors should also understand that their articles (and any additional files, including data sets and analysis/computation data) will become publicly available once published. The license of published articles (and additional data) will be governed by a Creative Commons Attribution-ShareAlike 4.0 International License. JTSiskom allows users to copy, distribute, display and perform work under license. Users need to attribute the author(s) and JTSiskom to distribute works in journals and other publication media. Unless otherwise stated, the author(s) is a public entity as soon as the article is published.

A proposed method for handling an imbalance data in classification of blood type based on Myers-Briggs type indicator

EDITORIAL OFFICE OF JURNAL TEKNOLOGI DAN SISTEM KOMPUTER