skip to main content

Perbandingan pengukuran jarak Euclidean dan Gower pada klaster k-medoids

Comparison analysis of Euclidean and Gower distance measures on k-medoids cluster

Department of Informatics, Universitas Singaperbangsa Karawang. Jl. H.S. Ronggowaluyo, Teluk Jambe Timur, Karawang, Jawa Barat 41361, Indonesia

Received: 8 May 2020; Revised: 21 Oct 2020; Accepted: 24 Oct 2020; Available online: 27 Oct 2020; Published: 31 Jan 2021.
Open Access Copyright (c) 2021 The Authors. Published by Department of Computer Engineering, Universitas Diponegoro
Creative Commons License This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Citation Format:
Abstract
K-medoids clustering uses distance measurement to find and classify data that have similarities and inequalities. The distance measurement method selection can affect the clustering performance for a dataset. Several studies use the Euclidean and Gower distance as measurement methods in numerical data clustering. This study aims to compare the performance of the k-medoids clustering on a numerical dataset using the Euclidean and Gower distance. This study used seven numerical datasets and Silhouette, Dunn, and Connectivity indexes in the clustering evaluation. The Euclidean distance is superior in two values of Silhouette and Connectivity indexes so that Euclidean has a good data grouping structure, while the Gower is superior in Dunn index showing that the Gower has better cluster separation compared to Euclidean. This study shows that the Euclidean distance is superior to the Gower in applying the k-medoids algorithm with a numeric dataset.
Keywords: clustering; data mining; Euclidean; Gower; k-medoids
Funding: Universitas Singaperbangsa Karawang

Article Metrics:

  1. I. Kamila, U. Khairunnisa, and M. Mustakim, "Perbandingan algoritma k-means dan k-medoids untuk pengelompokan data transaksi bongkar muat di provinsi Riau," Jurnal Ilmiah Rekayasa dan Manajemen Sistem Informasi, vol. 5, no. 1, pp. 119-125, 2019. doi: 10.24014/rmsi.v5i1.7381
  2. M. Anggara, H. Sujiani, and H. Nasution, "Pemilihan distance measure pada k-means clustering untuk pengelompokkan member di alvaro fitness," Jurnal Sistem dan Teknologi Informasi, vol. 1, no. 1, pp. 1-6, 2016
  3. D. F. Pramesti, M. T. Furqon, and C. Dewi, "Implementasi metode k-medoids clustering untuk pengelompokan data potensi kebakaran hutan / lahan berdasarkan persebaran titik panas (hotspot)," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 1, no. 9, pp. 723-732, 2017
  4. D. Marlina, N. F. Putri, A. Fernando, and A. Ramadhan, "Implementasi algoritma k-medoids dan k-means untuk pengelompokkan wilayah sebaran cacat pada anak," Jurnal Coreit, vol. 4, no. 2, pp. 64-71, 2018. doi: 10.24014/coreit.v4i2.4498
  5. F. R. Senduk, I. Indwiarti, and F. Nhita, "Clustering of earthquake prone areas in Indonesia using k-medoids algorithm," Indonesian Journal of Computing, vol. 4, no. 3, pp. 65-76, 2019
  6. R. D. Ramadhani and D. A. Januarita, "Evaluasi k-means dan k-medoids pada dataset kecil," in Seminar Nasional Informatika dan Aplikasinya, Bandung, Indonesia, Sept. 2019, pp. 20-24, 2017
  7. W. Budiaji and F. Leisch, "Simple k-medoids partitioning algorithm for mixed variable data," Algorithms, vol. 12, no. 117, pp. 1-15, 2019. doi: 10.3390/a12090177
  8. S. Pandit and S. Gupta, "A comparative study on distance measuring approaches for clustering," International Journal of Research in Computer Science, vol. 2, no. 1, pp. 29-31, 2011
  9. S. Dahal, "Effect of different distance measures in result of cluster analysis," Master thesis, Aalto University School of Engineering, Finland, 2015
  10. M. Mohibullah, M. Z. Hossain, and M. Hasan, "Comparison of Euclidean distance function and manhattan distance function using k-mediods," International Journal of Computer Science and Information Security, vol. 13, no. 10, pp. 61-71, 2015
  11. A. Aditya, I. Jovian, and B. N. Sari, "Implementasi k-means clustering ujian nasional sekolah menengah pertama di Indonesia Tahun 2018/2019," Jurnal Media Informatika Budidarma, vol. 4, no. 1, p. 51, 2020. doi: 10.30865/mib.v4i1.1784
  12. W. Gautama, "Analisis pengaruh penggunaan manhattan distance pada algoritma clustering isodata (self-organizing data analysis technique) untuk sistem deteksi anomali trafik," Skripsi, Telkom University, Indonesia, 2015
  13. Z. Mustofa and I. S. Suasana, "Algoritma clustering k-medoids pada e-goverment bidang information and communication technology dalam penentuan status edgi," Jurnal Teknologi Informasi dan Komunikasi, vol. 9, no. 1, pp. 1-10, 2018
  14. U. Rani and S. Sahu, "Comparison of clustering techniques for measuring similarity in articles," in 3rd International Conference on Computational Intelligence & Communication Technology, Ghaziabad, India, Feb. 2017, pp. 1711-1718. doi: 10.1109/CIACT.2017.7977377
  15. B. Ali and Y. Massmoudi, "K-means clustering based on gower similarity coefficient: A comparative study," in 5th International Conference on Modeling, Simulation and Applied Optimization, Hammamet, Tunisia, Apr. 2013, pp. 1-5. doi: 10.1109/ICMSAO.2013.6552669
  16. M. Nishom, "Perbandingan akurasi Euclidean distance, minkowski distance, dan manhattan distance pada algoritma k-means clustering berbasis chi-square," Jurnal Informatika, vol. 4, no. 1, pp. 20-24, 2019. doi: 10.30591/jpit.v4i1.1253
  17. D. Sinwar and R. Kaushik, "Study of Euclidean and manhattan distance metrics using simple k-means clustering,"International Journal for Research in Applied Science and Engineering Technology, vol. 2, no. 5, pp. 270-274, 2014
  18. A. S. Sunge, Y. Heryadi, Y. Religia, and L. lukas, "Comparison of distance function to performance of k-medoids algorithm for clustering," in International Conference on Smart Technology and Applications, Surabaya, Indonesia, Feb. 2020, pp. 1-6. doi: 10.1109/ICoSTA48221.2020.1570615793
  19. R. I. Fajriah, H. Sutisna, and B. K. Simpony, "Perbandingan distance space manhattan dengan euclidean pada k-means clustering dalam menentukan promosi," Indonesian Journal on Computer and Information Technology, vol. 4, no. 1, pp. 36-49, 2019
  20. S. Godara, R. Singh, and S. Kumar, "Proposed density based clustering with weighted Euclidean distance," International Journals of Advanced Research in Computer Science and Software Engineering, vol. 7, no. 6, pp. 409-412, 2017. doi: 10.23956/ijarcsse/V7I6/0190
  21. Z. Šulc, J. Procházka, and M. Matějka, "Modifications of the gower similarity coefficient," in Applications of Mathematics and Statistics in Economics, Banska Stiavnica, Slovakia, Sept. 2016, pp. 369-377
  22. Z. Anna, "Acceleration of k-means clustering by dijkstra method for graph partitioning," Thesis, School of Information Science Nara Institute Science and Teknology, Japan, 2015
  23. J. van den Hoven, "Clustering with optimised weights for Gower's metric," Thesis, University Amsterdam, Netherlands, 2016
  24. K. H. Jung et al., "Cluster analysis of child homicide in South Korea," Child Abuse & Neglect, vol. 101, 104322, 2020. doi: 10.1016/j.chiabu.2019.104322
  25. A. N. Sadovski, "Detection of similar homoclimates by numerical analysis," Bulgarian Journal of Soil Science, vol. 4, no. 1, pp. 69-75, 2019
  26. A. Nowak-Brzezinska and T. Rybotycki, "Comparison of similarity measures in context of rules clustering," in IEEE International Conference on INnovations in Intelligent SysTems and Applications, Gdynia, Poland, Jul. 2017, pp. 235-240. doi: 10.1109/INISTA.2017.8001163
  27. N. N. Mohammed and A. M. Abdulazeez, "Evaluation of partitioning around medoids algorithm with various distances on microarray data," in IEEE International Conference on Internet of Things (iThings), Exeter, UK, Jun. 2007, pp. 1011-1016. doi: 10.1109/iThings-GreenCom-CPSCom-SmartData.2017.155
  28. C. W. Putra and R. Rian, "Implementasi data mining pemilihan pelanggan potensial menggunakan algoritma k-means," INTECOMS: Journal of Information Technology and Computer Science, vol. 1, no. 1, pp. 72-77, 2018. doi: 10.31539/intecoms.v1i1.141
  29. F. L. Sibuea and A. Sapta, "Pemetaan siswa berprestasi menggunakan metode k-means clustering," JURTEKSI, vol. 4, no. 1, pp. 85-92, 2017. doi: 10.33330/jurteksi.v4i1.28
  30. R. Fitriani and N. Rosmawanti, "Penerapan algoritma euclidean distance untuk pemilihan paket internet berdasarkan wilayah," Progresif, vol. 13, no. 1, pp. 1651-1662, 2017
  31. A. Skabar, "Clustering mixed-attribute data using random walk," Procedia Computer Science, vol. 108, pp. 988-997, 2017. doi: 10.1016/j.procs.2017.05.083
  32. N. Putu, E. Merliana, and A. J. Santoso, "Analisa penentuan jumlah cluster terbaik pada metode k-means," in Seminar Nasional Multi Disiplin Ilmu, Semarang, Indonesia, Aug. 2015, pp. 978-979
  33. A. D. Savitri, F. A. Bachtiar, and N. Y. Setiawan, "Segmentasi pelanggan menggunakan metode k-means clustering berdasarkan model rfm pada klinik kecantikan (studi kasus : Belle Crown Malang)," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 9, pp. 2957-2966, 2018
  34. A. C. Benabdellah, A. Benghabrit, and I. Bouhaddou, "A survey of clustering algorithms for an industrial context," Procedia Computer Science, vol. 148, pp. 291-302, 2019. doi: 10.1016/j.procs.2019.01.022
  35. Z. Šulc, M. Matějka, J. Procházka, and H. Řezanková, "Evaluation of the Gower coefficient modifications in hierarchical clustering," Metodološki Zvezki, vol. 14, no. 1, pp. 37-48, 2017
  36. M. R. Šikonja, "Dataset comparison workflows," International Journal of Data Science, vol. 3, no. 2, p. 126, 2018. doi: 10.1504/IJDS.2018.10013385
  37. A. F. Khairati, A. A. Adlina, G. F. Hertono, and B. D. Handari, "Kajian indeks validitas pada algoritma k-means enhanced dan k-means MMCA," Proseding Seminar Nasional Matematika, vol. 2, pp. 161-170, 2019
  38. S. M. Kim, M. I. Peña, M. Moll, G. Giannakopoulos, G. N. Bennett, and L. E. Kavraki, "An evaluation of different clustering methods and distance measures used for grouping metabolic pathways," in International Conference on Bioinformatics and Computational Biology, Kuala Lumpur, Malaysia, Feb. 2016, pp. 115-122

Last update:

  1. Establish a trend fuzzy information granule based short-term forecasting with long-association and k-medoids clustering

    Fang Li, Weihua Lu, Xiyang Yang, Chong Guo. Journal of Intelligent & Fuzzy Systems, 44 (1), 2023. doi: 10.3233/JIFS-222721
  2. Text Mining for Employee Candidates Automatic Profiling Based on Application Documents

    Adhi Dharma Wibawa, Arni Muarifah Amri, Arbintoro Mas, Syahrul Iman. EMITTER International Journal of Engineering Technology, 2022. doi: 10.24003/emitter.v10i1.679

Last update: 2024-11-17 07:04:41

No citation recorded.