Identification of the distribution village maturation: Village classification using Density-based spatial clustering of applications with noise

Okfalisa Okfalisa; Angraini Angraini; Shella Novi; Hidayati Rusnedy; Lestari Handayani; Mustakim Mustakim

doi:10.14710/jtsiskom.2021.13998

DOI: https://doi.org/10.14710/jtsiskom.2021.13998

Identification of the distribution village maturation: Village classification using Density-based spatial clustering of applications with noise

Okfalisa Okfalisa¹

, Angraini Angraini^{2, 3}

, Shella Novi¹, Hidayati Rusnedy¹, Lestari Handayani^{1, 4}, Mustakim Mustakim³

¹Informatics Engineering Department, Universitas Islam Negeri Sultan Syarif Kasim Riau. Jl. HR. Soebrantas Panam Km. 15 No. 155, Tuah Madani, Kec. Tampan, Kampar Regency, Riau 28293, Indonesia

²School of Computing, Faculty Engineering, Universiti Teknologi Malaysia. UTM Johor Bahru, Johor 81310, Malaysia

³Information System Department, Universitas Islam Negeri Sultan Syarif Kasim Riau. Jl. HR. Soebrantas Panam Km. 15 No. 155, Tuah Madani, Kec. Tampan, Kampar Regency, Riau 28293, Indonesia

⁴ Prism Lab, Insa Center Val de Loire. 88 Boulevard Lahitolle, Bourges 18000, France

View all affiliations

Received: 3 Dec 2020; Revised: 19 Mar 2021; Accepted: 24 Apr 2021; Available online: 26 Apr 2021; Published: 31 Jul 2021.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Citation Format:

Abstract

The rural development measurement is undoubtedly not easy due to its particular needs and conditions. This study classifies village performance from social, economic, and ecological indices. One thousand five hundred ninety-one villages from the Community and Village Empowerment Office at Riau Province, Indonesia, are grouped into five village maturation classes: very under-developed village, under-developed village, developing village, developed village, and independent village. To date, Density-based spatial clustering of applications with noise (DBSCAN) is utilized in mining 13 of the villages’ attributes. Python programming is applied to analyze and evaluate the DBSCAN activities. The study reveals the grouping’s silhouette coefficient values at 0.8231, thus indicating the well-being clustering performance. The epsilon and minimum points values are considered in DBSCAN evaluation with percentage splits simulation. This grouping can be used as guidelines for governments in analyzing the distribution of rural development subsidies more optimal.

Note: This article has supplementary file(s).

Fulltext View|Download | Dataset, Data Analysis

Supplementary Data

Subject	The collected data of villages at Riau Province from the year 2018 and the results of DBSCAN analysis of villages classification on three main attributes, namely IKS, IKL, and IKE
Type	Dataset, Data Analysis
	Download (299KB) Indexing metadata

Email colleagues

Keywords: clustering; density-based spatial clustering of applications with noise; Python; silhouette coefficient;village maturity

Funding: Universitas Islam Negeri Sultan Syarif Kasim Riau, Indonesia;Riau Province Community and Village Empowerment Service, Indonesia;Universiti Teknologi Malaysia;Insa Center Val de Loire, Bourges, France

Article Metrics:

Article Info

Section: Original Research Articles

Language : EN

In Volume 9, Issue 3, Year 2021 (July 2021)

Most viewed articles

Decision Support System in Recommending Best Unit in PDAM Tirta Lihou Using Promethee Method Data Mining using Apriori Algorithm for Product Recommendation for Customers K-means-SMOTE for handling class imbalance in the classification of diabetes with C4.5, SVM, and naive Bayes Decision Support System for Thesis Graduation Recommendation Using AHP-TOPSIS Method Teka-teki Unsur Kimia sebagai Media Pembelajaran Kimia Interaktif bagi siswa SMA Kelas X Berbasis Android More articles

Most cited articles

Decision Support System for Subsidizing the Repair Cost of Containers Damage Using Naive Bayes K-means-SMOTE for handling class imbalance in the classification of diabetes with C4.5, SVM, and naive Bayes Perancangan Sistem Manajemen Restoran dengan Aplikasi Pemesanan Restoran Berbasis Mobile dalam Jaringan Lokal Flood Prediction with Ensemble Machine Learning using BP-NN and SVM Sistem Pendeteksi Kualitas Daging Dengan Ekualisasi Histogram Dan Thresholding Berbasis Android More cited articles

H. S. Bakti, “Identifikasi masalah dan potensi desa berbasis indek desa membangun (IDM) di desa Gondowangi kecamatan Wagir kabupaten Malang,” Wiga: Jurnal Penelitian Ilmu Ekonomi., vol. 7, no. 1, pp. 1–14, 2018. doi: 10.30741/wiga.v7i1.331
M. Stit, N. Kusuma, And E. Purwanti, “Village index analysis building to know the village development in Gadingrejo district Pringsewu District,” Inovasi Pembangunan: Jurnal Kelitbangan, vol. 6, no. 2, pp. 179–190, 2018. doi: 10.30741/wiga.v7i1.331
A. Aprianti, M. Marliani, Y. Yunindyawati, and F. Nomaini, “Pengaruh program satu desa satu PAUD,” thesis, Sriwijaya University, Indonesia. 2018
G. Bathla, H. Aggarwal, And R. Rani, “A novel approach for clustering big data based on Mapreduce,” International Journal of Electrical and Computer Engineering, vol. 8, no. 3, pp. 1711–1719, 2018. doi: 10.11591/ijece.v8i3.pp1711-1719
A. Amelio and A. Tagarelli, Data Mining : Clustering. Encyclopedia of Bioinformatics and Computational Biology, 2018
R. Filipovych et al., “Semi-supervised cluster analysis of imaging data,” NeuroImage, vol. 54, pp. 2185-2197, 2011. doi: 10.1016/j.neuroimage.2010.09.074
A. Bewley and B. Upcroft, “Advantages of exploiting projection structure for segmenting dense 3D point clouds,” in Australasian Conference on Robotics and Automation, Sydney, Australia, Dec. 2013, pp. 2–4
J. R. Saura, “Using data sciences in digital marketing: framework, methods, and performance metrics,” Journal of Innovation & Knowledge, vol. 6, no. 2, pp. 92-102, 2020. doi: 10.1016/j.jik.2020.08.001
Y. Yang, E. W. K. See-To, and S. Papagiannidis, “You have not been archiving emails for no reason! Using big data analytics to cluster B2B interest in products and services and link clusters to financial performance,” Industrial Marketing Management, vol. 86, 2018, pp. 16–29, 2020. doi: 10.1016/j.indmarman.2019.01.016
N. Tomasevic, N. Gvozdenovic, and S. Vranes, “An overview and comparison of supervised data mining techniques for student exam performance prediction,” Computers & Education, vol. 143, 103676, 2020. doi: 10.1016/j.compedu.2019.103676
M. C. Thomas, W. Zhu, and J. A. Romagnoli, “Data mining and clustering in chemical process databases for monitoring and knowledge discovery,” Journal of Process Control, vol. 67, pp. 160–175, 2018. doi: 10.1016/j.jprocont.2017.02.006
S. Zheng and J. Zhao, “A new unsupervised data mining method based on the stacked autoencoder for chemical process fault diagnosis,” Computers and Chemical Engineering, vol. 135, 106755, 2020. doi: 10.1016/j.compchemeng.2020.106755
Y. Guo, N. Wang, Z. Y. Xu, and K. Wu, “The internet of things-based decision support system for information processing in intelligent manufacturing using data mining technology,” Mechanical Systems and Signal Processing, vol. 142, 106630, 2020. doi: 10.1016/j.ymssp.2020.106630
G. Grigoras and F. Scarlatache, “An assessment of the renewable energy potential using a clustering based data mining method. Case study in Romania,” Energy, vol. 81, pp. 416–429, 2015. doi: 10.1016/j.energy.2014.12.054
L. Kaufman and P.J. Rousseeuw, Finding groups in data: an introduction to cluster analysis, volume (344). John Wiley & Sons, 2009
G. Karypis, E. H. Han, and V. Kumar, “Chameleon: Hierarchical clustering using dynamic modeling,” Computer, vol. 32, no. 8, pp. 68–75, 1999. doi: 10.1109/2.781637
D. M. Saputra, D. Saputra, and L. D. Oswari, “Effect of distance metrics in determining k-value in k-means clustering using elbow and silhouette method,” in Sriwijaya International Conference on Information Technology and Its Applications, Palembang, Indonesia, Nov. 2019, pp. 341–346. doi: 10.2991/aisr.k.200424.051
S. Wang, D. Wang, C. Li, Y. Li, and G. Ding, “Clustering by fast search and find of density peaks with data field,” Chinese Journal of Electronics, vol. 25, no. 3, pp. 397–402, 2016. doi: 10.1049/cje.2016.05.001
H. P. Kriegel, P. Kröger, J. Sander, and A. Zimek, “Density-based clustering,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 1, no. 3, pp. 231–240, 2011. doi: 10.1002/widm.30
M. M. R. Khan, M. A. B. Siddique, R. B. Arif, and M. R. Oishe, “ADBSCAN: Adaptive density-based spatial clustering of applications with noise for identifying clusters with varying densities,” in 4th International Conference on Electrical Engineering and Information and Communication Technology, Dhaka, Bangladesh, Sept. 2019, pp. 107–111. doi: 10.1109/CEEICT.2018.8628138
P. B. Nagpa and P. A. Mann, “Comparative study of density-based clustering algorithms,” International Journal of Computer Applications, vol. 27, no. 11, pp. 44–47, 2011. doi: 10.5120/3341-4600
M. Esther, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” KDD-96 Proceedings, vol. 96, no. 34, pp. 226–231, 1996
R. Arya and G. Sikka, “An optimized approach for density based spatial clustering application with noise,” in ICT and Critical Infrastructure: Proceedings of the 48th Annual Convention of the Computer Society of India, vol I, 2014, pp. 695-702. doi: 10.1007/978-3-319-03107-1_76
B. Borah and D. Bhattacharyya, “An improved sampling-based DBSCAN for large spatial databases,” in Intelligent Sensing and Information Processing, Chennai, India, Jan. 2004, pp. 92-96. doi: 10.1109/ICISIP.2004.1287631
B.Z. Qiu, X.Z. Zhang, and J.Y.I Shen, “Grid-based clustering algorithm for multi-density,” in International Conference on Machine Learning and Cybernetics, Guangzhou, China, Aug. 2005, pp. 1509–1512. doi: 10.1109/ICMLC.2005.1527183
C. Xiaoyun, M. Yufang, Z. Yan, and W. Ping, “GMDBSCAN: Multi-density DBSCAN cluster based on grid,” in IEEE International Conference on e-Business Engineering, Xi’an, China, Oct. 2008, pp. 780–783. doi: 10.1109/ICEBE.2008.54
A. Rodriguez and A. Laio, “Clustering by fast search and find of density peaks,” Science, Vol. 344, no. 6191, pp. 1492–1496, 2014. doi: 10.1126/science.1242072
L. Yinghua et al., “An efficient and scalable density-based clustering algorithm for datasets with complex structures,” Neurocomputing, vol. 171, pp. 9–22, 2016. doi: 10.1016/j.neucom.2015.05.109
C. Deng, J. Song, R. Sun, S. Cai, and Y. Shi, “Griden: An effective grid-based and density-based spatial clustering algorithm to support parallel computing,” Pattern Recognition Letters, vol. 109, pp. 81–88, 2018. doi: 10.1016/j.patrec.2017.11.011
G. Andrade, G. Ramos, D. Madeira, R. Sachetto, R. Ferreira, and L. Rocha, “G-DBSCAN: A GPU accelerated algorithm for density-based clustering,” Procedia Computer Science, vol. 18, pp. 369–378, 2013. doi: 10.1016/j.procs.2013.05.200
M. Hosseini-Rad and M. Abdolrazzagh-Nezhad, “A new hybridization of DBSCAN and fuzzy earthworm optimization algorithm for data cube clustering,” Soft Computing, vol. 24, no. 20, pp. 15529–15549, 2020. doi: 10.1007/s00500-020-04881-0
H. Hanibal et al., Indeks desa membangun kementrian desa, pembangunan daerah tertinggal dan transmigrasi. Jakarta, Indonesia, 2015
O. Okfalisa, R. Fitriani, and Y. Vitriani, “The comparison of linear regression method and k-nearest neighbors in scholarship recipient,” in 19th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, Busan, Korea, Jun. 2018, pp. 194–199. doi: 10.1109/SNPD.2018.8441068
O. Okfalisa, I. Gazalba, M. Mustakim, and N. G. I. Reza, “Comparative analysis of k-nearest neighbor and modified k-nearest neighbor algorithm for data classification,” in International Conferences on Information Technology, Information Systems and Electrical Engineering, Yogyakarta, Indonesia, Nov. 2017, pp. 294–298. doi: 10.1109/ICITISEE.2017.8285514
H. Yan, N. Yang, Y. Peng, and Y. Ren, “Data mining in the construction industry: Present status, opportunities, and future trends,” Automation in Construction, vol. 119, no. August 2019, 103331, 2020. doi: 10.1016/j.autcon.2020.103331
Han, Jiawei, J. Pei, and M. Kamber, Data mining: concepts and techniques. Elsevier, 2011
E. Sharma, M. Mussetta and W. Elmenreich, “Investigating the impact of data quality on the energy yield forecast using data mining techniques,” in 2020 IEEE PES Innovative Smart Grid Technologies Europe, The Hague, Netherlands, Oct. 2020, pp. 599-603. doi: 10.1109/ISGT-Europe47291.2020.9248920
P. Bafna, D. Pramod, and A. Vaidya, “Document clustering: TF-IDF approach,” in International Conference on Electrical, Electronics, and Optimization Techniques, Chennai, India, Mar. 2016. doi: 10.1109/ICEEOT.2016.7754750
S.R. Kannan, “A new segmentation system for MR brain images based on fuzzy techniques,” Applied Soft Computing Journal, vol. 8, no. 4, pp. 1599– 1606, 2008. doi: 10.1016/j.asoc.2007.10.025
P.J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,” Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987. doi: 10.1016/0377-0427(87)90125-7
V. T. P. Swindiarto, R. Sarno, and D. C. R. Novitasari, “Integration of Fuzzy C-Means Clustering and TOPSIS (FCM-TOPSIS) with silhouette analysis for multi criteria parameter data,” in International Seminar on Application for Technology of Information and Communication, Semarang, Indonesia, Sept. 2018, pp. 463–468. doi: 10.1109/ISEMANTIC.2018.8549844
B. Rozemberczki, O. Kiss, and R. Sarkar, “Karate club: an api oriented open-source python framework for unsupervised learning on graphs,” in 29th ACM International Conference on Information & Knowledge Management, Virtual Event, Ireland, Oct. 2020, pp. 3125–3132. doi: 10.1145/3340531.3412757
P. Virtanen et al., “SciPy 1.0: fundamental algorithms for scientific computing in Python,” Nature Methods, vol. 17, no. 3, pp. 261–272, 2020
Y. M. Elbarawy, R. F. Mohamed, and N. I. Ghali, “Improving social network community detection using DBSCAN algorithm,” in World Symposium on Computer Applications and Research, Sousse, Tunisia, Jan. 2014, pp. 1-6. doi: 10.1109/WSCAR.2014.6916792
M. Khatoon and W. A. Banu, “An efficient method to detect communities in social networks using DBSCAN algorithm,” Social Network Analysis and Mining, vol. 9, no. 1, pp. 1-12, 2019. doi: 10.1007/s13278-019-0554-1
Y. Xie and S. Shekhar, “Significant DBSCan towards statistically robust clustering,” in ACM International Conference Proceeding Series, Vienna, Austria, Aug. 2019, pp. 31–40. doi: 10.1145/3340964.3340968

Last update:

No citation recorded.

Last update: 2025-10-06 01:37:43

No citation recorded.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Starting from 2021, the author(s) whose article is published in the JTSiskom journal attain the copyright for their article and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. By submitting the manuscript to JTSiskom, the author(s) agree with this policy. No special document approval is required.

The author(s) guarantee that:

their article is original, written by the mentioned author(s),
has never been published before,
does not contain statements that violate the law, and
does not violate the rights of others, is subject to copyright held exclusively by the author(s), is free from the rights of third parties, and the necessary written permission to quote from other sources has been obtained by the author(s).

The author(s) retain all rights to the published work, such as (but not limited to) the following rights:

Copyright and other proprietary rights related to the article, such as patents,
The right to use the substance of the article in its own future works, including lectures and books,
The right to reproduce the article for its own purposes,
The right to archive all versions of the article in any repository, and
The right to enter into separate additional contractual arrangements for the non-exclusive distribution of published versions of the article (for example, posting them to institutional repositories or publishing them in a book), acknowledging its initial publication in this journal (Jurnal Teknologi dan Sistem Komputer).

Suppose the article was prepared jointly by more than one author. Each author submitting the manuscript warrants that all co-authors have given their permission to agree to copyright and license notices (agreements) on their behalf and notify co-authors of the terms of this policy. JTSiskom will not be held responsible for anything arising because of the writer's internal dispute. JTSiskom will only communicate with correspondence authors.

Authors should also understand that their articles (and any additional files, including data sets and analysis/computation data) will become publicly available once published. The license of published articles (and additional data) will be governed by a Creative Commons Attribution-ShareAlike 4.0 International License. JTSiskom allows users to copy, distribute, display and perform work under license. Users need to attribute the author(s) and JTSiskom to distribute works in journals and other publication media. Unless otherwise stated, the author(s) is a public entity as soon as the article is published.

Identification of the distribution village maturation: Village classification using Density-based spatial clustering of applications with noise

EDITORIAL OFFICE OF JURNAL TEKNOLOGI DAN SISTEM KOMPUTER