Optimasi nilai k dan parameter lag algoritme k-nearest neighbor pada prediksi tingkat hunian hotel

Optimization of k value and lag parameter of k-nearest neighbor algorithm on the prediction of hotel occupancy rates

*Agus Subhan Akbar  -  Department of Information System, Universitas Islam Nahdlatul Ulama Jepara, Indonesia
R. Hadapiningradja Kusumodestoni  -  Department of Informatics, Universitas Islam Nahdlatul Ulama Jepara, Indonesia
Received: 30 Jan 2020; Revised: 26 Apr 2020; Accepted: 6 May 2020; Published: 31 Jul 2020; Available online: 3 Jul 2020.
Open Access Copyright (c) 2020 Jurnal Teknologi dan Sistem Komputer
License URL: http://creativecommons.org/licenses/by-sa/4.0

Citation Format:
Article Info
Section: Original Research Articles
Language: ID
Statistics: 376 82
Abstract
Hotel occupancy rates are the most important factor in hotel business management. Prediction of the rates for the next few months determines the manager's decision to arrange and provide all the needed facilities. This study performs the optimization of lag parameters and k values of the k-Nearest Neighbor algorithm on hotel occupancy history data. Historical data were arranged in the form of supervised training data, with the number of columns per row according to the lag parameter and the number of prediction targets. The kNN algorithm was applied using 10-fold cross-validation and k-value variations from 1-30. The optimal lag was obtained at intervals of 14-17 and the optimal k at intervals of 5-13 to predict occupancy rates of 1, 3, 6, 9, and 12 months later. The obtained k-value does not follow the rule at the square root of the number of sample data.
Keywords: hotel occupancy rate; kNN regression; k optimization; lag; kNN prediction

Article Metrics:

  1. P. Nadkarni and P. Nadkarni, “Core technologies: data mining and big data,” in Clinical Research Computing, Academic Press, 2016, pp. 187–204. doi: 10.1016/B978-0-12-803130-8.00010-5
  2. D. A. Adeniyi, Z. Wei, and Y. Yongquan, “Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN) classification method,” Applied Computing and Informatics, vol. 12, no. 1, pp. 90–108, 2016. doi: 10.1016/j.aci.2014.10.001
  3. B. Santosa dan A. Umam, Data Mining dan Big Data Analytics, edisi 1. Yogyakarta: Penebar Media Pustaka, 2018, pp. 110–113.
  4. S. Zhang, D. Cheng, Z. Deng, M. Zong, and X. Deng, “A novel kNN algorithm with data-driven k parameter computation,” Pattern Recognition Letters, vol. 109, pp. 44–54, 2018. doi: 10.1016/j.patrec.2017.09.036
  5. R. Goyal, P. Chandra, and Y. Singh, "Suitability of kNN regression in the development of interaction based software fault prediction models," IERI Procedia, vol. 6, pp. 15-21, 2014. doi: 10.1016/j.ieri.2014.03.004
  6. J. F. Ajao, D. O. Olawuyi, and O. O. Odejobi, "Yoruba handwritten character recognition using Freeman chain code and k-nearest neighbor classifier," Jurnal Teknologi dan Sistem Komputer, vol. 6, no. 4, pp. 129-134, 2018. doi: 10.14710/jtsiskom.6.4.2018.129-134
  7. A. M. Nagy and V. Simon, "Survey on traffic prediction in smart cities," Pervasive and Mobile Computing, vol. 50, pp. 148-163, 2018. doi: 10.1016/j.pmcj.2018.07.004
  8. A. Priadana and A. W. Murdiyanto, "Metode SURF dan FLANN untuk identifikasi nominal uang kertas Rupiah tahun emisi 2016 pada variasi rotasi," Jurnal Teknologi dan Sistem Komputer, vol. 7, no. 1, pp. 19-24, 2019. doi: 10.14710/jtsiskom.7.1.2019.19-24
  9. F. Martínez, M. P. Frías, M. D. Pérez, and A. J. Rivera, "A methodology for applying k-nearest neighbor to time series forecasting," Artificial Intelligence Review, vol. 52, no. 3, pp. 2019-2037, 2019. doi: 10.1007/s10462-017-9593-z
  10. V. Nguyen Thanh Le, B. Apopei, and K. Alameh, "Effective plant discrimination based on the combination of local binary pattern operators and multiclass support vector machine methods," Information Processing in Agriculture, vol. 6, no. 1, pp. 116-131, 2019. doi: 10.1016/j.inpa.2018.08.002
  11. M. A. Mabayoje, A. O. Balogun, H. A. Jibril, J. O. Atoyebi, H. A. Mojeed, and V. E. Adeyemo, "Parameter tuning in kNN for software defect prediction: an empirical analysis," Jurnal Teknologi dan Sistem Komputer, vol. 7, no. 4, pp. 121-126, 2019. doi: 10.14710/jtsiskom.7.4.2019.121-126
  12. S. Zhang, X. Li, M. Zong, X. Zhu, and D. Cheng, "Learning k for kNN classification," ACM Transactions on Intelligent Systems and Technology, vol. 8, no. 3:43, pp. 1-19, 2017. doi: 10.1145/2990508
  13. Y. Song, J. Liang, J. Lu, and X. Zhao, "An efficient instance selection algorithm for k nearest neighbor regression," Neurocomputing, vol. 251, pp. 26-34, 2017. doi: 10.1016/j.neucom.2017.04.018
  14. S. Zhang, "Cost-sensitive KNN classification," Neurocomputing, vol. 391, pp. 234-242, 2019. doi: 10.1016/j.neucom.2018.11.101
  15. F. Martínez, M. P. Frías, M. D. Pérez-Godoy, and A. J. Rivera, "Dealing with seasonality by narrowing the training set in time series forecasting with kNN," Expert Systems with Applications, vol. 103, pp. 38-48, 2018. doi: 10.1016/j.eswa.2018.03.005
  16. B. E. Flores, "A pragmatic view of accuracy measurement in forecasting," Omega, vol. 14, no. 2, pp. 93-98, 1986. doi: 10.1016/0305-0483(86)90013-7
  17. J. S. Armstrong, Long-range Forecasting: from crystal ball to computer, 2ed. Wiley, 1985. doi: 10.1016/0169-2070(86)90059-2
  18. Y. Cai, H. Huang, H. Cai, dan Y. Qi, "A K-nearest neighbor locally search regression algorithm for short-term traffic flow forecasting," in 9th International Conference on Modelling, Identification and Control, Kunming, China, Jul. 2017, pp. 624-629. doi: 10.1109/ICMIC.2017.8321530
  19. S. P. Mahasagara, A. Alamsyah, and B. Rikumahu, "Indonesia infrastructure and consumer stock portfolio prediction using artificial neural network backpropagation," in 5th International Conference on Information and Communication Technology (ICoIC7), Malacca City, Malaysia, May 2017, pp. 1-4. doi: 10.1109/ICoICT.2017.8074710
  20. J. Demšar et al., "Orange: data mining toolbox in Python," Journal of Machine Learning Research, vol. 14, pp. 2349-2353, 2013.
  21. F. Pedregosa et al., "Scikit-learn: machine learning in Python," Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011.
  22. JetBrains, "PyCharm: the Python IDE for professional developers by JetBrains," 2017. [online]. Available: https://www.jetbrains.com/pycharm/
  23. J. D. Hunter, "Matplotlib: A 2D graphics environment," Computing in Science & Engineering, vol. 9, no. 3, pp. 90-95, 2007. doi: 10.1109/MCSE.2007.55
  24. P. Mehta et al., "A high-bias, low-variance introduction to machine learning for physicists," Physics Reports, vol. 810, pp. 1-124, 2019. doi: 10.1016/j.physrep.2019.03.001

No citation recorded.