Comparative analysis of classification algorithms for critical land prediction in agricultural cultivation areas

The identification of critical land that has been physically, chemically, and biologically damaged usually uses a geographic information system. However, it requires a high cost to get the high resolution of satellite images. In this study, a comparison framework is proposed to determine the classification algorithms' performance, namely C.45, ID3, Random Forest, k-Nearest Neighbor, and Naïve Bayes. This research aims to find the best algorithm for the classification of critical land in agricultural cultivation areas. The results show that the Random Forest algorithm has the highest accuracy of 93.10 % in predicting critical land. The naïve Bayes has the lowest performance, with 89.32 % of accuracy in predicting critical land.


I. INTRODUCTION
Watershed is a complex system built on physical systems, biological systems, and human systems interconnected and interact with each other [1]. The condition of watersheds in Indonesia is getting worse, indicated by the increasing number of priority watersheds from year to year [2]. The damage of watersheds is caused by the use of natural resources that exceed the carrying capacity. Population growth is a consequence, and natural resource utilization policy is not principled for sustainable development [3]. The upstream part of the watershed should function as a water catchment area. However, forest and land damage and deforestation in the upstream will cause a long dry season and flooding during the rainy season and cause critical land [4].
Critical land is a land that has been damaged physically, chemically, and biologically. These damages cause the land to experience a decline in fertility [5]. Land that should be a place of production and a place for water catchment cannot properly function because it is damaged. Mapping and identification of critical land are crucial to be carried out for the planning and determination of watersheds, which are the priority of rehabilitation [6].
The indicators that cause critical land include land management, land use, rainfall, slope, and land erosion [5]. However, forest and land damage information and data damage often do not refer to database formats and structures that can be accounted for. One of the important aspects in determining the success of land mapping is determining the classification of land. However, the lack of spatial data and information affects the evaluation of the validity of critical land data [7]. The availability of accurate and informative information about the amount and distribution of degraded land has a significant meaning. Updating the degraded land data will continue to be carried out concerning the criteria and standard standards for determining and processing degraded land data. Data processing of critical land is essential to obtain the results of an inventory of critical land with high validity.
The previous method used for land classification is Geographic Information System (GIS) [8]- [10]. GIS method captures the condition of land from satellite images. The land condition score is then calculated according to the Decree of the Director-General of RRL No: 041/Kpts/V/1998 on determining critical lands [7]. The advantage of GIS is that it can capture the condition of land or area spatially by using satellite images [9]. However, it requires high costs to take the highresolution satellite images and to know the rainfall levels, at least three different monitoring stations (Landsat-5 TM 1985, Landsat-7 ETM+ 2000 and Landsat-8 OLI-TIRS 2015) with 30 m Spatial resolution were used for the analysis of the studied watershed [11], [12]. For critical land mapping, images with high spatial resolution are needed to obtain information about the earth's surface [13].
Presently, only attributes data regarding critical land are available at the Indonesia Ministry of Forestry, such that spatial distribution is difficult to know. So, the synchronization of forest and land rehabilitation programs that are multi-sectoral is difficult because the spatial analysis is one of the main tools [7]. Unavailability of spatial data and information affects the assessment of the validity of critical land data.
Copyright ©2020, JTSiskom, e-ISSN: 2338-0403, p-ISSN: 2620-4002 Submitted: 17 February 2020; Revised: 9 July 2020; Accepted: 10 July 2020; Published: 31 October 2020 However, to get reliable information from satellite data on the right target, the right classification technique is needed. Several classification methods have been optimized for the past few years [14], one of which is the data mining approach [15], [16]. Data Mining in the agriculture field is a relatively novel research field to predict critical land.
Classification is one of the primary roles of data mining that can be used to classify data with a label or class, which is called supervised learning since it requires a label or class in the process [17]. The classification predicts the label or class of a dataset. The dataset is divided into two parts in the classification process, namely training data and testing data [18]. In the training data, some data that are known are applied to create a classification model. Whereas the testing data is used to test the classification model to determine how well the accuracy of the classification model is formed. The classification model with good accuracy can be used to predict data of labels or class that has not yet been known [19].
In recent years, there have been several studies on algorithm comparison to determine classification algorithms' performance. Hall et al. [20] show that the naïve Bayes (NB) algorithm has the highest accuracy among Logistic Regression (LR), and Neural Network (NN). Kim et al. [21] show that the k-Nearest Neighbor (kNN) algorithm has better performance than the Quadratic Discriminant Analysis (QDA) and Linear Discriminant Analysis (LDA). Narayanan et al. [22] reported that Decision Tree (DT), NB and kNN algorithms are the most commonly used classification algorithms. Five classification algorithms considerably have the best performance, namely C.45, ID3, Random Forest (RF), NB, and kNN. C.45, ID3, and RF algorithms are induction of DT.
With the motivation to seek an efficient approach for critical land prediction, we propose a new approach to classify critical land with data mining. In this study, we propose a comparison framework to compare the C.45, ID3, RF, kNN, and NB algorithms' performance to predict critical land in agricultural cultivation areas. This research is expected to provide benefits in the form of a framework for further research, especially for critical land prediction.

II. RESEARCH METHODS
The proposed algorithm comparison framework can be seen in Figure 1. The proposed framework consists of critical land datasets in the agricultural cultivation area, classification algorithm, model validation, model evaluation, and model comparison. This study aims to find out the best algorithms for critical land classification.

A. Datasets
The data used for the experiment in this study is a dataset from BPDAS Pemali Jratun. This data is in the form of critical land parameter data in agricultural cultivation, with four attributes. The attributes used include land productivity, slope level, erosion hazard level, and land management, while the class used is a land criticality with a total data of 111,003 [23]. The dataset structure is shown in Table 1.

B. Classification algorithms
In this study, five classification algorithms are compared to get the best model for the classification of critical land in agricultural cultivation, namely algorithm C.45, ID3, Random Forest (RF), K-Nearest Neighbor (K-NN) and Naïve Bayes (NB). Those algorithms were chosen because those algorithms are most commonly used, as stated in [22]. The selection aims to achieve a balance between established classification algorithms used in critical land prediction. Figure 1 shows a proposed framework for the comparison of classification algorithms. The initial stage is a collection of datasets, then a comparison of classification algorithms. The distribution of training data and testing data uses stratified 10-fold cross-validation to evaluate classification algorithms' performance using accuracy, recall, and precision. In the final stage, a statistical test is performed to determine the significant differences between the classification models.

C. Model evaluation
Evaluation of the experiments' results assesses or measures how well the proposed method against other methods and whether the proposed method has a significant difference in other models' results. In this study, the evaluation models used are accuracy, recall, and precision. The accuracy value (AC) is calculated by taking the correct prediction percentage from the whole data (Eq. 1). Sensitivity or recall (R) in the field of information search measures the proportion of original positives that are correctly predicted as positive (Eq. 2). Sensitivity is related to the ability of testing to identify positive results from a number of actually positive data. Precision (P) or positive predictive value is a matrix to measure system performance in getting relevant data (Eq. 3). Precision is the amount of true positive data divided by the amount of data that is recognized as positive. TN denotes true negative, TP as true positive, FP as false positive, and FN as a false negative.

D. Model comparison
Model comparison is used to compare the performance of classification algorithms. The classification model is validated using 10-fold crossvalidation, which means that the algorithm's performance is calculated and compared directly. This approach is difficult to know whether the difference in the average value of accuracy is significantly different or not. In this study, a statistical significance test was used to determine differences in the classification model's performance. Statistical test results are test statistics and p-values, both of which can be interpreted and used in the presentation of results to measure the level of confidence or significance in differences between models. It allows stronger claims to be made as part of the model selection rather than not using a statistical hypothesis test.

III. RESULTS AND DISCUSSION
The experiments in this study used a computing platform based on Intel Core i5 2.3 GHz CPU, 4 GB RAM and Microsoft Windows 10 64-bit operating system, and RapidMiner Studio 9.3. RapidMiner was used to measure Accuracy, Recall, and Precision. This experiment used a critical land dataset consisting of five classification algorithms.
The 10-fold cross-validation test results of the five classification algorithms are expressed as confusion matrix in Table 2 for the kNN algorithm, Table 3 for C4.5, Table 4 for ID3, Table 5 for RF, and Table 6 for NB. The classification algorithms for predicting critical land in agricultural cultivation areas are summarized under the selected statistical criteria, as shown in Table 7. The accuracy of the prediction of critical lands indicates that the RF algorithm has the highest accuracy, while the lowest is the NB algorithm. It confirms [24], [25] that RF is an ensemble learning method and considered a reference due to its excellent performance. Table 7 also shows that the RF algorithm has the highest averaged recall and precision, with 92,74 % and 94.94 %, then followed by the kNN algorithm with averaged recall and precision values of 92,02 % and 93.51 %, respectively. It is consistent with [21], stating that the kNN has better performance than QDA and LDA algorithms. In contrast, the lowest performance is the NB algorithm with an accuracy value of 89.32 %, averaged recall and precision of 65.45 % and 77.68 %, respectively.
In this study, the NB has the worst classification performance due to obtaining the classification results by calculating the attribute probability values independently, which means the attribute values are not interdependent. This can disrupt the classification performance of NB in [20], [26]- [29].
This study also used a t-test to see whether there are significant differences or not between the classification models and to see whether there is a significant difference seen from the p-value. If the p-value is less than α=0.05, Copyright   then H0 is rejected, meaning significant differences between the two classification models. Table 8 shows no difference in accuracy between the C4.5 algorithm with ID3 and the kNN algorithm since there is no difference with the ID3 algorithm. Based on the t-test, RF algorithm is the best algorithm for the classification of critical land in agricultural cultivation areas.

IV. CONCLUSION
RF algorithm has the best performance for classifying critical land in agricultural cultivation areas. C4.5, ID3, and kNN algorithms have good performance, but there is no significant difference, while the NB algorithm has the lowest performance for critical land prediction. Results of the prediction of the classification of critical land can be used for decisions related to land rehabilitation.
For further research, it is suggested to make attribute selection before applying the classification model to eliminate irrelevant information so that it can improve the performance of classification algorithms and reduce computing time.