Segmentation of university customers loyalty based on RFM analysis using fuzzy c-means clustering

– One of the strategic plans of the developing universities in obtaining new students is forming a partnership with surrounding high schools. However, partnerships made does not always behave as expected. This paper presented the segmentation technique to the previous new student admission dataset using the integration of recency, frequency, and monetary (RFM) analysis and fuzzy c-means (FCM) algorithm to evaluate the loyalty of the entire school that has bound the partnership with the institution. The dataset is converted using the RFM approach before processed with the FCM algorithm. The result reveals that the schools can be segmented, respectively, as high potential (SP), potential (P), low potential (CP), and very low potential (KP) categories with PCI value 0.86. From the analysis of SP, P, and CP, only 71 % of 52 school partners categorized as loyal partners.


I. INTRODUCTION
All universities must recruit new students every year. For well-known universities, this is not a problem. In contrast to developing universities, the competition for obtaining new students is an unavoidable problem. Therefore several universities implemented strategies to minimize competition; one of them was Bumigora University. The strategy implemented is making a partnership with high schools in the vicinity. The form of the partnership is a Memorandum of Understanding (MoU), which is valid for a specified period. It provides several conveniences for prospective students to register on the Bumigora University with the condition that prospective students must first obtain a recommendation letter from the school.
However, the partnership made does not always behave as expected. Therefore, it is necessary to evaluate the entire school that has bound the MoU with Bumigora University. The evaluation approach that can be applied is the evaluation of loyalty. One method that is widely applied to evaluate loyalty is the analysis of recency, frequency, and monetary (RFM) [1]- [5]. RFM analysis generally applied to customer transaction data, namely the transaction date, transaction amount, and financial amount per transaction [6], [7]. The results of RFM analysis are customer segments based on specific criteria, such as customer lifetime value [8], customer future value [9], and consumption behavior [10]. Therefore, the implementation of RFM analysis is very suitable when combined with data mining techniques [11]- [13].
Previous studies also evaluated customer loyalty by combining the RFM analysis with data mining techniques. Among the data mining techniques that widely used are clustering methods, such as k-means [1], [11], fuzzy c-means (FCM) [2], and self-organizing map (SOM) [4], [14], [15]. Besides, there are those who apply classification methods, such as J48 [1], [16], apriori algorithms [17], analytical hierarchy process (AHP) [3], C4.5, naive Bayes, and nearest neighbor algorithms [18]. Similarities between these studies all using customer transaction data as in general. While in this study, the RFM analysis will be applied to the new student admissions dataset. So it is necessary to adjust the RFM conversion rights for the dataset.
This study aimed to analyze the loyalty of schools bound to the partnership with Bumigora University using RFM analysis. RFM combined with the FCM algorithm to simplify the analysis. Since, Afrin and Tabassum [19], Al-Augby et al. [20], Taufik and Ahmad [21], Sheshasayee and Sharmila [22], and Ghosh and Dubey [23] concluded on generally that FCM could provide excellent results in customer segmentation problems. The results of segmentation are then used to analyze the loyalty of these schools. In addition, the results of segmentation can also be applied to evaluate other potential schools for making the partnership. The analysis also carried out based on the school type and regency of the school location.

A. Dataset
This research used previous new student admission data of Bumigora University, Indonesia, starting from 2011-2014 and 2016 periods. The data contains student Copyright  ID, student name, school origin, school type, and school location. The amount of data used is 1348 schools. After grouped by name, the total number of schools is reduced to 342. Among them, 52 schools have become partners of Bumigora University. The datasets fragment used in this study is addressed in Table 1. The complete data can be downloaded in the dataset attached to this paper.

B. RFM conversion
Recency, frequency, and monetary are often used in marketing management to evaluate customer loyalties. In marketing, recency (R) is computed from how often the customer was doing the transaction in one period (recency of purchase). Frequency (F) is measured from the total or the average amount of transaction that happened in one period (frequency of purchase). Monetary is measured from the average transaction value in one period (monetary value of the purchase). The higher the value, the bigger the customer's contribution to the company [24], [25].
In this study, the RFM method implemented to evaluate the loyalties of school partners in every admission period. The new student admission dataset used in this research is similar to the transaction dataset in [26], although its quality is relatively low. It does with some modifications and conversions to get the RFM model like [12].
In this research, only student school origin is modified and converted to the RFM scale.  Table 2. The range of each RFM scale obtained from statistical analysis of dataset using frequency distribution formula. The results of applying the RFM scale to the dataset are shown in Table 3.

C. FCM algorithm
This study used the FCM method to segment the customer's loyalty. The FCM segmentation method based on fuzzy logic allowing data to become a member of two or more groups depends on its membership level [21], [22], [27]. The membership level corresponds to the minimum value of objective function J as expressed at Eq. 1 where m = 1, 2, 3, ... and uij denotes data membership degree i (xi) to group cj.
The FCM algorithm is run through iterative optimization of the objective function algorithm J. It also renews the level of uij membership and centroid cluster using Eq. 2 and Eq. 3. The iteration will stop when ϵ criterion is met (Eq. 4). The FCM method expressed in Algorithm 1.

D. Validation
Partition coefficient index (PCI) used to validate the FCM clustering result. PCI evaluates data membership degrees in each cluster by ignoring its geometric information. PCI value ranges between 0-1-the bigger the value, the better the cluster's quality [28]. PCI defined in Eq. 5. Parameter uij denotes the membership of data point j in cluster i. PCI gives the best result for soft clustering algorithm like FCM clustering [29]- [31].
The groups formed are four clusters with 100 maximum iterations. The cluster center is analyzed to decide on the category of each cluster member. Cluster categories that used are high potential (SP), the potential (P), a low potential (CP), and very low potential (KP). The segmentation result then used to evaluate the loyalty of the school partner. Later, the potential criterion of another school is analyzed to gain the probability of making the MoU. The location and type of school will be used as consideration for establishing the MoU using a cross-tabulation.
The FCM algorithm is started by determining the number of clusters to be made. In this case, there are 4 clusters, namely very potential (SP), Potential (P), fair potential (CP), and less potential (KP) clusters. The initial centroid value for each cluster is selected randomly from data that has been converted on an RFM scale. Then, the distance of each data is calculated against all initial centroids using Eq. 3. The results are then used to determine the degree of membership of each data with the initial centroid using Eq. 2. The degree of membership of each data compared with the threshold value with Eq. 4. If the value of the degree of membership is smaller than the threshold, the process is terminated. However, if the conditions have not been met, the process is repeated from the calculation of the distance of each data with the updated centroid value with Eq. 3. It takes several iterations until the threshold value is exceeded, and a new centroid value is obtained, as shown in Table 4. Table 4 shows the cluster's centroid obtained from the FCM implementation. Each cluster's centroid denoted as high potential (SP), potential (P), low potential (CP), and very low potential (KP). The Partition Coefficient Index (PCI) value of those clusters is 0.86. This means that the quality of the clusters passed the validation criterion.

A. Cluster analysis
Based on Table 3 and Table 4, the students from the schools which are segmented into the SP category (C1) are registered as new students in Bumigora University almost every year. Its average number of registrants is 6-7 persons. Students from the schools which are segmented into For P category (C2) also registered with the average number of registrants are 2-3 students. In these two categories, the approach is needed to increase the number of registrants. The recency of the CP category (C3) is almost 2-3 times in the period of analysis, but the average number of applicants is low. So, schools in this category may be considered as a partner or not. While, schools in C4 which is KP category is not analyzed because it has the lowest recency, frequency, and monetary scale. Table 5 shows the school potentiality distribution. The total number of schools in C1, C2, and C3 category reached 42 % of 342 analyzed schools. From the Pareto principle, where 80 % of business comes from 20% of customers, it is found that the number of schools in the C1 cluster, which gives the most potent influence to colleges business, is only 2 %. It is still far from 20 % of the threshold value, even accumulated with C2, these clusters are still lower than 20 %.

B. Cluster analysis toward partner and non-partner schools
For the analysis of partner and non-partner school, Figure 1 and Table 6 Table 6. It concluded that from a total of 52 partners school only 37 segmented into C1, C2, and C3. The rest of the partner schools segmented into C4. This means that 71 % of partner schools are categorized as a loyal partner. Therefore, in fulfilling 20 % of the business threshold, Bumigora University management must immediately implement the MoU to the partner schools that categorized as C4 as soon as possible. The partnership should be offered especially to the rest of schools categorized as C1 and C2.

C. Cluster analysis toward cross-tabulation among partnership, school type, and school region
The types of schools evaluated in this study are the state high school (SHS), the private high school (PHS), the Islamic state high school (ISHS), the Islamic private high school (IPHS), the state vocational high school (SVHS), and the private vocational high school (PVHS) in all regencies of West Nusa Tenggara (NTB) province. Some of them come from outside NTB. Based on crosstab results in Table 6, several schools in cluster C1 still have not formed a partnership with the institution (6 out of 8 schools), and most of them located in Mataram City. The rest is located in East Lombok regency.
On cluster C2, Central Lombok regency has five schools that have formed a partnership with the institution. Compared to another regency, Central Lombok is the highest regency with the partnership. West Lombok Regency and East Lombok Regency, each of them, have six and eight schools that have not formed a partnership, respectively. Crosstab result of cluster C2 shown in Table 6, where all schools in this cluster have the potential to be the next year registrant.
Cluster C3 as shown in Table 6, dominated by nonpartner schools. Schools distribution with CP category located in East Lombok Regency (15 schools), West Lombok Regency (11 schools), West Sumbawa regency (12 schools), Mataram (8 schools), and Central Lombok Regency (8 schools). Overall, It depends on the registrant school type, most registrant from state high school, state vocational high school, and private high school, respectively. The distribution of the school areas of registrants at Bumigora University reveals that the top three demographics are East Lombok, West Lombok, and Mataram City.
All previous results and analyses can be used as a basic evaluation for the management of Bumigora University in a new student admission strategy. More specifically some strategies that can be formulated are as follows: 1) offering partnership by establishing the Memorandum of Understanding, 2) provide appropriate rewards for schools that have a contribution even though they are not yet partners, 3) asking for their opinions as stakeholders regarding their needs from the graduates of Bumigora University, 4) recommend other levels of education at Bumigora University, such as vocational degree, 5) start building relationships with schools that have not become partners, and 6) the implementation of all these policies should start from schools that in the same area as or closer to Bumigora University.
Generally, it can be concluded that the application of the FCM algorithm in this study also gives satisfactory results in customer segmentation. The result is consistent with Afrin and Tabassum [19], Al-Augby et al. [20], Taufik and Ahmad [21], Sheshasayee and Sharmila [22], and Ghosh and Dubey [23]. Although, in this study, the data type of clustered customers is not similar to the type of customers in the field of marketing in general. However, It can be seen from the PCI value of the cluster that reaches 0.86, which means the formed clusters have a good quality based on the validation criterion. The RFM approach that was implemented before the FCM method also influenced the segmentation results as indicated in [1]- [5], which shows that the application of RFM in customer segmentation also gives relatively good results.
Further research can be performed by adding more variables, such as academic ability and registrant motivation. Also, it can be compared by using other clustering and classification algorithms to determine the best method to solve the same problem, such as k-means in [1], [11], FCM in [2], and SOM in [4], [14], [15]. The result can be applied to develop a prediction or decision support system.

IV. CONCLUSION
The customers' loyalty segmentation for university can be achieved on a previous new student admission dataset using the integration of RFM and FCM algorithms. The result shows that the RFM conversion approach to the dataset can perform schools clustering into four potentiality categories, i.e. high potential (SP), potential (P), low potential (CP), and very low potential (KP) with PCI reaches 0.86. It implies that the system can be used to determine the level of customer loyalty, the possibility of each school as a potential partner, and the preparation of strategies in recruiting new students.