A Simple Classifier for Detecting Online Child Grooming Conversation

The massive proliferation of social media has opened possibilities for the perpetrator conducting the crime of online child grooming. Because the pervasiveness of the problem scale, it may only be tamed effectively and efficiently by using an automatic grooming conversation detection system. The current study intends to address the issue by using Support Vector Machine and k -nearest neighbors’ classifiers. Besides, the study also proposes a low-computational cost classification method, which classifies a conversation using the number of the existing grooming conversation characteristics. All proposed methods are evaluated using 150 textual conversations of which 105 are grooming, and 45 are non - grooming. We identify that grooming conversations possess 17 features of grooming characteristics. The results suggest that the SVM and k-NN can identify grooming conversations at 98.6% and 97.8% of the level of accuracy. Meanwhile, the proposed simple method has 96.8% accuracy. The empirical study also suggests that two among the seventeen characteristics are insignificant for the classification.


Introduction
Online child grooming is defined as a process to approach, persuade, and engage a child, the victim, in sexual activity by using the Internet as a medium. Perpetrators approach the victim to build not only sexual but also emotional relationship [1]. The massive proliferation of social media has opened possibilities for the perpetrators to conduct the crime of online child grooming in a larger sale [2]. According to the Child Exploitation and Online Protection Agency, online child grooming is the most reported crime in the UK in 2009-2010 [2]. It affects the victim life psychologically, physically, emotionally, behaviorally, and psycho-socially [3].
For revealing this type of crimes, investigator usually relies on the conversation texts where the grooming patterns are carefully analyzed [4]. With the vast amount of textual conversation data, the process becomes severe and requires a significant amount of time. The manual approach of investigating grooming pattern is also error Rome [4]; besides, the grooming process may take minutes, hours, days, or months [5][6][7].
For the reason described above, it is important to develop an automatic system to analyze a conversation text and to detect the possibility of the online child grooming conversation. During the last five years, a number of research works have been addressing the issue using various pattern detection schemes including using k -means clustering by Kontostathis, Edwards, and Leatherman [4], a ruled-based approach by McGhee et al. [8], Support Vector Machine (SVM) by Pandey, Khapaftis, and Manandhar [9]. Recently, Pranoto, Gunawan, and Soewito [10] developed a grooming detection system utilizing a logistic regression model. SVM method seems to work best for the text-based classification according to Ref. [11]. However, SVM has also been demonstrated for the image-based classification such as detection of corona artery disease [12] and breast cancer [13]. Reference [14] used SVM for developing an intrusion detection system. This study intends to propose a simple method to detect an online child grooming conversation. In doing so, the study firstly identifies the main

Relevant Theories 2.1. The characteristics of online child grooming
Online child grooming conversation texts are complex as it varies in duration, type, and intensity depending on the perpetrator characteristics and behavior. However, in general, O'Connel [15] and Gupta [16] have identified the typical stages in an online child grooming process. The first is the friendship forming stage. The perpetrator tries to get introduced to the child and then to establish a possibility of exchanging name, location, age, etc. Furthermore, the perpetrator inquiries other online information related to the child, requesting photos to confirm that the child is indeed a child.
The second is the relationship forming stage. The perpetrator and the child talk about family, school, interest, and hobbies of the child so that he can exploit them by deceptively making the child believes that they are in a relationship. The third is the risk assessment stage. The perpetrator tries to gauge the level of threat and danger by talking to the child. He ensures that the child is alone, and nobody else is reading their conversations.
The fourth is the exclusivity stage. The perpetrator tries to gain the complete trust of the child. Often, the concept of love and care are introduced by the perpetrator in this juncture. The fifth is the sexual stage. The perpetrator and the child talk about sexual activities and developing sex fantasy. Finally, the sixth is the conclusion stage. In this stage, the perpetrator approaches the child for a meeting in person. These stages of online child grooming may or may not occur in a sequence. The frequency, order, and extent of the occurrence of these stages may vary from chat to chat. On the basis of the previous work [10], and Refs. [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16], we have identified 17 grooming characteristics, see Table 1, and their relation to the grooming stages are presented in Table 2. These characteristics would be used to classify the online conversation texts on the current study.  [16]. 3 Asking picture. Perpetrator asks victim to send a his/her picture or vice Vera [16]. 4 Giving compliment. Perpetrator compliments the victim in order to make the victim happy and flattered [16]. 5 Talking about activity, favourite hobby, and school. Perpetrator and victim talk about daily activities, favourite hobbies and victim's school Activities [16]. 6 Talking about friend and relationship. Perpetrator and victim talk about friendships or relationships, such as, asking about relationship w ith another person [16]. If the victim is not in a relationship w ith another person, it's easier for perpetrator to get closer. 7 Asking questions to know the risk of conversation. Perpetrator tries figure out the risk of their conversation, whether their conversation is know n by victim's parental [10]. Usually, perpetrator w ill ask about anyone w ho uses victim's computer, location of the computer, and w hether victim's parents know the passw ord of the chat application. 8 Acknowledging wrong-doing. Perpetrator will inform to potential victim w hat they are doing is w rong, and have legal risks for perpetrator [10]. By telling this to victim, perpetrator has a purpose, w hich is perpetrator w ill be free from legal cases that w ill make him/her jailed in the future. 9 Asking if the child is alone or under adult or friend supervision. Perpetrator w ants to make sure the victim w hether is alone or under Supervision [2]. 10 Trying to build mutual trust. Perpetrator trying to build the mutual trust from victim, the next level relationships w ill be easier for perpetrator if perpetrator gain the trust from the victim [10], [16]. 11 Using falling in love w ords. In conversation betw een perpetrator and the victim, they use w ords to express they are in love [2], [16] . 12 Using w ord to express feeling. In a conversation between the perpetrator and victim, they use w ords to express their feelings [10]. 13 Using w ord about biology, body, intimate parts, and sexual category. In a conversation betw een the perpetrator and the victim, they use w ords that contain sexual context [10]. 14 Asking hot picture. Perpetrator asks victim for sexual theme photos or vice versa [10], [16]. These pictures can be used as fantasy or a tool to threaten victim to obey the perpetrator. 15 Introducing sexual stage. Conversation started w ith talking about sexual context, such as ask about sex experiences [10], [16].  Talking about friend and relationship 7 Risk assessment Asking questions to know the risk of conversation 8 Acknow ledging w rong-doing 9 Asking if the child is alone or under adult or friend supervision 10 Exclusivity Trying to build mutual trust 11 Using falling-in-love w ords 12 Using w ords to express feeling 13 Sexual Using w ords about biology, body, intimate parts, and sexual category 14 Asking hot picture 15 Introducing sexual stage 16 Sexual stage 17 Conclusion Arranging further contact and meetings

Support Vector Machine Classification Method
In the present study, we only use the Support Vector Machine (SVM) for linearly separable data. The SVM is a numerical method to compute a hyperplane for separating a two-class dataset. It can easily be extended to multiple-class problem. The SVM establishes the hyperplane, governed by ) , ( b w , by using the support vectors, which are the data points that are closest to the hyperplane. The following SVM formulation is derived from Refs. [17][18]; readers are advised to the two sources for detail exposition. We consider the point sets where d   w , x w, denotes the inner dot product of the vectors w and x , and b is a scalar constant. The hyperplane is obtained by solving: (  d , and 3, and  has a positive value.

k-nearest neighbor classification method
The k -Nearest Neighbor (k -NN) is an instant-based learning algorithm for the classification of the query data on the basis of the training dataset. For the classification purpose, firstly, the method selects the most similar k data to the query data from the training dataset. The term similar is usually quantified by using the Euclidian distance [19]. Secondly, the method evaluates the classes of those k -selected data. Finally, the query data is assumed belong to the dominant class of the classes.

Accuracy Indicator
The classification accuracy is computed by: where TP stands for True Positive, TN for True Negative, FP for False Positive, and FN for False Negative [20].

Research Method
The research procedure is schematically shown in Figure 1 and a few important steps are briefly explained in the following.

Dataset Preparation
Two types of conversation texts are required for this research: the first type is the conversations of actual online child grooming and the second type is the conversations of non-grooming conversations but has grooming characteristics. The first type conversations are randomly selected from Perverted Justice website [21], a website that contains more than 500 texts of grooming conversations involving perpetrators and children, juvenile victims, or undercover law enforcements. Only 105 texts are selected. The source has also been used by the previous researchers [4], [8][9][10], [22].
The second type conversations are selected from Literotic website [23]. The web contains conversation scripts of people expressing their sexual passion legally. Literotic defines that the website purpose is ``this chat room is here for adults interested in erotic subject, so be aware of that before you enter!". Fourty five non-grooming conversation texts are randomly selected from the site.

Preprocessing
The text of online conversation contains many noises from the perspective of document classification. Those noises should be minimized or eliminated, if possible, prior the analysis to determine the grooming characteristics. The all texts in this research are subjected to the following processes. Tokenization: non-letter characters in the document would be removed and each document is partitioned into words. Transformation: words in the document would be transformed into lowercase. Stopword elimination: words which frequently exists across document but not significantly useful would be erased. Stemming: words in the document would be reduced into their root using porter algorithm. Generating 3-gram: words in the document would be formed into 3-grams of 1 continuous sequence formed of 3 words from the document.

Feature Extraction
Texts that have been preprocessed would be transformed into a vector space model (VSM). The features are words or combinations of words that form the word list. The word list is denoted with

Feature Selection
Feature extraction results from each document in VSM would be used to create a grooming characteristic vector. Grooming characteristics used are 17 characteristics that have been determined in Table 1

Classification
The classification would be performed using SVM (see Subsection 2.2), k -NN (see Subsection 2.3) and our proposed method, which is based on the number of grooming characteristics in the document.

Results and Discussion
We have analyzed 150 conversation texts consisting of 105 grooming conversations, randomly taken from www.perverted-justic.com, and 45 non-grooming conversations, randomly taken from www.literotika.com. We have identified seventeen grooming characteristics by learning those grooming conversation and by considering previous works. Those characteristics are then represented in a vector space. These characteristics and their frequencies of occurrence in grooming and non-grooming conversations are presented in Table 3.  What makes automatic classification difficult is that the grooming characteristics also appear on non-grooming conversations as shown by Table 3. For example, the most prevalent characteristics, which is the 13th characteristics, "using word about biology, body, intimate parts, and sexual category" appears in 105 grooming text conversations and in 43 nongrooming text conversations.
Another insight shown by the table is that two characteristics, namely, the 14th characteristics, "asking hot picture", and the 7th characteristics, "asking question to know risk of conversation," apper rarely. We hypothesize: the two characteristics may not have significant contribution to the performance of the document automatic classification. This will be empirically evaluated.
In the following, we are going to discuss the results in term of the classification accuracy for various classification methods and with or without the 7th or 14th grooming characteristics. The training set consists of 70 grooming and 30 non-grooming conversations. The testing set consists of 35 grooming and 15 non-grooming conversations. For the first research results, we compare the level of accuracy of the results by several SVM kernel functions, namely, RBF, quadratic, polynomial, and linear functions. The results, on the average accuracy, are shown in Table 4. These results reveal some interesting phenomena, some are expected, some are unexpected. We expect that the highest level of accuracy would be achieved by using all grooming characteristics. This expectation is materialized for the three types of SVM kernels: polynomial, quadratic, and linear. Using the RBF kernel, the results are rather unexpected: the accuracies without the 7th and 14th characteristics are better than using the all characteristics. The expectation that the 7th and 14th grooming characteristics would only slightly affect the level of accuracy is only materialized for the three kernel: polynomial, quadratic, and linear. Using all grooming characteristics, the level of accuracy by means of SVM method is within the range of 83-98% depending on the selection of the kernel function. In comparison to the method utilizing the logistic regression model, see Ref. [10], the three kernels provide slightly better accuracies. The RBF kernel produces a lower accuracy than the logistic model. For the second research results, we also compare the level of accuracy by using a different classifier, that is the k -NN method with the k values of 1, 3, and 5. The results, in average, are depicted in Table 5. These results completely agree with our expectation. The highest average level of accuracy is achieved by using all grooming characteristics. This result is materialized for all values of k . The expectation that the 7th and 14th grooming characteristics will only slightly affect the level of accuracy is materialized for all of k values. The average level of accuracy in classification without the 14 th characteristics is the same with using all grooming characteristics. Using all grooming characteristics, the average level of accuracy by means of the k -NN method is within the range of 96.8-97.8% depending on the k value. However, it is not clear whether increasing the k value will increase or decrease the level of accuracy.
Finally, we propose a simple classification method, which requires very low computational cost and makes it suitable for implementation in the electronic mobile devices. The proposed method is to classify the conversation on the basis of the exi sting number of grooming characteristics. This method is proposed by observing the fact that the number of grooming characteristics are markedly different; see Table 6. Table 6. The distribution of the number of grooming characteristics in the 150 grooming and non-grooming textual conversations in the current study. The table suggests that a conversation tends to be a grooming conversation if it contains the number of grooming characteristics within the range 8-17. Meanwhile, a conversation tends to be a non-grooming conversation if it contains about 2-11 grooming characteristics. Thus, the number of grooming characteristics can simply be used as a classifier; despite the fact, there is an overlap in the number of grooming characteristics between the two categories. To evaluate a text conversation, we can set certain threshold, evaluate the number of grooming characteristics, and decide that the conversation is grooming if its number of grooming characteristics is equals or exceeds the threshold.
We empirically evaluate the method by varying the threshold value from 1 to 17. If the number of grooming characteristics in a document is less than the threshold, the conversation will be classified as a non-grooming conversation and vice versa. The results in the average accuracy are depicted in Table 7. These empirical data suggest that the highest average level of accuracy is achieved at the threshold value of 11. The best threshold provides an accuracy level of 96.8%. The expectation that the 7 th and 4th grooming characteristics would only slightly affect the level of accuracy is materialized for all of the threshold values. Finally, we compare the level of accuracy of the three classification methods: SVM, $k$-NN, and our proposal. For the SVM method, we only include the results of us ing the linear kernel as they are the best among the method. For the same reason, for the $k$-NN method, we include only the case of k =3. The comparison is presented in Table 8. The three methods support the hypothesis that the accuracy would slightly drop when the 7th and 14th grooming characteristics are excluded. In addition, these results suggest that the SVM classifier is able to classify the best in term of the accuracy. The proposed method, despite of its simplicity, also performs rather well. With empirical findings presented in the current and previous [10] works, the proposed seventeen characteristics seem to be highly representative and unique to differentiate a grooming textual conversation from a non-grooming one. Even when the characteristics are used in conjunction with a simple classification method, the classified conversation is very likely to be correct. Despite of these findings, we also note that the likelihood of the success of child online grooming increases when perpretrators employ identity deception and suggesting secrecy [24]. In the current classification framework, these behaviors have been identified as one of the grooming characteristics. However, it is of a great interest to understand to which extent the victim age, child or adult, affects the success of the current classification method, and we leave this issue as a future work. For the final note, the present work has opened a possibility for developing an automatic grooming detecting system on the devices that have low computational power.

Conclusion
Automatic system to detect online child grooming has an important role in analyzing the vast amount of conversation texts. For the reason, many studies have been performed using various pattern detection schemes. In the current work, seventeen characteristics of grooming conversation are identified and utilized for classification. Two traditional classification methods are used: SVM and k -NN. Moreover, this work proposes a simple classification method on the basis of the number of existing grooming characteristics in the conversation. The numerical analysis using empirical data suggests that the SVM method with the linear kernel is the best among others with the average level of accuracy 98.6%. Our proposed method, despite its simplicity, also performs well with the average level of accuracy 96.8%. The empirical study also suggests that two among the seventeen characteristics are insignificant for the classification accuracy.