Decision Tree Algorithm for Precision Marketing via Network Channel

With the development of e-commerce, more and more enterprises attach importance to precision marketing for network channels. This study adopted the decision tree algorithm in data mining to achieve precision marketing. Firstly, precision marketing and C4.5 decision tree algorithm were brieﬂy introduced. Then e-commerce enterprise A was taken as an example. The data from January to June 2018 were collected. Four attributes including age, income, occupation and educational background were selected for calculation and decision tree was established to extract classiﬁcation rules.The results showed that the consumers of the products of the company were high-income young and middle-aged people, middle-income young people, middle-income middle-aged and elderly people with a college degree or above, low-income middle-aged people with a college degree or above and low-income elderly people with a state-owned enterprise. After precision marketing to these customers, it was found that the monthly sales volume of the enterprise increased by 22.82% and the marketing cost decreased by 28.21%, which veriﬁed the effectiveness and application value of precision marketing and showed that the decision tree algorithm could provide enterprises with decision support in precision marketing.


INTRODUCTION
The rapid development of Internet and e-commerce has brought challenges to traditional marketing methods. Only accurate marketing methods can help enterprises not be eliminated in the fierce market price competition [Luo and Li (2014)]. Marketing activities are inseparable from data [Elsalamony (2014)]. In a large amount of marketing data, there is a lot of valuable customer information. Through the statistics, collection and analysis of these data [Ozyirmidokuz, Uyar and Ozyirmidokuz (2015)], precision marketing can be realized, which can reduce marketing costs and improve marketing efficiency [Abbas (2015); Mitik, Korkmaz, Karagoz et al. (2017)]. In order to extract valuable information from massive data, data mining technology has been widely applied [ [You, Si, Zhang et al. (2015)] selected attributes through the RFM model. Then important attribute values were identified by CHAID decision tree and Pareto value, and customer groups were classified. Next targeted marketing strategies was put forward. Finally, it was proved by the case study that the scheme could provide enterprises with good precision marketing strategies. Wan et al. [Wang, Fang and Wang (2015)] combined EM clustering analysis and neural network algorithm. Firstly, the sales data are preprocessed by EM clustering, and then classified by neural network to predict the purchase behavior of customers, which effectively improved the accuracy of direct marketing of enterprises. Hongda [Hongda (2017)] applied the decision tree algorithm to build a classifier model by collecting historical data of power users and to mine customer characteristics and preferences. Through the research on the promotion of palm power APP, it is found that this way of precision marketing was effective. Zou and Wang [Zou and Wang (2008)] introduced the merchant gene bank model to improve the competitiveness of e-commerce enterprises. A customer preference model was proposed based on optimal vol 35 no 4 July 2020 293 neighbor and utility function. Then the genetic algorithm was improved and the two models were matched to achieve precision marketing. In the era of consumer supremacy, it is of great practical significance to grasp customer information through data mining technology and conduct targeted marketing for improving marketing effect and promoting enterprise development. Therefore, it is of great value to study the application of data mining technology in precision marketing. At present, the application of decision tree in precision marketing is seldom, but the algorithm of decision number has low computational complexity and good effect on processing large data. In this study, the decision tree algorithm in data mining technology was used. C4.5 algorithm was used for classifying the collected customer historical data, constructing decision tree and extracting classification rules, so as to obtain customer categories with high purchase probability and conduct precision marketing on them. Moreover an example analysis was carried to verify the reliability of the decision tree algorithm. This work provides some theoretical supports for the further application of decision tree algorithm in precision marketing and is beneficial to the better and faster development of precision marketing in the field of e-commerce.

DECISION TREE ALGORITHM
Data classification is a kind of data mining. Through data classification, precise marketing can be realized by classifying customer information data.
The methods of data classification include genetic algorithm, decision tree and neural network. Decision tree with simple structure and high efficiency has a very good performance in data prediction [Wang, Li,  Training sample set is expressed as S. The sample number of the i-th result in category C i (i = 1, 2, · · · , m) is expressed as s i . The probability that any sample belongs to is expressed as ρ i . Then it was found that: and the expected information for the result is: Suppose there are u different values of D, D = {d 1 , d 2 , · · · , d u }. Attribute D is used to divide training sample set S, and {s 1 , s 2 , s 3 , · · · , s u } is obtained. Suppose that S j contains training sample set S and there is a sample with value d j in attribute D. If D is the best splitting property, then these samples are branches growing out of the nodes in training sample set S. S i j is set as the number of samples belonging to C i in subset S j , and the expected information of the subset divided according to attribute D is: where ρ i j = s i j |S j | represents the probability that a sample in S j belongs to C j .
The information gain of attribute D is The split information of attribute D is Then, the information gain rate of the attribute D is

Gai n Rati o(D) = Gai n(D) Split I n f o(D)
.
The attribute with the largest information gain rate is selected. The data set is divided into different subsets, and the decision tree is established.

Data Acquisition
E-commerce enterprise A is taken as the research subject. The transaction pattern of e-commerce enterprise A is mainly on the line, which helps enterprise A to achieve communication with customers through the network platform. Its marketing mode is mainly in the form of Internet advertising, which attracts customers to buy by placing advertisements in search engines, WeChat, weibo and other online channels. In order to achieve precision marketing, this paper classifies customers through the decision tree to determine which customers may buy the products of the enterprise, and is worth putting in the marketing plan. The browsing and purchasing information of enterprise products from January to June 2018 was collected, and SQL Server was used to establish a database.

Data Processing
Since there may be a large number of vacant values and incomplete data in the collected data, it is necessary to clean up these data and screen the data to select the information which is valuable to precision marketing, in order to improve the data quality and improve the accuracy of data classification.
In the perspective of consumer privacy, private information such as family status and marital status were not selected. In this study, attributes including age, income, occupation and education background were selected to establish the decision tree. Y was used to represent the final purchase, and N was used to represent no purchase. The classification is shown in Table 1.

Establishment of Decision Tree
One hundred records were randomly selected from the database as the training sample set. Discrete processing was performed on the data, and the age was divided into the following groups: (1) under 30 years old: young people; (2) 30-50 years old: middle-aged; (3) over 50 years old: the elderly; Income is divided into the following groups: (1) under 300: low income; (2) 3000-5000: medium income; (3) over 5000 yuan: high income; The occupations are divided into the following groups: (1) state-owned enterprises; (2) non-state-owned enterprises; Academic qualifications are divided into the following groups: (1) university degree or above; (2) university degree or below.
The training sample set shown in Table 2.
The training sample set is sorted according to attributes, and the results are shown in Table 3.
Among them, 46 records were classified as "Y" and 54 records were classified as "N", so the expected information of the result "Y" is I = 0.978.
The expected information of different attributes is calculated, and the results are as follows:   It can be found that the attribute "income" has the largest information gain rate. Therefore, it is taken as the first level attribute of the decision tree, and then the classified samples are calculated again. The final decision tree is shown in the Figure 1.

The information gain of different attributes is
According to the decision tree, the following classification rules can be extracted: ( According to the above classification rules, it can be found that youth and middle-aged people with high income, youth with middle income, middle-aged and elderly people with university degree or above, middle-aged people with low-income and university degree or above and low-income elderly people in state-owned enterprises will buy the products of this enterprise,

296
computer systems science & engineering Y. ZHENG i.e., precision marketing can be carried out for them to achieve good marketing effects.

Analysis of Precision Marketing Effect
In order to verify the effect of precision marketing, e-commerce enterprise A applied decision tree to conduct precision marketing to customers in September 2018. The monthly sales volume and marketing cost of August and September were compared. The results are shown in Table 4. As shown in Table 4, through precision marketing, the monthly sales volume of e-commerce enterprise A in September 2018 is 7,632, which is 22.82% higher than that before precision marketing, and the marketing cost is 28.21% lower than that of last month, which suggests the good effect of precision marketing. After the implementation of precision marketing, the marketing cost of the enterprise is significantly reduced. This is because the enterprise only carries out the marketing plan for the designated customers, which largely avoids the invalid marketing plan and reduces the marketing cost. In addition to the reduction of marketing cost, the monthly sales volume of the enterprise has significantly increased, which indicates that the limited marketing plan has achieved good marketing effect and the precision marketing for customers with high purchase probability is practical and effective.

DISCUSSION
With the development of Internet technology, the ways of transaction and marketing have changed a lot. There are increasing transactions conducted through the Internet between enterprises, and between enterprises and consumers, with the rapid development of e-commerce. Enterprises also increasingly use the Internet as a tool to promote marketing [Andreopoulou, Tsekouropoulos, Koliouska et al. (2014)]. As the transactions go on, a lot of data is generated in the Internet. In the era of big data, how to use data for effective marketing is a problem widely valued by enterprises. With the transformation of marketing mode from traditional broadcast network to precision marketing, precision marketing oriented to network channels is also welcomed by more and more enterprises. Precision marketing is a form of one-to-one marketing. It takes consumers as the center and pushes the influence scheme to the target audience accurately through the accurate positioning of the product target audience [Li (2014)]. Precision marketing oriented to network channels can show the enterprise's marketing plan to the audience through static web pages, dynamic web pages, floating Windows, AD links and other forms [Han (2015)], which not only saves the marketing cost to a large extent, but also can effectively improve the marketing effect and obtain greater returns.
Customer classification is the key to precision marketing. In the past, customers' information was often collected through questionnaires to understand their needs and classify them. However, this method is not only time-consuming and laborintensive, but also does not have high accuracy of classification under the influence of various factors, so it cannot provide valuable decision support for enterprises. In the era of big data, especially transactions formed on the basis of the Internet have accumulated a large amount of data, which contains a lot of valuable information. If this part of information can be fully statistical, analyzed and used, you can get a lot of information beneficial to the enterprise. The development of technology makes the storage and analysis of massive data a reality. Data mining technology refers to the process of extracting potential and valuable information and knowledge from a large number of fuzzy and random data [ In this study for precision marketing, the decision tree algorithm is selected to realize the accurate classification of customers. Decision tree algorithm is a method for classification and prediction. Its fast classification speed and high classification accuracy are widely favored by enterprises. Among them, C4.5 algorithm is one of the better decision tree algorithms. This paper first introduces the C4.5 algorithm, and then takes e-commerce enterprise A as an example to realize precision marketing through the decision tree algorithm. By classifying the customer information collected according to the four attributes of age, income, occupation and education background, and then calculating the information gain rate, it is found that in the four attributes, the information gain rate is in the order of income, age, education background and occupation from the largest to the smallest. Therefore, the decision tree is constructed with income as the first level attribute. By further calculating the final decision tree and extracting the classification rules, it can be found out that which customers are more likely to buy products and which customers are less likely to buy products. Then, the marketing plans are designed and pushed to customers with high purchase probability, which will further increase the purchase possibility of these customers and realize precision marketing. It was found from the analysis of precise marketing effect that the monthly sales volume of Enterprise A reached 6214 pieces, and the marketing cost reached 234,000 yuan before using precise marketing; after using precise marketing, the monthly sales volume of Enterprise A was 7632,increased by 22.82%, while the marketing cost was 168,000 yuan, decreased by 28.21%, indicating the application of precise marketing could effectively reduce the marketing cost of enterprises, and targeted marketing strategies also promoted the growth of enterprise sales. The results proved the effectiveness of the precise marketing method.
Although this study has obtained some achievements, there are still some shortcomings, for example, the enterprise data obtained in the case analysis were not comprehensive, and the data storage and customer classification method need further research, which are the direction of future work.

CONCLUSION
In this study, C4.5 decision tree algorithm in data mining is applied to realize precision marketing for network channels. Taking e-commerce enterprise A as an example, A decision tree is constructed and classification rules are extracted through the calculation and analysis of collected customer information. Eventually, a customer category with a high probability of purchase is obtained. These customers are then marketed with precision. It is found that the sales volume of the enterprise significantly improves, and the marketing cost reduces, indicating that the decision tree constructed by using C4.5 algorithm is effective and reliable for precision marketing. The decision tree constructed by using C4.5 algorithm provides a basis for the marketing decision of enterprises and has great application value.