User Behavior Path Analysis Based on Sales Data

: With the rapid development of science and technology and the increasing popularity of the Internet, the number of network users is gradually expanding, and the behavior of network users is becoming more and more complex. Users’ actual demand for resources on the network application platform is closely related to their historical behavior records. Therefore, it is very important to analyze the user behavior path conversion rate. Therefore, this paper analyses and studies user behavior path based on sales data. Through analyzing the user quality of the website as well as the user’s repurchase rate, repurchase rate and retention rate in the website, we can get some user habits and use the data to guide the website optimization.


Introduction
User behavior path analysis is an analysis method that monitors the user flow and statistics the depth of product use. It mainly analyzes the circulation rules and characteristics of each module in the App or website according to the click behavior log of each user in the App or website, and mines the user's access or click mode, so as to realize some specific business purposes. User access path analysis is a very important part of website analysis [1,2]. By analyzing user access path, we can help specific visitors to improve the efficiency of completing the visit tasks in different stages on the premise of achieving the business goals of the website [3,4].
The primary purpose of user access path analysis is twofold. The first purpose is to fulfill the visitor's task on the premise of achieving the business goals of the website. Generally speaking, a website has only one business goal, while visitors may have multiple tasks during their visit [5,6]. For example, a website's business goal is to make money from users downloading its documents, so helping users find the documents they need faster and more accurately is the purpose of user access path analysis [7,8]. The other main purpose of user access path analysis is to optimize the business objectives of the site, so as to improve the efficiency of users in completing access tasks [9,10]. This goal builds on the previous one.
This paper mainly studies the user access path when accessing the site, by analyzing the behavior way of user access to web users of consumer behavior, get some habit of users, with the data of the web site have some guidance effect optimization, make the operations department in marketing is more targeted, and to help users to improve the efficiency of access, the purpose of improving the user's experience, thereby saving costs, improve efficiency.

Research Status at Home and Abroad
With the rapid development of science and technology and the increasing popularity of the Internet, the number of network users is gradually expanding, and the behavior of network users is becoming more and more complex. A large number of studies have shown that users' actual demand for resources on the network application platform is closely related to their historical behavior records [11,12]. As an abstract concept, user behavior refers to the behavior law of users when using application service business. User behavior analysis refers to the use of a variety of scientific knowledge to study and analyze the characteristics of user behavior in the network platform, and to dig out the behavior characteristics and behavior rules implied by users in the network application [13,14]. The aim is to combine this law with the network service strategy to provide the basis for further optimizing the network resources and providing high-quality service strategy [15]. At present, the user behavior studied mostly refers to the network user behavior of a single website platform, as well as the analysis of the website target user behavior.
In recent years, more and more scholars at home and abroad have studied Web user behavior mining, and the analysis of user behavior is mainly based on data mining technology [16]. As early as the early 1990s, foreign countries began to conduct research on the information analysis of user behavior under the Internet environment, and the earliest research results were Chennells et al.'s [17,18] research on the British academic network users. Relatively speaking, China began to pay attention to the behavior of Internet users and began to study the relatively late. At present, the methods of user behavior analysis and research mainly include statistics and mining based on server logs, statistical mining based on traffic usage and statistical mining based on users' web browsing paths. Common data mining analysis methods include statistical analysis, clustering, classification analysis, association rules or frequent set mining. In terms of the user behavior analysis method, Chen et al. [19] realized the personalized recommendation function based on the content collaborative filtering and augmented matrix by using the traditional classifier and heuristic scoring mechanism based on the user behavior log. Wei et al. [20] proposed a collaborative filtering algorithm based on joint clustering smoothing to solve the problem of sparsity of user behavior data, which improved the prediction accuracy to some extent. Based on mobile communication user behavior, Li et al. [21] proposed a multidimensional analysis method combining network data and market development and operation data, and verified that the method achieved good results in mobile network.
To sum up, the user behavior of each application domain is unique, and the analysis method is different. For example, due to the large number of users, extensive behaviors and diversified emotional factors in social networks, the analysis method combining multiple algorithms is generally adopted. Some professional scientific sharing platforms, such as geosciences data sharing platform, have their own professional user behavior, which is generally single and simple to predict, and their user behavior analysis model methods are targeted. However, at present, many researches in China focus on user characteristics analysis of social networks, and lack of analysis on user behavior path conversion rate. Therefore, it is necessary to analyze the user behavior path conversion rate.

User Behavior Path Funnel Model Overview
Funnel model can disassemble and quantify each link in the process, help us analyze and monitor the key link in product operation, find the weak link, optimize through user guidance or product iteration, and improve the transformation effect. There are three types of funnel model.

AIDMA Model
AIDMA, one of the mature theoretical models in the field of consumer behavior, was proposed by the American advertising scientist E. S. Lewis in 1898. According to this theory, consumers will go through the following five stages: A: Attention-fancy business CARDS, embroidered advertising slogans on handbags, etc. I: Interest-the general method used is to cut and paste refined color catalogues and news bulletins about products. D: Desire-the person who sells tea must prepare a tea set at any time and brew the customer a cup of strong tea with strong aroma. Sell the house, to show the customer the house. The entrance of the restaurant should display the refined samples with full color, fragrance and fragrance, so as to make the customer feel the charm of the product and arouse his desire to buy.
A successful salesman says, "every time I promote my company's products, I bring along catalogues from other companies and compare them in detail. Because if you keep saying how good your product is, your customers won't believe you. Instead, they want to learn more about other companies' products, and if you come up with other companies' products first, customers will recognize your own." A: Action-the salesman must be confident all the way through the sales process from drawing attention to making A purchase. Overconfidence can also cause resentment among customers who think you're bluffing. So I do not trust your word.
The theory says that consumers' purchase behavior is modeled, which is helpful for advertisers to conduct more effective product publicity after studying consumers systematically. However, the theory is not specific to different categories of goods. In fact, the theory is more suitable for goods with high involvement (high price, need to make decisions carefully), while for goods with low involvement, the decision-making process of consumers is often less complicated.
The model is shown in the following figure (

AISAS Model
AISAS model is a new consumer behavior analysis model proposed by dentsu for the change of consumer lifestyle in the era of Internet and wireless application. Emphasize the entry of each link, close to the user experience. In the brand new marketing law, the emergence of two "s" with network characteristics-search and share points out the importance of search and share in the Internet era, instead of blindly inculcating the one-way concept to users, which fully reflects the influence and change of the Internet on people's lifestyle and consumption behavior.
AISAS model is a new consumer behavior analysis model proposed by dentsu for the change of consumer lifestyle in the era of Internet and wireless application. Emphasize the entry of each link, close to the user experience. In the brand new marketing law, the emergence of two "s" with network characteristics-search and share points out the importance of search and share in the Internet era, instead of blindly inculcating the one-way concept to users, which fully reflects the influence and change of the Internet on people's lifestyle and consumption behavior.
In the traditional AIDMA model, consumers pay Attention to products, generate Interest, Desire to buy, leave Memory and make purchase actions. The whole process can be controlled by traditional marketing methods.
Based on the reconstruction of the network age characteristics of market AISAS (Attention note Interest Interest Search Search Action Action Share Share) mode, will consumers in Attention and Interest of information gathering (Search), and purchase information sharing (Share), after considerations as two important link, the two links are inseparable from the consumers in the Internet, including wireless Internet applications.
The new consumer behavior model (AISAS) determines the new consumer Contact Point. Management on the basis of dentsu's Contact Point Management (Contact), the media will no longer be limited to a fixed form, no longer fragmented, different media types for the media form, delivery time, delivery method, first of all, from the consumer the feasible Point of Contact with the product or brand recognition, in all of the Contact Point and consumers to communicate information. At the same time, in the center of the information communication circle, the consumer website that explains product features in detail becomes the deep end of information communication with consumers at each contact point. Consumer websites not only provide detailed information, so that consumers understand the product more deeply and influence their purchase decisions; it also facilitates interpersonal communication among consumers. At the same time, marketers can develop more effective marketing plans by analyzing visitor data.
Due to the irreplaceable information integration and interpersonal communication functions of the Internet, all information will be aggregated on the Internet to produce multiple communication effects, and a cross-media full communication system with the network as the aggregation center will be born.
The model is shown in the following figure (

AARRR Model
AARRR is an acronym for Acquisition, Activation, Retention, Revenue, and self-propagation, which correspond to the five key segments of a mobile application's life cycle.
Acquisition: the first step in running a mobile app is, of course, Acquisition, or promotion. If there are no users, there is no operation.
Activation: many users may have entered the application through different channels such as terminal presets, advertising, etc. These users entered the application passively. How to turn them into active users is the first problem operator face.
Retention: Some apps have solved the liveliness problem and found another one: "users come and go quickly." Sometimes we say the app is not as sticky.
Revenue: Revenue acquisition is actually the core of application operation. Very few people build an app out of pure interest, and most developers are most concerned with revenue. Even free apps should have a profit model.
There are many sources of revenue, and there are three main types: paid apps, in-app payments, and advertising. Paid apps are poorly received in China, including Google Play Store, which only offers free apps in China. In China, advertising is the source of income for most developers, and in-app payment is currently more widely used in the game industry.
In either case, the revenue comes directly or indirectly from users. Therefore, the aforementioned increase in activity and retention is necessary to generate revenue. The user base is big, the revenue just is possible on the quantity.
Refer: the previous operational model ended at the fourth level, but the rise of social networks has added another aspect to the operation, namely the viral spread of social networks, which has become a new way to obtain users. The cost is low, and the results can be very good; the only prerequisite is that the product itself is good enough to have a good reputation.
From self-propagation to acquiring new users again, the application operation forms a spiral. And the best apps take advantage of that trajectory and expand their user base.
The model is shown in the following figure (Fig. 3): Funnel model is actually an overview of user path, which can describe various processes, such as marketing purchase process, customer acquisition growth process, invite share process, add purchase conversion, operation bit conversion, repurchase and so on Funnel model is widely used in data analysis, which can help us understand the running status of products in the current period, track the path of user behavior, realize the refined operation of products, and evaluate the results of each event.

Experiment and Analysis
This experiment analyzes the consumer behavior of CDNow website by analyzing the purchase details of users, including the overall consumption trend of users, individual consumption data, consumption cycle, user stratification and user quality to analyze the characteristics of consumer behavior.

Data Processing
Import the data and view the basic information of the data (Fig. 4 and Fig. 5): The number of columns from left to right is Id, Purchase date, orders and Order Amount. It can be seen from the statistical description information of the data that the user purchased 2.41 commodities on average per order and spent 35.89 yuan on average per order. The standard deviation of the quantity of goods purchased is 2.33, indicating that the data has certain volatility. The median is 2 items and the 75th quantile is 3 items, indicating that most orders are purchased in small quantities. The maximum is 99, which is pretty high. The amount of purchase is similar, with most orders concentrated in small amounts. In general, the distribution of consumer data is a long tail. Most users are small, while a small number of users contribute the majority of the revenue, commonly known as the 28.
The monthly total sales, times of consumption, sales volume and number of consumers are shown in the figure below (Fig. 6).  6: It can be seen that the sales volume in the first three months of 1997 was particularly high, which dropped suddenly after march. During the period from February to march, the number of consumers declined slightly, but the total sales volume and the total sales volume still rose. Users in March may have high-value customers that we need to focus on developing.
Draw the user scatter diagram as shown below (Fig. 7): Since this is the sales data of CD website, the commodities are relatively single, and the relationship between amount and quantity of commodities is linear, with few outliers.
According to the user's consumption amount, the distribution diagram is shown below (Fig. 8): Therefore, the distribution of the figure below is greatly worth excluding (Fig. 9).

Figure 9:
Distribution of consumption amount Fig. 9: After selecting the users whose consumption amount is less than 800, it can be seen that the consumption capacity of most users is not high, nearly half of the users' consumption amount is less than 40 yuan, and the number of high-consumption users ( >200 yuan) is less than 2,000.
It can be seen from the histogram of the figure above that most users' consumption ability is not high, and most of them focus on a very low consumption level. High consumption users can hardly be seen on the graph, which is also in line with the industry rules of consumption behavior. Although there is extreme data interference, most users still focus on the lower consumption level.
According to the user consumption times, the following distribution diagram is obtained (Fig. 10):

User Quality Analysis
How many users only consume once? (Multiple consumption within a day is recorded as one). Draw the pie chart below (Fig. 11): Figure 11: Consumption times pie chart Fig. 11: More than half of the users only consume once, which also shows that the operation is not good and the retention effect is not good.
Repurchase rate, repurchase rate and retention rate. Calculate the repurchase rate and draw the change in the repurchase rate as shown below (Fig. 12). As can be seen from the figure above, the repurchase rate of the initial users is not high. The repurchase rate in January was only about 15%, and since April the repurchase rate has been stable at about 30%.
As can be seen from the data of the number of users who have buyback consumption every month, the number of buyback users has a downward trend as a whole. The analysis of the buyback rate once again shows that for new users, three months after their first consumption is an important period, and marketing strategies are needed to actively guide their consumption again and continuously. In addition, for the continuous consumption of old customers, also should timely launch feedback of old customers preferential activities, in order to strengthen the loyalty of old customers.
After analysis, the following retention figure was obtained (Fig. 14): 5% of users consumed within the next day to three days after their first purchase, and 3% consumed within three to seven days. The Numbers do not look good, and CD buying isn't really high-frequency consumer behavior. 20% of users made a purchase between three months and six months after the first purchase, and 27% of users made a purchase between six months and one year after the first purchase. From the perspective of operation, while serving new users, CD marketing should pay attention to the cultivation of user loyalty and recall users to purchase within a certain period of time.
Grouped by user id, the total amount spent by the user is summed up, and then compared to the total sales, so that the ratio abscissa is the user's id. Calculate the ratio diagram below ( Fig. 15 and Fig. 16).    16: Very close to sales. The first 20,000 users contributed 40% of the consumption, while the last 3,500 users contributed 60% of the consumption. In line with the trend of 28. In other words, as long as we maintain the 3,500 users, we can complete 60% of the performance KPI, and if we can better operate the 3,500 users, they can account for 70% to 80%.

Conclusion
1. Overall trend: the monthly trend sales volume and sales volume in January to march are relatively high, and then drop sharply, which may be related to the vigorous promotion during this period or the quarterly nature of the goods.
2. Individual characteristics of users: the amount and purchase amount of each order are concentrated at the low level of the range, and they are all purchased in small amounts and batches. This kind of transaction group can enrich the product line and increase promotional activities to improve the conversion rate and purchase rate.
3. The total consumption and purchase amount of most users are concentrated in the low segment and the long tail, which is related to user demand. It is possible to endow products with diversified cultural values, enhance their social value attributes and enhance users' value demands. 4. The repurchase rate of new customers is about 6%, and that of old customers is about 20%; The repurchase rate of new customers is about 15%, and that of old customers is about 30%. Marketing strategies are needed to actively guide their consumption again and continuously.
5. User quality: the individual consumption of users has a certain regularity. The consumption of most users is under 2000. So, pay close attention to high-quality user is eternal invariable truth, these high-quality customer are "member" type, need to optimize shopping experience specially for the member, such as special line answer, special discount and so on. 6. In terms of retention rate, half of the users will be lost, so attention should be paid to the cultivation of user loyalty, such as card check-in, point system, discount system for old users and membership upgrade system.