Abstract

The development and application of big data technology has expanded the sources and methods for enterprises in the animation industry to obtain data, which provided them with the opportunity to obtain more user samples and solved the computing and storage problems faced by enterprises with massive data. By using statistical analysis and data mining methods for modeling, the user portrait of each user is depicted in an all-round and three-dimensional manner. Based on this background, this study proposes the topic of user portrait modeling based on the Reference Forward Model (RFM) model under big data. In this study, the animation user portrait modeling based on the RFM model under big data is firstly summarized: collect and summarize the behavior data of animation client users, prepare the data, and use entropy and Pearson correlation coefficient methods for data processing. And, then the weight of RFM is calculated through AHP (Analytic Hierarchy Process), and the behavior data collection method for animation user portrait modeling is described. Finally, based on the model tag in the constructed user portrait tag system, the RFM model under big data is used to analyze and model from multiple dimensions. In particular, in the algorithm model of animation user recognition, the weight and value of RFM are calculated to obtain the user value, and the results of the animation user portrait are summarized. Experiments proved that based on RFM Model under Big Dat can identify animation users more accurately.

1. Introduction

Users are becoming increasingly reliant on the network due to the rapid advancement of mobile Internet technology. The amount of organized, semistructured, and unstructured data in the network is growing at an exponential rate, ushering in the era of big data [1]. In the face of massive data in the era of big data, how to fully mine the value behind massive data, deeply understand user needs, and improve marketing efficiency has gradually become an important target of enterprises. Using the k-means algorithm of cluster analysis to build a customer of animation enterprises portrait model can improve the efficiency of customer management and reduce the management cost of enterprises [2, 3]. User portraits accurately reflect the attributes and preferences of target users, enabling enterprises to target the characteristics and preferences of target users. And, further adjust the positioning of enterprise products in the animation industry, design more suitable products for users, and improve user experience [4]. By analyzing customers’ consumption, behavior habits, and basic data, enterprises use user portraits to obtain users’ consumption level, brand preference, and other information for accurate advertising to improve the marketing effect of animation enterprises [5]. At the same time, user portrait can also help enterprises improve product operation, optimize the process and experience of interaction with users, and enhance the user engagement of the platform [6, 7]. The user portrait accurately reflects the attributes and preferences of target users, enabling enterprises in the animation industry to further adjust the positioning of their products according to the characteristics and preferences of target users, design more suitable products for users, and improve users’ experience [8]. At the same time, user portrait can assist animation companies in improving product operation, improving the process and experience of user interaction, and increasing platform user engagement. User portrait is based on the analysis of user behavior data to achieve accurate marketing of animation industry enterprises as the ultimate goal [9]. Enterprises need to be able to clearly understand user needs to guide the derivation. The customers of animation enterprises are mainly young people, and the needs of young people are characterized by instability and randomness, which requires animation enterprises to apply big data for user portrait, and improvement of products, and further implement personalized recommendation [10, 11]. Therefore, the construction of user portrait is of great significance to the steady development of animation enterprises. Its strong technical characteristics are also one of the important means for animation enterprises to master competitiveness and face peer competition.

The paper’s organization paragraph is as follows: the RFM is presented in Section 1. Section 2 analyzes the big data collection of animation users of the proposed work. Section 3 discusses the RFM model of animation user value portrait in detail. Finally, in Section 4, the research work is concluded.

2. RFM Model

RFM model is a classical quantitative analysis model of customer relationship management. Hughes, an American database marketing expert, proposed it in 1994. He believed that customers might be classified into three groups based on their purchasing habits [12]. From a theoretical perspective, the smaller R means that such consumers have more purchasing interest and demand for enterprise products; the larger F means that consumers are more inclined to purchase enterprise products; the larger M value means that consumers have a higher contribution [13]. Indicators of the RFM model and meanings are shown in Table 1.

3. Big Data Collection of Animation Users

3.1. User Behavior Data

The behavioral data of animation users come from a specific animation platform. It collects and summarizes the behavioral data of animation client users [14] by using IMEI (International Mobile Equipment Identity) as the unique identifier in the way of embedded SDK (Software Development Kit) of animation client. According to statistics, there were around 9,352,700 active users on the client in the fourth quarter of 2021, and the entire amount of user behaviour data were approximately 2100 TB. In order to support the collection of massive user behavior data, a data acquisition system with high concurrency and easy expansion is designed to realize the collection and parallel processing of massive user behavior data [15]. The definition of event interaction interface parameters is shown in Table 2.

In order to improve the processing capacity of massive user behavior data, the load balancing server at the data receiving end routes the client requests to multiple HTTP Servers. Then, each HTTP Server implements the parsing, cleaning, and conversion of the parameters in the request. The user behavior data are saved in the local server according to the set entries, so as to realize the safe storage and convenient processing of massive requests [16]. Finally, the consumer consumes the user behavior data from the Kafka cluster and persists it to the distributed file system HDFS (Hadoop Distributed File System) [17].

The biggest feature of big data is massive data, in view of the massive collection of user behavior data, it is necessary to further summarize and count data of user behaviors [18]. In order to support the parallel processing of massive data, and make full use of the characteristics of high throughput and easy expansion of distributed computing. In order to ensure the robustness and accuracy of the final output strategy, this study summarizes and counts the user behavior data within 30 consecutive days to obtain a comprehensive description of the user behavior.

3.2. Data Preparation

Relevant user characteristics are extracted from the broad table based on animation business scenes and integrated with relevant research on the features of user behavior in animation scenes. Then, the original user information table I0 is formed. Indicator description of the original user information table I0 is shown in Table 3.

3.3. Data Preprocessing

As there are different correlations among the indicators in the original user information table I0, some noise indicators may have to be negatived effects on the effectiveness of discrimination [19]. Therefore, the entropy method and Pearson correlation coefficient method are used to screen the core indicators. The specific steps are as follows.

3.3.1. Data Standardization

To avoid the influence of extreme values and different dimensions among indicators, logarithmic transformation of data is required first, as shown in the following formula:

Here, is the indicator value of the user. Then, the forward range standardization method is used to standardize the data, as shown in formula:

A new data information table I1 is formed. At this time, information table I1 still contains n users and d1 indicators.

3.3.2. Use Entropy Method

The entropy method is used to select indexes with large information content. Formula (3) is first used to calculate the probability of under index .

Then, formula (4) is used to calculate the entropy value of index .

Since the amount of information is inversely proportional to the amount of entropy , the part of an index with a lower entropy value is selected to obtain the new user information table I2.

3.3.3. Pearson Correlation Coefficient

Pearson correlation coefficient was used to eliminate some indexes with strong correlation [20]. Indicators are screened by calculating the Pearson correlation coefficient, which is shown in the following formula:

Here, is the correlation coefficient, and , respectively, represent two different indexes of the sample in sample I2, and are the observed mean values of indicators and , respectively. The degree of linear correlation between indicators is shown in Table 4.

Indexes with large entropy values in the highly correlated and perfectly correlated index groups (|r| ≥ 0.8) are removed, and a new data information table I3 is obtained. The definition of indicator in table I3 is shown in Table 5.

The method of normal distribution random interpolation is used to fill the empty value of the field randomly [21]. Then, the new user information table I4 is got.

4. RFM Model of Animation User Value Portrait

4.1. RFM Weight

The analytic hierarchy process (AHP) is used to determine the weight of the RFM index. 9 importance grades and their assigned values are given by Saaty were used to construct the discriminant matrix [22]. The judgment matrix is shown in Table 6:

The sum and product method is used to sort the discriminant matrix, and finally, the normalized vector and the maximum characteristic root are obtained.

The consistency indicator is calculated.

Because is not equal to 0, the consistency check fails. Therefore, it is necessary to further calculate the consistency ratio .

The weights of R, F and M in the RFM model are 0.5318, 0.2964, and 0.1627, respectively.

4.2. RFM Value Degree

Based on the user’s viewing data, the RFM node in SPSS Modeler software is used to calculate the user’s RFM value. Set R, F, and M as three dimensions with weights of 0.5318, 0.2964, and 0.1627. The data flow of the animation user value analysis model is shown in Figure 1.

After the data flow runs, view the RFM analysis node. Scoring rules for all dimensions of the RFM model are shown in Table 7.

The RFM value of the user is calculated.

Here, , , and are the scores of the user in R, F, and M dimensions, respectively. , and are the weights of R, F, and M, respectively.

4.3. User Value

After the value rating of animation users is obtained based on RFM model analysis, it is necessary to subdivide animation users according to their value, which is helpful for enterprises to carry out personalized precision marketing according to different user needs and user characteristics [23]. The standardized RFM index values of each user were compared with the overall mean value of this index. Animation user value segmentation rules are shown in Table 8.

4.4. Result of Portrait

A summary of characteristics of clustering results is shown in Table 9.

RFM model intraclass distance and interclass distance are shown in Table 10.

5. Conclusion

This study takes animation business as the background, based on the behavior data of animation users using the client to watch the animation, uses RFM under big data to conduct user portrait modeling and explores the method and detailed steps of user portrait construction.(1)The application of user time preference analysis in the new scene of the animation industry. The time preference of animation client users is evaluated and modelled using past behavior data, and the preferred time period of each user is determined. The study of user time preferences is applied to the new field of animation big data. This enables enterprises to carry out marketing activities for individual users within the time range of user preference, which is beneficial to enterprises to improve user experience and marketing efficiency.(2)For enterprises in the animation industry, user portrait is the means, and precision marketing is the goal. The user portrait faithfully portrays each individual’s entire image, while the label highly abstracts a specific attribute of the user. Enterprises can correctly identify target user groups by label selection and combination based on user picture labels created by themselves and conduct out marketing activities for target users, as part of the precision marketing process.

In this study, the modeling of animation user portrait is realized based on the business scene of animation, and the marketing strategy management platform is realized based on the user portrait. The construction of a tag system in user portrait modeling is mainly achieved through literature reading and market research, combined with the strategic goals and business scenes of the enterprise. However, the user portraits created in this study are not ideal. Because of limits in knowledge and ability, the interpretation of strategic objectives and comprehension of business logic are not always full, which inevitably leads to imperfect user pictures. The process of creating user portraits is a long one. User portraits will be improved more in the future based on input from correct marketing based on user portraits.

Data Availability

The datasets used during the present study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

Acknowledgments

The study was supported by (1) Research on the Development of Hainan Animation and Comic Creative Industries—Based on Value Co-Creation Theory under Grant no. HNSK(ZC)21-160 and (2) Research on the Production and Communication of Network Animation Culture in the New Era From the Perspective of Value Cocreation, under Grant no. 2022YBA279.