Abstract

Wireless virtual reality integrated multidisciplinary technology, combined with related industries and fields, has changed the way of human-computer interaction and opened up a new field of user experience. In recent years, with the rapid improvement of computer technology and hardware conditions, interactive technology has developed rapidly. The existing wireless virtual reality interactive system is too single and cannot be used in multiple environments. The original system requires a large number of sensor equipment, the cost is high, and the traditional perception technology is too restrictive and cannot realize human-computer interaction more naturally. This paper proposes a dual intention perception algorithm based on the fusion of touch (obtained by experimental simulation equipment), hearing, and vision. The algorithm can perceive the user’s operation intention through the user’s natural behavior and can identify the user’s two intentions at the same time. This paper proposes a navigational interactive mode, which provides users with multimodal intelligent navigation through intelligent perception of user intent and experimental progress. We determine the impact model of the interactive system effect evaluation and analyze its effect evaluation strategy in depth and then further quantify the indicators under the four effect dimensions of information perception, artistic reflection, social entertainment, and aesthetic experience. A combination of qualitative and quantitative methods was used to carry out relevant research on effect evaluation, usability test, and questionnaire interview. The experimental results show that this interactive system has better entertainment effects than other forms of film and television animation, but still needs to pay attention to and strengthen the construction and embodiment of film and television animation content, as well as the optimization and perfection of the fault-tolerant mechanism in the design process.

1. Introduction

The higher the urgency for the content of wireless virtual reality technology, the lower the risk it will endure. The industry is more willing to get involved in the field that no one has gone before. Therefore, early pioneers of wireless virtual reality technology applications often focus on short-term development. For example, in the military and aerospace fields, if the construction of a physical site is used for professional training and evaluation of certain setting scenarios, it will not only consume a lot of manpower and material resources but also have a long replacement cycle, and it is difficult to meet the tasks of multiple groups of personnel or equipment at the same time [1, 2]. However, in the simulation scenario built by wireless virtual reality technology, the cost of content construction is greatly reduced, free from the constraints of venue and space, and multiple groups of people can conduct virtual experience research at the same time and even quickly modify and replace virtual content [3]. Based on the above characteristics, the earliest commercialization and popularization of wireless virtual reality technology and entering the field of ordinary people’s lives are experiential games. Due to the brand new sensory experience and interactive features of virtual reality games and relatively low content production costs, early experiential games can control production investment and short-term returns to avoid some risks. At present, wireless virtual reality technology is widely used in industries such as tourism, medical treatment, and education. However, the current development of wireless virtual reality technology is not mature, and the familiarity and technological development of the masses cannot meet the needs of these fields [4]. If you want to quickly promote wireless virtual reality technology, fast-developing entertainment is the most suitable field.

Animation art can always be closely integrated with the development of science and technology in its own development process, so that animation has obtained an unprecedented performance space and development pattern [5]. The impact of wireless virtual reality on the expression and experience of animation is something that is happening quietly around us. It not only makes animation appear in front of the audience in a brand-new form but also enhances the presence and interactivity of the audience’s experience. The inner feelings produced by people’s dominant position present broad prospects for development. With its new display method, the 360-degree panoramic immersive experience of VR animation gives the audience a new perspective to observe the virtual world of the animation, acquires a wealth of sensory information, and makes it easy for the audience to enter this highly attractive and touching experience. The application of wireless virtual reality technology makes animation come out of the traditional animation form, changes the audio-visual language of animation, enables audiences to participate in animation, and makes the presentation of animation art more diversified. In the experience of VR animation that strongly stimulates the senses, the audience satisfies the fantasy of entering the animation world of this alternative experience, triggering the audience’s emotion and emotional resonance, and then can have a deeper understanding of the theme and connotation of the animation. The audience thinks deeply about themselves and produces a more profound and internalized psychological experience [6].

This paper proposes a multimodal fusion algorithm based on SVM and set theory and applies this method to the wireless virtual reality perception simulation experiment scene. The MSSN algorithm integrates multimodal information of touch (obtained by experimental simulation equipment), hearing, and vision for the perception of intent and provides support for navigational interaction by establishing an intention behavior node to describe the user’s intention. In the entertainment effect evaluation stage, its unique effect evaluation strategy is set by a combination of qualitative and quantitative methods, and its entertainment effect is effectively and indepth measured from multiple aspects such as effect evaluation, ease of use testing, and questionnaire interviews. Based on the results obtained, it proposes ideas for its future improvement direction and perfect plan. The wireless virtual reality scene film and television animation time-space reproduction interactive system combines media technology, art, and film and television animation and other elements and integrates the characteristics of readability, immersion, artistry, interactivity, and attractiveness. While safeguarding the fundamental interests of the public and society to promote the steady development of society, it has also played a positive role in the management and control of the overall public opinion of the society and has also opened up a new sustainable development road for the film and television animation industry.

In the research on user information acceptance, the effect of wireless virtual reality information acceptance is often compared with the traditional information acceptance effect [7]. The wireless virtual reality environment promotes and inhibits user information acceptance behavior. The wireless virtual reality environment deepens the user’s understanding and acceptance of objective reactions or complex knowledge structures by improving user perception, increasing the user’s perspective of observing information, and dynamically disassembling abstract scientific phenomena. In terms of practice, the user’s familiarity with industrial skills can be deepened through the interactive operation process. At the same time, the wireless virtual reality environment may also hinder users’ information acquisition behavior [8]. Multimedia learning cognitive theory believes that complex visual materials will hinder users’ information acceptance. In addition, users may have a cognitive load in the wireless virtual reality environment, and their tasks have higher complexity and difficulty, requiring additional cognitive resources; compound multiple feedback information presentation methods will cause information overload, causing confusion and distracted behavior. There is a phenomenon that the operational skills are strengthened but at the same time the acceptance of the technical theory link is poor [9].

In the research on the technical acceptance of wireless virtual reality technology in the tourism industry, relevant scholars have explored the effectiveness of wireless virtual reality technology in shaping consumer attitudes and behaviors through empirical research and determined the sense of existence in wireless virtual reality experience [10, 11]. The sensation in the virtual environment increases the enjoyment of the wireless virtual reality experience, thereby exploring the influencing factors of user attitudes and behavior desires. The experimental research method is also often used in the research of users’ wireless virtual reality technology acceptance behavior [12]. When studying the technology acceptance behavior of wireless virtual reality in the sports industry, relevant scholars conducted a intergroup control experiment among subjects research, explored the factors that affect the audience’s mobile experience, examined the influence of mobile experience on audience satisfaction, and analyzed the influencing factors of user technology acceptance [13]. Although the related research on the technical acceptance of wireless virtual reality users has not yet involved the field of reading, it provides an important theoretical basis for the research on the factors affecting the interactive behavior of wireless virtual reality reading users [14].

Related scholars pointed out that all the spaces in wireless virtual reality are simulated, but they are real [15]. However, the spatial impact of wireless virtual reality on animation is not only visual and physical spatial impact but also manifested in psychological aesthetic space, perceptual space and other aspects. Relevant scholars believe that the traditional way of expressing animation space is mainly focused on the simulation of three-dimensional things in two-dimensional space, including color, light and shadow, sound, lens, and motion [16]. Although these seem to make the audience feel that the three-dimensional space really exists on the surface, the audience encounters difficulties when they want to explore further. The audience can only touch the side that the director shows to the audience. In other directions, the content of other perspectives does not seem to be the same as that of the director. The relationship shuts out the audience even more [17]. However, wireless virtual reality technology completely solves this problem. This technology allows us to communicate and contact the displayed objects more closely. For example, the scene in the flight cabin of an airplane. In this scene, our experiencers can actually see the entire cabin, as if they have checked in and boarded the aircraft and entered the cabin [18, 19]. Passengers can visit any angle in the cabin and move their luggage. The switch of the frame can adjust the height of the seat. This is not a real cabin, but you can already feel very close to the cabin experience [20].

Knowledge acquisition and information acquisition tools based on wireless virtual reality technology will not only improve users’ digital abilities and literacy but will also trigger changes in reading forms. In the face of this new technology, the current wireless virtual reality user technology pays more attention to the user’s subjective point of view on the use behavior, continuous use, transfer, and other behaviors of wireless virtual reality technology, wireless virtual reality interaction methods, wireless virtual reality applications, etc. The research on user adoption behavior in the wireless virtual reality environment mainly focuses on the user’s acceptance of wireless virtual reality technology, and the technology acceptance model is mostly used to analyze the user experience and user satisfaction. Virtual tours can promote users’ willingness to use tourist destinations and hotels. Virtual previews will generate positive psychological intentions and a stronger sense of presence, which translates into an enhanced brand experience and promotes user behavior. The enhanced sense of immersion and entertainment will also stimulate the user’s curiosity and promote the user’s travel intention for the destination.

3. Wireless Virtual Reality Interactive Operation Based on Sensing Devices

3.1. Using Vuforia’s Virtual Reality Tracking Registration Technology

The tracking and registration technology is one of the three key technologies in the enhanced wireless virtual reality. For the enhanced wireless virtual reality, whether the position of the virtual information in the real scene is accurate depends on the tracking and registration technology. Current tracking and registration technologies mainly include sensor-based recognition, visual recognition, and hybrid tracking and registration technologies composed of the two. The visual recognition method does not require expensive sensor equipment, and the registration accuracy is high. The realization process based on natural feature tracking is difficult, the real-time performance is weak, and the requirements for the recognition map are higher. This article chooses the identification method based on identification to reduce the amount of calculation of mobile devices. The virtual reality system is mainly composed of users, sensor equipment, 3D scene generation and display, and real scene simulator. The virtual reality system architecture is shown in Figure 1.

The identification-based tracking and registration technology requires the system to recognize the identification in the output video stream of the mobile device, calculate the relative position of the camera and the identification, and render the identification corresponding model to the device screen according to the relative position. The Vuforia engine used in this article provides a variety of scanning and tracking methods, including scanned images, cylinder recognition, multitarget recognition, text recognition, user-defined targets, and cloud recognition. The marking-based method used in this article is a special frame marker that can be tracked in the Vuforia SDK—Frame Marker. This frame mark tracking and registration technology has the following advantages:

(1) Since the Frame Marker exists in the SDK and belongs to local recognition, there is no need to make a separate mark, which also reduces the workload and facilitates later porting functions. (2) The size of the Marker can be adjusted. In some occasions, it is necessary to zoom the mark recognition map to a small size to reduce its aesthetic impact on the surrounding environment. However, when the recognition map based on the natural feature method is smaller than a certain size, it will produce unstable effects such as the shaking of the virtual object. (3) The frame marking method can realize that multiple identification cards can be extracted by the device at the same time for identification and tracking

The image frame tag of the wireless virtual reality interactive system based on the mobile smart terminal is mainly used for detection and use. The surrounding images are not used for detection or tracking, but these images can affect the performance of the frame tag. Therefore, when making the frame tag identification card for identification, you try to make the contrast between the inside and outside of the mark a little bit larger, so that the outline of the border is more obvious.

In order to better present the effect, the model needs to be adjusted before generation. Through continuous debugging, it can be displayed in a better size and angle. Afterwards, the platform built by Unity 3D and Vuforia will package and generate the APP with the enhanced wireless virtual reality function. You can use the mobile smart device to open the APP and scan the set frame mark identification card to realize the tracking and registration technology in the enhanced wireless virtual reality.

3.2. Human-Computer Interaction

The use of mobile devices to realize the human-computer interaction of enhanced wireless virtual reality requires the design and programming of it under the Unity 3D platform, including animation design, animation control, and animation triggering. Due to the complex internal structure of some equipment, traditional teaching cannot be fully demonstrated. This article adds equipment operation functions. For different devices, you can visually observe the device, but also see the internal working mechanism of the device.

In order to control the virtual scene or object in the later implementation process, it is necessary to create a one-to-one corresponding control mode during the design. An animation controller is provided in Unity 3D, and the animation and premade behaviors of the virtual world can be edited with the animation controller. You create a new animation controller and drag it to the controller of the inspection panel in the corresponding virtual object directory.

The controller is established in the virtual world, and the animation needs to be edited under the controller later. You import the premade animation into the animation controller window to form an animation layer component and add suitable transition connections between several animation layer components. After the connection between the animations is completed, you need to click to trigger. At this time, you need to add the trigger to the animation parameters, create a new trigger, and make the trigger and transition connections correspond to each other and configure the animation speed and the connection point between the animations.

4. Multimodal Fusion Intention Perception Based on SVM and Set Theory

4.1. The Overall Design of Multimodal Fusion Intention Perception

In order to further enhance the intelligence of the interaction during the experiment, the first thing to do is to fully perceive the user’s more complex intentions. Therefore, this paper proposes a multimodal fusion algorithm (MSSN algorithm) based on SVM and set theory. The algorithm combines the tactile information, auditory information, and visual information in the user’s experiment process and recognizes the user’s dual intentions through a specific preprocessing method. We construct intention behavior nodes according to intentions to perceive detailed behavior information. Based on the perceived behavioral information, the navigational interactive mode is used to guide users to conduct experiments.

The advantage of the MSSN method is that it can better identify dual-intent operations first and can better support the dual-operation requirements of wireless virtual reality perception simulation experiments. Secondly, the construction of user behavior nodes can describe user behavior in detail, which describes an intention as the type of intention action, the object of the operation, and the attribute of the operation. This description method is more conducive to the computer’s complete understanding of the user’s intentions and facilitates interaction with the user. Finally, the use of navigational interactive mode is more conducive to improving the user’s experimental efficiency and reducing the user’s load.

The MSSN algorithm fuses multimodal information at the data level through preprocessing and SVM, and fuses multi-modal information at the decision-making level through node construction based on set theory.

As shown in Figure 2, the MSSN algorithm is divided into four layers, which are a multimodal input layer, a recognition layer, a multimodal fusion intention perception layer, and an application layer. The core is the multimodal fusion intention perception layer, which organically combines multimodal data to form the form of intention behavior nodes to facilitate the computer’s understanding of the user’s more complex intentions.

4.2. Intent Set Recognition Based on SVM

Since the user may perform two operation intentions at the same time in actual operation, the preprocessing needs to enable the SVM model to distinguish the types of dual operations. There are two types of double operation RF, one is a single operation or two operation types with the same intention (two times the same intention may operate on different objects) (), and the other is two different operation types ().

Because feature selection will lose part of the information, in order to retain all the original feature information, the feature queue uses the TF-IDF weight calculation method to obtain the feature weight vector before feature selection. Then, the feature weight vector is used for feature selection. This paper compares CHI, IG, I_G (information grouping method: grouping feature queues according to the source of information and then grouping according to the type of data within the group), and various methods through experimental methods. We select the linear combination method of CHI and I_G (abbreviated as CHI+I_G) according to the experimental results for feature selection to obtain the feature vector. The experimental results are shown in Table 1.

Table 1 is a comparative test of 6 experimental methods. The data in the table are the recognition rates when the comprehensive evaluation of various methods (comprehensive recognition rate and feature number) is the highest. In order to distinguish between the two types, the probability value is used for judgment. When , the maximum probability distribution is ; , the maximum probability distribution is . After experimental verification, using the CHI+I_G method, the recognition rate of dual operation intentions in the range of [0.5, 1] can reach 96%, and the number of features can be effectively reduced, which can support for the dual operation requirements of wireless virtual reality perception simulation experiments.

SVM uses Gaussian kernel function. We put the preprocessed multimodal data into the SVM model for recognition, so that the multimodal information is fused at the data level, and the intent recognition result is obtained through the multimodal information. Due to the preprocessing method used in this article, the types of double operations can be distinguished.

4.3. Construction of Intentional Behavior Nodes Based on Set Theory

According to the sequence number of the intention identified by SVM and the split feature queue corresponding to the intention, multimodal information can be organically integrated at the decision-making level through the method of set theory, so as to obtain a specific description of the intention, that is, the intention behavior node. All objects refer to experimental simulation equipment. An operation object is divided into active objects and passive objects. Based on the design of the simulation device in this article, the active object is generally a simulation device that emits more signals, while the passive object is less likely to emit signals because it passively accepts the operations of other devices. According to the design of the simulation device, the active object often sends out more signals; so, the perception of the active object is provided by the information of the feature queue. You obtain the set of objects that sent the signal through the information and do the intersection operation with the possible active objects of the intention to obtain a reasonable and possible set of active objects. If there are multiple objects, the object with the most occurrences in the set is selected as the active object; that is, the device that sends out the most signals. Due to dual intention recognition, when , the active object is not the active object of another node.

Since passive objects are often passively accepted devices, they seldom emit tactile signals. However, users are most likely to describe passive objects by voice. In addition, the instruments used in the scene may also be passive objects to be operated again. Passive objects are mainly obtained by fusion of auditory information and scene context information. First, according to the knowledge base, determine whether there are passive objects for the intention and group the passive objects in the knowledge base into a collection O1. Then, take the intersection of O1 and the object set O2 corresponding to the auditory signal in the feature queue to get O3 (not including the determined active object). If O3 has a unique element, then this element is a passive object; if O3 has multiple element, according to the priority rules, take the most likely element as the passive object; if it is an empty set, add the set of objects used in the scene and containing reagents O4 to the set (not including the active objects that have been determined). We finally find the passive object according to the multiple element method.

Among them, is the priority function, which is used to select the object with the highest priority. is the number of elements in the set .

The attribute refers to the description information of the user’s intention. By using attribute descriptions, the intention expression can be made more complete. Since the expression of attributes is relatively rich, it needs to be obtained from the fusion of tactile, auditory, and visual information.

First, according to the knowledge base, determine whether the intent has an attribute and group the attributes in the knowledge base into a set . Second, according to the database, transform the tactile information and auditory information in the feature queue into corresponding attribute sets and and take the intersection of , , and , respectively, to get and . Third, take the intersection of and to get the attribute set . If there is only one element in , this element is the attribute value. If there are multiple elements in , the attribute value is judged according to the visual information when the attribute is related to the visual position, and the user is asked for the attribute when the attribute is not related to the visual position. If and are not empty sets, it is judged that the operation is inconsistent with the voice , and if and are all empty sets, the default attribute of the knowledge base is used as its attribute value. If one of and is not an empty set, take the set that is not an empty set as the attribute set and then judge the attribute value by it.

Among them, is an attribute irrelevant to vision, and means an attribute related to vision.

4.4. Navigational Interactive Mode

The navigation interaction describes the application layer of the MSSN method. As shown in Figure 3, it is designed based on the perceived intentional behavior node and applied to the experimental system.

4.4.1. Intentional Behavior Node Screening

Since there are simultaneous operations in the wireless virtual reality perception simulation experiment scene, the system supports the perception of dual operation intentions; so, there may be one or two intention behavior nodes to be screened. Filtering out the nodes that users really want to execute is the prerequisite for executing nodes. Two nodes with the same active object but different intents are the focus of screening, because there may be situations that cannot be executed at the same time. Intentional behavior nodes include active objects, which are generally simulated devices that mainly send signals. The processing method is as follows: (1) determine the number of elements in NQ. If there is only one element in NQ, directly add the node set to be executed WNQ; if there are two elements in NQ, their active objects are the same but their intentions are different, go to step 2. If not, use NQ as the set of nodes to be executed WNQ. (2) According to the shortest intention conversion path method SRP, judge whether the active objects of the two elements can reach their new intentions, respectively, from the current intention. If they can be reached directly, then go to step 3, if not, change NQ. (3) By asking the user to select an intent node as the node to be executed WNQ, another node is set as an invalid node.

The system interacted according to the selected nodes. In the execution process of the node, first, according to the shortest intent conversion path method SRP (Node), the effective intent node in the set of nodes to be executed WNQ is planned for the intent conversion path. Then, it is judged that if it can be directly converted, the intent node is directly executed; if not, the shortest planned intent conversion path is used to prompt the user.

4.4.2. System Navigation

The navigational interactive method is to monitor the user’s operation and experiment process in the wireless virtual reality perception simulation experiment in real time, including voice navigation and visual navigation of the operation.

4.4.3. Voice Navigation

The navigational wireless virtual reality perception simulation experiment system proposed in this paper uses the method of experimenting while navigating to guide users in order to reduce the load of users. When a user has a common sense error, the user will be prompted through path planning or error prompts, which not only tells the user the reason for the unreasonable operation but also reduces the risk of the experiment not going on. The key knowledge of the experiment supports exploratory experiments (that is, experimental phenomena that can observe wrong operations). During this process, the experimental phenomena are returned and explained according to the perceived user intentions, and the users are guided to perform correct operations. In addition, the system will automatically monitor the progress of the experiment and set up voice navigation at key nodes to guide users to operate. This way of experimenting while navigating is better than traditional simple navigation at the very beginning, which can reduce the user’s load and reduce the risk that the experiment cannot be completed due to not using the system.

4.4.4. Visual Navigation

The system proposed in this paper uses an electronic auxiliary screen to guide users in the experimental scene, and uses text to write and present the key steps in the experiment on the screen, so that the user can follow the prompts on the screen during the operation. Visual navigation and voice navigation are used together. Visual navigation is more to provide experimental steps, and voice navigation is more to navigate the operations dynamically generated during the experiment.

5. The Entertainment Effect Simulation Evaluation Experiment of Wireless Virtual Reality Scene Film and Television Animation

5.1. Setting of Data Collection and Evaluation Methods

Entertainment validity is the final validity test of the entertainment results of film and television animation. It considers whether the film and television animation has reached the goal expected by the entertainer during the entertainment process. The validity and entertainment effect of film and television animation depends on the entertainment value experiment. Entertainment value is a quantitative evaluation method for measuring the entertainment effect of film and television animation. It refers to the degree of realization of the value of film and television animation in the entertainment process. The number of audiences here refers to the relative proportion of the actual number of people participating in the evaluation () to the total number of recruits (), and the social response index refers to the audience’s information perception (Ip) and artistic reflection (Er) when using film and television animation, social entertainment (Sc), and aesthetic experience (Ae) four effect dimensions of satisfaction.

This experiment uses three methods for subsequent processing of the evaluation data: the first method is to calculate the entertainment value of the four forms of film and television animation and conduct an overall evaluation to study the problem RQ1; the second method analyzes the experimental data and uses the two sets of mean test and variance analysis of the scale in the independent sample -test to study whether the audience’s satisfaction with the wireless virtual reality scene film and television animation and traditional film and television animation systems is due to the different presentation forms of the film and television animation, and there is a significant difference, that is, the research question RQ2; the third method is to calculate the average of the ease of use evaluation scores and compare them to study the user’s evaluation and recognition of the ease of use of the interactive system, that is, the research question RQ3. In the overall evaluation of entertainment value, it should be noted that due to the different types of film and television animation, the evaluation index of its effect should also be different. The experimental material is disaster-related film and television animation. Whether it is traditional film and television animation or wireless virtual reality scene film and television animation, it pays more attention to reflecting the impact of film and television animation content on the audience and society. Therefore, it is included in the evaluation indicators of this experiment. We should focus on the effects of the three dimensions of information perception, artistic reflection, and social entertainment and should appropriately reduce the weight of aesthetic experience effects. Of course, the weight indicators of the film and television animation evaluation rules are not fixed, but should be adjusted according to different types of film and television animation.

Under comprehensive consideration, this experiment uses two-level index subdivision method to assign proportional weights accordingly in the inspection and evaluation of interactive system entertainment effects: the first level, that is, the first index is the general commonality of entertainment effects, including information perception, artistic reflection, social entertainment, and aesthetic experience effects; the second-level indicators are the detailed rules for the quantification of each effect. Under the subdivision of the two-level weights, the actual score of a certain rule is the product of the same-level weight and the second-level weight of the rule, and the calculated scores of the various rules are added together to obtain the entertainment value of the effect.

According to the above weight distribution of the primary and secondary indicators, the formula definition of each effect can be formed, namely,

Among them, represents the score of the -th user for each effect, represents the proportional weight corresponding to each rule under a certain secondary index, represents the -th user, and represents the user’s rating of the -th rule.

After calculating the entertainment value of each first-level indicator, the final score can be further summarized as

Among them, represents the final average score of each level, and is the number of effective evaluations recovered from the experiment. According to the definition of entertainment value of film and television animation, combined with the number of audiences, social response index and formula setting for each level, the entertainment value of each film and animation form can be defined as

Since the testers in this experiment evaluated the four forms of film and television animation on the spot, the number of evaluators and recruiters is the same; that is, the audience ratio is 100%. Just list the product of the social response indexes, so the formula can be further simplified as

This experiment is based on the above formula. By calculating and comparing the entertainment value of each form of film and television animation, the overall evaluation of the entertainment effect of film and television animation reports, in order to more intuitively analyze the advantages and disadvantages of different forms of film and television animation and effectively understand the wireless virtual reality scene, there is still room for improvement in the interactive system of film and television animation.

5.2. Overall Evaluation and Analysis of Entertainment Effects

We calculate the overall entertainment value of text film and television animation, video film and television animation, data film and television animation, and VR scene film and television animation, respectively, and then calculate the general entertainment value of the four types of film and television animation entertainment effects and then compare and study the overall effect of VR scene film and television animation. The full score is 5 points. The higher the entertainment value, the better the entertainment effect of the movie and animation form. Through calculation, the entertainment value of VR scene film and television animation is the highest. From the comparison of data results, it is obvious that the overall entertainment value of VR scene film and television animation is higher than that of the other three types of film and television animation. To a certain extent, wireless virtual reality technology has achieved better entertainment effects of film and television animation.

After calculating the overall entertainment value of each film and television animation form, the entertainment value of each first-level indicator level is further calculated and presented in the form of a scatter chart. Figure 4 shows the detailed rules of the four forms of entertainment in film and television animation.

As can be seen from the detailed rules, the two interactive modes of film and television animation, data film animation and video film animation, have little difference in the entertainment value of information perception and artistic reflection. Data film and television animation reports are mainly based on data, supplemented by pictures and sounds. Although they can bring a certain interactive experience to the audience, they are far less than the dual experience of interaction and immersion that VR scene film and television animation brings to the audience. The effects of film and television animation in the three aspects of goal guidance, behavior promotion, and public opinion control are not as good as VR scene film and television animation. Based on the research of RQ1 problem, the overall entertainment effect of wireless virtual reality scene film and television animation is better, and wireless virtual reality plays a positive role in the entertainment effect of film and television animation in terms of social entertainment and aesthetic experience.

5.3. Research on the Significant Differences in the Impact of Different Forms of Film and Television Animation on Entertainment Effects
5.3.1. Significant Difference in the Effect of Information Perception

In order to understand the significant conditions of the detailed rules under the general commonality, we use the variance analysis of the analysis of variance in the SPSS statistical software and the ANOVA analysis to study the effects of the two indicators of information perception and artistic reflection. The entertainment value of the two is quite different at the social entertainment and aesthetic experience levels; so, the independent sample -test is used to analyze the effects of these two levels.

Through the analysis of variance, the four details of the information perception effect are used as the dependent variables of the research, and the text, film, and television animation and wireless virtual reality scene film and television animation are used as factors, and then the significance test of the information perception effect is generated, as shown in Figure 5. It can be seen from Figure 5 that the significance of the five details of the information perception effect is greater than 0.05.

5.3.2. Significant Difference in the Effect of Artistic Reflection

Taking the four detailed rules of artistic reflection effect as the dependent variables of the research, while the factors remain unchanged, the significance test is generated, and the null hypothesis and alternative hypotheses are set. The significance test of artistic reflection effect is shown in Figure 6. It can be seen from Figure 6 that the significance of the four secondary indicators of the artistic reflection effect is greater than 0.05. Research on the effects of RQ2 art reflection shows that wireless virtual reality scene film and television animation can significantly promote the audience’s emotional resonance, moral education, and rational thinking compared with text film and television animation, and these two forms of film and television animation have a more significant impact on the audience’s cognitive enhancement.

5.4. Data Analysis of Usability Test

This research and test adopts the mean value comparison method of the ease of use principle to evaluate the ease of use effect of the wireless virtual reality scene film and television animation interactive system. The mean value of the ease of use problems is calculated and sorted, and then the sum is calculated. Each question corresponds to the total score, total mean, and comprehensive ranking of the ease-of-use principles. According to the final data results, we judge and analyze the actual effect of the interactive system, the audience’s satisfaction, and the degree of problems that still exist. The higher the effect and satisfaction score, the better the ease of use effect, and the lower the value, the higher the audience’s recognition of the corresponding problem, that is, the better the ease of use of the principle. The ease-of-use test data analysis obtained by data processing is shown in Figures 7 and 8.

It can be seen from Figure 7 that the principle of state visibility and interactive experience is relatively high, indicating that the user’s state feedback to the system and the interactive experience obtained in the system environment are better. The overall ranking of satisfaction is relatively high (ranked third), which shows that users are satisfied and recognized by this system. The overall time-consuming test result of wireless virtual reality film and television animation perception is shown in Figure 8.

6. Conclusion

In the process of wireless virtual reality perception of human-computer interaction, people-oriented is an important design principle. Therefore, it is of great scientific significance to analyze and extract the intention of the user’s brain from the user’s behavioral representation. Multimodal information has more semantics than a single modality, and at the same time, it also produces a lot of redundancy. Therefore, it is very important to establish a multimodal fusion intention perception model. To this end, this paper proposes two multimodal fusion models to solve the problem of intention perception, namely, the multimodal fusion intention perception method based on listening sequence (MF-GVM algorithm) and the multimodal fusion intention perception based on SVM and set theory. Compared with the MF-GVM algorithm, the MSSN algorithm can further perceive the user’s more complex operation intentions and can better support the perception of dual intentions; so, it can be better adapted to more complex experimental scenarios. The experimental results show that compared with other forms of film and television animation, the wireless virtual reality scene film and television animation has better entertainment effects, and while enhancing the interest, it also brings a positive impact to the audience and the society, but it still needs to be noted that this system should pay more attention to the construction and embodiment of film and television animation content, as well as the optimization and improvement of error-proofing and fault-tolerant mechanisms in the design process, so as to improve user satisfaction. Generally speaking, the entertainment effect of this system is in line with the expected set goals, but there is still room for improvement and discussion in the design of interaction and function, which also requires further development and improvement of system resources, interaction, and functions. The multimodal fusion algorithm proposed in this paper needs to use a database and a knowledge base. In order to further optimize the algorithm, it is necessary to further optimize the design of the database and the knowledge base. In addition, future work needs to further explore multimodal fusion methods that rely less on databases and knowledge bases.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.