Abstract

In recent years, the interdisciplinary exploration of combining traditional visual communication design with somatosensory interaction technology has become a new form of artistic expression. In order to explore the feasibility of somatosensory interaction technology in visual communication design, this study proposes a 2D dynamic graphic generation method based on somatosensory interaction parameterisation and uses traditional Chinese elements as an example for specific applications in visual communication design. Firstly, the motion parameters recognised using the Kinect somatosensory interaction device are bound to the function variables used to generate the images in the development environment, thus enabling human somatosensory interaction with different characters in the scene. Secondly, a linear discriminant analysis based on kernel functions is used to reduce the dimensionality of the vector space, thus solving the problem of real-time and accurate capture of human movements. Then, using the skeletal parameter binding technique, the association between the motion parameters of the somatosensory interaction and the two-dimensional dynamic graphics is achieved. The experimental results show that the visual communication technique based on somatosensory interaction has a high recognition accuracy. Distinguished from traditional digital video, the proposed method can greatly enrich the visual representation of traditional Chinese elements.

1. Introduction

In recent years, there have been many points of convergence between the rapidly developing digital economy and the cultural industries. Science and art are beginning to merge in terms of technology and form. In today’s cultural communication application scenarios, audiences increasingly need a good live experience. However, in the current presentation of design content, spatial interpretation is still usually done by means of printed posters or by showing offline videos [15]. As the design content itself is independent, the audience cannot communicate with the design content. To achieve real-time interaction between the design content and the audience, a fundamental change in the design approach is required. The interdisciplinary exploration of traditional visual communication design in combination with somatosensory interaction is a new solution to these problems.

In the landscape of museum exhibitions and tourism developments, designers prefer to use immersive interactive experiences to attract foot traffic. Unlike traditional interface interaction, somatosensory interaction emphasises the use of body movement responses to communicate with the product [68]. As a representative of Human-Computer Nature Interaction (HCI), somatosensory interaction is a new stage of technological development. With good immersion, low learning cost, and good user experience, somatosensory interaction has won wide application prospects and has received attention from scholars in various fields at home and abroad [912]. More advanced somatosensory interaction first appeared in 2007. During this period, Nintendo combined somatosensory with gaming and introduced the concept of “health gaming.” In 2010, Microsoft launched Kinect v1.0, which was released with the X-BOX console and won high reputation and sales. At this stage, Kinect v1.0 began to be used in the medical, fitness, and retail industries, with various leading-edge applications being attempted. As a noncontact means of interaction, somatosensory interaction has a natural advantage for scenario applications on immersive and large digital spaces. Currently, rapid advances in science and art are leading to increasing demands for spatial visual applications. The use of interactively updated visual information to improve spatial functionality is a future area for joint development across media.

In order to explore the feasibility of somatosensory interaction technology in visual communication design, this study proposes a 2D dynamic graphics generation method based on somatosensory interaction parameterisation and uses traditional Chinese elements as an example for specific applications in visual communication design. It is important to note that somatosensory interaction technology requires real-time and accurate capture and recognition of human movements. Therefore, this paper employs the latest Kinect v2.0 device and designs a method for action parameter feature extraction based on linear discriminant analysis.

Humans sometimes do not always speak the truth, but body language often expresses their truest emotions. Therefore, the recognition of human body movements has been an important research direction in the field of computer image recognition [1316]. Currently, popular algorithms for body movement recognition include BP neural networks, decision trees, and Support Vector Machine (SVM), among others.

At this stage, the most commonly used hardware device for body movement recognition is the Kinect body sensor. For example, Ying et al. [17] proposed a human motion recognition method using the Kinect body sensor. Lai et al. [18] proposed a DSP-based portable human gesture action recognition system in real time, using a combination of spectral analysis and linear discriminant analysis (LDA) strategy. Although these two methods improve the accuracy and efficiency of action recognition, respectively, there are more types of actions in practical applications. At the same time, human actions are more complex in Kinect-based somatosensory interaction, resulting in real-time and high-dimensional action data, so the dimensionality of the vector space must be reduced as much as possible in order to recognise more action types in real time.

The main objective of this research is to visualise 2D motion graphics using the Kinect v2.0 device and apply it to visual communication design. When the body posture data obtained from the Kinect v2.0 device is used as a “parameter” in the 2D motion graphics, changes in body posture can trigger changes in the 2D motion graphics, creating an interaction between somatosensory and traditional Chinese elements. When a person walks across the screen, the Kinect sensor recognises the person’s x-coordinates and maps them to the computer, which then projects the motion parameters onto the screen via a projector, causing the traditional Chinese elements in the 2D graphics to change dynamically by panning. The experimental results validate the effectiveness and accuracy of the proposed method.

The main innovations and contributions include the following: (1) This paper proposes an LDA-based feature extraction method for motion parameters in order to improve the recognition accuracy of the Kinect v2.0 device. (2) Using the binding technique of skeletal parameters, the association between somatosensory interaction motion parameters and 2D dynamic graphics is achieved.

3. Kinect v2.0-Based Parameterisation of Somatosensory Interaction

3.1. Key Technologies

Microsoft first announced Kinect in June 2009 [19, 20] with the hope that the hardware would merge motion with communication. The device was officially launched in November 2010, and in May 2013, Microsoft demonstrated the next generation Kinect v2.0, which allows developers to design based on the voice, gesture, and player sensory information sensed by Kinect v2.0, bringing users an unprecedented interactive experience. In this paper, we decided to use the Kinect 2.0 device as the base hardware for development.

The colour resolution of Kinect v2.0 has been dramatically increased from 640 × 480 to 1920 × 1080, enabling very beautiful images to be acquired. Kinect v2.0 can bone bind all 6 of the maximum number of identified users and identify 25 keys. The Kinect V2.0 will be able to bone bind all 6 users and identify 25 key nodes. Also, because of the increased resolution of the depth sensor, the user data can be separated from the person with a simple cut, and the detection range has been increased from 0.8–4 m to 0.5–4.5 m. It is important to note that the infrared sensor does not require light, i.e., it can still be used in dark or dark environments.

Kinect consists of a colour camera, an IR camera, and an IR projector with a microphone array underneath [2123] as shown in Figure 1. The IR camera and IR projector work together to achieve the depth image function. The main hardware features of Kinect v2.0 are shown in Table 1.

The colour image is based on the data stream acquired by the colour camera on the far left of Kinect v2.0. The colour camera sensor is shown in Figure 2.

Depth image and skeleton tracking technology is the dominant technology in the Kinect device and is somewhat representative. It contains information about the distance of the current object from the camera’s point of view in addition to the grey scale value. Each pixel has its own information, and when there are enough of them, they can form a point cloud that recreates the geometry of the object as well as its position and distance. The closer the object is, the darker the colour in the depth image, while the more distant the object is, the more white it tends to be. The TOF technique allows infrared light to be emitted and the attenuation of the light to be offset by phase detection, as shown in Figure 3.

The pixel information obtained from the depth image contains two parts, where the higher order 13 bits of information is the detected object pixel distance depth information. The Kinect sensor can detect the target human depth information in the valid range of 0.6 m to 4.5 m. The depth image pixel information is shown in Figure 4.

3.2. Feature Extraction and Recognition of Action Parameters

The motion data captured in Kinect-based somatosensory interaction is complex. In particular, when the types of actions to be recognised are large and variable, the action data can contain a large number of high-dimensional nonlinear features. Therefore, this paper uses linear discriminant analysis based on kernel functions to reduce the dimensionality of the vector space and thus solve the problem of capturing human actions accurately and in real time.

Due to the complexity and variability of human actions in VR scenes, it is not possible to extract some important high-dimensional nonlinear feature information hidden in the action data. Therefore, this paper introduces the kernel function in the LDA algorithm for nonlinear projection to extract the expression features.

In the Kinect captured human movement dataset, let be the action matrix and be the full rank matrix with class labels in the LDA algorithm [24, 25].where each is a data point in an m-dimensional space. Each block matrix is the set of data items in class i. is the size of class i, and the total number of data items in the data set is . Let denote the index of the columns belonging to class i. The global centre of and the local centre of each class are denoted, respectively, as follows:

Suppose that the following settings are met:where , , and are referred to as the interclass scatter matrix, intraclass scatter matrix, and total scatter matrix, respectively.

The standard LDA objective function can be shown as follows:

As the standard LDA algorithm uses a linear computational principle, it leads to less effective results in dealing with nonlinear problems and has a singularity problem. Therefore, kernel function-based LDA is used to reduce the dimensionality of the vector space and thus effectively extract the nonlinear features in human action data.

Set the kernel matrix is , where . The Fisher criterion functions in can be expressed as follows [26]:where, is the kernel space projection vector.where is the mean of the ith sample in , is the overall mean, and is the intraclass scatter matrix .This can be expressed as follows:where . Equation (6) can be expressed as follows:where denotes the overall scatter matrix of kernels and denotes the scatter matrix between kernel classes [27].where is a kernel intraclass scattering matrix. For any collected human action data point , let denote a set of feature vectors of the optimal solution; then, we obtain the kernel space projection matrix [28, 29].

Finally, an SVM classifier is used to implement the recognition of human actions. Combined with a SVM classifier, complex action classification recognition is finally achieved.

4. Two-Dimensional Motion Graphics in Visual Communication

4.1. Association of Movement Parameters with Two-Dimensional Dynamic Chinese Traditional Elements

Somatosensory is the perception of body posture, which can be achieved through a variety of sensors. Sensors can capture changes in our body posture, including the position of our head and individual joints, the speed and direction of movement, and even facial expressions, joint flexion of the hands, gestures, and so on. Essentially, these changes in body posture are changes in some data. Changes in the parameters of the posture data then cause changes in the generated graphics. When the body posture data obtained from somatosensory is used as a “parameter” in 2D dynamic graphics, changes in body posture can naturally lead to changes in 2D dynamic graphics. This creates an interaction between body sensing and 2D motion graphics, enabling the association of body sensing with 2D motion graphics. From a design and user experience perspective, this enables the user’s behaviour to be involved in the design, resulting in a better user experience.

As shown in Figure 5, the Cleveland Museum of Art has developed a somatosensory interaction installation in Ohio, USA. In this installation, when a person approaches the screen, the image unfolds to form a Dunhuang fresco containing traditional Chinese elements. The Kinect body sensing device captures the movement signals of the audience and maps the recognised movement parameters to the traditional Chinese Figs on the mural. This interactive approach allows the audience to control the movement of a character in the image.

The motion of the character’s arms, head, body, and legs in the motion graphics are motion-bound to the viewer’s captured bone points, allowing the viewer to directly manipulate the character on-screen. This type of interaction allows for a more fluid experience. The characters in the Dunhuang murals can be synchronised with the viewer’s movements. By binding the motion parameters recognised by the Kinect interaction device to the function variables used to generate the images in the development environment, human interaction with the different characters in the murals can be achieved. The person can sense the motion graphics in different screens when they are in different areas.

4.2. Binding of Skeletal Parameters

The most resource-intensive part of the Kinect hardware is the bone tracking technology. The bone coordinate system is the most important part of the Kinect hardware, accounting for 50% of the resources of the entire Kinect system. Using this technology, 2D motion graphics applications based on somatosensory human-computer interaction can be developed. Kinect v2.0 can simultaneously identify the position information of six people and bind the user’s pose to the coordinate information of key bone points. Bone data is also one of the most commonly utilised parts of action recognition based on Kinect hardware. The correspondence between the human skeleton recognised by Kinect and the character’s bones in the mural is shown in Figure 6.

Each bone point is represented by a Joint. In Kinect v2.0, 5 new bone points have been added, as well as 20 bone points in the previous version, all of which have corresponding names in the JointType, as shown in Table 2.

5. Experimental Results and Analysis

5.1. Experimental Setup

In order to implement the interactive features offered by the somatosensory device Kinect, the Kinect Development Kit needs to be installed before running the device. The main development software currently available is the official Microsoft SDK, which manages the colour images and depth information obtained by the Kinect camera through the SDK 2.0, sharing it with Unity 3D and writing language functions to control it. In the development environment configuration, the Kinect for Windows SDK 2.0 tool needs to be applied. The Kinect v2.0 device has a resolution of 1980 × 1080 and an FPS value of 30. The experiments were carried out using the official Kinect wrapper plugin. The experimental hardware configuration is shown in Table 3.

During the experiment, when a person walks across the screen, the Kinect sensor recognises the person’s x-coordinates and maps them to the computer through arithmetic. The computer then projects the motion parameters onto the screen through a projector, causing the two-dimensional graphics in the mural to change dynamically in translation. The experimental environment for somatosensory interaction is shown in Figure 7.

5.2. Visual Communication Effects Based on Somatosensory Interaction

The Kinect recognition range is: 0.5–4.5 m, without special circumstances the whole body bone is generally selected for recognition and the full body sensory interaction is selected through subsequent program control. Kinect v2.0 device eliminates the generation of motors and the capture angle needs to be set manually. Therefore, after the capture angle has been determined, a horizontal plane distance conversion is performed for the farthest interaction distance threshold. The closest distance threshold is the horizontal plane distance at which the colour image just fully accommodates the user’s entire body. In the absence of special requirements, the threshold can be contracted by 10% to ensure tracking stability. After the interaction range was set for the real environment, interaction capture detection was performed and the experimental results are shown in Table 4.

Using the test data recognised by Kinect, the results and statistical analysis of the nine human movements were derived as shown in Figures 8 and 9 respectively.

As can be seen in Figures 8 and 9, the visual communication technology based on somatosensory interaction achieved 92.1% and 92.2% on the precision and accuracy metric averages, respectively. This indicates that the technology exhibits excellent performance across the nine action types. Unlike traditional digital video that does not have an interactive experience, somatosensory interaction associates a relationship between the person and the digital landscape. The viewer can assume that they are in the space depicted in the mural and can move freely. This allows the viewer and the digital work to break out of the binary spatial relationship and achieve a more multi-dimensional spatial expansion. The application of somatosensory interaction to the visualisation of two-dimensional motion graphics can greatly enrich the visual representation of traditional Chinese elements.

6. Conclusion

In order to explore the feasibility of somatosensory interaction technology in visual communication design, this study proposes a 2D motion graphics generation method based on somatosensory interaction parameterisation and uses traditional Chinese elements as an example for specific applications in visual communication design. The latest Kinect v2.0 device is used and an LDA-based feature extraction method for action parameters is designed in order to improve the recognition accuracy of the Kinect v2.0 device. Using the skeletal parameter binding technique between the human body and the characters in the graphics, the association between physical interaction action parameters and 2D dynamic graphics is realised, thus greatly enriching the visual representation of traditional Chinese elements.

Data Availability

The experimental data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest to report regarding the present study.

Acknowledgments

This study was supported by Research on the Improvement of Jinzhong Intangible Cultural Heritage Technology and the Research and Development of “Jinzhong Gift” Cultural and Creative Design from the Perspective of Cultural Tourism Integration (No. jzwhlyj2021hx).