GPU based real-time simulation of massive falling leaves

As an important autumn feature, scenes with large numbers of falling leaves are common in movies and games. However, it is a challenge for computer graphics to simulate such scenes in an authentic and efficient manner. This paper proposes a GPU based approach for simulating the falling motion of many leaves in real time. Firstly, we use a motionsynthesis based method to analyze the falling motion of the leaves, which enables us to describe complex falling trajectories using low-dimensional features. Secondly, we transmit a primitive-motion trajectory dataset together with the low-dimensional features of the falling leaves to video memory, allowing us to execute the appropriate calculations on the GPU.


Introduction
Falling leaves, as a universal feature of autumn, are often shown in movies and games. Simulation of falling leaves has been studied for many years in hydrodynamics and physics [1]. However, because the air flow and fluid-solid coupling is complex, it is difficult to develop an accurate physical model for the falling motion of leaves. In computer graphics, to simulate falling leaves, researchers usually use simplified physical models or leaf path templates  [10][11][12][13][14][15][16][17][18][19][20] designed by artists. Most of these methods are unsuitable for GPU parallel processing.
Large quantities of falling leaves cannot be simulated by the CPU in real time. Real-time simulation of large numbers of falling leaves is still a current challenge.
In this paper, we propose a GPU based real-time approach to simulate the motion of many falling leaves. Our major contributions are: • a data-driven approach for leaf motion synthesis, which uses six primitive motions as low-dimensional features to describe a complex falling trajectory; • a GPU based framework for real-time simulation of large numbers of falling leaves. The rest of this paper is organized as follows. In Section 2, we present related research and underlying technologies related to our proposed method. Our method is then outlined in Section 3. In Section 4, we describe how we use low-dimensional features to represent falling trajectories using motion synthesis. The GPU simulation framework is presented in Section 5. Experimental results are given in Section 6, followed by conclusions in Section 7.

Related work
Scientists began to study how to simulate the movement of light yet rigid bodies such as falling leaves as early as 1870 [2]. However, after one and a half centuries, this problem has not yet been fully solved. Prior work for the modelling of falling leaves can be divided into three categories, based on using physical models, trajectory templates, or motion synthesis.
In 1994, Tanabe and Kaneko [3] built a physical model to simulate paper falling in air. They developed a simple phenomenological model of falling paper by solving ordinary differential equations, based on the Kutta-Joukowski theorem. However, they failed to consider the influence of vortices, which makes their simulation less than realistic. Wei et al. [4] proposed an approach to simulate the falling motion of light-weight rigid bodies via simulation of vortices and air motions in space. The authenticity of their work has been widely acknowledged; however, due to the high computational costs, their method cannot be applied to real-time rendering.
In 2008, Vázquez and Balsa [5] used Maya to design templates for the paths of falling leaves. Before rendering, a template is chosen for the path of each leaf. This approach allowed the use of a GPU, making it possible to simultaneously simulate very many falling leaves. Li et al. [6] proposed an approach to generate a 2D trajectory for a falling leaf using simple interaction, which is applicable to 2D animation. Compared to the approach based on vortex simulation, these methods are more suited for use on a GPU, but the authenticity of the scene is compromised.
In 1997, Field et al. [7] suggested use of a Re-I * phase diagram for free-fall motions based on statistics. They pointed out that in fluids with varying Reynolds number (Re) and dimensionless moment of inertia (I * ), various primitive motions exist, such as steady descent (SD), periodic tumbling (PT), transitional chaotic (TC) behaviour, and periodic fluttering (PF). A complex leaf falling motion can be seen as a combination of these primitive motions. After many experiments, Zhong et al. [8] found two other primitive motions, transitional helical (TH) motion and periodic spiral (PS) motion. These ideas have enabled data-driven motion synthesis to become a new way to simulate falling leaves.
In 2014, Xie and Miyata [1] synthesized a complex falling leaf trajectory by the use of motion synthesis, and used the trajectory to model falling leaves. Since each leaf has its own particular trajectory, their simulation is realistic. Compared to a simulation using a physical model, their simulation is much faster: it takes just 3 ms to update each frame when there is only one leaf in the scene. However, the leaves' trajectories need a large amount of memory to store, making this approach impractical for GPU acceleration when there are large numbers of leaves.
As the number of leaves increases in the scene, the frame rendering rate rapidly decreases.
In this paper, we improve upon Xie and Miyata's work, using low-dimensional features to describe falling leaf trajectories, thus permitting GPU acceleration. In this way, we can simulate falling leaves realistically and efficiently. Figure 1 illustrates our method. It has two main steps: motion synthesis, and GPU calculation. In the motion synthesis phase, we establish a trajectory dataset which builds upon collected experimental data to describe six types of primitive motions, which form the basis for the synthesis of sections of each complex falling trajectory. This allows use of low-dimensional features to describe this falling trajectory. The GPU calculation phase commences by sending the primitive-motion trajectory dataset and the leaves' trajectory features to the GPU. During run-time, we input wind field information, the system time, and leaf geometry parameters frame by frame, allowing computation of the positions of the leaves, and the reconstruction of the leaf geometry by the GPU. Finally rendering is performed.

Overview
In the rest of this paper, we focus on two problems: how to synthesize the falling motions of leaves by expressing a leaf-fall trajectory in terms of lowdimensional features, and how to simulate falling leaves on the GPU.

Primitive motions
Falling motions can be divided into several primitive motions if there is no disturbance [1,8,9]. A complex falling trajectory can be represented as the combination of one or more primitive motion trajectories. Following Zhong et al. [8], we divide leaf falling motions [9] into 6 classes. As shown in Fig. 2, there are steady descent (SD), periodic tumbling (PT), transitional chaotic (TC), periodic fluttering (PF), transitional helical (TH), and periodic spiral (PS).
Considering these motions, SD can be seen as a uniform linear motion with slight disturbance. PT, PF, and TC are 2-dimensional motions which can be regarded as the combinations of similar trajectory segments. Following Andersen et al. [10], we may formulate the trajectory segments as where A x and A y are the amplitudes of vertical and horizontal velocities of the falling leaf generated by oscillations due to the surrounding viscous flow respectively, Ω describes the angular frequency of the falling motion, and U is the average descent velocity. We select segments from the function curve with We adjust the parameters in Eq. (1) to get a trajectory segment dataset with differently shaped segments as shown in Fig. 3. We choose suitable trajectory segments from the dataset to construct primitive motions PF, PT, and TC: for PT, we select segments with similar lengths fluttering in the same direction. For PF, we select segments with similar lengths but fluttering in opposite directions. For TC, we randomly select trajectories.
When projected onto the x−y plane, TH and PS  curves have characteristic shapes as shown in Fig. 4. The spiral-motion trajectory is a circle while the helical-motion trajectory is similar to an eight-petal rose curve. These curves can be represented by the following equations: where A e is the amplitude of the elliptical oscillation generated in the x−y plane, E e is the ratio of the minor axis to the major axis of the oscillation ellipse, k is the ratio of the period of elliptical oscillation to that of rotation of the falling object, and h is the height from which the object is released. When E e is close to 0 and k = 1, the curve describes PS motion; when E e is close to 1 and k = 4, the curve can be used to describe TH motion. Thus, we can obtain paths for different primitive motions by adjusting the parameters in the equations above. The rotational motion while falling largely affects the realism of the leaf falling motion. Following Tanabe and Kaneko's work [3], we have the following ordinary differential equation (ODE): where w is the angular velocity of the leaf, ρ is the density of the leaf, and θ and α are the angles the leaf makes with the x−y and x−z planes respectively, k a is the friction against the motion perpendicular to the paper, V is the velocity in the x−y plane. Using this ODE gives authentic angular velocity data for the leaf falling motion. Combining a suitable velocity and angular velocity gives a primitivemotion trajectory for a falling leaf. These are gathered into a dataset. Every primitive-motion trajectory in the dataset contains N vectors of the form (x t , y t , z t , w t ), where x t , y t , and z t refer to the velocity of the leaf in the x, y, and z directions at time t respectively and w t denotes the angular velocity of the leaf at time t. The whole leaf falling motion can be seen as the combination of many primitive motions, while the switches between these primitive motions are not arbitrary.
Xie and Miyata [1] proposed a hypothesis that the vortexes are gradually generated behind the object because of the vorticity of the surrounding flow when an object starts falling from a release point. As a consequence, the motion of a leaf becomes increasingly sensitive to internal forces, which makes the motion unpredictable.
We define the labels L i , i = 1, . . . , 6 to denote primitive motions of types SD, PT, TC, PF, TH, PS respectively. From L 1 to L 6 , the randomness of the primitive motion gradually increases. A falling leaf motion M comprises a sequence of primitive motions M = m 1 m 2 · · · m i , each of which is labelled as above: m i = L j . As shown in Fig. 5, the label subscript sequence L 1 , . . . , L i should be an increasing sequence.

Low-dimensional trajectory feature of falling leaves
Xie and Miyata's approach [1] needs to precompute the whole trajectory of each falling leaf. It performs motion synthesis using the leaves' trajectories, providing strong realism but requiring large amounts of memory, making it inappropriate for simulating many falling leaves. For the same reason, it is unsuitable for GPU use. To solve the problem, we use low-dimensional features to represent leaf trajectories.
We define the feature D to be the trajectory feature for leaf falling motion, denotes the current primitive motion stage, in which L i is the primitive motion index, T i denotes the total time for this motion stage, and P i denotes the starting point for the primitive motion. There are five stages in D, which means there are at most 4 switches in the whole falling motion sequence. From the results of experiments in previous studies, we find that 4 switches are typically sufficient. We get L i from Table 1, which is calculated by a Markov chain model. T i and P i can be obtained from the following equations: Using the 15-dimensional feature D, we can reconstruct the whole falling-motion trajectory. Moreover, by using feature D and the primitivemotion dataset, we do not need to save the whole trajectory data into video memory. We only need to calculate the current velocity and angular velocity of the leaf.

GPU simulation framework
In traditional game engines, particle systems usually operate on the CPU. This places a heavy workload on the CPU, and data exchange between memory and video memory becomes a bottleneck in the simulation of complex particle motions. Recently, GPU based particle systems have been introduced, though most of them only support simple trajectory motions, and are unable to realize complex  calculations needed for tasks such as simulating falling leaves.
To solve the problem, we have built a simulation framework for falling leaves based on the GPUboth calculations and geometry reconstruction are realized on the GPU, which reduces the workload on the CPU as well as data exchange.
As shown in Fig. 6, simulation on the GPU includes four steps: calculation of trajectories, responses of motions to wind, geometrical reconstruction, and rendering of leaves. We discuss these steps in turn in this section.
In the initial stage, we transmit the primitive motion dataset, leaf trajectory features, and initial leaf locations to the video memory. During realtime simulation, the system time and wind field information are updated at each frame. Following previous works on GPU particle systems [11][12][13], we define the following data streams in the compute shader: • Geometric feature streams: these contain the geometric features of each leaf, including leaf size and normal direction, used for the geometry reconstruction, and which can be calculated on CPU. • Trajectory feature streams: these contain the 15-dimensional features for each leaf as discribed in Section 4.2. The stream is used for free-fall motion simulation. • Location and velocity streams: these two streams contain the location, velocity, and angular velocity of each leaf particle. In the compute shader, we first compute the leaf's position in the trajectory. Given a leaf falling motion feature D = {S 1 , S 2 , S 3 , S 4 , S 5 } and the current system time t, we get the current motion state to denote the sampling operation, so L i means primitive motion samples at location i, we get the velocity and angular velocity via the following equation: where V T and Ω T are the velocity and angular velocity of the leaf's motion respectively, and t s is the time when stage S i started. In this way, we get V T and Ω T without reconstructing the trajectory, saving much video memory. The simulation of wind has been discussed in many papers [14][15][16]. Following Mann [14], we simulate wind motion using the following equation: V W = 0.5 + 0.6F W cos(2.4 + 0.2t + x) sin(0.4 + 0.35t + x) + 0.7F W cos(0.3 + 0.1t + x) sin(1.2 + 0.35x) (6) where V W is the current wind speed, x is the xcoordinate in the wind field, and F W is the wind strength controlled by the user. When F W = 1, the variation of wind speed is shown in Fig. 7. For each leaf, we have a parameter W I which describes the wind's effects on the leaf. Equation (7) gives the leaf velocity V and angular velocity Ω in the wind field.
With V and Ω, we can update the leaf's position and normal information stored in the location and velocity streams. After the computation, we have to reconstruct geometrical shape of each leaf to obtain the vertex and topology information [17]. As shown in Fig. 8, a rectangle is used to represent the leaf during  simulation. The representation includes C, the center of the leaf, N or, the normal of the leaf, T an, the tangent to the leaf, and Size x and Size y, the sizes of the leaf. We get the Lup vector by finding the vector perpendicular to T in the plane of the rectangle, allowing us to calculate the positions of the vertices of the rectangle.
Reconstruction runs on the geometry stream in the shader. Using geometric reconstruction technology [18,19], we can avoid transmitting vertex and mesh information from memory to video memory every frame, which greatly increases the rendering rate.
Using the geometric information for the leaves, we render them using vertex and pixel shaders in DirectX 11.

Results
As shown in Fig. 9, we have applied our approach in a virtual landscape application, which has substantially improved the realism of the autumn scene. In order to prove the effectiveness and the advantage of our approach to simulate large numbers of leaves, we have designed a case to compare our approach with Xie and Miyata's. We use a simple garden scene, shown in Figs. 9(c) and 9(d), with varying numbers of trees in it, rendering the scene with two different approaches: 1. Xie and Miyata's approach, computing a trajectory for each leaf, and updating the leaf position on the CPU; 2. our approach, using a low-dimensional feature to rebuild each leaf's trajectory, and the geometry of the leaf, on the GPU. For comparison, both simulation approaches were built with Microsoft Visual Studio 2010 and DirectX 11. Evaluation was performed on a desktop computer running Windows 7, an Intel Core i5-4570 CPU, 8 GB of memory, and an nVidia GTX 650TI video card.
Without falling leaves, the frame rates for both examples was close to 1200 fps. We then started the simulation, ran the programs for 40 s, and recorded the average fps and CPU utilization. Table 2 gives our test results. Approach 1 provides fewer fps and has higher CPU utilization. As the quantity of the leaves reaches 50,000, fewer than 30 fps are achieved, which is unsuitable for real-time rendering. For approach 2, even for 1 million leaves, the frame rate is still 75 fps, and the CPU utilization rate is also relatively low. Our approach clearly has

Conclusions
This study proposes an approach to realize real-time simulation of huge numbers of falling leaves based on the GPU. While ensuring realism of simulation, we have substantially improved the simulation efficiency with the ability to simulate over 1,000,000 leaves simultaneously.
This approach is not limited to the simulation of falling leaves, but is also applicable to other light rigid bodies falling such as feathers, paper scraps, and little plates.
In our future work, we would like to improve the level of realism. We intend to add a collision response mechanism to calculate collision responses between the leaves, as well as between the leaves and other objects. Such an implementation needs to be highly efficient since we will have to conduct collision detection and collision response calculations for large numbers of objects in real time.