Feature Extraction of Broken Glass Cracks in Road Traffic Accident Site Based on Deep Learning

,


Introduction
Automobile glass is one of the important components of automobiles. It integrates not only the functions of wind, rain, heat, and sound insulation but also a safety component of the car. In recent decades, especially with the concept of car panoramic sunroof, the use area of car glass for cars has also become larger [1]. Because of the frequent occurrence and complexity of traffic accidents, automobile safety has always been one of the key research topics in various countries [2,3]. Traffic accidents are mainly divided into human-vehicle accidents, vehicle-vehicle accidents, and vehicle-environmental accidents. Due to the uncertainty of pedestrian movement direction, it has become one of the important factors leading to traffic accidents. In humanvehicle accidents, because the occupants in the car have safety belts and airbags and other protective devices, and the pedestrians have almost no protective measures, the injury and death rate of pedestrians and high-speed vehicles are much higher than those of the occupants in the car [4].
Deep learning is a method of high-dimensional abstraction of raw data by building deep structures or stacking complex processing layers to obtain an effective representation of raw data. Early neural networks were devoted to the processing of vectors. A one-dimensional vector is used to represent the original data, and the deep structure abstracts the characteristics of glass breakage and cracks on the scene of high-dimensional sparse road traffic accidents through learning [5,6]. e defect of this deep learning framework is that it has too many neural nodes and a large amount of calculation, which depends on the performance of the hardware. In the recognition of images, processing the image into a one-dimensional vector will destroy its spatial structure and local correlation [7]. e feature extraction of broken glass cracks in road traffic accident scenes is to extract the features of broken glass cracks in road traffic accident scenes from the original complex data that can be used for specific tasks [8]. Its expression ability is closely related to the performance of the corresponding tasks. e technology for extracting the characteristics of glass breaking and cracking at the scene of road traffic accidents can be divided into two types: the characteristic engineering of glass breaking and cracking at the scene of road traffic accidents and the study of the characteristics of glass breaking and cracking at road traffic accidents [9]. e feature engineering of glass broken and cracks on the scene of road traffic accidents must rely on people's prior knowledge and experience, and corresponding extraction techniques can be designed for different tasks to achieve certain results; the feature learning of glass cracks on the scene of road traffic accidents is to learn the road from the original complex data [10]. e characteristic expression of glass broken cracks at traffic accident scenes is different from the curing process of the glass broken cracks characteristic engineering at road traffic accident scenes, avoiding excessive reliance on prior knowledge, and is universal. Especially after the deep neural network is proposed, the technology for learning the characteristics of glass broken and cracks in road traffic accidents has made great progress. It can extract the characteristic expression of glass broken and cracks in road traffic accidents with high generalization ability and can improve the application of artificial intelligence in various applications.
In this paper, the deep Convolutional Neural Network (CNN) convolution layer is used to study the feature extraction and middle-level expression of glass broken cracks in local road traffic accidents. Two CNN models with different depths, Caffe Net and VGG-VD16, are used. e input of each CNN model adopts the method of image pyramid to extract the characteristics of glass broken and cracks in the deep convolutional layer of the input image at different scales. Specifically, the technical contributions of this article can be summarized as follows.
First, in order to enhance the distinguishability of depth descriptors, Hellinger kernel and Principal Component Analysis (PCA) road traffic accident scene glass breaking and crack feature transformation are introduced. We discuss the aggregation strategies of two depth descriptors to form the global expression of the image. "Descriptor Aggregation" simultaneously aggregates the depth descriptors extracted by Caffe Net and VGG-VD16 and enhances the robustness of the feature encoding of glass breakage and cracks on road traffic accidents from the perspective of increasing the number of descriptors. "Middle-level feature level aggregation" firstly performs the Fisher vector coding on the depth descriptors extracted by Caffe Net and VGG-VD16 separately and then merges the two middle-level coding results.
Second, the experiment shows that the Hellinger core, PCA transformation, and the two aggregation strategies are all conducive to improving the classification performance of the broken glass image on the road traffic accident scene.
rough the force on the glass specimen and the center displacement of the specimen during the impact of the sensor, the force-displacement curve of the specimen is integrated to obtain the energy consumption value of the specimen. rough comparison, it is found that in terms of impact resistance, Polyvinyl Butyral-(PVB-) laminated glass is stronger than flat glass, and when the types of glass specimens are the same, the impact resistance of the fourframe support glass is stronger than that of the four-point support glass. Fragment analysis was carried out on the flat glass specimen. For flat glass with four frames, the main crack form is a radial crack, and the resulting fragments are mainly sharp dagger-shaped fragments. For four-point support plate glass, in addition to dagger-shaped fragments, scaly fragments will appear near the boundary.

Related Work
Relevant scholars proposed continuous damage mechanics and applied them to the axisymmetric finite element model to study the failure mode and energy absorption characteristics of windshield glass under the impact of the head model with the characteristics of glass breaking and cracks on the scene of no road traffic accident [11]. e nine-node Lagrange element finite element model of laminated glass and the influence of boundary conditions, stacking sequence, glass size, collision speed, and impactor quality on the energy absorption characteristics of laminated glass at low speed are analyzed. Relevant scholars studied the influence of the failure threshold of laminated glass on the test results through a series of simulation experiments; the researchers used 3D solid elements to simulate windshield glass and studied its failure behavior under explosive load [12]. In the model, the glass layer adopts the principle of maximum principal stress to define the failure, and the PVB layer adopts the elastic-plastic and superelastic material model [13]. e reliability of the modeling method is demonstrated through the four-point bending test. Relevant scholars used membrane elements to simulate the PVB layer, established a finite element model of laminated glass, fitted the experimental stress-strain curve of PVB with the constitutive relationship of Blatz-Ko, Mooney-Rivlin, and Ogden materials, and fitted the fitted model [14]. Brought into the laminated glass material card, the test of the headshaped impactor impacting the windshield glass without the characteristics of the broken and cracked glass at the road traffic accident site was carried out. e researchers used LS-DYNA No. 110 material to simulate soda-lime glass and MAT24 material to simulate PVB interlayer, established a finite element model of laminated glass, and explored the influence of material parameters on the characteristics of laminated glass under explosive load [15].
Relevant scholars tried to use the deep convolutional network for the first time in recognition, positioning, and detection tasks [16]. e characteristics of glass broken and cracks learned by Convolutional Neural Network in road traffic accidents were used as the input of regression network to learn the coordinates of objects. e characteristics of broken glass in road traffic accident scene learned by the convolutional network include not only classification information but also information such as the location and size of the object. e researchers replaced the softmax classifier in Alex Net with regression and reset the regression objective function for target detection [17]. Related scholars use different sizes of convolution kernels on each layer of neurons and set the number of convolution kernels according to the size of the convolution template. In order to reduce the number of parameters, a 1 × 1 convolution kernel is used in the Inception V1 structure to reduce dimensionality, and an average pooling layer is used to replace the traditional full convolution layer before classification. In order to prevent the problem of gradient disappearance, two loss functions that generate gradients are added in the middle of the network structure. In addition, in its training, it has also enriched the data enhancement technology: adding image data with different aspect ratios, different resolutions, and different image encodings. e researchers proposed a 19-layer VGG network structure [18]. By constructing the network structure regularly, VGG can eliminate a large number of hyperparameters at the same time.
e author adds a convolutional layer between the two pooling layers of the convolutional network, and both use a 3 × 3 convolution kernel size. It is believed that the receptive fields of two 3 × 3 convolution kernels superimposed are equivalent to the receptive fields of a 5 × 5 convolution kernel and the receptive fields of three 3 × 3 convolution kernels superimposed together [19]. e receptive field is equivalent to the receptive field of a 7 × 7 convolution kernel, and this stacking can strengthen the expressive ability of the convolutional layer while reducing the number of parameters. During training, it uses a random multiresolution method, and during testing, it uses a method of taking average results from multiple resolution tests. e input of the researchers for the fully connected layer in the deep convolutional network is a fixed size, but scaling the size of the feature map or image of the glass breakage crack on the road traffic accident site will result in the loss of the glass breakage crack feature map or image information of the road traffic accident site. e pooling layer processes different sizes of input road traffic accident site glass breakage crack characteristics into uniform size output road traffic accident site glass breakage crack characteristics.
Related scholars proposed a deep residual network (Res Net) in the Image Net large-scale visual recognition challenge [20]. Unlike previous neural networks that tried to use convolutional layers to learn the complete mapping, residual networks use convolutional layers to learn the residual part, which can reduce the difficulty of learning and prevent the gradient from disappearing in the process of backpropagation [21]. Combining the previous batch normalization and MSRA initialization methods, the residual network can reach a depth of hundreds or even thousands of layers and has achieved excellent accuracy in image classification tasks. Inspired by the residual network, the researchers put forward the Inception-v4 network, tested the Inception-v3 with the identity connection, and verified that the residual connection can significantly shorten the training time and improve certain performance. Relevant scholars proposed wide residual networks [22][23][24][25]. Based on residual networks, they further explored the influence of width on neural networks and proved through experiments that width is more efficient than depth [26,27]. For example, a 16-layer widened residual network has better performance than a thousand-layer residual network and can save nearly half of the training time.

Fatigue Crack Growth Mechanism of Fiber-Reinforced Aluminum Alloy
Laminates. As shown in Figure 1, under the action of high-cycle fatigue load, the fiber resin layer does not break in the nonsawing area of the fiber-reinforced aluminum alloy laminate. erefore, the intact fiber will inhibit the opening of cracks in the aluminum alloy layer and have a bridging effect on the cracks.
Part of the load that the aluminum alloy should bear is transferred to the fiber resin layer, and the bridging fibers bear additional stress, which is called bridging stress. e fiber resin layer hinders the crack surface of the aluminum alloy from opening to slow down the crack propagation of the aluminum alloy layer. e main mechanism is the bridging effect of the fiber resin layer on the crack of the aluminum alloy layer so that the crack bears a smaller opening load. Under the combined action of the long-range load σAl and the fiber bridge, the plastic zone size at the crack tip of the aluminum alloy layer is reduced. According to the theory of incremental plastic damage, the reduction of crack tip plastic damage reduces the crack growth rate. At the same time, the bridging effect will also cause shear deformation between the metal layer and the fiber resin layer, resulting in delamination expansion, and the increase of the delamination will change the fiber bridging efficiency.

Two-Dimensional Model of Fatigue Crack Bridging of Fiber-Reinforced Aluminum Alloy Laminates.
In order to solve the stress ratio effect and compressive load effect of the fatigue crack growth of fiber-reinforced aluminum alloy laminates, the incremental plastic damage theory method is introduced into the two-dimensional crack bridging model. Using the two-dimensional elastoplastic finite element method, the fatigue crack growth rate is predicted by calculating the size of the plastic zone at the crack tip of the aluminum alloy layer under the action of fiber bridging. e two-dimensional model method focuses on the bridging effect of the fiber resin layer on the cracks and ignores the characteristics of glass breakage and cracks in some threedimensional road traffic accidents.
Ignoring the change in the crack length of the aluminum alloy layer along the thickness direction and the difference in the crack length of each layer of aluminum alloy, the material constant of the aluminum alloy layer with I-type cracks is set the same as that of the aluminum alloy material. In this paper, an ideal elastoplastic material is used to simulate the elastoplastic behavior of aluminum alloy materials. e crack propagates along the x-axis, and a tensile-pulling or tensilecompression cyclic load is applied in the y-axis direction, a is the half crack length, and a 0 is the sawing crack length.

Combination of Incremental Plastic Damage eory and
Two-Dimensional Crack Bridging Model. Under elastoplastic conditions, when the crack length and the layer boundary are constant, the size of the plastic zone at the crack tip is determined by the effective remote load σAl of the aluminum alloy layer and the bridging load distribution. e bridging stress distribution can be determined by the layer size, so With crack propagation, the new layered boundary is is study will first use the elastic-plastic finite element to calculate the crack surface opening displacement of the aluminum alloy layer under a certain crack length and layered boundary conditions. According to the balanced relationship of the crack bridge model of the fiber-reinforced aluminum alloy laminate, iterative calculations are applied to the bridging load of the boundary. And under the combined action of external load σAl and bridge load, the size of the plastic zone at the crack tip is solved.

Partial Coding.
Reconstruction-based coding means that by solving the least-squares problem, the linear combination of a few basis vectors is used to approximate the sample data, and the reconstructed road traffic accident site glass breaking and crack characteristics are linear combination coefficients. In the solved least squares problem, there are different forms of constraints, and the result of the reconstruction code is to solve the optimization problem.
Under the visual dictionary D, the reconstruction error of the broken glass crack feature x of the visual road traffic accident scene is minimized as much as possible to ensure the accuracy of the reconstruction; Ø (v) is to ensure the discrimination of the reconstructed coding result.
In sparse coding, in order to relax the L 0 constraint, a complex problem of minimizing L1 needs to be solved. e mathematical representation of sparse coding is arg min Among them, V is the visual dictionary and cm is the code obtained by the visual dictionary V of m visual road traffic accident scene glass broken and crack characteristics. e local constrained linear coding method improves the conditions for constraining locality in LCC, and the corresponding mathematical expression is as follows: Among them, d i is the constraint condition, expressed as e parameter σ is used to adjust the weight decay speed.

Global Coding.
Soft distribution coding is a smoother coding method than hard distribution coding. e soft allocation method uses multiple words to represent the characteristics of glass breakage and cracks on a road traffic accident site and considers the distance between the characteristics of glass breakage and cracks on the road traffic accident site and all words in the visual dictionary. Soft allocation coding is to assign weights to words according to the distance between the glass broken and crack features of the road traffic accident site to be allocated and all words in the visual dictionary. Words with a small distance from the glass broken crack feature on the road traffic accident site are usually assigned a larger weight. e expression of soft allocation is as follows: Among them, D (xm, bk) represents the distance between the broken glass crack feature of the road traffic accident site and the word in the visual dictionary, and β is the smoothing parameter, which controls the distribution of weights, which determines the similarity between the glass broken crack characteristics of the road traffic accident site degree.

Convolutional Layer Descriptor and Its Feature Coding of Glass Breakage and Cracks at Road Traffic Accident Scene.
Descriptors are the basis for the feature coding of broken glass cracks in road traffic accidents. Traditional descriptors such as SIFT and HOG are based on single-layer structure extraction of broken glass cracks in road traffic accidents. In contrast, the deep CNN model's convolutional road traffic accident scene glass broken crack feature contains a multilayer structure, and its expressive ability is better than traditional descriptors such as SIFT and HOG. Based on this, this section discusses a descriptor extraction based on deep CNN convolutional layer representation, combined with the feature encoding of glass broken and cracks on the scene of road traffic accidents and two aggregation strategies to realize the global expression of the image. Figure 2 shows the middle-level expression method based on the deep CNN convolutional layer of the glass broken and crack characteristics at the scene of a road traffic accident.

Local Descriptors for Deep Convolutional Layers.
When the pretrained CNN model (Caffe Net or VGG-VD16) is used to extract the global road traffic accident scene glass crack characteristics of the image (such as the fully connected layer road traffic accident scene glass crack feature), the input image size of the network is usually stable. For example, the input image size of Caffe Net and VGG-VD16 is 227 × 227 and 224 × 224 pixels, respectively. is is because the network structure of Caffe Net (or VGG-VD16) is fully connected to the first layer in the last convolutional layer. e number of weights (or neurons) between layers is fixed. In contrast, the CNN input image size is not limited by the size of the convolution kernel in the convolutional layer. When the size of the input image is larger, the size of the feature map of glass broken and cracks on the road traffic accident scene output by the last layer of CNN is also larger, as shown in Figure 3. erefore, by removing the fully connected layer in the CNN model, the model's limitation on the input image size can be overcome. Here, the images of each scale in the pyramid are separately input into the CNN network, and then the output road traffic accident scene glass crack feature map of the last layer of the network is obtained. e reason why the last layer of the convolutional layer is extracted is to obtain a more abstract, less redundant, and more discriminative feature of broken glass in a road traffic accident scene.
In order to describe the detailed information of the characteristics of the glass breaking and cracking of the convolutional road traffic accident scene, here the characteristics of the glass breaking and cracking of the convolutional road traffic accident scene at each spatial position are separately processed. As shown in Figure 4, taking the number of channels in the convolutional road traffic accident scene glass broken crack feature map as the dimension, the convolutional crack features of multiple road traffic accident scene glass broken crack feature maps at the same spatial position are vectorized to form single local descriptor. e part mentioned here mainly has two meanings, one is the convolution operation on the local spatial domain, and the other is the descriptor extraction at a single spatial location. Finally, we perform L2 normalization on each of the extracted descriptors to obtain the local descriptor of the deep convolutional layer.

Depth Local Descriptor Based on the Transformation of the Characteristics of Broken Glass in Road Traffic Accident Scene.
When performing statistics on deep local descriptors, such as cluster-based visual dictionary construction, it is often necessary to measure the distance of different descriptors, so it is necessary to select a suitable measurement method for the descriptors. e commonly used distance metric is the Euclidean distance, which is defined as follows: Among them, x i and x j represent two different descriptors.

Complexity
For some local descriptors based on histogram statistics, the measurement method using X 2 or Hellinger kernel is better than the Euclidean distance. Given any two local descriptors, the Hellinger kernel is defined as follows: Among them, x i and x j are descriptors normalized by L 1 , namely, where d′ is the dimension of the descriptor, and m is the dimension index.

Depth Local Descriptor Based on the Feature Coding of Glass Breakage and Cracks on Road Traffic Accident.
Given an image pyramid corresponding to any input image, a large number of depth descriptors can be extracted from it and the Hellinger kernel and PCA transformation can be performed. Since each depth descriptor is an expression of a single spatial position, ignoring the global characteristics of the image, this section uses the Fisher vector coding to encode and count the descriptors of all spatial positions to form the global road of the image. e middle-level representation based on the Fisher vector coding combines the advantages of the discriminative model and the generative model and mainly includes two processes. e probability density distribution in the Gaussian mixture model is expressed as Given a set of depth descriptors x 1 , x 2 , . . . , x N extracted from the training image, the parameters of the Gaussian mixture model are learned by using an expectation-maximization algorithm. e Gaussian mixture model defines the soft assignment from the descriptor to the Gaussian component, expressed as    6 Complexity

Hellinger Kernel Transformation.
On two road traffic accident scene glass broken crack image datasets, Figures 5  and 6 analyze the impact of the Hellinger check used in "descriptive sublevel aggregation" and "middle-level feature level aggregation" on classification accuracy. e depth descriptors extracted from the CNN model (without Hellinger kernel transformation) are susceptible to the influence of the characteristic values of glass breakage and cracks at the scene of large road traffic accidents, which will also affect the distance measurement between the descriptors. e Hellinger kernel transformation of the depth descriptor is a nonlinear transformation. After Hellinger kernel transformation, it can effectively weaken the influence of the characteristic value of glass breakage and crack on the scene of a large road traffic accident, and at the same time, it also enhances the discrimination of the characteristic value of glass breakage and crack on the scene of a smaller road traffic accident. erefore, the Hellinger kernel transform makes the depth descriptor more sensitive to the characteristic value of glass breakage and cracks at the scene of minor road traffic accidents when measuring the distance, thereby enhancing the distinguishability of the depth descriptor.
In general, whether it is "descriptive sublevel aggregation" or "middle-level feature level aggregation," Hellinger's kernel transform can help improve the classification accuracy of glass broken crack images at road traffic accident scenes.

PCA Road Traffic Accident Site Glass Breakage and Crack
Feature Transformation. Figure 7 lists the classification accuracy of "descriptive subaggregation" for different PCA dimensions. e figure compares the three methods of "Caffe Net," "VGG-VD16," and "Caffe Net without PCA" based on a single CNN. It can be seen from Figure 7 that with the increase of the PCA dimension, the classification accuracy based on "Caffe Net," "VGG-VD16," and "Descriptor Aggregation" has been above 83%. Comparing "Caffe Net"   Complexity and "Caffe Net without PCA," it can be seen that preserving all principal components helps to achieve better classification accuracy. Figure 8 lists the classification accuracy of "middle feature level aggregation" under different PCA dimensions. In "Middle Feature Level Aggregation," the depth descriptor extracted by Caffe Net is transformed to a fixed dimension, and then the depth descriptor extracted by VGG-VD16 is transformed to different dimensions. With the increase of the PCA dimension, the depth descriptor extracted based on VGG-VD16 has more obvious performance advantages when combined with PCA transformation than without PCA transformation. For the depth descriptor extracted by VGG-VD16, in the process of transforming to 200 dimensions combined with PCA, a more ideal classification performance can be obtained.

e Number of Clusters for the Feature Coding of Glass
Breakage and Cracks on the Road Traffic Accident Site. Figure 9 analyzes the influence of the number of GMM clusters on classification accuracy. In "Descriptor Level Aggregation" and "Middle Feature Level Aggregation," depth descriptors extracted based on Caffe Net and VGG-VD16 are transformed to 256 dimensions using PCA. e reason for analyzing the number of GMM clusters in this section is that the number of GMM clusters determines the dimension of the final image expression. When the number of GMM clusters is set to 64, "descriptive sublevel aggregation" and "middle-level feature level aggregation" can extract 2 × 255 × 64 � 32640 dimensions and 2 × (255 + 511) × 64 � 98048 dimensions of road traffic accident scenes, respectively. In general, the classification effect of "middle-level feature level aggregation" is better than "descriptive sublevel aggregation." When the number of GMM clusters is set to 64, the classification effect on the two data sets is relatively ideal. Using a small number of GMM clusters is not only conducive to the aggregation of different road traffic accident scene glass breaking and crack feature coding results but also can avoid the sudden increase in the dimension of the road traffic accident scene glass breaking crack feature expressed in the final image.

Performance Comparison of Glass Breaking and Cracking Characteristics of Convolutional Layer and Fully Connected
Layer in Road Traffic Accident Scene. In the image classification task of glass broken cracks on the road traffic accident   8 Complexity scene, the activation vector of the CNN fully connected layer is used as the image representation of the global road traffic accident scene glass broken crack characteristics, which can be compared to the shallow expression. erefore, Figure 10 compares the convolutional layer and the fully connected layer of the CNN model for the glass breakage and crack characteristics of the road traffic accident scene. e activation vectors extracted from the Caffe Net and VGG-VD16 fully connected layers are taken as the global road traffic accident scene glass break crack characteristics of the image, which are, respectively, marked as "Caffe Net (fully connected layer)" and "VGG-VD16 (fully connected layer)." First, we set the size of each input image to the input size of the corresponding CNN model and then extract the L2 normalized 4096-dimensional activation vector from the last fully connected layer (except the classification layer) of the CNN model. In addition, extracting depth descriptors from the convolutional layers of the two CNN models and then combining Hellinger kernel and PCA transformation, respectively, can obtain two types of road traffic accident scene glass broken and crack characteristics, which are recorded as "Caffe Net (256-dimensional convolutional layer)" and "VGG-VD16 (256-dimensional convolutional layer)."

Fragment Analysis of Flat Glass Specimens.
Under the action of impact load, glass shards splash at high speed. Because of their sharpness, they may threaten the lives of nearby people. erefore, for impacted glass, it is necessary to analyze its fragment shape and splash speed. For PVB laminated glass, although spiderweb-like cracks appear on the specimen, the glass fragments formed by the intersection of radial cracks and circular cracks are still attached to the PVB interlayer, thereby greatly reducing the damage caused by fragment splashing. erefore, this chapter only discusses the spatter of flat glass. For flat glass with four frames, the main cracks are radial cracks, and the resulting fragments are mainly sharp daggershaped fragments. For the four-point supporting plate glass, because the radial cracks have a tendency to "turn" around the supporting holes, the cracks bifurcate and intersect.
erefore, in addition to dagger-shaped fragments, scaly fragments will appear near the boundary. With the increase of the impact velocity, the specimens under the two boundary conditions showed a tendency of local damage, and the number and size of fragments decreased, but they were still very sharp.
When there is clearly identifiable debris in the image, the time at this time is taken as zero time. At the same time, the front end of the debris at this time is used as the displacement zero points to make a zero displacement reference axis. en, we mark the real-time position of the front end of the fragment at the same time interval to determine the displacement of the fragment. Finally, the real displacement of the fragments is calculated according to the calibration rule, and the displacement time history curve of the fragments is obtained. e time history curve of the fragment displacement obtained by the processing is shown in Figure 11.
Due to the wrong trigger time, the flying speed of a small number of specimens was not captured by the high-speed camera. Linear fitting is performed on the displacement time history curve of the fragments to obtain the fragment splash velocity. As the impact speed increases, the fragment splash speed gradually increases. However, there is no obvious law for the comparison of the fragment splash velocity under the two boundary conditions, which may be due to the difference in size and mass of the splash fragments. e fragment splash speed obtained through processing will be used as an important reference index for comparison with subsequent numerical simulation results. 5.6. Discussion. Understanding the crack distribution pattern of glass specimens under impact load is very important to clarify the size distribution of glass fragments. In the impact test, the destruction process of the specimen is captured by a high-speed camera. In order to facilitate the observation of crack propagation, grids are laid on the surface of the test piece in advance. e final damage shape of the specimen was also recorded by taking pictures. However, due to the trigger error of the high-speed camera, the destruction process of a small number of specimens was not captured. It is worth mentioning that the fracture speed of the specimen is very fast, and the crack develops rapidly in a very short period of time. We observed the fracture process of the tempered glass specimen by a high-speed camera and found that the average speed of the crack front end can reach 1466 m/s. For high-speed cameras, the too-high sampling frequency cannot guarantee the clarity of the image, so the number of sampling frames for high-speed cameras in the impact test is 2000 FPS. Due to the limitation of the number of sampling frames, the crack propagation speed of the glass specimen cannot be obtained. However, the images collected by high-speed cameras can still show the difference in failure modes of the specimen under different boundary conditions. Under impact load, the spherical wavefronts of longitudinal waves and shear waves propagate at different speeds from the point of application of the load. Longitudinal waves that propagate faster are reflected on the lower surface of the glass plate as tensile waves, which cause quite high tensile stress somewhere near the free surface. When the tensile stress meets the dynamic fracture criterion of the material, it will cause the material to break. is is the cause of radial cracks.
In most cases, by combining the Hellinger kernel and the PCA transformation, the classification effect of the depth descriptor can be better than that of the fully connected layer activation vector (global description). For example, experimental results using 10%, 50%, and 80% training images, respectively, show that "VGG-VD16 (511-dimensional convolutional layer)" has a higher classification accuracy than "VGG-VD16 (fully connected layer)"; experimental results using 50% and 80% training images, respectively, show that "Caffe Net (256-dimensional convolutional layer)" has higher classification accuracy than "Caffe Net (fully connected layer)." Compared with "Caffe Net (256dimensional convolutional layer)" and "VGG-VD16 (511-dimensional convolutional layer)," the classification effects of "descriptive sublevel aggregation" and "middle-level feature level aggregation" perform better.
Similar to flat glass, the overall rigidity of the framesupported specimen is stronger, so the force on the specimen is relatively large, and the maximum displacement of the specimen is smaller. is proves once again that the constraints of the four-frame support are stronger than the fourpoint support. After the test, statistics were made on the cracks of the PVB specimens, and the energy absorption of the specimens was also obtained by curve integration. When counting the number of cracks in the PVB laminated glass specimen after impact, it is found that the radial cracks on the front and back of the specimen are completely overlapped. For laminated glass, the longitudinal stress wave first causes the radial cracks on the back of the test piece to crack and develop. When the radial cracks on the back of the laminated glass have fully expanded, due to the adhesion of the PVB interlayer, the layer of the test piece can be ignored. erefore, in the thickness direction, the crack will usually cross the elastic interlayer and recrack at the same position on the front glass. When counting the number of radial cracks, only one overlapped crack on the front and back sides is counted as one.
When the boundary conditions are the same, the energy absorption effect of PVB laminated glass is better than that of flat glass.
is is because the PVB interlayer can absorb energy through deformation, friction, and intermolecular vibration, and on the other hand, the double-layer glass panel can provide more fracture energy.

Conclusion
e extraction of the local descriptors of CNN convolutional layer road traffic accident scene glass broken crack features is studied, and two aggregation strategies are proposed to describe the sublevel and the middle-level road traffic accident scene glass broken crack feature level, which are used to fuse two different types CNN model. Two CNNs with different depths, Caffe Net and VGG-VD16, are used, and the fully connected layer in the model is removed. e input of the CNN model uses an image pyramid to extract the characteristics of the convolutional layer of the image at different scales of road traffic accidents. e number of channels in the glass broken crack feature map of the convoluted road traffic accident scene is taken as the feature dimension of the glass broken crack at the road traffic accident scene and the road traffic accident where multiple convoluted road traffic accident scene glass broken crack feature maps are located in the same space. e on-site glass breaking and crack features are combined into a single descriptor, and the Hellinger kernel and Principal Component Analysis are used to further transform the descriptors. e aggregation strategy is adopted to obtain the global expression of the image. e classification experiments on two road traffic accident scene glass broken crack image datasets show that the depth descriptor based on the image pyramid combined with the proposed aggregation strategy can obtain a higher classification than the fully connected layer road traffic accident scene glass broken crack feature. rough the force on the glass specimen and the center displacement of the specimen during the impact of the sensor, the force-displacement curve of the specimen is integrated to obtain the energy consumption value of the  specimen. rough comparison, it is found that in terms of impact resistance, PVB laminated glass is stronger than flat glass, and when the glass specimens are of the same type, the impact resistance of the four-frame glass is stronger than the four-point support glass, which is due to the constraints of the four-frame support. e effect is better than the fourpoint support, which improves the overall rigidity of the glass specimen. In terms of energy consumption, the energy absorption effect of the four-point support glass specimen is stronger than that of the four-frame glass, and when the boundary conditions are the same, the energy absorption value of PVB laminated glass is higher than that of the flat glass specimen. Fragment analysis was carried out on the flat glass specimens, and the results showed that for flat glass with four frames, the main cracks were radial cracks, and the resulting fragments were mainly sharp dagger-shaped fragments. For the four-point supporting plate glass, because the radial cracks have a tendency to "turn" around the supporting holes, the cracks bifurcate and intersect. erefore, in addition to dagger-shaped fragments, scaly fragments will appear near the boundary.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.