Design and Optimization of Hemispherical Resonators Based on PSO-BP and NSGA-II

Although one of the poster children of high-performance MEMS (Micro Electro Mechanical Systems) gyroscopes, the MEMS hemispherical resonator gyroscope (HRG) is faced with the barrier of technical and process limits, which makes it unable to form a resonator with the best structure. How to obtain the best resonator under specific technical and process limits is a significant topic for us. In this paper, the optimization of a MEMS polysilicon hemispherical resonator, designed by patterns based on PSO-BP and NSGA-II, was introduced. Firstly, the geometric parameters that significantly contribute to the performance of the resonator were determined via a thermoelastic model and process characteristics. Variety regulation between its performance parameters and geometric characteristics was discovered preliminarily using finite element simulation under a specified range. Then, the mapping between performance parameters and structure parameters was determined and stored in the BP neural network, which was optimized via PSO. Finally, the structure parameters in a specific numerical range corresponding to the best performance were obtained via the selection, heredity, and variation of NSGAII. Additionally, it was demonstrated using commercial finite element soft analysis that the output of the NSGAII, which corresponded to the Q factor of 42,454 and frequency difference of 8539, was a better structure for the resonator (generated by polysilicon under this process within a selected range) than the original. Instead of experimental processing, this study provides an effective and economical alternative for the design and optimization of high-performance HRGs under specific technical and process limits.


Introduction
The MEMS hemispherical resonator gyroscope (HRG) is one of the typical representatives of micro-resonator gyroscopes for its high reliability, good accuracy, strong stability, long service life, and simple structure [1,2]. The existing MEMS hemispherical resonator gyroscopes have achieved bias stability and a Q factor to the level of 0.01 • /h and 1,000,000, respectively. To adapt it to a different application scenario, researchers have developed the micro HRG for different materials and structures, and for these different gyros, there are a few discrepancies in their manufacturing processes.
Up until now, there have been three main approaches to processing a MEMS hemispherical resonator. Since the approaches are different, the resonators have different modeling characteristics. Additionally, since the manufacturing processes are different, the resonators have different modeling characteristics [3].
Andrei M. Shkel et al. from the University of California proposed the method of using glass blow molding to make the resonator of a MEMS HRG [4]. K. Najafi et al. from the University of Michigan developed a MEMS HRG with a bias stability of 0.0391 • /h and an angular random walk of 0.0087 • / √ h by applying a constant velocity of airflow directly above the fused silica [5].
On the other hand, it is difficult to control the concentricity error and roundness error of a resonator processed via this method, as Shanghai Jiao Tong University demonstrated [6].

Data Collection and Analysis Based on PSO-BP and NSGA-II
As Figure 1 shows, MEMS HRG achieves its high-precision detection through the standing wave influenced by Coriolis affecting the resonator. The incentive part affects periodic excitation caused by coulomb forces or mechanical forces to the resonator; then, a standing wave will come into play on the resonator. The antinode points of the standing wave are set as A, B, C, and D; when the shell resonator is stimulated by an angular velocity Ω relative to the inertial system, each antinode point will have the implicated angular velocity Ω.
Under the effect of V A , V B , V C , and V D caused by the standing wave and the implicated angular velocity Ω external input of A, B, C, and D have Coriolis accelerations of W KA , W KB , W KC , and W KD , respectively. Thus, the Coriolis inertial force of P KA , P KB , P KC , and P KD are generated by the Coriolis accelerations of V A , V B , V C , and V D . Two equivalent Coriolis couples with opposite directions are formed via P KA -P KC and P KB -P KD , resulting in the precession of the standing wave relative to the resonator shell and inertial reference frame.
Due to the precession of the standing wave, the points on the harmonic oscillator that have an isometric deformation will change to other ones. Measuring electrical signal changes at specific points caused by their deformation changes, the accurate angle and angular velocity of the external input will be determined. couples with opposite directions are formed via PKA-PKC and PKB-PKD, resulting in the precession of the standing wave relative to the resonator shell and inertial reference frame.
Due to the precession of the standing wave, the points on the harmonic oscillator that have an isometric deformation will change to other ones. Measuring electrical signal changes at specific points caused by their deformation changes, the accurate angle and angular velocity of the external input will be determined. The signal quality is affected by multiple factors, of which the most significant effect is the Q factor and frequency characteristic.
The Q factor represents the ratio of the total energy that the resonator stored to the energy that dissipated in a vibration cycle. There are a few elements that influence the Q factor, such as thermoelastic damping, anchor damping, surface damping, and air damping. The Q factor can be written as follows [16]:  (1) Air damping represents the energy consumption caused by air viscous force to a certain degree. However, since the MEMS resonator always works in a vacuum environment, it does not work obviously in energy consumption for the air damping.
The anchor damping represents the energy consumption caused by anchor loss. When the resonator vibrates, a part of the energy will be lost through the basement to the earth, and that is what is called anchor loss.
Surface damping represents the energy consumption caused by surface defects such as large surface harshness. A rough surface causes increasing surface quality imbalances and other problems, which make the resonator have to carve out a portion of energy to keep balance.
Thermal elastic damping represents the energy consumption caused by the irreversible heat flow. Under the effect of the standing wave, the shell resonator receives tension and compression at the different parts; then, the temperature of the different parts under tension and compression decreases and increases, respectively, resulting in a local temperature difference in the resonator.
For the MEMS HRG, the size and deformation of the anchor are too small to make the anchor loss give considerable influence on the Q factor compared with the thermal electric damping. Additionally, as shown above, surface damping and air damping can The signal quality is affected by multiple factors, of which the most significant effect is the Q factor and frequency characteristic.
The Q factor represents the ratio of the total energy that the resonator stored to the energy that dissipated in a vibration cycle. There are a few elements that influence the Q factor, such as thermoelastic damping, anchor damping, surface damping, and air damping. The Q factor can be written as follows [16]: Air damping represents the energy consumption caused by air viscous force to a certain degree. However, since the MEMS resonator always works in a vacuum environment, it does not work obviously in energy consumption for the air damping.
The anchor damping represents the energy consumption caused by anchor loss. When the resonator vibrates, a part of the energy will be lost through the basement to the earth, and that is what is called anchor loss.
Surface damping represents the energy consumption caused by surface defects such as large surface harshness. A rough surface causes increasing surface quality imbalances and other problems, which make the resonator have to carve out a portion of energy to keep balance.
Thermal elastic damping represents the energy consumption caused by the irreversible heat flow. Under the effect of the standing wave, the shell resonator receives tension and compression at the different parts; then, the temperature of the different parts under tension and compression decreases and increases, respectively, resulting in a local temperature difference in the resonator.
For the MEMS HRG, the size and deformation of the anchor are too small to make the anchor loss give considerable influence on the Q factor compared with the thermal electric damping. Additionally, as shown above, surface damping and air damping can also be ignored. So, what we have to take into consideration is only thermal elastic damping.
The frequency characteristic includes characteristic frequency and frequency cracking. Characteristic frequency is the natural frequency of a specific vibration mode; it is influenced by the structure and materials of the resonator. Frequency cracking is the difference between the operating mode and its parasitic mode; it is influenced by structure mass imbalance and stiffness imbalance of the resonator.
The operating mode of HRG is the natural-bending vibration mode, also known as the four-antinode mode. Generally, the ambient noise is around 5000 Hz, aiming at minimizing the disturbance to measurement signal caused by ambient noise; the characteristic frequency of operating mode should keep a distance to the frequency of 5000 Hz.

Preparation
For this study, an ideal model was used, in which the stiffness and mass are absolutely symmetrical; thus, the frequency cracking was not taken into consideration.
Additionally, for the resonator to have other modes at low-frequency ranges except for the operating mode, it is necessary to make the difference between the operating mode and other low-frequency modes as large as possible.
To optimize the resonator based on the analysis of thermal elastic damping, the trajectories of the irreversible heat flow should be determined. In the heat conduction of the solid module and solid mechanics module of COMSOL, as the finite element simulation results revealed, the trajectories of the irreversible heat flows are shown in Figure 2.
ing. Characteristic frequency is the natural frequency of a specific vibration mode; it is influenced by the structure and materials of the resonator. Frequency cracking is the difference between the operating mode and its parasitic mode; it is influenced by structure mass imbalance and stiffness imbalance of the resonator.
The operating mode of HRG is the natural-bending vibration mode, also known as the four-antinode mode. Generally, the ambient noise is around 5000 Hz, aiming at minimizing the disturbance to measurement signal caused by ambient noise; the characteristic frequency of operating mode should keep a distance to the frequency of 5000 Hz.

Preparation
For this study, an ideal model was used, in which the stiffness and mass are absolutely symmetrical; thus, the frequency cracking was not taken into consideration.
Additionally, for the resonator to have other modes at low-frequency ranges except for the operating mode, it is necessary to make the difference between the operating mode and other low-frequency modes as large as possible.
To optimize the resonator based on the analysis of thermal elastic damping, the trajectories of the irreversible heat flow should be determined. In the heat conduction of the solid module and solid mechanics module of COMSOL, as the finite element simulation results revealed, the trajectories of the irreversible heat flows are shown in Figure 2. There are five tracks for the irreversible heat flows (refer to [17]). The heat flows along the lip of the resonator and the anchor of the hemispherical shell surface are relatively long, and the heat transfer time is much longer than the vibration period of the operating mode. Therefore, the energy loss caused by these three heat flows can be ignored. Then, the thermo elastic damping of the resonator can be written as follows: There are five tracks for the irreversible heat flows (refer to [17]). The heat flows along the lip of the resonator and the anchor of the hemispherical shell surface are relatively long, and the heat transfer time is much longer than the vibration period of the operating mode. Therefore, the energy loss caused by these three heat flows can be ignored. Then, the thermo elastic damping of the resonator can be written as follows: ρ, C, E, α, and T 0 are the density, constant pressure heat capacity, Young's modulus, coefficient of thermal expansion, and initial temperature of the harmonic oscillator material, respectively; τ across-h , τ across-z , and τ across-s are time constants for the heat flow transfer through the thickness direction of the hemispherical shell, the heat flow transfer through the thickness direction at the anchor, and the heat flow along the circumferential direction of the edge of the anchor, respectively; t is the thickness of the edge of the shell; − z is the average length of the heat flow passing through the thickness direction at the anchor; s is the radius of the hemispherical shell at the edge of the anchor.
It can be inferred that structure parameters that have an influence on the thermo elastic damping are t, − z, and s from (5)- (7). Additionally, the fundamental parameters of the structure parameters are a, R, r, H, H − h, and R − r (refer to Figure 3). π k τ across-s = s 2 ρC 4k (7) ρ, C, E, α, and T0 are the density, constant pressure heat capacity, Young's modulus, coefficient of thermal expansion, and initial temperature of the harmonic oscillator material, respectively; τacross-h,τacross-z, and τacross-s are time constants for the heat flow transfer through the thickness direction of the hemispherical shell, the heat flow transfer through the thickness direction at the anchor, and the heat flow along the circumferential direction of the edge of the anchor, respectively; t is the thickness of the edge of the shell; ̅ is the average length of the heat flow passing through the thickness direction at the anchor; s is the radius of the hemispherical shell at the edge of the anchor.
It can be inferred that structure parameters that have an influence on the thermo elastic damping are t, z , and s from (5)- (7). Additionally, the fundamental parameters of the structure parameters are a, R, r, H, H−h, and R−r (refer to Figure 3). Additionally, for this study, a polysilicon resonator processed via thin film deposition was taken as an example. The process of thin film deposition can be obtained from [18], as shown in Figure 4. First, the thermal oxide is patterned to mask the wafer, capacitive actuation, and sense electrodes that are isolated from the substrate are created by PN junctions. Second, A PECVD oxide layer is patterned to form the isotropic etch mask, and sulfur hexafluoride (SF6) plasma is used to etch the hemispherical mold. Third, an LPCVD low-stress silicon nitride layer is patterned to mask the topside and backside during KOH etching, which forms the back-side plug mold. Finally, the polysilicon layer on the wafer surface is removed in a short dry etching step with SF6 plasma while the shell is protected by a photoresist, and the shell is then released in HF.  Additionally, for this study, a polysilicon resonator processed via thin film deposition was taken as an example. The process of thin film deposition can be obtained from [18], as shown in Figure 4. First, the thermal oxide is patterned to mask the wafer, capacitive actuation, and sense electrodes that are isolated from the substrate are created by PN junctions. Second, A PECVD oxide layer is patterned to form the isotropic etch mask, and sulfur hexafluoride (SF6) plasma is used to etch the hemispherical mold. Third, an LPCVD low-stress silicon nitride layer is patterned to mask the topside and backside during KOH etching, which forms the back-side plug mold. Finally, the polysilicon layer on the wafer surface is removed in a short dry etching step with SF6 plasma while the shell is protected by a photoresist, and the shell is then released in HF. τ across-s = 4k (7 ρ, C, E, α, and T0 are the density, constant pressure heat capacity, Young's modulus coefficient of thermal expansion, and initial temperature of the harmonic oscillator mate rial, respectively; τacross-h,τacross-z, and τacross-s are time constants for the heat flow transfe through the thickness direction of the hemispherical shell, the heat flow transfer through the thickness direction at the anchor, and the heat flow along the circumferential direction of the edge of the anchor, respectively; t is the thickness of the edge of the shell; ̅ is the average length of the heat flow passing through the thickness direction at the anchor; s is the radius of the hemispherical shell at the edge of the anchor. It can be inferred that structure parameters that have an influence on the thermo elas tic damping are t, z , and s from (5)- (7). Additionally, the fundamental parameters of the structure parameters are a, R, r, H, H−h, and R−r (refer to Figure 3). Additionally, for this study, a polysilicon resonator processed via thin film deposi tion was taken as an example. The process of thin film deposition can be obtained from [18], as shown in Figure 4. First, the thermal oxide is patterned to mask the wafer, capaci tive actuation, and sense electrodes that are isolated from the substrate are created by PN junctions. Second, A PECVD oxide layer is patterned to form the isotropic etch mask, and sulfur hexafluoride (SF6) plasma is used to etch the hemispherical mold. Third, an LPCVD low-stress silicon nitride layer is patterned to mask the topside and backside during KOH etching, which forms the back-side plug mold. Finally, the polysilicon layer on the wafe surface is removed in a short dry etching step with SF6 plasma while the shell is protected by a photoresist, and the shell is then released in HF.  There are mainly four steps for thin film deposition to process a resonator: first, manufacturing a mask on the silicon substrate; second, etching the silicon substrate to form a mold; third, depositing the structure in the mold; fourth, releasing the structure via wet etching.
The forming defects of the resonator processed via this method are mainly caused by the mold, which may cause a rough surface and a cavity structure, with the ratio of depth to width <1 resulting from wet etching, which cannot be resolved faultlessly now.
As mentioned earlier, the forming defects of specific manufacturing processes were the problems that should be put first. So, in consideration of the modeling characteristic and gravity effect when forming the structure via deposit, the parameter characteristics of the resonator can be written as follows: A is the coefficient matrix; S is the structure parameters matrix; b is the deviation matrix. Set t as follows: Referring to [18], the varying range of structure parameters was set as shown in Table 2, and polysilicon was selected to deposit the structure.

R (µm) r (µm) H (µm) h (µm) A (µm)
301~502 299~500 251~552 249~550 50~100 Obtaining the mapping between structure parameters and performance parameters requires a lot of calculations; finite element simulation can solve the problem of data collection, and BP neural networks can solve the problem of fitting efficiently.
The variation is regular between performance parameters, and singular structure parameters can be obtained from simulation by varying one of the structure parameters and holding on to the others sequentially. Some of the results are shown in Table 3. The curve graphs drawing the singular structure parameter with all of the performance parameters are shown in Figures 5-10. For every structure parameter, the figure (a-d) demonstrate when R or r is 300, 350, 400, and 500 as samples.
It can be seen that the effects of the performance parameters of varying R and r are almost the same, and it is in conformity with the conclusion that [17] provided. The increase in R and decrease in r both result in the increase in thickness of the edge of the shell, and it affects the path length of the heat flow transfer through the thickness direction of the hemispherical shell and the heat flow transfer through the thickness direction at the anchor.
Increasing the thickness of the hemispherical shell appropriately can extend the path length of the heat flow transfer through the thickness direction of the hemispherical shell and the heat flow transfer through the thickness direction at the anchor, decreasing the energy loss in a vibration period, which results in the decrease in thermoelastic damping, improve the Q factor, and raise the characteristic frequency. The curve graphs drawing the singular structure parameter with all of the performance parameters are shown in Figures 5-10. For every structure parameter, the figure (a), (b), (c), (d) demonstrate when R or r is 300, 350, 400, and 500 as samples.  The curve graphs drawing the singular structure parameter with all of the performance parameters are shown in Figures 5-10. For every structure parameter, the figure (a), (b), (c), (d) demonstrate when R or r is 300, 350, 400, and 500 as samples.  improve the Q factor, and raise the characteristic frequency. However, whether it is the Q factor, the characteristic frequency, or the frequency difference, there is a turning point when the structure transfers from the standard hemisphere to the non-standard hemisphere. It can be inferred that it is the sudden change in the structure that results in the sudden change in the path length for the irreversible heat flow. It is obvious that except for the turning point, there is a significant linear positive correlation between t and frequency difference and a linear negative correlation between t and characteristic frequency. It means that the characteristic frequency of the operating mode is getting closer and closer to the characteristic frequency of the other mode, which can also be said that the characteristic frequency is getting more and more intense. So, it is necessary for us to make the value of t as small as possible.
Additionally, for the Q factor, it is also necessary to control the size of t at a low level as it shows a negative correlation between t and the Q factor since the heat production converged from the edge to the bottom, which results in the increase in energy loss with the t increase. It can be informed that the characteristic frequency increases as a increases. However, the Q factor and frequency difference are both generally negatively correlated with a; it can be inferred that although the path length of the heat flow at the anchor increased, the heat flow generated at the bottom also increased while a increased.  It can be informed that the characteristic frequency increases as a increases. However, the Q factor and frequency difference are both generally negatively correlated with a; it can be inferred that although the path length of the heat flow at the anchor increased, the heat flow generated at the bottom also increased while a increased.  When H and h increase synchronously, out of our expectation, the Q factor decreases within a certain range when R is less than 500 µm, and then the Q factor increases, which is par for the course. An inference that the Q factor decrease because of the increase in energy loss caused by heat flow along the circumferential direction of the edge of the anchor is less than the decrease in energy loss caused by heat production converging from the edge to the bottom was given. Additionally, this phenomenon disappears when R is However, whether it is the Q factor, the characteristic frequency, or the frequency difference, there is a turning point when the structure transfers from the standard hemisphere to the non-standard hemisphere. It can be inferred that it is the sudden change in the structure that results in the sudden change in the path length for the irreversible heat flow.
It is obvious that except for the turning point, there is a significant linear positive correlation between t and frequency difference and a linear negative correlation between t and characteristic frequency. It means that the characteristic frequency of the operating mode is getting closer and closer to the characteristic frequency of the other mode, which can also be said that the characteristic frequency is getting more and more intense. So, it is necessary for us to make the value of t as small as possible.
Additionally, for the Q factor, it is also necessary to control the size of t at a low level as it shows a negative correlation between t and the Q factor since the heat production converged from the edge to the bottom, which results in the increase in energy loss with the t increase.
It can be informed that the characteristic frequency increases as a increases. However, the Q factor and frequency difference are both generally negatively correlated with a; it can be inferred that although the path length of the heat flow at the anchor increased, the heat flow generated at the bottom also increased while a increased. Figures 9 and 10 show the variety of regulation between performance parameters and h. When H and h increase synchronously, out of our expectation, the Q factor decreases within a certain range when R is less than 500 µm, and then the Q factor increases, which is par for the course. An inference that the Q factor decrease because of the increase in energy loss caused by heat flow along the circumferential direction of the edge of the anchor is less than the decrease in energy loss caused by heat production converging from the edge to the bottom was given. Additionally, this phenomenon disappears when R is large enough.
When H and h decrease synchronously, predictably, the Q factor decreases as the energy loss caused by heat flow along the circumferential direction of the edge of the anchor decreases. Since the path length of heat flow along the circumferential direction of the edge of the anchor is reduced, this results in a reduced transfer time.
It can be seen that although the curve graphs can reflect the relationship between single structure parameter and performance parameters to some extent, it is difficult to explain the mapping reasonably since every structure parameters interact with each other, and it is, hence, the need for global optimization of all of the structure parameters.

Data Handling
BP neural network is one of the most widely used neural network models as a multilayer feedforward neural network whose training strategy is according to an error backpropagation algorithm, and its structure is shown in Figure 11. We can obtain the mapping between multiple arguments and multiple variables via the BP neural network. However, the adjusted direction of weight values of traditional BP neural network is along the direction of local improvement, which leads to local extremum easily. Moreover, due to the BP neural networks being sensitive to initial weight values, different initial weight values may lead the neural network convergence to different local minima; in addition, there lacks an effective selection method to determine the weight values, which can However, the adjusted direction of weight values of traditional BP neural network is along the direction of local improvement, which leads to local extremum easily. Moreover, due to the BP neural networks being sensitive to initial weight values, different initial weight values may lead the neural network convergence to different local minima; in addition, there lacks an effective selection method to determine the weight values, which can only be obtained through experimental learning and training, while too many hidden layer neurons may lead to over learning, but too few causes under learning. Additionally, another problem is that the number of hidden layer neurons can only be roughly obtained based on experience, making the algorithm unstable.
Particle Swarm Optimization (PSO) can solve the above problems of traditional BP neural networks to some extent. We assume that the deviations, weight values, and a number of hidden layer neurons of the neural network as a disaggregation set to be solved, and all of the disaggregation sets are treated as a population in the PSO, while each disaggregation set is treated as an individual, as shown as Figure 12. This way, we can prevent the optimal weight values and deviation obtained from falling into the locally optimal solution by combining the locally optimal solution (individual optimal solution) and the globally optimal solution (population optimal solution) and obtain uniquely determined weight values, deviation, and a number of hidden layer neurons. However, the adjusted direction of weight values of traditional BP neural network is along the direction of local improvement, which leads to local extremum easily. Moreover, due to the BP neural networks being sensitive to initial weight values, different initial weight values may lead the neural network convergence to different local minima; in addition, there lacks an effective selection method to determine the weight values, which can only be obtained through experimental learning and training, while too many hidden layer neurons may lead to over learning, but too few causes under learning. Additionally, another problem is that the number of hidden layer neurons can only be roughly obtained based on experience, making the algorithm unstable.
Particle Swarm Optimization (PSO) can solve the above problems of traditional BP neural networks to some extent. We assume that the deviations, weight values, and a number of hidden layer neurons of the neural network as a disaggregation set to be solved, and all of the disaggregation sets are treated as a population in the PSO, while each disaggregation set is treated as an individual, as shown as Figure 12. This way, we can prevent the optimal weight values and deviation obtained from falling into the locally optimal solution by combining the locally optimal solution (individual optimal solution) and the globally optimal solution (population optimal solution) and obtain uniquely determined weight values, deviation, and a number of hidden layer neurons. N is the number of hidden layers neurons; W1num, B1num, W2num, B2num, and N num correspond to the dimensions of W1, B1, W2, B2, and N. Input num is the dimension of input data, hidden num is the dimension of hidden layer neurons, output num is the x 3 y 2 Figure 12. The structure of a particle for PSO-BP.
N is the number of hidden layers neurons; W1num, B1num, W2num, B2num, and N num correspond to the dimensions of W1, B1, W2, B2, and N. Input num is the dimension of input data, hidden num is the dimension of hidden layer neurons, output num is the dimension of output data, and N num is 1, which is used to record the number of hidden layer neurons.
NSGA-II is a kind of genetic algorithm which is always used to solve the combinatorial optimization problem; it can improve the probability of excellent individuals being preserved and reduce global complexity with an elitist strategy and the concept of congestion degree.
The mapping of structure parameters and performance parameters can be obtained via the BP neural network, and the BP neural network will be set as the constraint function of NSGA-II. Then, the best disaggregation set for the combination of structure parameters that correspond to the best performance parameters, under the limitation of a specific manufacturing processing, will come out as the output of NSGA-II, as shown in Figure 13. gestion degree.
The mapping of structure parameters and performance parameters can be obtained via the BP neural network, and the BP neural network will be set as the constraint function of NSGA-II. Then, the best disaggregation set for the combination of structure parameters that correspond to the best performance parameters, under the limitation of a specific manufacturing processing, will come out as the output of NSGA-II, as shown in Figure 13. The specific steps are as follows: 1. Train BP neural network.
(1) Build a sample set from simulation data. The dimension of the input is 5, and each input contains R, r, H, h, and a. The dimension of the output is 3, and each output contains f, f0, and Q. There are 1760 sets of data.
The matrix of input is as follows: The matrix of input is as follows: (2) Normalize the dataset as follows: The specific steps are as follows:
Train BP neural network.
(1) Build a sample set from simulation data. The dimension of the input is 5, and each input contains R, r, H, h, and a. The dimension of the output is 3, and each output contains f, f 0 , and Q. There are 1760 sets of data. The matrix of input is as follows: The matrix of input is as follows: (2) Normalize the dataset as follows: Optimize the particles.
Giving the weight values, deviations, and the number of hidden layer neurons to the neural network every time, the particles update and import the structure parameters into the neural network. Take the two-norm difference between the predicted data and the actual data output as the evaluation criterion for every particle. The smaller the value, the closer the particle is to the optimal solution.
During this process, the current optimal position P of a single particle, which has the smallest evaluation value for a single particle, and the current optimal position G of the entire particle swarm, which has the smallest evaluation value over the whole group are recorded, and the particle position is continuously updated with the target of optimal position. The overall process is shown in Figure 14. (4) Optimize the particles.
Giving the weight values, deviations, and the number of hidden layer neurons to th neural network every time, the particles update and import the structure parameters int the neural network. Take the two-norm difference between the predicted data and th actual data output as the evaluation criterion for every particle. The smaller the value, th closer the particle is to the optimal solution.
During this process, the current optimal position P of a single particle, which has th smallest evaluation value for a single particle, and the current optimal position G of th entire particle swarm, which has the smallest evaluation value over the whole group ar recorded, and the particle position is continuously updated with the target of optimal po sition. The overall process is shown in Figure 14.
The function of position update is as follows: where w is the inertia factor, C 1, and C 2 are individual learning factors and social learning factors, respectively, and the value range of r 1 and r 2 is (0, 1). After iterative calculation, the particle with the lowest fitness is output, and the weight values, deviation, and a number of hidden layer neurons stored within the particle are just the optimal ones for the BP neural network.

2.
Calculate the best combination of structure parameters. The entire process is shown in Figure 15.
factors, respectively, and the value range of r1 and r2 is (0,1). After iterative calculation, the particle with the lowest fitness is output, and the weight values, deviation, and a number of hidden layer neurons stored within the particle are just the optimal ones for the BP neural network.
2. Calculate the best combination of structure parameters.
The entire process is shown in Figure 15. (1) Pick up the information of the particle the output from 1. Give the best weight values, deviation, and a number of hidden layer neurons to the BP neural network; it will be the constraint function of NSGA-II that will be used in (2). (2) Set the BP neural network as the constraint function of NSGA-II, and set as a constraint condition.
(3) Selecting the structure-parameter combinations that participate in genetic variation by elite strategies; Then, cod the structure-parameter combinations selected to simulate binary crossover and polynomial mutation, and put the offspring and the parent of structure-parameter combinations together. The combined structural parameter combinations were sorted via a fast non-dominant sorting algorithm, and a new parent is selected.
Repeat these 500 times, and output the Pareto frontier. The Pareto frontier is shown in Figure 16. (1) Pick up the information of the particle the output from 1. Give the best weight values, deviation, and a number of hidden layer neurons to the BP neural network; it will be the constraint function of NSGA-II that will be used in (2). (2) Set the BP neural network as the constraint function of NSGA-II, and set AS = b as a constraint condition. (3) Selecting the structure-parameter combinations that participate in genetic variation by elite strategies; Then, cod the structure-parameter combinations selected to simulate binary crossover and polynomial mutation, and put the offspring and the parent of structure-parameter combinations together. The combined structural parameter combinations were sorted via a fast non-dominant sorting algorithm, and a new parent is selected.
Repeat these 500 times, and output the Pareto frontier. The Pareto frontier is shown in Figure 16.

Results
We need a higher Q factor, a higher frequency difference between the frequency of ambient noise and the characteristic frequency of the operating mode, and a higher frequency difference between the characteristic frequency of other low-frequency modes and

Results
We need a higher Q factor, a higher frequency difference between the frequency of ambient noise and the characteristic frequency of the operating mode, and a higher frequency difference between the characteristic frequency of other low-frequency modes and the characteristic frequency of the operating mode.
According to Figure 16, the Pareto frontier can be divided into four groups: the first is from point 1 to point 4; the second is from point 5 to point 12; the third is from point 13 to point 23; the fourth is from point 24 to point 31. Points in group 1 have the highest Q factor and the lowest frequency difference, and, among these points, the difference in the Q factor is unusually small, while the difference in frequency difference is relatively large.
The difference in the Q factor among the points of the second group is second only to the first group; however, the difference in the Q factor among the points of this group is obviously larger than it of the first group, while the differences in frequency difference are almost the same.
For the third group and the fourth group, differences in the Q factor and frequency difference among them are less distinctive. However, compared with the first group and the second group, they have a considerable variation in the Q factor and frequency difference over their points inside. Additionally, the Q factor of the points in the third group and fourth group are all far below the points in the first group and second group.
As analyzed above, point 12, the last point in the second group, is selected as the best combination of the structure parameters, which corresponds to the best performance parameters. It has a Q factor of 42,382 and a frequency difference of 8323. Additionally, its structure parameters are R = 408.726, r = 406.855, t 0 = 0.712, h = 365.505, a = 54.415.
Taking the structure parameters of the 12th point into finite element simulation, it came that the Q factor is 42,454, and the frequency difference is 8539, which is very similar to the result from PSO-BP and NSGA-II; thus, this method is proved to be available.

Local Analysis
Local analysis is operated to ensure that point 12 does not fall into a locally optimal solution.
Some of the structure parameters and the performance parameters corresponding to them around point 12 are listed in Table 4. Additionally, the variation of performance parameters at this local range is shown in Figure 17.
It is obvious that whichever structure parameter is changed, point 12 is located in the crossover point and knee point. When a structure parameter is changed, if point 12 did not own the highest Q factor, it is inevitable that it owns the highest frequency difference. So, point 12 is the best solution to obtain the best performance parameters at the local range. Additionally, the variation of performance parameters at this local range is shown in Figure 17.

Global Analysis
A global analysis is operated to ensure that point 12 is the best solution actually. The structure parameters and the performance parameters corresponding to them in the selected value range are listed in Table 5.
Additionally, the variation of performance parameters at the global range is shown in Figure 18.
It is obvious that whichever structure parameter is changed, point 12 is located in the crossover point and knee point. When a structure parameter changes, if point 12 did not own the highest Q factor, it is inevitable that it has the highest frequency difference. So, point 12 is the best solution to obtain the best performance parameters at the global range. It has a Q factor of 42,454 and a frequency difference of 8539. Additionally, its structure-parameters are R = 408.726, r = 406.855, t 0 = 0.712, h = 365.505, and a = 54.415. So, when the thin film deposition method is operated, the best structure has the combination of structure parameters as mentioned above. The structure parameters and the performance parameters corresponding to them in the selected value range are listed in Table 5. Additionally, the variation of performance parameters at the global range is shown in Figure 18. It is obvious that whichever structure parameter is changed, point 12 is located in the crossover point and knee point. When a structure parameter changes, if point 12 did not own the highest Q factor, it is inevitable that it has the highest frequency difference. So, point 12 is the best solution to obtain the best performance parameters at the global range. It has a Q factor of 42,454 and a frequency difference of 8539. Additionally, its structureparameters are R = 408.726, r = 406.855, t0 = 0.712, h = 365.505, and a = 54.415. So, when the thin film deposition method is operated, the best structure has the combination of structure parameters as mentioned above.
A comparison between the simulation results of this study and experimental data A comparison between the simulation results of this study and experimental data from the relevant literature is as Table 6 shows. As Table 6 shows, the structure parameters of the polysilicon resonator given by this study have the highest Q factor among the resonators presented in reference [7,19], which is up to 42,454. The difference between the characteristic frequency of the operating mode and the frequency of the spurious mode of this resonator is 8539, ensuring that the frequency coupling between the operating mode and spurious mode is very small, which makes the influence on the hemispherical resonator generated by the spurious mode, and can be reduced to a lower level. Compared to the hemispherical resonator presented in reference [19], the radius and the thickness of the shell have a slight difference with these two parameters of the hemispherical resonator in this study. However, there is a major gap in the Q factor. Besides the differences in the material properties coming from simulation and experiment, the influence that the variety regulation mentioned above, between the structure parameters and the performance parameters, plays an important role. According to the analysis of Figure 5b, as the thickness of the shell increases, the characteristic frequency of the operating mode and the Q factor both increase continually because of the lengthening of the delivery time of heat flow, which results in the decrease in energy loss in a vibration period.
According to the analysis of Figure 9b, there is a decrease in the Q factor when the depth of the shell increases, while the characteristic frequency of the operating mode has a different trend.
There might be a disparity in the depth of the shell between reference [7] and this study. Additionally, for reference [19], according to the analysis of Figures 5 and 10, when the radius of the resonator is up to 650 µm, different from the variation trend of the characteristic frequency of operating mode, there is a decrease in Q factor when the thickness and depth of the resonator decrease, since a decrease occurs in the delivery time of heat flow, which is the cause for the increase in energy loss in a vibration period.
Since Table 6 shows that the performance parameters of the hemispherical resonator obtained via the method referring to this study are better than those of the hemispherical resonators presented in references [7,19], the feasibility that this method can be applied to the design and optimization under the limitation of a specific manufacturing process was proven.

Conclusions
As analyzed in Section 3, whether it is a local range or a global range, point 12 is the best solution. At the numerical intervals that we have set, the best combination of structure parameters is obtained without experimental processing. Proved by the simulation at the local range and global range, R = 408.726, r = 406.855, t 0 = 0.712, h = 365.505, and a = 54.415 is the best solution in the range of structure parameters that Table 2 actually shows.
This study provided an optimization formula for micro-resonators based on PSO-BP and NSGA-II under the limitations of specific processing conditions, avoiding experimental processing, and can be used in any one of the manufacturing processing. Obviously, it will make a difference in guiding the optimization of MEMS hemispherical resonator gyroscopes.
The structure parameters that mainly affect the performance parameters of the hemispherical shell resonator are obtained by analyzing the thermoelastic damping losses. The finite element simulations with variables controlling are used to obtain a dataset of the performance parameters of the hemispherical resonator when the structure parameters vary. By optimizing the weight, deviation, and number of hidden layer neurons via PSO, as Section 3 claims, the problem of the BP neural network that falls into a locally optimal solution is easily resolved. Finally, the limitations of molding defects under the process conditions are considered constraints, and the optimized BP neural network is used as the objective function to obtain the best structure parameters.
At the end of Section 3, validated by simulation, the result of the simulation is almost similar to the result of NSGA-II output; we prove that this approach to optimize the resonator is available. It solves the problem that it is difficult to design the best structure under a specific process because the structure parameters are under the limitation of process conditions. However, there are also some limits in this study. First, the weight of the Q factor and frequency difference is not clear, although the Pareto frontier gives the non-dominated optimal solution set, to what an extent that Q factor and frequency difference occupies the comprehensive performance should be quantized clearly; another limit is that what we study is in a selected numerical interval, in different numerical intervals, there might be a great difference among the mappings, so, the numerical intervals selected in an optimization should not be too large.
Moreover, there would be a sudden change when the hemispherical resonator changes from a standard one to a nonstandard one because that the sudden change in the structure results in a sudden change in the path length for the irreversible heat flow.
For the next work, the characteristic frequency of the operating mode can be limited under 10 kHz or 15 kHz as it is a little high for the frequency of 31 kHz from this study; another job is to discover the particular cause of the sudden change when the hemispherical resonator changes from a standard one to a nonstandard one.