Accelerated Particle Filter With GPU for Real-Time Ballistic Target Tracking

This study addresses the problem of real-time tracking of high-speed ballistic targets. Particle filters can be used to overcome the nonlinearity of motion and measurement models in ballistic targets. However, applying particle filters (PFs) to real-time systems is challenging since they generally require a significant computation time. So, most of the existing methods of accelerating PF using a graphics processing unit (GPU) for target tracking applications have accelerated computation weight function and resampling part. However, the computational time per part varies from application to application, and in this work, we confirm that it takes a lot of computational time in the model propagation part and propose accelerated PF by parallelizing the corresponding logic. The real-time performance of the proposed method was tested and analyzed using an embedded system. And compared to conventional PF on the central processing unit (CPU), the proposed method shows that the proposed method significantly reduces computational time by at least 10 times, improving real-time performance.


I. INTRODUCTION
The performance of ballistic target interception highly depends on accurate target tracking. For high-accuracy tracking under measurement uncertainties, state estimation must be adopted based on various filtering algorithms. Generally, measurement model noise is assumed as a Gaussian distribution for mathematical simplicity. However, owing to the nonlinear and non-Gaussian characteristics of the measurement noise caused by the seeker random and scintillation, the assumption of a Gaussian distribution is invalid [1], [2]. In nonlinear and non-Gaussian uncertainties, conventional filtering algorithms may perform unsatisfactorily. For this reason, linear Kalman filter-based target tracking filters may not converge properly or even diverge during interception.
Accordingly, various nonlinear filters have been previously applied to target state estimation, including the extended The associate editor coordinating the review of this manuscript and approving it for publication was Utku Kose .
Kalman filter (EKF), particle filter (PF), and unscented Kalman filter (UKF). Compared to the EKF, PF performs more consistently under nonlinear and non-Gaussian noise [3], [4], [5]. This is because the PF has an inherent capability and flexibility to deal with various types of error distributions. However, the primary difficulty of a PF in a real-time system is the heavy computational burden, as the required number of particles exponentially increases with the number of state variables. The computational issue is a crucial constraint and must be solved for real-time application.
The PF takes more time because the number of iterations increases as the number of particles increases due to the nature of the sampling-based algorithm. However, this repetition is necessary to find an appropriate value using particle information and the particle resampling process, the algorithm progress time can be reduced if the parts that require many calculations are parallelized [6]. Therefore, algorithms such as PFs, which require a significant time to find target information, can be accelerated by parallelization using a VOLUME 11, 2023 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ graphics processing unit (GPU) [7]. PFs are accelerated using GPU in studies that require quick results, such as target tracking using sensors or fields used in real-time.
To achieve high-speed target tracking, the PFs are accelerated in Compute Unified Device Architecture (CUDA) with a GPU. If the entire process of the PF algorithm is converted to CUDA, all parts regardless of their computation time are converted. Then to use a PF in a GPU, the data can be handed over to a PF running on the CPU to the GPU required. But this is inefficient. It can lead to a considerable overhead time. Therefore, the identification of parts of the PF requiring a considerable amount of computational time is accelerated using a GPU. Calculation parts that require a considerable amount of computational time are parallelized using a GPU. Thus, the computational time can be further reduced even if overhead is generated.
As the PF algorithm is suitable for tracking, it is widely used. From Table 1, it is used for various tracking tasks such as target tracking [11] and [12], object tracking, and motion tracking. In [4] and [13], they used a PF for missile applications. Acceleration studies were conducted using a GPU to perform the PF algorithm in real-time. In [14] and [18], the PF that is parallelized for the weight computation is proposed. A GPU was used to improve the PF estimation for target tracking rather than acceleration [14], And in [18], a GPU was used to accelerate IoT applications. The tracking algorithm was accelerated by approximately 55 % compared to the CPU-based algorithm. In [16], they proposed a PF that parallelized the likelihood function calculation and reduced the calculation time of that. However, it required time to generate random values, such studies have conducted parallelization in environments with a considerable change in the signal or amount of information of particles, such as image tracking. In [17], [19], and [20], more than one part requiring a long computation time in the research environment or those that could be parallelized independently were parallelized. Examples include weight computation, likelihood function evaluation for calculating the particle state, and resampling. When two or more parts of a PF are parallelized in the GPU, more overhead is generated. Therefore, tasks reducing the overhead time, such as kernels, can be shared. So, unlike previous related works, we propose a method that parallelized part of the measurement acquisition, especially the model propagation part, using GPU.
This paper explains a high-speed target tracking system and the necessity for the PF algorithm acceleration. Parallelization is used in parts of the PF requiring a considerable amount of calculation time to accelerate. This achieves high-speed for target tracking. Methods using a GPU to accelerate parts of the PF and reasons for accelerating those parts are described. The results and analysis of the target-tracking algorithm with the accelerated PF to that of the original targettracking algorithm with un-accelerated PF are compared.
The contributions of this paper are as follows: •This is the first approach to accelerate a PF for ballistic target tracking under glint noise.
•To the best of our knowledge, this is the first study to address and analyze the problem of long computation time for the model propagation of the sampling process.
•A new parallelization method was developed for real-time PFs for ballistic target tracking.
•The computation time of the PF was significantly reduced even with the overhead time for the CUDA initialization on a widely-used embedded system. The remainder of this paper is organized as follows. Section II describes the target missile tracking system based on PFs and the real-time problems of PFs. In Section III, after the computation times for the PF are profiled block-wise, a new parallelization method for the model propagation of the sampling process is proposed. In Section IV, the evaluation results of the proposed method are presented and quantitatively compared with those of other methods using a widely used embedded system. Finally, conclusions are presented in section V.

II. PROBLEM DESCRIPTION
The objective of the target-tracking filter is the real-time estimation of the true target states. To evaluate the performance of the tracking filter and accelerate the system, the target trajectories of ballistic missiles are generated. We consider a target-tracking filter for the reentry phase of a ballistic missile. In the reentry phase, atmospheric drag is a significant force determining the path of the missile. Accordingly, the forces acting on the target during reentry arise from gravity and aerodynamic drag. A ballistic missile is represented as a point mass in three-dimensional Cartesian coordinates. Since we only consider target tracking for the reentry phase, the thrust force is set to zero and the mass of the tracking target is constant. Aerodynamic drag D is expressed as a function of the air density ρ, target velocity V , aerodynamic coefficient C D , and reference area S.

A. MOTION AND MEASUREMENT MODEL FOR PARTICLE FILTER
Since target-tracking estimations are based on the target motion model, several target models have been proposed. In this study, the well-known Singer model was used [8] and [9]. The Singer model assumes that the target acceleration is a zero-mean, first-order, stationary Markov process. The state-space representation of the continuous-time Singer model is:ẋ Here, x is the state of target, w is the zero-mean white Gaussian noise. And τ and I 3 in F represent the maneuver time constant and identity matrix of order 3. Its discrete-time equivalent is as follows: where k and t represents state transition matrix and sampling time interval. Covariance Q k in Eq. 6 consists of the power spectral density S w and white noise jerk model Q 0 . The acceleration increment over a time period is the integral of the jerk over the period. For the state-space representation in Eq. 2, x denotes the associated variable of the target position, velocity, and acceleration. P, V , and A in Eq. 7 are the target position, velocity, and acceleration, respectively, in Cartesian coordinates.
where (x, y, z) represent the target position in the Cartesian coordinate system. Three measurements achieved by the radar were assumed: elevation, azimuth, and range. The measurements were acquired with respect to the target and radar position. In Eq. 9, the subscript r denotes the relative position between the target and radar. m represents the position of the radar. Consequently, two bearing angles z θ , z ψ and the relative range z R can be represented as nonlinear equations as Eq. 10 using states in Cartesian and radar noises.
x r y r z r where n θ , n ψ , and n R represents the receiver noise of the radar, and n G,θ and n G,ψ are non-Gaussian glint noises generated in radar measurements [3].

B. THE PROBLEM OF ALGORITHM ACCELERATION
For high-speed targets such as ballistic missiles, the filter update rate and estimation accuracy are crucial. Because precision guidance and control lead to a successful interception, accurate target tracking is an indispensable element. In this study, a PF was used for higher estimation accuracy and consistency. However, the heavy computational burden of PF should be solved for real-time application. To cope with the problem, we propose a GPU-based acceleration method for PFs. The iterations of the parts in the PF algorithm were processed as many times as the number of particles. And the entire PF algorithm was iterated a predetermined number of times by the user. If the PF is iterated 300 times, the model propagation and weighting function are calculated by iterating as many times as the number of particles for each iteration.
If the PF algorithm proceeds using a CPU, the calculations are sequentially performed as the number of iterations. As the number of particles increases, the computation time increases accordingly. In the CPU, the repetitive calculation in the PF algorithm was carried out as many times as the number of particles. Whereas in the GPU, the same calculation could be parallelized and calculated simultaneously. Therefore, when using a GPU, the parts requiring a considerable amount of calculation time in the iterations can be significantly reduced. If the appropriate parallelization technique is applied, the larger the number of particles, the shorter the calculation time compared to that of the CPU.

III. PROPOSED METHOD A. OVERVIEW OF THE PROPOSED METHOD
The PF flowchart for target tracking is shown in Fig. 1. Initially, the particles were sprinkled at random intervals within the measurement range. The model propagation step predicts how the target will move. The movement is estimated using Eq. 11. In Eq. 11, x p is the acceleration, speed, and location information of the initial particles; randnum is a matrix of random values; Q k is a 9 × 9 matrix of the filter covariance and can be represented by Eq. 13. In Eq. 13, sig acc is the signal accuracy of the filter covariance; dt is the time the state of the VOLUME 11, 2023 target change. and the matrix A is the target state transition matrix, which is defined in Eq. 5. x mid1 was obtained using these values. The matrix was used to estimate the state of the target. This process is performed for each particle. All the obtained information in x bar is added and used in the filter update part. After the model propagation step, glint and sensor noise models generated values.

B. COMPUTATION TIME PROFILING
To accelerate the PF, the part of the PF algorithm requiring a considerable amount of calculation time should be identified. Computation time measurement for each part was performed in Nvidia Jetson Xavier, and the results are summarized in Table 2, showing that the model propagation part takes more time than the other parts. The computation time for each part of PF shown in Table 2 is the result of an experiment with 5000 particles.
The calculation proceeded in the model propagation part consists of obtaining the square root of the filter covariance value, adding, and multiplying matrixes. The matrix A and Q k are matrixes of size 9 × 9. The x p is a matrix with 9 rows and columns as large as the number of particles. The model propagation part is calculated for one column of the x p matrix at one iteration. As shown in Fig. 1, the matrix calculation process in the model propagation part is repeated as many as the number of particles. Since the matrix calculation process, which is iterated as many as the number of particles, is performed every single iteration of the particle filter algorithm, the model propagation part is taken the most computation time. The filter update part and likelihood function part in Fig. 1 are also iterated as many as the number of particles, but since the two parts are simple numerical operations rather than matrix operations, so the computation time of the two parts is not taken much time compared to the model propagation part. The PF was accelerated by performing parallel calculations using Eq. 11, which progresses in the model-propagation part. As described in Part B, when the number of particles used in the PF algorithm is large, the calculated matrices become large. Therefore, a significant amount of calculation time is required for the model propagation part. However, Eq. 11 does not consist of complex equations compared to timeconsuming equations. Thus, it is easy to parallelize using a GPU and can be accelerated effectively.
For CUDA, tasks such as CUDA initialization and malloc are executed first. The matrices to be used for input are copied to the variables defined by CUDA to be used. The GPU memory is required as shown in Fig. 2, according to the size of the matrices calculated. The matrix has 9 rows and 1024 threads in one block, and the more particles are used, the more GPU memory uses. As shown in Fig. 2, the memory usage of the GPU must be defined to store values in the memory designated when using the CUDA kernel and calculate them in parallel.  Fig. 3 shows a flowchart when performing the model propagation part using CUDA. The kernels are calculated in parallel using GPU. In the first kernel, as shown in Fig. 2, the random values according to the normal distribution are generated using the ''curand_normal'' function provided by the CUDA library. To put different seeds for each thread, the time when the kernel is performed and the thread ID that is generated by using ParticleID in Fig. 2 are used as seeds and applied to the ''curand''. The random value, which is the result of kernel execution, is generated as a matrix in which rows are 9 and columns are as many as the number of particles. Because the size of the random value matrix should be adjusted for matrix operation to be applied in Kernel 4. Kernel 1 is also used in the parallelized PF 2.0 described in section III-D. To deal with the matrixes using Kernel 2, it is necessary to generate IDs that specify the addresses of the values of the matrixes. Fig. 5 is shown how to create IDs, which are defined equally within all kernels used in this part and the D part. The target matrixes of the kernels have a size of 9 × 9 or a size in which rows are 9 and columns are as many as the number of particles. It is the same as the ParticleID in Fig. 3,and its size is the same as the number of particles. Therefore, ID can assign as many addresses as the number of particles in the column. StateID assigns the address of a matrix row that has 9 rows and StateIDy assigns the address of a matrix column. The generated IDs in Fig. 5 are used to assign addresses to elements of the matrixes to be targeted by kernels so that the values of each address can be calculated in parallel. In this part, the parallelized PF 1.0 which is a method of using a kernel integrated with all the functions used in Eq. 11 is described. Kernel 2, all equations in Eq. 11 are calculated, which is described in detail in Fig. 6. The addresses of the matrixes calculated in Kernel 2 are assigned as the size of the target matrixes of calculations in Eq. 11. Since the CreateIDs in Fig. 5 are declared in the first order in Kernel 2, addresses for the values of the matrixes can be assigned using stateID, StateIDy, and ID according to the size of the matrixes used in each calculation. It is difficult to calculate Eq. 11 as one equation as in CPU because the size of the result matrixes of each calculation is different. This is also the reason for defining the IDs with different sizes and addresses in Fig. 5.
The variables x mid1 , x mid2 , and x mid3 defined within Kernel 2 in Fig. 6 are necessary for storing the result matrixes having a different size for this reason. After performing Kernel 2, the x mp matrix can be obtained as a result of the parallelized model propagation.

D. PARALLELIZED PARTICLE FILTER 2.0
Eq. 11 consists of a square root operation of a matrix, multiplication, and addition between matrixes. The method VOLUME 11, 2023 FIGURE 5. Create IDs used to calculate matrixes. Each id is defined to obtain the values' addresses of the matrixes.
described in Part C performs parallel calculations by kernelizing Eq. 11 by integrating it into one kernel. As shown in Fig. 2, this method was used by defining two kernels. Thus, the variables to store the results of each equation are defined in the kernel. Therefore, each time the particle filter is iterated, the task of defining variables within the kernel for storing the result matrixes is iterated. This means that the task of allocating the space in GPU to store the values of the variables is performed every iteration.
To reduce the time required for these tasks, a method of securing the space in GPU by declaring variables in advance before using the kernel was devised. Declared variables should be used as inputs or outputs of the CUDA kernels. Since the calculations constituting Eq. 11 are not complicated, Kernel 2 in Fig. 5 was subdivided, and parallel calculations were performed. Subdivided kernels are defined so that predefined variables can be used as inputs and outputs of the kernels described in this part. The particle filter of this method is proposed 2.0, and Fig. 7 reconstructs the kernel of Fig. 6 and proposes three kernels. The proposed PF 2.0 is performed by subdividing Kernel 2 in the proposed PF 1.0 into three kernels.
The subdivided kernels are shown in Fig. 8, 9, and 10. In Kernel 2 in this part, as shown in Fig. 8, the matrix  multiplication of A and x p is calculated in parallel. Since the size of A matrix is 9 × 9, the address of the row is assigned using stated and the address of the column is assigned using stateIDy. The multiplication of matrixes is added after the values of the rows in the preceding matrix and the columns in the following matrix are multiplied. Therefore, the row address of x p , which is the following matrix, is assigned as stateIDy to calculate the multiplication of matrixes. And the size of the column in the x p is the number of particles, so the addresses of the columns are assigned using ID. And then, as shown in Fig. 9, three matrixes are multiplied in Kernel 3. First, multiply A matrix by the square root of Q matrix and store it in x mid1 matrix. The size of the two matrixes is 9 × 9, so the size of x mid1 matrix is the same. And the random matrix is multiplied by a random value matrix generated by Kernel 1 and stored in x mid3 . In Kernel 4, result matrixes of Kernel 2 and Kernel 3 are added in parallel, as shown in Fig. 10. Since both resultant matrixes have a size that rows of 9 and columns are as many as the number of particles, the result matrix, x bar , is obtained as a matrix of the same size.  Anyway, since the sequential particle filter method proceeds to calculate Eq. 11 and verifies whether it has been  repeated as many as the target number of particles, it can be said that the time complexity is determined by the number of particles. An algorithm with linear time complexity can reduce calculation time by eliminating overlapping calculations, so a parallelized particle filter will show excellent performance. In addition, from the perspective of space complexity, the optimal memory required for the algorithm is allocated in advance according to Fig. 2. However, in the proposed PF 1.0, we reallocate space to store the resulting matrix in the process of all operations in Eq. 11 at once, which results in poor space complexity. Therefore, the proposed PF 2.0 pre-allocates a place to store computational results, which makes space complexity better, and efficiently uses GPU resources.

IV. RESULTS
In this study, GPU-based accelerated PF for high-speed target tracking was performed by parallelizing the model propagation process in PF. First, the simulation results of ballistic target tracking can be described by glint noise. Furthermore, the proposed PF algorithm performs faster calculations compared to the results of the CPU based on the acceleration of various GPUs. Compared to the other methods of PF to parallelize resampling or likelihood functions, the only way to show good performance in our applications was ours.

A. RESULTS OF HIGH-SPEED TARGET TRACKING
The effectiveness of the proposed acceleration method is assessed in a ballistic target tracking scenario. For the numerical simulation, the dynamic model in Eq. 1 and Eq. 2 were used to set the true reference trajectory. The aerodynamic drag and weight of the missile were set as in [10]. The sampling interval was set to t = 0.01s, with 200 intervals, yielding a total simulation time of 2s. The standard deviations of the radar receiver noise models n θ , n ψ , and n R were 0.1 • , 0.1 • , and 1 m, respectively. Glint noise n G,θ and n G,ψ are mixtures of Gaussians, which follow the distribution.
where ε is the glint probability. p G 1 and p G 2 are Gaussians in p G 1 ∼ N (0, 0.5 2 ) • and p G 2 ∼ N (0, 1 2 ) • at the range of 100m in respectively [3]. The tracking motion model follows the Singer model in Eq. 5 and the measurement model is expressed by Eq. 13. The position of the radar is assumed to be fixed on the ground. Whereas the ballistic target moves at a high speed, considering the gravity and aerodynamic drag. As a result, the velocity of the ballistic target varies with the simulation time.
The resulting trajectory and the estimated results are represented in Fig. 11-13. The number of particles is 15000, which shows a satisfactory tracking performance.

B. RESULTS OF ALGORITHM ACCELERATION WITH GPU
We constructed the following experiments on the embedded system, Jetson Xavier NX: computational time measurements of each algorithm for the number of particles. The parallelized algorithms have significantly faster performance, And the larger the number of particles used in the particle filter algorithm, the less time the particle filter calculated in parallel using GPU takes to computation. The performance of the PF algorithm for the entire algorithm is shown in Table 3.
Here, for parallelized computation using CUDA, overhead is inevitable, the overhead includes CUDA initialization, variables definition for using CUDA, kernels definition, etc. However, even if the overhead time is included, it takes more time when the algorithm is performed using only the CPU. Fig. 15 shows a comparison of the calculation time of the entire algorithm using only CPU and using both CPU and   GPU with CUDA when the number of particles is 5000. The PF algorithm takes almost the same amount of time to perform the entire algorithm, and the other parts, except the PF, have a short time. This result shows that the calculation time  decreases significantly when parallelization using CUDA is used in the model propagation part, where it takes the most time. And Fig. 15 shows the speedup between the conventional PF and the parallelized PF algorithm 2.0. The difference in the performance time between the conventional PF and the proposed PF 2.0 is the largest in the model propagation part. Other methods include parallelization of resampling or parallelization of likelihood functions, which are compared and shown in Fig. 16. In conclusion, other methods have not only very slow computational time for model propagation, but also show that the performance may be lower than that of conventional PF due to overhead, and our method achieves optimal performance by selectively parallelizing the parts that need acceleration through computational time profiling in advance. Most importantly in this result, comparing different parallelization methods by applying them to our applications may result in unfair results. Other methods are designed for applications that are different from ours, and the input of the application may be different, finally, they will eventually show optimal performance in their applications. After all, what this shows is that parallelize for computation should be applied differently for each application. In other words, each acceleration algorithm can ensure optimal results when configured with the appropriate acceleration algorithm through profiling of its applications. So, we conducted profiling about conventional PF, and we found that the model propagation part takes most of the whole computation time. And the parallelization of model propagation has shown better performance in our applications.  Table 4 is an extension of the above experiments, and is the result of profiling the performance of a particle filter with 5000 particles using other embedded boards, Jetson AGX Orin, and several GPU cards. The hardware specification was higher, the better performance was shown. For example, the computation time with Jetson AGX Orin was smaller than the computation time with Jetson Xavier NX because Jetson AGX Orin has higher hardware specification than Jetson Xavier NX. Instead, the local memory usage with Jetson AGX Orin was larger than the local memory usage with Jetson Xavier NX, which indicates that there exists a trade-off relation between the computation time and the local memory usage. In addition, the computing peak performance values on the Geforce RTX 3070 and 3090 were 2.57 % and 2.54 %, respectively, resulting in bottlenecks due to the robust performance of the GPU compared to the performance required by the proposed method. Therefore, one should select carefully a hardware for algorithm acceleration by considering the hardware cost and the trade-off relation. The most important point in Table 4 is that the conventional particle filter only on CPU can be highly speeded up by the proposed parallelized particle filter regardless of the selection of the GPU-based hardware system.

C. DISCUSSION
The computation time of the target tracking algorithm was compared when using only the CPU and when using parallelized particle filter 1.0 and 2.0. the result is that the entire algorithm to which the parallelized particle filter was applied had much less computation time than when the algorithm was performed using only the CPU. And when comparing two methods using GPU, it took less time to compute when using the proposed PF 2.0 method.
The reasons for the result of comparing the proposed PF 1.0 and 2.0 are as follows. First, since most of the overhead time occurs in CUDA initialization, if the number of kernels or variables used is not significantly different, the overhead time is similar. For the difference to occur in overhead time, used kernels that perform more complex computations are integrated, or inputs and outputs are defined a lot compared to the experimental environment of this paper. however, the difference in the number of used kernels and variables in the two methods described in this paper is not significant. Therefore, the occurrence time of the overhead is similar. The parallelized particle filter 2.0 uses predefined variables in which the values of the result matrix will be stored to move the result matrixes of the kernels. By defining variables to be used in advance, the area where the values will be stored has been set. However, the parallelized particle filter 1.0 method defines variables every iteration when the kernel is executed to sets the area where the variables will store. Therefore, the second reason is a difference in that variables are defined in advance, or the variables are defined every iteration. For this reason, the parallelized particle filter 1.0 method takes more computation time in the model propagation part than the proposed PF 2.0 method.

V. CONCLUSION
In this study, the first approach was developed to accelerate a PF for target missile tracking. A PF algorithm was used to track the high-speed moving ballistic target, and acceleration was performed using a GPU to achieve real-time performance. For the ballistic target, a PF was used to track the state of the target, such as its movement and angle, and it was successfully estimated without significant differences.
Most of the time was spent in the PF algorithm in the target tracking algorithm, especially in the model propagation part of the information held by the particles. The part identified as requiring a lot of computation time was parallelized by CUDA using a GPU. The result of parallelization was that the computation time was reduced compared to the algorithm using only a CPU, even considering the overhead time inevitably occurring when using CUDA. The algorithms with the parallelized PF proposed in this study using a GPU require less computation time than estimating the state of the ballistic target using only the CPU. Both methods using GPU can be said to have much more real-time performance than when the entire algorithm is performed using only the CPU. Comparing the two methods using GPU, using the proposed PF 2.0 method is more effective because the calculation is not complicated and the variables to be used in GPU are predefined. Since 2016, she has been a Senior Researcher with LIG Nex1, Seongnam, South Korea. Her research interests include the control and guidance of unmanned aerial vehicle and missile systems, optimal control, and convex optimization. In 2009, he joined the LIG Nex1 for development of precision guided missiles. His research interests include convex optimization, optimal control, guidance and autopilot design, and nonlinear control.
CHAN KIM received the master's degree in information and control engineering from Kwangwoon University, Seoul, in 2015.
From 2015 to 2019, he worked as a researcher in a company related to vehicle parts. Since 2020, he has been working as a Senior Researcher with PGM Research and Development Group Development, LIG Nex1, Gyeonggi-do. His research interests include embedded systems, embedded hw/fw, real-time embedded systems, missile systems, and optimization and acceleration.
WONSEOK CHOI received the M.D. degree in defense convergence engineering from Yonsei University, Seoul, South Korea, in 2017.
From 2015 to 2017, he was a Senior Researcher at LIG Nex1 for PGM Research and Development Group Development, Gyeonggi-do, South Korea. His research interests include embedded systems, embedded SW, real-time embedded systems, and missile systems. VOLUME 11, 2023