Curve Fitting Algorithm of Functional Radiation-Response Data Using Bayesian Hierarchical Gaussian Process Regression Model

We present a nonparametric Bayesian hierarchical (NBH) model and develop a variational approximation (VA) algorithm for the curve fitting of the functional radiation response data. The NBH model is based on a Bayesian hierarchical (BH) model with a Gaussian-Inverse Wishart process (G-IWP) prior, which simultaneously smooths multiple functional observations and estimates mean-covariance functions. We use the automatic differentiation variational inference (ADVI) algorithm with a Gaussian distribution as the variational distribution for approximating the posterior distribution of parameters of interest, which is applicable to a wide class of probabilistic models and can also be implemented in Stan (a probabilistic programming system). Using the NBH model and the Gaussian ADVI algorithm, we fit a dataset for the semiconductor obtained from the radiation response map (RRM) of South Korea.


I. INTRODUCTION
Radiation technology is usually used to refer to the use of ionization and transmission force characteristics, a phenomenon in which orbital electrons of neutral atoms or molecules are reversibly separated by the action of radiation. Radiation technology is deeply intertwined with both basic and applied sciences, and it is very fashionable thanks to its property to be integrated across industry sectors ranging from traditional businesses to high-tech industries such as new material development, advanced medical technology, biotechnology, and military science. This is because, in the past, radiation was limitedly used around material conversion through radiation irradiation, but radiation technology has lately moved to fusion technology. Radiation technology is The associate editor coordinating the review of this manuscript and approving it for publication was Wentao Fan . largely divided into three generations as follows. The use of simple radiation irradiation technology in agriculture and medical fields in the 1960s is classified as the first generation, and the second generation was a period of active application of radiation fusion technology, which incorporates existing radiation irradiation technology into various fields such as chemistry, biology, nano, environment, and informatics. Currently, radiation technology is being incorporated into the fourth industry and is evolving into specific purposes of radiation fusion technology such as radiation response big data and predictable radiation molecular conversion technology, which is categorized as the third generation.
National-level databases for radiation responses, in particular, are being created, and numerous relevant research is being performed as a result of the development of statistical techniques and machine learning-based analysis methods. The United States is conducting various studies and building VOLUME 11, 2023 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ data using neutrons at the National Institute of Standards and Technology and Oak Ridge National Laboratory [1]. Germany strategically promotes multidisciplinary research in the field of advanced radiation medicine, centering on the Heidelberg Ion-Beam Therapy Center [2]. Japan conducts comprehensive research in the field of radiation at the National Institute for Quantum Science and Technology [3]. The Korea Atomic Energy Research Institute (KAERI) established a Radiation Response Map (RRM) modeling platform to promote and support new radiation convergence industries as part of its project to expand the radiation research base. RRM is a database that systematically collects and organizes reaction information that appears when irradiated with radiation from food to industrial materials. It is worth highlighting, in particular, that this data was rigorously evaluated by the government and was awarded the highest grade of Data Quality Certification-Value (DQC-V). It is evaluated that the radiation utilization platform, which is tricky to predict and contains huge data, has been officially certified for reliability. RRM has a data integrity and consistency rate of 99.97% or higher and can be expected to boost its efficacy by sophisticated prediction models. We employ a nonparametric Bayesian hierarchical (NBH) model to analyze the data from the RRM modeling platform. NBH incorporates Bayesian qualities that can quantify uncertainty by estimating the posterior distribution, while also including nonparametric and hierarchical properties. Since [4] proposed a hierarchical bayesian model, NBH has gained a great deal of attention in complex network analysis and spatiotemporal modeling [5], [6], [7], [8]. The reasons are similar to the descriptions following, which are appropriate for RRM data analysis. First, it deals with nonparametric characteristics. In real-world applications, the parameter form of the probability distribution is unknown, and it is often not easy to assume this. Although Bayesian approaches are regarded as a very powerful statistical tool for measuring uncertainty, parametric Bayesian models based on assumptions about probability distributions are frequently troublesome for inference. As a result, we offer a nonparametric Bayesian approach with a Gaussian-Inverse Wishart process (G-IWP) priority in this paper. The second is about hierarchical characteristics. A hierarchical Bayesian approach is based on the premise that different models share information by sharing common parameters. The conditionally independent hierarchy is the essential foundation of hierarchical modeling, in which a set of parameters is coupled by making their distributions rely on a shared underlying parameter. As data groups with similar statistical properties exchange information, the estimated parameters' uncertainty can be reduced. This generally results in more accurate statistical estimations. Furthermore, the hierarchy is highly valuable for providing a Bayesian interpretation of the concept of shrinkage and random effects from the perspective of frequentists [5].
In general, complex models with higher parameters make it difficult to estimate the posterior distribution, which can-not be obtained directly or, if obtained, is not derived in a well-known form in Bayesian approaches. For this reason, the Markov chain Monte Carlo (MCMC) method, which generates samples from probability distributions proportional to the posterior distribution and infers, is widely used. MCMC delivers reasonably accurate results, yet because it involves a huge amount of calculation formulas, the computational cost grows extremely long as the amount of data increases or the model becomes more intricate. In this paper, we employ variational inference, a Bayesian inference approach that applies a variational method, to address the MCMC difficulty. [17] suggested variational inference as a method for discovering and inferring specific distribution which is closest to posterior distribution among easily obtained distributions. Because it applies a distribution that is relatively straightforward to calculate rather than a posterior distribution that is tricky to calculate, this method is suitable for use not just in high-dimensional complicated models like NBH, but also in large-scale data. However, obtaining a probability distribution that is similar to the posterior distribution for variational inference is frequently problematic. [20] suggested automated differentiation variable inference (ADVI) to overcome this, which is a method of transforming parameters into real coordinate space and approximating posterior distributions to Gaussian distributions. In this paper, we estimate the model described above using ADVI, which does not need to find a probability distribution similar to the posterior distribution.
The purpose of this paper is to contribute to the growth of associated sectors via the statistical analysis of radiation response data on semiconductors. To that purpose, we propose an advanced prediction model based on the NBH model. In addition, by properly presenting the computation algorithm for this method, it becomes relatively easy for readers to use. Our main contributions are summarized as follows: • We provide the applicability of statistical models for curve fitting of functional radiation response data. In addition to the NBH model for further sophisticated fitting, we describe the ADVI algorithm that is adaptable to a wide range of probabilistic models for computation. It also facilitates the implementation of the ADVI algorithm for the NBH model by providing Stan programming code for the convenience of readers.
• We demonstrate usefulness of radiation response data from the RRM modeling platform via data analysis on semiconductors. This can help us understand the properties of semiconductors while also encouraging innovation in radiation technology.
The rest of the paper is organized as follows. Section II presents preparations for building a model. Section III describes in detail the main model and its algorithm to be applied in this study, and the analysis of a real data set is presented in Section IV. Section VI summarizes the results of this paper and considers several possible directions for future research.

II. PRELIMINARIES A. GAUSSIAN-INVERSE WISHART PROCESS
G-IWPs are conjugate prior distributions for the mean and covariance functions of GPs. Here we give a briefly introduction about GP and IWP.
GPs are commonly used as a natural method for defining prior distributions on the space of functions with continuous domains in nonparametric Bayesian (NPB) inference [9], [10], [11]. Let f = (f (t), t ∈ T ) be a collection of function valued random variables at an any index set T . Then f is called a GP if the marginal distribution of any finitedimensional f (t 1 ), . . . , f (t m ), where m > 0 and t 1 , . . . , t m ∈ T , is a joint Gaussian distribution by the Kolmogorov's existence theorem [12]. The GP is completely determined by the mean function is a symmetric and positive definite Mercer kernel. We will hereafter write the GP as Yang et al. [13] and Yang et al. [14] proposed an IWP prior for the covariance function of the GP prior to reduce biases caused by misspecifying the covariance kernel and showed that the prior provides more flexible covariance estimation and accurate fitting of functional observations. The IWP is defined such that the finite-dimensional projection k(t, t ′ ) follows inverse-Wishart distribution [15] given by with δ + m − 1 degrees of freedom. Here δ > 4 denotes a positive-integer valued shape parameter, tr(·) is the trace, and ψ(t, t ′ ) is a symmetric and positive definite scaling parameter. We denote the IWP as

B. VARIATIONAL APPROXIMATION
Variational approximation (VA) methods (or variational Bayes methods) [16], [17], [18] are deterministic algorithms, which are alternatives to simulation-based MCMC algorithms, approximating the posterior distribution given by where θ is a parameter of interest, p(θ) is the prior distribution, and p(y | θ) is a likelihood of data y. VA methods assume a class of tractable distributions, Q = {q(θ); θ ∈ }, and pick the best approximation distribution (called as variational distribution) minimizing the Kullback-Leibler (KL) divergence [19] in that class as follows: where the KL divergence is non-negative and defined by Generally, it is difficult to minimize the KL divergence because of involving the true posterior distribution. Instead, we seek the variational distribution maximizing the so-called variational lower bound L(q) obtained from the following identity where log p(y) = log p(θ)p(y | θ)dθ is the log marginal likelihood function and can be often used as the evidence of the model. Note that the variational lower bound is also called the evidence lower bound (ELBO).

III. NBH MODEL
In this section, we present a NBH model and develop a VA algorithm for approximating the posterior distribution. The model is the NBH model with G-IWP prior proposed by [13] and the algorithm is based on the automatic differentiation variational inference (ADVI) [20], [21] that is the VA method with Gaussian distributions as variational distributions of parameters of interest.

A. MODEL DESCRIPTION
We now describe the NBH model with G-IWP prior (hGIWP). To do this, let {Y i (·); i = 1, . . . , N } be N independent radiation response trajectories and the ith radiation response trajectory Y i (·) has m measurements on the radiation level t i = {t i1 , . . . , t im } with t ij ∈ T for j = 1, . . . , m and t i1 < t i2 < · · · < t im . We assume that the curve of the ith radiation response is modeled as where the errors {ϵ i (·)} are independent and identically distributed (i.i.d.) Gaussian random variables with mean zero and variance σ 2 > 0, i.e., ϵ i (t ij ) ∼ N (0, σ 2 ), and the mean curves {f i (·)} are defined by t l )).
The hGIWP model assigns the G-IWP prior for the mean function µ(·) and covariance function k(·, ·) as follows: where µ = µ(t i ), µ 0 = (µ 01 , . . . , µ 0m ) ⊤ , c is a positive scale parameter that is set at a fixed value, and the scaling parameter VOLUME 11, 2023 is defined by the Matérn covariance kernel [9] ψ(t i , t l ) = σ 2 Here σ 2 z and ρ are the positive scale parameters, ν is the order, and K ν (·) is the modified Bessel function of second kind [22]. Note that the parameters ρ and ν are often fixed at certain values to estimate a stable covariance. For more details of the Matérn covariance kernel, see [9], [13], [14], and references therein.
For the scale parameter σ 2 of the model (4) and the hyperparameter σ 2 z of the Matérn covariance kernel , the hGIWP model assumes an inverse-gamma distribution and a gamma distribution as the prior distributions, respectively: IGa(a, b) denotes the inverse-gamma distribution with shape parameter a and scale parameter b and Ga(a z , b z ) is the gamma distribution with shape parameter a z and rate parameter b z .
To facilitate posterior computation, we represent the hGIWP model (4) with the following equivalent hierarchical model: where I m is a m × m identity matrix and t = ∪ N i=1 t i is the pooled grid.

B. POSTERIOR INFERENCE WITH GVA ALGORITHM
We introduce a variational algorithm, which is based on ADVI method, for approximating the posterior distribution of the hGIWP model (6) proportional to, up to a constant that does not depend on the parameters of interest, Let θ = (f ⊤ 1 , · · · , f ⊤ N , µ ⊤ , K , σ 2 z , σ 2 ) ⊤ be the parameters of interest and p(Y , θ) be the joint distribution (7) Let T (·) be also a one-to-one differentiable function transforming the support of parameters θ to the real coordinate space R D , where D is the number of parameters. Then, the transformed joint distribution is given as (8) where J T −1 is the Jacobian of T −1 (·) and det(·) is the determinant of a matrix.

IV. A REAL CASE STUDY: SEMICONDUCTOR DATA
In this section, we present the performance of the ADVI algorithm for the hGIWP model to a competitor, the   No-U-Turn Sampler (NUTS) [28] that is an extension to a simulation-based Hamiltonian Monte Carlo (HMC) method [29]. The NUTS uses a Hamiltonian dynamics simulation based on numerical integration (leapfrog) without tuning the step size parameter and number of steps to generate a candidate and then accept one by performing a Metropolis acceptance-rejection step. To this end, we use a dataset obtained from Radiation Response Map Modeling Platform developed by Korea Atomic Energy Research Institute (KAERI). The address of the webpage is https://www.kaeri.re.kr/rrm/#!/home. We set hyperparameters c = 1, δ = 5, a z = 1.515, a = 2.418, and b = b z = 1 using a heuristic empirical Bayesian method as described in [13]. For inference via the NUTS algorithm, we run simultaneously four Markov chains with different sets of initial values and use 5,000 posterior samples obtained by saving every 5th sample after a 20,000 burn-in period for each chain. The total number of posterior samples for inference is 40,000. We check the convergence of the algorithm using the potential scale reduction factor (PSRF) introduced in [30] and its improved version proposed in [31]. We run the NUTS and ADVI algorithms on an Intel 10-Core Xeon W CPU@3GHz 128 GB Ram. Figure 1 and Figure 2 show the histograms for the PSRF values of [30] and the improved PSRF values of [31], respectively. From these plots, we can see that the Markov chains are converged and mixed well. We also computed the effective sample size (ESS) in order to check the efficiency of the posterior samples and found that the minimum of ESS is about 3,000. It means that the posterior estimates obtained from these samples are reliable. We display the boxplot of ESS in Figure 3. Table 1 shows summary statistics (mean and standard deviation) of R-sqaured (R 2 ) values and root mean squared error (RMSE) values over radiation levels (see, e.g., [33]) and Figure 4 displays results of fitting functions on the radiation-response of three semiconductors. Here the ID number indicates semiconductor. In the figure, the solid grey lines denote real observed functions. The solid and dotted red lines represent the curves and 95% credible intervals fitted by VOLUME 11, 2023   the NUTS method and the solid and dotted green lines are the fitted curves and 95% credible intervals approximated from the ADVI algorithm. These results indicate that the ADVI algorithm estimates the mean curves considerably accurately as the NUTS algorithm but underestimates the uncertainty (This same phenomenon appears in [32]). Table 2 shows the computation times in seconds. From the table, we can see that the ADVI method is faster than the NUTS method. That is, the ADVI algorithm reduces the computational demands dramatically. These results suggest that the fully factorized ADVI algorithm can be used to quickly estimate the mean curves when uncertainty is not important.

V. CONCLUSION
In this paper, we have presented a nonparametric Bayesian hierarchical model with the Gaussian-Inverse Wishart process prior (hGIWP) and developed a Gaussian variational approximation (GVA) algorithm based on the automatic differentiation variational inference (ADVI) method. We have compared the GVA algorithm with the well-known No-U-Turn sampler (NUTS), a modified version of the Hamiltonian Monte Carlo (HMC) method, by fitting functions on the radiation-response of semiconductors. The results suggest that the GVA algorithm can be used for fitting the mean curves of functional data and when quick calculations are required because of reducing the computational demands.

VI. FUTURE WORK
Since the fully factorized (mean-field) GVA algorithm tends to underestimate the uncertainty, the additional approaches are needed to calculate accurate posterior variances (see, e.g., [34]). This is an important research area for future work.

APPENDIX A
In this appendix, we give the residual plots for checking normal error assumptions in model (4) of Section III. The residual plots ( Figure 5) below look fine in that there do not appear to be any strong patterns.

APPENDIX B
We provide a Stan code for the hGIWP model. JAEOH KIM received the Ph.D. degree in statistics from Korea University, in 2018. He is currently an Assistant Professor with the Department of Data Science and an Adjunct Professor with the Department of Statistics, Inha University. His research interests include data mining, data science, and statistical computation and its applications. VOLUME 11, 2023 HO-JIN JUNG received the M.S. degree in biology from Konkuk University, in 2000. She is currently a Principal Researcher with Insilicogen Inc. Her research interests include bioinformatics, and data science and its applications.
SEUNG-WON SEO received the M.S. degree in bioinformatics from the University of Science and Technology, in 2013. He is currently a Deputy Principal Developer with Insilicogen Inc. His research interests include bioinformatics, data science, and bio database system and its applications.
JI-MAN HONG received the M.S. degree in bioinformatics from Pusan National University, in 2007. He is currently a Deputy Principal Researcher with Insilicogen Inc. His research interests include bioinformatics, data science, multi-omics, and machine learning and its applications.
HYOUNG-WOO BAI received the Ph.D. degree in pharmacology from the University of Kansas Medical Center, in 2010. He is currently a Principal Researcher with the Department of Radiation Science, Korea Atomic Energy Research Institute, and a Professor with the Department of Radiation Science and Technology, University of Science and Technology. His research interest includes radiation biology and its applications.
SEONGIL JO received the Ph.D. degree in statistics from Seoul National University, in 2014. He is currently an Associate Professor with the Department of Statistics and an Adjunct Professor with the Department of Data Science, Inha University. His research interests include Bayesian statistics, data science, and spatio-temporal models and its applications.