Aerodynamic probe calibration using Gaussian process regression

During the calibration of an aerodynamic probe, the correlation between the present representative flow quantities of the fluid and the measurand is determined. Thus, a large number, sometimes several thousands, of different calibration points are set and measured, making this a very time-consuming process. The differences in the calibration data of similar constructed probes are very small. With the help of statistical methods, more precisely Gaussian process regressions, this similarity is exploited in order to use existing calibration data of different probes reducing the calibration time with sufficient reconstruction accuracy. Data from single-wire hot-wire probes and from five-hole probes are tested and show a very high reconstruction accuracy compared to the full calibration data set. The number of calibration points in the five-hole probe case is reduced by at least one order of magnitude with comparable reconstruction accuracy.


Introduction
Experimentally obtained data of flow phenomena are still of great interest for academic and industrial research, despite the ongoing development and optimization of CFD (computational fluid dynamics) simulations. Furthermore, experimental results often serve as data basis for the validation of CFD solvers. The most commonly used intrusive measurement methods are hot-wire probes and multi-hole pressure probes. Even though hot-wire constant temperature anemometry (CTA) is known for its high temporal resolution, hotwire probes are characterized by very low mechanical robustness when used in harsh environments. In contrast, multi-hole Original Content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. pressure probes are inexpensive to manufacture and easy to operate. However, attention should be paid to the fact that meaningful and accurate measurement results can only be obtained if the probe has been calibrated under representative conditions before use. During the calibration, different combinations of flow parameters (inflow velocity and flow angles) are set in a calibration free-jet wind tunnel [1,2]. The corresponding measurement data are recorded with the aerodynamic measurement probe. Depending on the probe and the expected reconstruction accuracy, this calibration process sometimes comprises several hundreds or thousands of calibration points [3]. Thus, the calibration is a very time-consuming step. Hot-wire probes for example require a recalibration before each measurement campaign or after damage to the wire and rewelding.
When analyzing the calibration data, it is noteworthy that the (multivariate) functions are often very alike in shape. Since aerodynamic probe calibrations can be described as regression problems, besides a standard polynomial regression approach, Bayesian statistics methods can be used. Gelman et al describe the basics of Bayesian statistics [4]. Furthermore, Rasmussen applies Gaussian processes GP to machine learning problems, both for regression and classification problems [5]. In various research fields in the literature, Gaussian processes have been applied: in geostatistics, Gaussian process regression is better known as kriging [6]. During the application of GPs, the placement of test point locations is crucial. Krause et al describe an optimization routine for the placement of the locations in Gaussian process problems [7]. Furthermore, in engineering applications, Gaussian process regression has been used for the wind energy turbine power curve model prediction [8]. Its application on the calibration of spectroscopic sensors has also been shown [9]. In aerodynamic metrology, Garcia-Ruiz et al show the application of Gaussian processes for the hotwire temperature compensation [10]. Moreover, Agrawal et al introduce a non-linear regression approach to minimize recalibration for non-thermal drifts [11].
Since aerodynamic probe calibrations a) require a regression within all measured data points, and b) show similarity among themselves, the idea of applying Bayesian statistics, viz. Gaussian process regression, on aerodynamic probe calibration arises. Hence, governing hypotheses can be identified and are investigated within this paper: Hypothesis 1 The knowledge of former calibration data of various different shaped probes can be transferred to future probe calibrations incorporating the similarity among themselves.
Hypothesis 2 Bayesian statistics and machine learning algorithms, more precisely Gaussian process regression, are applicable on aerodynamic probe calibration data.
Hypothesis 3 The number of calibration points needed can be significantly reduced while still showing acceptable reconstruction accuracy, and thus, leading to a reduction of time consumption.
Especially under the assumption that the first two hypotheses hold, a confirmation of the third hypothesis could result in a significant time saving in set-up costs of a measurement campaign with aerodynamic probes.
In this paper, the usage of Gaussian process regression for the reduction of calibration points is described. Hence, the theoretical background of Gaussian processes is outlined first in section 2. Moreover, the calibration process for hot-wire anemometry and multi-hole pressure probes is described in section 3. In the last part of the paper (see section 4), investigations on the applicability of the Gaussian process regression on real calibration data is demonstrated. The procedure is introduced with a generic example. Furthermore, apart from single-wire hot-wire data, the focus lies on the application of Gaussian process regression on the calibration of multi-hole pressure probes. Thereby, several GPs have to be considered simultaneously side by side.

Gaussian process regression
In this section, the theoretical background to Gaussian process regression is explained. In section 2.1, the principles of Bayesian statistics are outlined. The theory of the Gaussian process regression is introduced in section 2.2. The upcoming sections are based on the more detailed discussion given by Rasmussen and Williams by addressing Gaussian processes in machine learning applications [5,12]. Furthermore, general information on pattern recognition and machine learning is given by Bishop [13].

Theoretical background to Bayesian statistics
Engineering problems are often characterized by the lack of available information. This is where probability models provide a remedy when it comes to dealing with the challenge of missing information. In Bayesian statistics, the model based on existing data can successively be improved with new information by inference. The question on how probabilities change due to new information are thus brought into a mathematical framework by using the Bayesian theorem. The more general question on what can be inferred on the population based on samples is hereby answered. Hence, the Bayesian formalism introduces various probabilities, which are described in the following: A prior probability P(H) has to be specified, expressing the belief about the hypothesis before incorporating observations. The likelihood probability P(E|H) is the probability of the observations given the hypothesis. The marginal likelihood or evidence P(E) is the normalizing constant. The posterior combines the likelihood and the prior and takes all information that is known into account. The posterior P(H|E), the probability of the hypothesis given the evidence, can be calculated by Bayes' rule:

Gaussian processes
In the context of Bayesian statistics, multiple machine learning algorithms have been developed, one of which is the Gaussian process regression. As noted in the Bayesian statistics introducing comments, new information can be used to infer a new posterior Gaussian process model, which incorporates the observations by updating the initial/prior Gaussian process. In contrast to basic fitting methods, the expected order of the approximation does not need to be specified beforehand for the Gaussian process regression. A Gaussian process describes a distribution over functions and is fully characterized by the mean function m(x) and the covariance function k(x, x ′ ) of a real process f(x).
For noise-free observations, the Gaussian process can be written as: A distinct finite number n of locations is considered further and the mean µ and the covariance Σ can be expressed as: The joint distribution of n known training case function values, f, and a set of n * function values corresponding to the test set inputs, f * , gives: , Here, for example Σ f, * = Σ(X, X * ) represents the n × n * matrix of the covariances evaluated at all n training points X and n * test points X * . The predictive joint posterior distribution can be used to sample function values f * of test inputs X * by evaluating the mean and covariance matrix: The covariance of a Gaussian process random variable can be described by the kernel or covariance function and relates one observation to another. For a valid kernel function, the kernel matrix Σ = k(X, X) has to be positive definite. This implies a symmetric covariance matrix. The prediction of the Gaussian process strongly depends on the choice of the covariance function. Instead of fixing it to a specific shape, usually a parametric family of functions is selected and its parameters are optimized by inferring with the training data. In the following, two different families are introduced. Afterwards, it is shown briefly how the hyperparameters can be optimized.

Kernel function families.
One of the most basic kernel function families is the squared exponential kernel, or Gaussian kernel (see figure 1 (top)): Here, σ f denotes the signal standard deviation or the maximum allowable covariance between two different observations. Further, σ l is the characteristic length scale which defines the range of influence of two different observations. Another kernel function, which is commonly applied, is the Matérn 3/2 covariance function (see figure 1 (bottom)). This kernel function is also used for the GP of the CTA and the multi-hole pressure probe data in section 4.2 and 4.3, respectively: Figure 2 shows a representation of the kernel matrix for both the Gaussian and the Matérn 3/2 kernels with randomly chosen hyperparameters σ f = 1.0 and σ l = 1.0. Comparing both kernel functions, it can be seen that the squared exponential kernel shape has a broader peak in comparison to the Matérn kernel, but contrary flattens out earlier with increasing distance ∥x − x ′ ∥ 2 .
The set of hyperparameters is often pooled in the vector θ. For the previously defined kernel functions, θ is defined as: Depending on the mean and kernel function chosen in the prior step, different free parameters have to be set. The stated equations hold true for noise-free data. In real-world problems, observations are subject to noise. This can be expressed in additional terms in the GP formulation. The measured value y with noise is defined as: y = f(x) + ϵ. Here, the noise ε is Gaussian distributed with a noise variance σ 2 n . Furthermore, the covariance also changes to Σ y = Σ f + σ 2 n I, where Σ f is the covariance matrix for noise-free observations.

Training of hyperparameters.
The evaluation of the marginal likelihood function p(y|X, θ) is the basis for the training of the hyperparameter vector θ. This can be done by maximizing the log likelihood function with efficient gradientbased algorithms, e.g. conjugate gradient solvers [14]. The log likelihood function L for multivariate Gaussian distributions is given by: It has to be noted that log p(y|X, θ) is a non-convex function. Hence, it could have multiple maxima. Numerous methods in the literature cover the determination and optimization of this problems by inverting Σ y efficiently while reducing computational costs of the O(n 3 ) computation.
Furthermore, regarding non-linear optimization, the gradient of the log likelihood function is needed as well. This is done by seeking the partial derivatives of the marginal likelihood w.r.t. the hyperparameters θ j (here, Tr is the trace of the matrix): 2.2.3. Sparse GP for large data sets. In the case of a high number of input/training data sets, GP models experience a high computational effort due to matrix inversions in the inference step. Considering n training points x, an exact inference via the Gaussian likelihood method is of O(n 3 ) for the standard n × n matrix inversion. In order to reduce the computational load, there are different methods to approximate the covariance matrix. Instead of using the full covariance matrix Σ,  an approximate matrixΣ is used for the inference. Quinonero-Candela and Rasmussen give an overview of different methods [15]. Generally, the approximation methods work with a set of m inducing points u with a reduced computational load of O(mn 2 ). In the GPML Matlab toolbox the Fully Independent Training Conditional (FITC) approximation is applied, which is briefly explained in the following. The approximated covariance matrixΣ ≈ Σ can be expressed as [15][16][17]: with σ nu being the noise variance from the inducing points. The diagonal matrix diag(A) comprises the diagonal elements of A. Besides the known n × n covariance matrix Σ, the formula also uses the n × m covariance matrix between the test points and the inducing points Σ u and the m × m covariance matrix between the inducing points Σ uu .

Calibration of aerodynamic probes
The application of aerodynamic probes in experiments in unknown flow conditions requires a calibration of the probe in a known free-jet calibration wind tunnel in advance. Within the context of the aerodynamic/spatial calibration of a probe, the correlation between the mean free-jet flow conditions and the measured quantity x meas,c at the probe is determined. The index c denotes values in the calibration step, whereas the index T will be used for values in the reconstruction of test points.
x meas can, for example, be the voltage measured by a hot-wire probe or the pressure recorded by a multi-hole pressure probe at the location of its pressure transducer. During the calibration, different combinations of the free-jet velocity U ∞,c and the flow angles α c and β c are set in the free-jet calibration wind tunnel, see figure 3. In order to determine the actual flow conditions at the probe tip in an experiment, the measurand x meas,T must be post-processed with the stored calibration data. In the literature, there are several methods how the calibration data can be used to reconstruct the flow field properties. The most commonly used one is an interpolation approach, which is applied to calculate the flow data at the probe tip based on the acquired measurements. In the following sections, the calibration and reconstruction methods for both measurements techniques, hot-wire anemometry in section 3.1 and multi-hole pressure probes in section 3.2, are briefly introduced.

Hot-wire probes
The calibration of a CTA hot-wire probe determines a relationship between the CTA output and the flow velocity U ∞ . It is performed by exposing the probe to a known flow and recording the voltages E. A curve fit through the acquired points (E, U ∞ ) is used when converting data sets from voltages to velocities, see figure 4. CTA bases on the cooling effect of the flow on the wire (convective heat transfer).
The current/voltage that has to be provided by the anemometer to the wire to keep the wire at a constant temperature is measured. The wire is one arm of a Wheatstone bridge and has to be balanced before being calibrated. Hence, the bridge voltage E is a direct measure for the flow velocity U ∞ . Furthermore, it is important to monitor the air temperature during the probe calibration. If it varies from calibration to measurement, it is necessary to correct the CTA data for temperature variations, see [18][19][20]. The gold-plated tungsten wire with a length of approximately 1.25 mm and a diameter of 5 µm is welded between two prongs. The sensor temperature coefficient is 0.003 6 1 K -1 and an overheat ratio of a = 1.8 is applied within all calibrations. For X-wire or triple-wire probes, a directional calibration has to be performed. For reasons of brevity, this paper does not deal with the directional calibrations for hot-wire probes in detail. Henceforth, solely single-wire probes are considered. More details on multi-wire probes can be found in the literature [21]. For single-wire probes, the calibrated data points, shown in figure 4, can be approximated with either a polynomial or a power-law curve fitting: Here, θ poly = [a 0 , a 1 , a 2 , a 3 , a 4 ] are calibration coefficients for the polynomial fit, whereas A and B and n represent the calibration constants for the power law fit θ pow = [A, B, n]. The exponent n usually lies in the region of n = (0.4, 0.55) and is adapted to the calibration data set.

Multi-hole pressure probes
The working principle of pressure probes relies on the stagnation of the flow around the probe. At the stagnation point, the total pressure p t is equal to the sum of the static pressure p s and the dynamic pressure q. Multi-hole probes measure the total pressure of the flow at various locations at the probe tip. By measuring all pressures and setting them into relation, the flow properties at the probe tip can be concluded. For a five-hole probe, the pressures p 1 -p 5 are recorded and post-processed. Both, pitch α and yaw β, angles (see figure 5) can be resolved.
To gather the calibration data set, different angle combinations and free-stream velocities are set in the free-jet wind tunnel. Figure 6 shows an exemplary calibration grid for a multi-hole pressure probe. For each calibration velocity, several hundreds or sometimes more than a thousand angle combinations are calibrated. The maximal calibration angle of a five-hole probe, and hence its reconstruction range, is near ± 60 • .
Non-dimensional calibration coefficients can be calculated with the acquired pressures, which form the basis for the interpolation. The interpolation routines can be divided into global or local interpolations, depending on whether all calibration points or only points in the surroundings with similar calibration coefficients should be used. In the local interpolation method, the calibration data is divided into a low-and a highangle regime, see [22]. The pressure port with the highest measured pressure determines the set of calibration coefficients used for reconstruction. In the event that multiple pressure ports see similar pressures within a given range, overlap segments are defined where the coefficients are calculated for each dominant pressure port. For the low-angle regime, where the central port p 1 measures the highest pressure, the coefficients are defined as follows: q denotes the pseudo dynamic pressure, which is used to non-dimensionalize the coefficients. In the high-angle regime, where one of the circumferential ports p i records the highest pressure, the coefficients read: Hereby, p + and p − denote the pressures at the circumferential pressure ports in clockwise and counter-clockwise direction. During the reconstruction, the test point pressure data vector p T is recorded. The subscript T indicates the values at the test point. The non-dimensional coefficients b α,T , b β,T or b θ,T , b ϕ,T for low and high angle regimes are calculated as defined above. In the following step the quantities A t,T , A s,T and α T , β T or θ T , ϕ T are determined by the interpolation algorithm as functions of f(b α,T , b β,T ) or f(b θ,T , b ϕ,T ). Furthermore, the Mach number M and the deduced velocity magnitude U are calculated as a function of p t and p s . The velocity components can be expressed by using the flow angles α and β.

Exploitation of gaussian process regression for the calibration of aerodynamic probes
In this section, the findings from section 2 on the fundamentals of Gaussian process regression are applied to calibration data of aerodynamic probes. Using an introducing example in the following section 4.1, the procedure of the Gaussian process regression for the reduction in the number of the calibration points will be demonstrated, see figure 7. Thereby, the application of the GPML Matlab-toolbox implementation by Rasmussen and Nickisch [23] and the choice of the parameters will be discussed. In section 4.2, single-wire CTA calibration data are examined and the applicability of Gaussian process regression to real aerodynamic probe data is shown. As a final demonstration of the approximation technique for aerodynamic probe data, the more complex calibration of a fivehole pressure probe is explained (see section 4.3).

An introducing example
In this introducing example, the Gaussian process regression procedure from the flow chart in figure 7 is explained step by step using generically generated data. After starting the GPML Matlab toolbox, the available calibration data are read in first.  Here, the ten different input data sets are displayed in figure 8 (left). Since the number of input data points is small, the normal GP approach is applied. A squared exponential kernel function and the Gaussian likelihood formulation is chosen for inference. The prior GP(0, k) is now initialized with the initial hyperparameters in table 1, and thereafter, a posterior GP is trained with the input data (see figure 8 (right)). In the optimization step, the hyperparameter vector θ is optimized by maximizing the log marginal likelihood as described in section 2.2. The optimized values in table 1 show that the initially chosen values vary widely from the optimized hyperparameters. The location of the first supporting point, viz. the location of the first point to be calibrated in the new probe calibration, is determined. This can be done manually or e.g. by choosing the location of the highest predictive output variance, which is available as an output vector of the gp-routine. At this point, newly calibrated points of the probe under investigation are added to the GP regression approach. This is done in a while loop: as long as the termination condition is not fulfilled, an updated conditional GP is formed with the available supporting points from the new calibration and new locations for further supporting points are determined. If the termination condition is reached, a final GP is formed with all newly calibrated supporting points. The results of the last step serves as a new calibration curve for the probe and can be used for post-processing or visualization. In figure 9, the gp-routine outputs of the first three iterations and the final iteration are visualized. Besides the mean and the standard deviation of the updated GPs, the test curve, which represents the probe calibration curve to be approximated, and the supporting points are shown. Already after two iterations, the output of the GP is close to the function to be approximated y test . After five iterations, the termination condition selected here is reached and the final GP is calculated. The GP-output matches almost exactly the test curve y test .

Single-wire CTA-probes
In this test case, real aerodynamic probe calibration data of single-wire CTA probes are used. The input data set comprises 13 different single-wire calibrations. Each of the calibrations was conducted separately and is independent of each other. The bridge balancing and setup of the CTA was done in advance and the gain and offset voltages were set, respectively. Furthermore, some calibrations were digitized with an ADconverter analog input voltage range up to 5 V and some with up to 10 V. Furthermore, the maximum calibrated air speed differs between the input data sets. Maximum velocities range from 10 m s −1 up to 120 m s −1 . Concluding, the input data sets are independent of each other and very heterogeneous.
In the following first part of the hot-wire investigations, the regular GP regression methodology, as explained in the introducing example shown in figure 7, is applied on the hotwire data. After first tests, the Matérn 3/2 kernel is chosen for the covariance and the following initial hyperparameters are chosen (see table 2). For the sake of consistency and clarity,        the velocity and voltage data are denoted as x and y, respectively. In figure 10 (left) all calibration data that are used for the determination of the initial GP (see figure 10 (right)) are shown. Figure 11 shows the updating process of the GP regression routine for various iterations. After eight iterations, the termination condition is reached. Figure 12 displays the maximum standard deviation of the updated GP after each iteration. It is used to determine the location of the next supporting point. The relative improvement between the successive steps is decreasing and after the 8th iteration the standard deviation lies below the margin for the termination condition. In figure 13, the hot-wire test calibration that is to be approximated by the Gaussian process regression is shown.
Here, results of both fitting methods, which are described in In the second part of the hot-wire investigations, the influence of the initial determination of the hyperparameters on the outcome of the Gaussian process regression is studied. This is done in a sampling routine, similar to a Monte-Carlo-method (MCM) simulation. It is determined how many supporting points are needed to reach the determination criterion. Thereby, the three hyperparameters σ l , σ f and σ n are sampled out of three normal distributions N σ l (1000, 300), N σf (10,5) and N σn (0.01, 0.000 1), respectively. N MCM = 10 4 multivariate samples are drawn and tested. The determination criterion of each of the N MCM GPs is set to a threshold RMS-difference value of ε rms = 0.05 between the updated GPoutput and the test hot-wire calibration. Figure 14 shows the resulting number of supporting points needed to reach ε rms for each MCM-sample, as well as the initial and the updated hyperparameters.
The initialized hyperparameters show the expected Gaussian shape, and hence, the assumed number of MCMsamples N MCM = 10 4 is appropriate, which is a prerequisite before discussing the output data. In the column for the optimized hyperparameters after the GPML optimization routine, it can be seen that the distributions are not fully Gaussian anymore. The noise hyperparameter σ n is negligible now, whereas σ l is distributed around a mean ofσ l = 110 and σ f around a mean ofσ f = 8.2. Furthermore, it can be seen that for most samples, solely seven to ten supporting points are needed. In comparison to the full calibrations, which frequently comprise  up to 30 calibration points, the introduced calibration approach applying a GP regression algorithm leads to a reduced number of calibration points and hence a reduction in calibration time.

Five-hole pressure probes
The calibration of five-hole probes results in four calibration surfaces for α, β, A t and A s as functions of f(b α , b β ). Hence, the shown results in this chapter are the outcomes of multiple, viz. four simultaneous, GP regressions. The determination of the supporting points of the updated GPs will be evaluated globally, but the updating of every single GP itself is handled independently. All existing calibration data sets are read in first. They consist of data of 24 different probes, which are shaped differently. The probe stem shape varies between straight, L-shaped and cobra-shaped probe stems and conical and hemispheric probe tips with varying diameters D are present. The overhang length for the L-shaped probes vary between (1.5, 9.5)·D. For some probes, calibration data for different inflow Mach numbers are available. Therefore, 45 input data sets in the Mach number range between M = (0.024, 0.95) are used to train the initial GP. An overview of the different calibration data including the range of diameters D, Reynolds numbers Re D and Mach numbers M is given in table 3. When concatenating the input data sets, with n = 47 577 calibration points, the number of points exceeds the limit for the standard GP regression due to limitation of the necessary O(n 3 ) matrix manipulations. For this reason, the sparse approximation with m = 1681 locations (b α |b β ) is applied as described in section 2.2 which in turn reduces the costs to O(mn 2 ). The Matérn 3/2 kernel is used for the four GPs as   the covariance function and is initialized with the initial hyperparameters in table 4. After initialization, the GP is updated with the sparse input values. Figure 15 shows the initial GPs for the four calibration surfaces. A visualization of the standard deviation as displayed in the preceding 1D cases is omitted for clearer visualization.  In order to better compare the output of the GP regression, the probe under investigation was fully calibrated beforehand. It is a straight five-hole probe with a hemispheric probe tip with a tip diameter of D = 3 mm. It was calibrated for three inflow velocities. Two of these calibration data sets are part of the input data sets and the remaining one is the test calibration, denoted either as full calibration or with the index * test . In the next step, optimized hyperparameters for the four GPs are found by optimizing the log marginal likelihood. The resulting hyperparamters which remain unchanged during the upcoming updating routine are gathered in table 4 as well. The computational costs for the initialization step and the optimization of the hyperparameters lie in the order of (a few) minutes on a state-of-the-art workstation and, hence, can be considered as negligible in contrast to wind tunnel setup costs. In the updating procedure, supporting point locations are added to the GP. Due to the fact that four GPs have to be updated simultaneously, a criterion which GP contributes the next supporting point is defined. For each GP, the location of the maximum standard deviation of the updated GP normalized by the standard deviation of the first iteration is calculated. Depending which of the four GPs experiences the highest normalized standard deviation, the location of the additional supporting point for the next iteration for all GPs is chosen. A termination criterion stopping the iterative updating  process can be applied. For example, if the maximum normalized standard deviation or the RMS value of the sum of standard deviations for the four GPs falls below a pre-defined margin, the criterion is reached and the solution is converged. Since, in this example, the full calibration for the multi-hole probe was also done in order to compare it to the results of the GP, the number of iteration steps was fixed to 300 for the first investigations. After each iteration, the RMS error between the GP output data and the full calibration data is displayed in a semi-logarithmic plot in figure 16. When discussing the rms(β GP − β test ) plot, the RMS value reads around 1 • for the 10th iteration and drops below 0.2 • for around 80 iterations. Figure 17 visualizes the GP outputs of the 80th iteration of the four calibration surfaces. Furthermore, the 80 supporting points and the full calibration test surfaces are shown. Visually, there are only very small noticeable deviations in the A t and A s plot. The errors between the α and β GP outputs and the full calibration surfaces seem negligible.
For quantifying the visually perceived deviations in figure 17, a reconstruction of unknown test points is performed in the next step, also known as generalization. In this process, 47 test data points measured independently of the determination of the calibration surfaces are post-processed with the final GP output and the full calibration data. The test points comprise pressure measurements p T of different angle combinations at a fixed Mach number of M = 0.1. Figure 18 and figure 19 show the results of the post-processing step conducted with the GP and the full calibration data. Besides a visualization of the reconstructed angles, a histogram shows the absolute angle errors in degrees, calculated with the true angle values set in the calibration wind tunnel as reference values. Table 5 gives an overview of the quantified results of the reconstruction: apart from the maximum absolute occurring error among the test results maxabs, the RMS errors rms and the standard deviation std are formed across all test points for the angles α and β as well as for the reconstructed Mach number M.
To conclude the investigations on the five-hole probe, a series of reconstructions with different sized GP models is shown. Thereby, the number of supporting points was increased in steps of 10 up to 300 supporting points. The reconstruction accuracy for α and β in terms of the RMS error of the GP is displayed in the semi-logarithmic plot in figure 20. With an increasing number of supporting points, the RMS errors asymptotically approach the reference values of the full calibration reconstruction. After 80 to 100 supporting points, only marginal improvements are noticeable. This leads to the conclusion that with the application of GP regression, the number of actually measured points to build a calibration surface, capable to reconstruct with almost the same accuracy, can be decreased over one order of magnitude. In the case that lower requirements are made in terms of reconstruction accuracy, the number of supporting points can be reduced by a multiple, furthermore. Hence, calibration set-up costs, expressed by the number or time of measured points, could be reduced in this example by the factor of F = 1014/80 = 12.68 for high reconstruction accuracy (angle RMS below 0.15 • ) and the factor of F = 1014/40 = 25.35 for lower reconstruction accuracy (angle RMS below 0.5 • ).

Concluding remarks
In this paper, the application of Gaussian process regression on the calibration of aerodynamic probes is shown. Besides introducing into the theoretical background of the Bayesian statistics approach, the basic ideas in probe calibration methods for CTA and pressure probes are presented. The approach of applying GP regression to aerodynamic calibration data is introduced in a generic example and shows the potential of the method. The method is tested on two real data sets: the 1D calibration of a single-wire hot-wire probe and the 3D calibration of a five-hole probe. The more challenging task of modeling the four calibration surfaces of the five-hole probe can be performed very accurately with the newly developed GP methodology, yielding promising results. A reconstruction with the GP calibration with a twelfth of the calibration points, compared to the full calibration, provides a comparably accurate reconstruction of the test points. Moreover, this procedure can also be performed for probes of different sizes and probe head shapes in a reliable and robust manner. The only requirement for the input data is that they are generated during the calibration of the same type of probe. For example, calibrations of three-hole probes could not be used to generate the GP for five-hole probes. Finally, the hypotheses presented in the introduction are evaluated: The applicability of Gaussian process regression to aerodynamic calibration data from hypothesis 2 is mainly shown in the introducing example in section 4.1. The flow chart 7 shows the principle procedure of the GP calibration. Especially in the pressure probe test in section 4.3, hypothesis 1 can be confirmed by training the initial GP with more than 47 000 input data points and combining the similarity of the different probe calibrations. When looking at the results obtained from the 3D calibration of the five-hole probe, the savings potential in calibration time becomes obvious. The prerequisite for achieving this is, obviously, that a sufficiently large input data set of different probes is available, viz. various probe shapes with different tip diameters which are calibrated in a wide Reynolds numbers range. The reduction in the number of calibration points by at least one order of magnitude is demonstrated with comparable reconstruction accuracy. A speed-up factor of F > 10 can be realized for multi-hole pressure probes. In future developments, the GP calibration module is planned to be further refined, optimized and extended to other types of probes.