Statistical Comparison of Magnetopause Distances and CPCP Estimation by Global MHD Models

,


INTRODUCTION
The solar wind interacts with the Earth's dipole magnetic field, confining it in a magnetic cavity or magnetosphere with an outer boundary called the magnetopause. The size and shape of the magnetopause can be estimated by the dynamic and static pressure of the solar wind along with sufficient knowledge of the interplanetary magnetic field (Figure 1). This is the basis of several empirical models that have been developed to model the size and shape of the magnetopause. These relations are useful for space weather operations and have been used extensively for comparisons with numerical simulations. Various models for the estimation of magnetopause size and shape have been studied in the past. The most commonly used magnetopause models such as the Shue et al [1997,1998] model and the Petrinec and Russell [1996] model have used trigonometric functions to describe the magnetopause size and shape. The Shue et al model was one of the earliest models to shift from a geometric function towards a regressive one, and that trend has since been followed. More modern magnetopause models such as the Lin et al [2010] model or the Liu et al [2015] model have attempted to include more pressure and magnetic field components of the solar wind and compared the improvement with previous results. In spite of the development of a variety of models on the basis of different sets of magnetopause crossings of different spacecraft, studies like Case and Wild [2013] and Petrinec et al [2017] applied several of these models to a set of magnetopause crossings of their choosing and estimated deviations of the empirical value from the observed values. A similar standard deviation study was presented by Lin et al [2010] comparing their model with ten other models dating back to 1993.
The cross polar cap potential (ΦPC) serves as an instantaneous indicator of the amount of energy flowing into the Earth's magnetosphere-ionosphere system from the solar wind. While responding linearly to a nominal solar wind conditions as shown in Boyle et al. (1997), the ΦPC saturates for intense solar winds. This saturation has been consistent with observations and is found to occur in MHD simulations as well. Although several attempts have been made to explain these phenomena, none of the methods have strong observational evidence, thereby employing statistical techniques to measure ionospheric quantities. The four most commonly used techniques to measure the ionospheric quantities are: (1) Assimilative Mapping of Ionospheric Electrodynamics (AMIE), (2) Defense Meteorological Satellite Program (DMSP), (3) polar cap (PC) index, and (4) Super Dual Auroral Radar Network (SuperDARN). In this study, we have included the observational data from AMIE and SuperDARN. The methods used to estimate the ionospheric properties by these methods have been briefly described in Section 3.1.2, and can be referred in the primary studies by Ruohoniemi and Baker [1998] and Ridley and Kihn [2004]. Although several comparison studies for ionospheric potential values and the performance of MHD models in estimating similar data have been conducted for individual cases, they are not as abundant as those compared for magnetopause location models. A conclusive comparison by Gao [2012] is the basis for the present study's comparisons. The global magnetosphere density in the x-z plane (a) and the ionospheric potential in the northern hemisphere (b) for the October 2001 storm case, generated by the BATS-R-US/SWMF model via CCMC at two different times. The change in the magnetopause distances for times 1200 UT and 1600 UT is significant and has been compared with empirical data quantitatively. The similar has been done with the cross polar cap potential, which can be seen to be having a massive change with the difference between the ionospheric potential range significantly changing.
With the emergence of global magnetohydrodynamic (MHD) models, the growing interest in accurate space weather forecasting has increased the need for space weather model development. This in turn requires verification and validation of these models and the objective evaluation of their suitability for a particular purpose. Global MHD models have mostly been validated with several qualitative comparisons against observations by various spacecraft. The Geospace Environment Modeling (GEM) Challenge of 2008-09 was one of the biggest attempts to quantify the performance of MHD models in reproducing observations and comparing them with ground based data. The approach and metrics applied in the challenge described by Pulkkinen (Lu et al [2014]) has made it essential to conduct a trade-off study between the simulated environment values and the empirical estimations. In addition, the magnetopause locations and CPCP values, as indicated earlier, provide an instantaneous measure of the current sheet system and magnetic fields in the magnetosphere, and a correct inference towards the understanding of the magnetosphere-ionosphere coupling.
In this study, an attempt to graphically and numerically compare three MHD models -the Space Weather Modeling Framework (BATS-R-US MHD model), the Lyon-Fedder-Mobarry (LFM) model and the Open General Geospace Circulation Model (OpenGGCM) has been conducted to obtain a quantified metric for comparison. For this study, six empirical models for magnetopause estimation and two empirical models for CPCP have been used. The measured distance for the magnetopause standoff distance considered is the minimum distances of the magnetopause from the Earth within 30 O from the Sun-Earth line. The data for the ten solar events considered have been obtained from the database of the Coordinated Community Modeling Center (CCMC) website, and the settings used for the models are as close to each other as possible. In section 2 and 3, the metrics and the different models that have been employed in this study are described. In section 5, the model results with the corresponding measurements have been presented, and in section 6, comparison of the various metrics has been conducted. Conclusions are drawn in section 7.  , the change in dipole tilt angle has been assumed to be zero, and instead the minimum distance for any angle has been found.

MODEL PERFORMANCE METRICS
Based on the data sets, four different metrics are used in evaluating the model performances in this paper.
One of the classic means to quantify the difference between two elements of a set is to compute the root-meansquare difference (RMS) defined as RMS = √〈(x obs − x mod ) 2 〉 where xobs and xmod are the observed and the modeled signals, respectively, 〈. . 〉 indicates the arithmetic mean taken over i. Throughout this work i corresponds to the time series over individual events. RMS = 0 indicates perfect model performance. It should be noted that in contrast to other metrics used in this work, RMS has a dimension, which is equal to that of signal x. Further, it should also be noted that since RMS is not normalized, comparisons between events having large differences in the amplitude of the signal can be somewhat problematic, as will be seen below. Another commonly used metric is the prediction efficiency (PE) defined as 2 〉 where xobs and xmod are the observed and the modeled signals, respectively, 〈. . 〉 indicates the arithmetic mean taken over i, x ̅ indicates the arithmetic mean of the modeled signals, and the denominator is the variance of the observed signal. Note that PE = 1 indicates a perfect prediction.
The third metric used is the ratio of the maximum amplitudes (Max Amp): Max Amp = max (|x mod | ) max (|x obs | ) where xobs and xmod are the observed and the modeled signals, respectively, and the maximum is taken over i. Clearly, Max Amp=1 indicates perfect model performance while Max Amp>1 and Max Amp<1 indicate that model over-estimates and under-estimates, respectively, the maximum amplitude of the signal.
A new parameter named Wrong Prediction (WP) has been introduced as a subsequent alternative to a contingency table. This is a simple metric, which states the number of data points for which the modeled data was not in the range of the maximum and minimum values of the observed data including its standard deviations. Any and every prediction of the modeled data that lies outside this range has been characterized as a wrong prediction. ⊈ ( , ) ± The metric gives a percentage solution and is more effective in predicting the behavior of the MHD models with empirical data for the magnetopause locations, as have been discussed in Section 5. Due to its simplicity of implementation, we were also able to predict the under and over estimation of a model as Under-prediction: < ( , ) ± Over-prediction: > ( , ) ± A detailed explanation of the algorithm used to calculate the parameters have been explained in Section 6.

EMPIRICAL MODELS AND GLOBAL MHD MODELS
The various empirical models and global MHD models that have been used in this study have been discussed in the following sub-sections.

Magnetopause Standoff Models
Many magnetopause location models have been developed. Each of these models was based on a particular set of physical processes. These physical processes have been implicitly built in the mathematical form used in each model. For example, a power law of the dependence of the subsolar standoff distance on the solar wind pressure can model partially the nature of the geomagnetic dipole field. A nonlinear dependence of the subsolar standoff distance on IMF Bz can model the nonlinear saturation of the magnetopause erosion process. Since the data set used in each model is usually dominated by magnetopause crossings under normal solar wind conditions, the capability of a model to be used under extreme solar wind conditions depends critically on whether the assumed functional forms correctly represent the physical processes in real situations.
A numerical investigation of the above have been conducted in several studies comparing different empirical models with independent data sets and evaluating them on the basis of maximum standard deviation. In this study, the study in Lin et al [2010] is cited as the largest source for selecting the empirical models. In order to better evaluate the Lin et al model, the standard deviation σ(d) was used to compare the model with the previous models on the basis of 246 independent non-Hawkeye magnetopause crossings with 5 min average solar wind parameters, where d is the minimal distance from the observed magnetopause crossing to the predicted magnetopause surface. Since we have considered magnetopause locations within 30 O from the Sun-Earth line, the present study included the empirical studies that had a standard deviation lesser than unity. They have been listed in Table 1 along with a summary of their fitting details with the solar wind.

Cross Polar Cap Potential Models
Statistical methods have been implemented to model ionospheric potentials since the emergence of the Boyle et al [1997] model, which showed the linear dependence of the cross polar cap potential (CPCP) with the solar wind conditions. Several studies thenceforward have studied the statistical significance of ionospheric behavior while suggesting different physical reasons for the same. While the four most common techniques to calculate ionospheric quantities have been listed in the Introduction, this study employs the use of two such techniques: (1) Super Dual Auroral Radar Network (SuperDARN), and (2) Assimilative Mapping of Ionospheric Electrodynamics (AMIE). The key methods and their structure have been briefly explained in the following paragraphs.
SuperDARN measures line-of-sight ionospheric convection velocities with a ground-based network of radars and then infers functional forms of the electrostatic potential (Φ), as a function of the colatitude and longitude . This is expressed as a series expansion of spherical harmonic function truncated at order L where stands for the associated Legendre functions with order l and degree m, and and are coefficients determined by the minimizing of the loss function. This is done by converting SuperDARN measured velocity values to a mapping grid, furnishing a set of N velocity vectors and corresponding uncertainties. For more detail, please refer to Ruohoniemi and Baker [1998].
AMIE assimilates many types of data from both ground-based and space-based instruments and produce estimates of several ionospheric parameters including ΦPC. Ridley and Kihn [2004] describes the method as a technique for mapping high-latitude electric fields and currents from sets of localized observational data. The algorithm of AMIE is similar to SuperDARN. The electrostatic potential, Φ, is expanded on spherical harmonic bases truncated at order L. However, unlike L= 4 for SuperDARN, here L takes value of 11 [Matsuo et al, 2005]. AMIE establishes a linear relationship between the expansion matrices and with the observations at a given point. This relation includes a basis function and an error term, which is solved for by the technique using priori knowledge from an established loss term. Due to this, the predictions in the regions with limited observations become more reasonable.
While being used extensively for data analysis purposes, both SuperDARN and AMIE have significant disadvantages. One of the main limitations of SuperDARN lies in the ground-based radars' limited field of view. For the large fields present when saturation occurs, the polar cap can expand out of the SuperDARN radars' field of view, which can result in an underestimation of ΦPC. AMIE solves this error by the use of the error term as explained above. However, the same requirement of a priori knowledge about the flows and perturbation fields, requires a priori knowledge of the ionospheric Hall conductance (ΣH) is required when magnetometer observation are used. Ridley and Kihn [2004] suggests that incorrect conductance estimates during extreme conditions can lead to incorrect prediction, as has been shown in the present study as well. Gao [2012] compares both these techniques extensively and could be referred for a more detailed comparison of their performance for CPCP values.

Global MHD Model Features and Settings
The features and settings of global MHD models used in this study are presented in Table 2. All models have been executed through the CCMC website and receive as input the solar wind data measured by different satellites for different cases. Since most of the cases considered here have been studied as part of a GEM challenge or similar study, the same cases were found without much changes in options and features.
While all the models solve the MHD equations in the magnetosphere and the same electrostatic potential equation in the ionosphere, some differences have been listed in Table 2. For instance, the grids used in the models are extensively different, varying from 4 million cells for the SWMF/BATS-R-US cases to 7 -9 million cells for the OpenGGCM cases. This was done due to the grid optimization options available on CCMC and keeping in mind the run-time and preference of supplementary models like the Ring Current model, a rational decision was taken. A future study may include a more refined study of the same, with greatly similar characteristics between the different MHD runs. For a more detailed comparison of the models, see Honkonen et al [2013].
As has been pointed out in Table 2, the three models in consideration use different MHD equations to solve for global simulation. While SWMF/BATS-R-US uses an ideal, conservative equation, LFM uses semi-conservative equations. OpenGGCM in addition to using semi-conservative equations also adds in the resistive term. SWMF/BATS-R-US in addition to the above, has three coupled and distinct components called the Global Magnetosphere (GM) which contains the bow shock, magnetopause and magnetotail, the Inner Magnetosphere which contains the ring current model, and the ionospheric Electrodynamics (IE) model . For this study, the Rice Convection Model (RCM) has been used for all the models (to whatever extent was possible on CCMC). The LFM model employs a 3D stretched spherical grid to solve for the MHD equation in the magnetosphere, which is then coupled with their magnetosphere-ionosphere (MI) domain. Unlike the SWMF coupling where the GM module provides the field aligned currents (FACs) to the IE module, the MI coupling solves for the ionospheric electric potential and the FACs by combining the Ohm's law with current continuity and electrostatic approximation. OpenGGCM is generally coupled with the Coupled Thermosphere-Ionosphere Model (CTIM) to solve for the ionospheric potential using both first-principle based and empirical methods. OpenGGCM provides auroral precipitation and ionospheric electric fields to CTIM. The magnetopause locations are calculated by CCMC using the modified pressure balance equation, while being compared statistically with the Shue et al [1998] model. The respective ionospheric modules of the MHD models, the difference of which (available as DPhi on the CCMC website) is made available as the CPCP, calculate the ionospheric potential.
For the CCMC runs, all the models were run with a changing dipole tilt. Both ACE and WIND satellites provided solar wind parameters. This was so because, in several of the cases, the solar wind parameters were not particularly lucible in either one of the satellites considered.

EVENT DESCRIPTIONS AND DATA
Ten geospace solar events listed in Table 3, were chosen for the study. Solar wind bulk plasma and the interplanetary magnetic field observations were carried out from the CCMC database. The magnetopause and CPCP (north) data for the three MHD models were also obtained from the aforementioned database and used in this study. The present study consists of seven solar storm events, two substorm events and one general solar conditions case. The characterization of the events, however, have been conducted using the Kp index provided by the Kyoto database. An event having a Kp index value more than 7 was characterized as a highly intense solar storm, an event with Kp index lower than 4 was characterized as a low intensity solar event, while those in between these two indices were characterized as medium intensity solar events. This has been further elucidated in the following sections. The simulation results of the cases reported here are available through the CCMC website.
Appendix 1 denotes the solar bulk plasma and the interplanetary magnetic field observations of the solar wind from the WIND/ACE satellites made for the different cases. While all ten events were used for the study of the magnetopause locations, only eight could be used for the study of the CPCP data, due to unavailability of both empirical and MHD data in two separate cases.

RESULTS
The results relating to the individual event shave been included in the Appendix. These plots show the correlation of the magnetopause and CPCP data from the MHD models interpolated over a range formulated by the empirical data. The grey range includes the empirical data and has been substantially increased by adding/subtracting the standard deviations of the data and specifying the maximum and minimum value for a particular time. The process has been described using a cartoon in Figure 2, to show the general trend followed in the pictorial comparisons of the same. The numerical comparisons and results have been listed in Table 4. These comparisons have been analyzed in depth in the following section.
The present section describes the results as have been produced. The Appendix section contain the plots for the magnetopause locations and the CPCP comparisons successively. The magnetopause locations are visibly well plotted by SWMF and LFM. Apart from some cases (substorms and the general condition), LFM and SWMF have an extensively less wrong prediction rate, while having a good quantitative measure on the other parameters.
OpenGGCM under-predicts the magnetopause locations in almost all the case. It is however able to give a decent estimate of the storm once the storm begins. As can be seen with the case of the Halloween Storm, the solver is able to predict the time and enormity of the storm, while completely going off the charts in the recovery period. SWMF on the other hand overpredicts in the recovery zone. A general trend of SWMF is the over-prediction of the magnetopause location right after the storm. This is a serious as overpredicting the magnetopause distance would directly correlate to undermining the intensity of the storm. LFM is the best predictor amongst the sample cases, and clearly outperforms the rest in predicting the magnetopause locations. The range denoted in grey consists of the values given by the six magnetopause empirical models considered in this study.
Contrary to the results obtained from the magnetopause location comparisons, the CPCP comparison for all three of the models were unsatisfactory. SWMF has the best performance in predicting the CPCP pattern, doing an excessively accurate job in predicting the CPCP during substorms. LFM and OpenGGCM are not able to estimate the CPCP correctly, which is quite evident from the prediction efficiency that they receive. OpenGGCM, however, appears to be correctly estimating the trends at some locations, enabling this author to term it as a mysterious method. LFM over-predicts the CPCP to a great extent. In none of the cases, has its Wrong Prediction parameter been below 50%. However, prediction of the CPCP also shows a different problem. For the high Kp storm cases considered (Halloween Storm and November 2006 Storm), a peculiar change occurs in the grey range that includes the values from AMIE and SuperDARN. This region suddenly increases in breadth at the start of the storm, and remains highly unstable throughout. On an individual study of only the empirical model plots ( Figure  2), it is evident that either SuperDARN under-predicts, AMIE over-predicts or both happen simultaneously thereby causing a disastrous change to the observations. The reasoning could be the same as what has been mentioned in Section 3.1.2, that SuperDARN's range of radar and/or AMIE's ionospheric conductance values might be at fault. However, as seen in previous studies such as Grocott et al [2012], the number of flow vectors considered in SuperDARN assesses its reliability. In this study, not more than 150 flow vectors have been used, which may reason some of the under-prediction of SuperDARN.

ANALYSIS
In order to get a quantitative estimate for the performance of different models, the four metrics were calculated with respect to the empirical data. Table 5 and 6 presents the median values of the parameters for the three model predictions with the measurements for both magnetopause locations and CPCP data. Table 4 presents the best values of the parameters for the individual cases considered.
When examining the prediction efficiencies of the study, it was observed that the MHD models underperformed in the case of substorms and general solar wind conditions for both CPCP and magnetopause location estimation. In BATS-R-US, a general observation that was noted was the deviation of the prediction right after the storm was captured by the model. In terms of model efficiency, as shown in Table 4, LFM received the maximum prediction efficiency of 0.93 and the lowest wrong prediction, while the other parameters were topped by BATS-R-US. OpenGGCM, albeit a decent predictor, fails in all the test parameters in correctly predicting the magnetopause location. While major differences are visible in the metric curves of the prediction efficiency and theroot-mean-square difference, the maximum amplitude metric as a performance does not give significant differences in between the cases and/or the models (considering Case 9 for LFM as an anomaly), thereby failing as a good performance indicator for this case. This was evident in the last study conducted by the author, which prompted the use of the Wrong Prediction factor which had a substantial range and was able to predict correctly how accurate the model could be. The wrong prediction parameter is not however able to assess how much a model has deviated from the empirical values, which is taken care of by the prediction efficiency and RMS difference to some extent. An observation regarding the magnetopause locations also show that while the location is similar between all the MHD models, the standoff distance definitely seems to return to some baseline value which is different for each model. This was attributed in Honkonen et al [2013] as well, who later stated that finding the cause for this would require further investigations as there are no explanations in the upstream solar conditions. . As is evident from the observation, the two models deviate from each other's behavior once the storm starts and continue to be divergent throughout the storm period.
As stated in the last section, the MHD models underperform greatly when it came to estimating the CPCP values. SWMF tops the Table 6 on all the parameters, as is evident. The figures in Appendix 2 are testament to the fact that SWMF is able to achieve the best trend identification and value estimation, while LFM underperforms greatly. This trait of SWMF maybe attributed to another property in its IE module, which solves for the ionospheric potential using the same estimation techniques used in AMIE (Wang et al [2004], Ridley et al [2000], Ridley [2005]). OpenGGCM mysteriously predicts well for some of the trends exhibited in the cases considered. However, the most defining conclusion of the CPCP analysis, and perhaps, of the whole study was the massive differences in scale that occurred for the empirical values (see Figure 3). At this point, almost all the performance metrics were pointless as any comparison would be of no use The author therefore suggests more analysis of the same for high Kp solar events in the near future.

CONCLUSIONS
The present study aimed at observing the performance of MHD models with robust empirical data to predict the location of the magnetopause and the cross polar cap potential value. In order to obtain that, the study looked into the performance of three MHD models and compared solar events that have been simulated by the models on the CCMC website and compared it with empirical data from different sources. While the MHD models LFM and BATS-R-US are able to model the projected behavior of the magnetopause during a storm well, all three models exhibit large offsets during normal solar conditions and substorm events. A future study should include weighted ratios of the empirical model by including more comparative studies like Case and Wild [2010] and Petrinec et al [2017], to provide more robust data set for comparisons. The study was also able to analyze the behavior of the CPCP predictions by the MHD model and was substantially able to conclude that more work is necessary in this area. The CPCP observations were found to be excessively deviating for intense solar events, and while SWMF was able to predict the values and trends of the CPCP well, LFM and OpenGGCM over-predicted greatly in almost all the cases. The author concludes that analysis of AMIE and SuperDARN data by using similar constraints as used in Gao [2012] should be conducted, and only then can conclusive references be brought out of MHD comparisons with the empirical data. A future study may also include more ionospheric data from the PCN index or the DMSP satellites.    Figure 1 (a, b, c). Solar wind bulk plasma and the interplanetary magnetic field observations for the studied storm events (i -x) given in Table 3. See the text for details. Plotted for Events 1 to 10, listed in Table 3.

Figure 3
Cross Polar Cap Potential Data from three MHD simulations compared with median value of CPCP data from AMIE and SuperDARN as a function of time. The grey region consists of the range of max-min values that are calculated using the empirical data and their standard deviations.
Plotted for Events 1 to 10, listed in Table 3 (Event 7 and 10 not listed due to unavailability of data).