Grid-scale Fluctuations and Forecast Error in Wind Power

The fluctuations in wind power entering an electrical grid (Irish grid) were analyzed and found to exhibit correlated fluctuations with a self-similar structure, a signature of large-scale correlations in atmospheric turbulence. The statistical structure of temporal correlations for fluctuations in generated and forecast time series was used to quantify two types of forecast error: a timescale error ($e_{\tau}$) that quantifies the deviations between the high frequency components of the forecast and the generated time series, and a scaling error ($e_{\zeta}$) that quantifies the degree to which the models fail to predict temporal correlations in the fluctuations of the generated power. With no $a$ $priori$ knowledge of the forecast models, we suggest a simple memory kernel that reduces both the timescale error ($e_{\tau}$) and the scaling error ($e_{\zeta}$).


INTRODUCTION
Renewable power generation, unlike conventional power, exhibits variability owing to natural fluctuations in the energy source [1]. Wind power, in particular, shares spectral features of the turbulent wind from which it derives energy [2][3][4]. This variability in power output adds a cost to renewable power [5,6] that is absent in conventional power sources. Whereas distributed wind farms are expected to smooth the fluctuations [7], power entering the electrical grid still exhibits large amplitude fluctuations [1]. Large ramps in power fluctuations present the possibility of grid destabilization [8] and blackout, a constant source of concern for system operators [7,9]. This risk increases the cost of operating reserves [10] needed on standby to return a grid back to operation in the event of failure. Naturally, forecast models constitute essential tools in estimating the magnitude of fluctuations beforehand and in planning for the optimal operating reserves required on call. Yet, no standards for forecast accuracy currently exist [11].
Extant works on wind power forecast error, ranging from the turbine to the grid scale, focus on modeling the forecast error distribution [12][13][14][15][16][17]. Since a probability distribution is time-independent, it contains no information on temporal error variations. Several studies have considered the dependence of the mean and the variance of the error on the duration for which the power is predicted (ranging from minutes to hours) [18,19]. Other works have considered the different distributions of errors for mean power over different durations [14,15,20]. However, none of these studies account for the fluctuation correlations of atmospheric turbulence [21] transferred to the generated power in the analysis of forecast error nor for the temporal correlations of the errors.
Whereas power fluctuations at the scale of an individual turbine [3] and a wind farm [2,4] have been shown to exhibit self-similar scaling, such fluctuations from individual wind farms are expected to smooth out before they enter the electrical grid. Using data from the Irish grid operator EIRGRID [22], we show that wind power entering the grid exhibits correlated fluctuations with a self-similar structure. Such scaling points to large-scale correlations in atmospheric turbulence influencing the aggregate wind power entering the grid.
In this article, we exploit these correlations at the grid scale and draw upon the Statistical Theory of Hydrodynamic Turbulence to quantify two types of forecast error. The first is a timescale error (e τ ) that quantifies the timescales over which the forecast models fail to predict high-frequency power fluctuations. This timescale error sets a bound on the numerical resolution of forecast models and would already be known to system operators who own and run the forecast models. However, details of the models are not available to potential customers in energy spot markets [23] who could use this information to factor in the risk associated with a non-supply of promised wind power by producers. The second type of error we quantify is a scaling error (e ζ ) that establishes a difference in the self-similar scaling of fluctuations as observed for actual generated power visà vis the power that was forecast to be generated. This error could be potentially useful to model developers, and if such an error results from large-scale correlations in atmospheric turbulence, incorporating them into models is not subject to limitations arising from numerical resolution. Having established the errors, we then employ a simple memory kernel upon the forecast time series and show that the errors can be easily reduced with a minimal computational cost.
Two raw time series are provided by EIRGRID: the wind power generated nationwide across all Ireland entering the grid p g (t), and the power forecast by EIR-GRID's models p f (t) for the same period. The time series sampled at 15 minute intervals span a five-year period . Every third data point is plotted for easy visibility. b) The probability density function of the raw instantaneous forecast error Π(p d ) (solid black circles) has exponentially decaying, fat tails relative to a Gaussian distribution (solid black line) of the same mean and standard deviation as Π(p d ).
(2009 − 2014). No other information that permits meaningful data decomposition is available; forecast models employed, the number of wind farms feeding the grid, their location, date of commission or date of scheduled and unscheduled outages, etc. are all unknown. Despite the lack of further information, any quantifiable trends revealed are of immediate use to system operators in estimating operating reserves and in accounting for fluctuations that could potentially destabilize the grid [24]. Furthermore, such information [25] is potentially useful to customers in energy spot markets [23].
Raw time series for the generated p g (t) and forecast power p f (t), and their instantaneous difference p d (t) ≡ p f (t) − p g (t), which we define as the instantaneous forecast error, are shown in Fig. 1a for a 10-day period, permitting a few immediate qualitative observations. Firstly, p g (t) exhibits correlated fluctuations. Secondly, p f (t), while closely following p g (t), misses the high frequency (relative to the sampling rate of time series) components. The instantaneous forecast error p d (t) exhibits correlated fluctuations with a coefficient of variationstandard deviation/mean = 116.24/22.9 ∼ 5, implying large magnitude fluctuations in p d (t) (i.e., a broad distribution).

DISTRIBUTION OF FORECAST ERROR
As a point of comparison with prior works [12][13][14][15][16][17], we note that the probability density function (PDF) of the raw instantaneous forecast error Π(p d ) ( fig. 1b) exhibits fat exponential tails that decay slower than a Gaussian function of the same mean and standard deviation as Π(p d ). Indeed, a PDF with exponential tails may be expected for reasons detailed in the following. Being a scalar product of an instantaneous force ( f (t)) and velocity ( v(t)), the statistics of temporal variation in power are determined by the product of two random variables [26]. The statistics of the product (Z = XY ) of two normally distributed random variables (X and Y ) was first studied by C. Craig [27] (henceforth referred to as Craig's-XY distribution). An asymptotic analysis reveals that Craig's-XY distribution is logarithmically singular about zero with exponentially decaying tails, and its asymmetry (skewness) depends on the instantaneous cross-correlation between X and Y [28,29]. Whereas this asymptotic analysis is possible only when X and Y are Gaussian, the structure of Craig's XYdistribution itself is more generally observed, even when X and/or Y are non-Gaussian [28].
The basic structure of Craig's-XY distribution is expected for the PDF of power p(t) (e.g., compare Fig. 2c in [3] with Fig. 2 in [29]) and its estimation error δp(t). Suppose for a single wind turbine, the errors in estimating or forecasting velocity and force are δ v(t) and δ f (t), respectively, then Expanding the RHS permits decomposition into power p(t) = v(t) · f (t) and its error Consequently, δp(t) is a random variable whose statistics are determined by the sum of three terms (Eq. 1), each being a product of two random variables. One expects δp(t) to exhibit features of Craig's-XY distribution irrespective of whether or not v, δ v, f , and δ f are Gaussian. In fact, given that the velocity distribution of atmospheric turbulence is known to follow the Weibull distribution [30,31], a numerical approach may become necessary.
The power forecast error statistics can be easily scaled from the turbine to the grid scale. If M wind farms feed power to the grid, and the ith farm has N i turbines, the cumulative error in estimating wind power then equals the instantaneous forecast error p d (t) for the grid, shown in Fig. 1a, and is given by: Summing the power error statistics over all turbines (across all farms) causes an averaging of fluctuations, starting with the most probable ones that occur around zero, thus smoothing the logarithmic singularity. All these generic features are readily observed for Π(p d ) in Fig. 1b. Beyond contributing to the extant literature [12][13][14][15][16][17], the structure of Π(p d ) provides no useful information for our analysis and will not be discussed further. Understanding temporal variability (fluctuations) and uncertainty (error) requires analysis of the temporal evolution of the distributions, their moments and multipoint temporal correlation functions. We therefore proceed through a statistical analysis of the temporal correlations in the fluctuating time series for generated and forecast power.

DATA ANALYSIS
The time series was analyzed in two stages, with trends in the series being identified in the first stage, followed by an analysis of the fluctuations around the trends in the second stage. Trend removal permits a focus on systematic differences between p g (t) and p f (t), ignoring differences due to new wind farms and seasonal variability of the wind power. Trend identification was performed such that the cross-correlation between the generated and forecast power trends was maximal. We used the fast Fourier transform (FFT) for each of the series and defined the trends by inverting the transform using only the frequencies with maximal amplitudes. The number of maximal amplitudes was set by the requirement of the highest cross-correlation between p g (t) and p f (t). Keeping the zero frequency (to preserve the signal mean) and five more frequencies resulted in a peak cross-correlation of 0.9904 between the generated and forecast power trends (Fig. 2a). These respective trends were subtracted from the raw time series. We denote the de-trended generated power by P G (t), forecast power by P F (t) and their instantaneous difference by The characteristic fluctuation timescales for the detrended time series were first computed from their respective autocorrelation functions defined as: where P X is a time-average subtracted from the signal (de-trending does not render a zero signal mean since the zero frequency component was preserved). The subscript X should be replaced with G for generated power, F for forecast power, and D for instantaneous forecast error, respectively. The three autocorrelation functions ( fig. 2b) exhibit exponential decay for short times with a data fit following the functional form where A X 1.0, owing to C X (τ ) being normalized, and τ X represents the characteristic decorrelation time for each time series, yielding τ G = 80.94 data points (∼ 20.24 hours) for generated power, τ F = 81 points (also ∼ 20.24 hours) for forecast power, and τ D = 25.86 points (∼ 6.5 hours) for instantaneous forecast error. The de-trended series were also split into independent time series of shorter duration (1/8th of the original temporal duration). Autocorrelation functions computed for these windowed data did not reveal a measurable difference in the characteristic decay time τ X ; deviations were apparent only for long-term behavior spanning a week (or longer timescales) when the decorrelation had already occurred. The correlation time of high frequency fluctuations ( 20 hours) is much shorter than the slow varying trend (over months to years). Hence the de-trending protocol (in particular, the number of maximal amplitudes) does not influence the analysis to follow-a fact verified and reported upon later.
Autocorrelation functions for the generated (C G (τ )) and forecast(C F (τ )) power exhibit nearly identical scaling and the same characteristic decay timescales (τ G = τ F = 20.24 hours), suggesting the accurate capture of correlations in generated power by the forecast models. Yet, the autocorrelation function C D (τ ) for instantaneous forecast error P D (t) informs us that some correlations are not captured. In particular, we qualitatively know that P F (t) misses the high frequency components of P G (t), and they end up in P D (t), thereby contributing to its two-point correlator. This correlation deficit suggests that higher order moments of the two-point correlator are necessary to capture the statistical structure of the missing fluctuations.

TEMPORAL STRUCTURE FUNCTIONS
Statistical analysis of higher order correlations is a well-developed, mature tool within the Statistical Theory of Hydrodynamic Turbulence in which higher order two-point correlators are studied through Structure Functions. Kolmogorov's theory of 1941 (K41) [32] lays the foundation for structure functions through the celebrated "4/5th law": where the third moment of longitudinal velocity differences ( (∆v || (r)) 3 ) between two points spatially separated by a longitudinal distance r, is proportional to the product of the average turbulent dissipation rate (ε) and the longitudinal spacing r [33].
The nth order structure function encodes all crossterms up to order n of the two-point correlator for a given stationary signal. The physical relevance of structure functions may be appreciated by considering a stationary, fluctuating signal x(t) with zero mean. The difference between the two values of this signal taken time τ apart (∆x(τ ) ≡ x(t + τ ) − x(t)) is collected at various windows (of duration τ ) along the time series. ∆x(τ ) is therefore a random variable with statistics of its own, and the nth order structure function defined as S n (τ ) = (∆x(τ )) n is the nth moment for its PDF Π(∆x(τ )). The moment �� �� FIG. 3: (color online) Structure functions of order n = 1 − 10 (red solid circles) and their power-law fits (black solid lines) for (a) generated power S G n (τ ) and (b) forecast power S F n (τ ) plotted versus τ in log-log scale exhibit self-similar scaling S X n (τ ) ∝ τ ζ X n (X is G for generated and F for forecast power). The scaling is robust for (a) the generated power over 1.4 decades (40 time steps). (b) In contrast, for forecast power, the first-and second-order structure functions exhibit scaling up to τ = 40 time steps, but for n > 2, no scaling is observed for τ ≤ 10 time steps. Self-similar scaling is restored over a limited range of timescales 10 < τ < 40.
S n (τ ) varies with the time difference τ between signals, and its scaling (if any) reveals temporal variations in the statistical structure of fluctuations in the signal to the nth order.
Tails of the PDF Π(∆x(τ )) exert themselves with the increasing order n of the structure function, thus necessitating more data to resolve higher order structure functions. A weak test for resolving the nth order structure function involves splitting the time series into smaller windows and testing for identical scaling on the truncated series. However, this test only assures stationarity of the statistics. A strong test for the ability to resolve the nth order structure function requires that first, the moment's integrand (∆x) n Π(∆x) → 0 as |∆x| → ∞ [34] (required due to finiteness of data), and second, the PDF Π(∆x) should decay faster than 1/|∆x| n+1 for |∆x| → ∞ or else the integral (∆x) n Π(∆x) dx would diverge for large |∆x| [35] (test for existence of a PDF's nth moment). Whereas the two conditions are not independent, the second condition is theoretical and does not depend upon the available statistics. When conducting data analysis, even when the second condition is satisfied, insufficient data can lead to noise and prevent the integrand (∆x) n Π(∆x) from satisfactorily converging to zero. The first condition is therefore dependent on finiteness of data. Based on both weak and strong tests, we conclude that the EIRGRID data can resolve structure functions up to order n = 12; however, only results up to n = 10 are presented.
Since even-order structure functions take only positive values, they converge faster than ones with odd order. To overcome this distinction between odd and even orders, we compute the nth order structure function of the absolute value of differences: S X n (τ ) ≡ |P X (t + τ ) − P X (t)| n where subtraction of mean P X (t + τ ) and P X (t) is assumed. While ensuring the same convergence rate for even-and odd-order statistics, it also collates all data in the positive quadrant permitting easy visualization. Analysis of fractional-order structure functions allows better testing for anomalous scaling [36]. Fractionalorder structure functions are only defined for absolute values of signal differences [36]-another reason why we calculate structure functions of absolute differences. We also calculated the structure functions of orders n = 0.1 − 0.9 in steps of 0.1. Figure 3 plots the structure functions of order n = 1 − 10 (fractional-order structure functions are calculated but not shown) for the absolute value of signal differences of the generated power |∆(P G (τ ))| ( fig. 3a) and forecast power |∆(P F (τ ))| (Fig. 3b). Self-similar or power-law scaling is observed for the generated power structure functions over 1.4 decades spanning τ ≤ 40. Scaling over the same temporal range is also observed for the forecast power structure functions of order n = 1 and 2. For n > 2, no scaling is observed for timescales τ ≤ 10. The scaling is restored over a limited range of timescales 10 < τ < 40 (0.4 decades in time).

RESULTS
Self-similar scaling of the temporal structure functions implies a relationship of the form: where ζ X n is the scaling exponent. For simple monofractal scaling, ζ X n ∝ n. However, fluctuations with a multi-fractal character exhibit a nonlinear dependence of the scaling exponent ζ X n with respect to n. Super-(sub-) linear variation of ζ X n versus n implies temporal expansion (compression) of fluctuations [37]. Scaling exponents for all the structure functions were computed from the log derivative, ζ X n = d log(S X n (τ )) d log(τ ) , which provides a more reliable estimate of the exponent than a power-law fit [36,38]. The pre-factor A X n in Eq. 4 is subsequently obtained from fit to data. In Fig. 3, all the data (red solid circles) were divided by A X n such that all fits (solid black lines) commence from both mantissa (τ ) and ordinate (S X n (τ )) at unity, for an easy comparison of ζ X n with order n. All the data in Fig. 3, Fig. 4a, and Fig. 5b therefore follow the scaling relation: S X n (τ ) ∝ τ ζ X n (A X n ≡ 1).
The scaling in Fig. 3 reveals higher-order temporal correlations at work in the EIRGRID data. The absence of scaling for S F n (τ ) for n > 2 at timescales τ ≤ 10 confirms the qualitative observation made in Fig. 1a that forecast models do not capture high frequency fluctuations. More importantly, Fig. 3b ascribes a precise bound on the time (τ = 10, 2.5 hours) out to which the high frequency fluctuations are missed. Finally, scaling presence for S F n (τ ), n = 1, 2 explains the close agreement between the autocorrelation functions C G (τ ) and C F (τ ) and their identical characteristic decay times, τ G and τ F , observed in Fig. 2b. This is to be expected on the grounds that the second-order structure function S 2 (τ ) ≡ (∆x(τ )) 2 = x(t+τ ) 2 + x(t) 2 −2 x(t)x(t+τ ) shares a direct correspondence with the autocorrelation function where the cross-term is identical to the numerator of Eq. 3. The failure of S F n (τ ) for n > 2 to capture high frequency fluctuations out to τ = 10 reveals one type of forecast error in the models; we call this the timescale error e τ .
Before proceeding to the second type of error arising from scaling mismatch, we define the cross-structure function X F G n (τ ) ≡ |P F (t + τ ) − P G (t)| n . X F G n (τ ) represents nth order moments for the PDF of the relative magnitude of fluctuations between P G (t) and P F (t + τ ), and their cross-terms correspond to higher-order twopoint cross-correlators between the generated and forecast power. This function is plotted in Fig. 4a. Again, we notice that scaling is absent at early times (τ ≤ 10), and restored at later times (10 < τ < 40). We note that X F G n (τ ) exhibits no scaling for n = 1 and 2, unlike the forecast structure functions (Fig. 3b). Although S F n (τ ) exhibits scaling for order n = 1 and 2, its exponent ζ F n = ζ G n ; this scaling deficit is reflected in X F G n (τ ) for n = 1 and 2. FIG. 4: (color online) a) Log-log scale: cross-structure functions X F G n (τ ) versus τ (red solid circles) exhibit no scaling at early times τ ≤ 10, with scaling restored for 10 < τ < 40. Black solid lines are power-law fits to data within the scaling regime. b) Scaling exponent ζ X n versus the order of structure function n for generated G (red solid circles), forecast F (blue solid squares) and modified forecast M (black solid triangles) structure functions, and cross-structure functions F G (green solid inverted triangles), and their respective secondorder polynomial fits: solid red line for ζ G n , small dashed blue line for ζ F n , medium dashed black line for ζ M n and long dashed green line for ζ F G n .

DISCUSSION
Having established the various structure functions, we now consider the behavior of their scaling exponents ζ X n (X ≡ G for generated, F for forecast and F G for the cross-structure function). Figure 4b plots ζ X n versus the order n together with their polynomial fits to the quadratic order. ζ G n = 10 −2 + 0.67n − 0.013n 2 scales almost linearly (mono-fractal) with a small, but mea-surable, quadratic deviation towards multi-fractal behavior. The exponent ζ F n = 0.007 + 0.8n − 0.025n 2 exhibits a slightly more pronounced quadratic deviation (multi fractal behavior) relative to ζ G n . On the other hand, ζ F G n = 10 −2 + 0.54n − 0.006n 2 scales almost linearly with n, implying mono-fractal scaling.
We now consider the measurement error for the aforementioned scalings. Firstly, given that all de-trending protocols suffer from an ad hoc choice of a de-trending timescale, we tested the scalings for dependence on the de-trending procedure by varying the number of maximal amplitudes. Ignoring the condition for maximal cross-correlation between p g (t) and p f (t), the number of maximal amplitudes contributing to the trends was varied. The scalings were invariant up to the inclusion of 15 maximal amplitudes into the trend, beyond which, coefficients for the polynomial fits started varying in the second decimal place. Having ascertained the robustness of our choice for the five maximal amplitudes at which the cross-correlation peaks, we focused on a second source of scaling measurement error, namely statistical variability. Since the scalings are analyzed up to τ = 100 data points, the de-trended time series were split into eight independent windows (each with 21912 data points), and the structure functions were re-computed for each window. The variation in the log derivative (ζ X n = d log(S X n (τ )) d log(τ ) ) for the eight independent measurements was taken as the possible scatter in the scaling estimation, thereby providing a confidence interval for the polynomial fits. The scatter was found to be ζ X n ± 0.01 in both the measured value of ζ X n and the corresponding polynomial fits (for each of the polynomial coefficients) for each of the eight independent datasets, revealing that the polynomial fits were meaningful only to the linear order for ζ G n and ζ F G n . The quadratic-order polynomial coefficient for ζ F n , despite being larger than the scatter of ±0.01, is not useful owing to the fact that the corresponding quadratic terms for ζ G n and ζ F G n are smaller than the scatter magnitude.
Despite qualitatively observing a quadratic deviation for ζ X n in Fig. 4b, our inability to ascribe significance to it arises from the fact that the multi-fractal component (deviation from linear scaling) of the scalings is miniscule. This is significant in light of several studies that have demonstrated multi-fractal scaling for wind power fluctuations at the turbine [3,4] and farm scales [39]. Turbulence theory traces the source of multi-fractal behavior to intermittent fluctuations that can arise from two sources in the atmospheric context. The first, known as internal intermittency, occurs at the small scales of turbulent flow. These intermittent fluctuations would be naturally reflected in the power generated at the turbine and farm scales. However, when adding together power generated by geographically distant wind farms, internal intermittency should smooth out [40] since it is a small-scale effect and cannot extend across geographically distributed wind farms. Furthermore, the sampling interval (15 minutes) for EIRGRID data is not expected to resolve any effects that may arise from internal intermittency, which occur at much shorter timescales (high frequencies).
The second source of intermittency, known as external intermittency, occurs at the edge of any free-stream [41] and arises in the atmospheric context due to coupling between the atmospheric boundary layer turbulence and a co-moving weather system [21]. External intermittency, which can be experienced in the form of wind gusts, is of greater relevance in the present analysis as it can both correlate distributed farms through the weather system and occur at timescales longer than the 15-minute sampling interval for EIRGRID data. The nearly fractal scaling of ζ G n informs us that both internal and external intermittency are being smoothed to the point of rendering grid-level power fluctuations almost mono-fractal.
The self-similar scaling of S G n (τ ) over several hours does strongly point to the influence of large-scale turbulent structures on power fluctuations at the grid level. The 20-hour characteristic decorrelation time (τ G ) for generated power in Fig. 2c, if taken as the large eddy turnover time of atmospheric turbulence, also lends credence to such an argument. Finally, independent proof in support of this argument also comes from Katzenstein et al. [40] who show that an individual wind farm exhibits f −5/3 (f being the frequency) scaling for the wind power spectrum (equivalent to τ 2/3 scaling of the secondorder structure function in the time domain). However, as wind power from various farms is summed, the spectrum steepens (please see Fig. 3 in [40]). Such spectral steepening can be clearly attributed to the smoothing of high frequency (short timescale) fluctuations corresponding to small eddies. But the low frequency (long timescale) fluctuations corresponding to large-scale eddies lose no power spectral density, clearly indicating the influence of large-scale turbulent structures.
We finally consider the forecast error due to the scaling mismatch. We define the scaling error as e ζ ≡ ζ F n − ζ G n . Under this definition, if the time series for forecast and generated power were identical, then S G n (τ ) ≡ S F n (τ ), implying ζ G n ≡ ζ F n , and therefore e ζ = 0. Another typical case arises if forecast models fail completely, resulting in a flat time series with no fluctuations, ζ F n = 0 resulting in an error e ζ = −ζ G n . Using the polynomial fits for ζ X n (see Fig. 4b) to linear order, we obtain e ζ = (7 × 10 −2 + 0.8n) − (10 −2 + 0.67n) = −0.003 + 0.13n. This can be cross-validated against the difference ζ G n − ζ F G n = (10 −2 +0.67n)−(10 −2 +0.54n) = 0.13n. Since ζ X n → 0 as n → 0, the 0th order term falling within the scatter may be taken to be zero. Both estimates of error are identical in linear order (e ζ = 0.013n).
The analysis thus far demonstrates the importance of temporal correlations in wind power and their role in estimating forecast errors. It is reasonable to ask whether this knowledge could help improve the forecast time se-�� �� FIG. 5: (color online) a) Log-linear scale: γopt versus order of structure function shows no improvement for n < 4 but shows better agreement for n ≥ 4 with an abrupt change observed in γopt at n = 4. b) Log-log scale: Structure functions S M n (τ ) versus τ (red solid circles) for the modified forecast time series show considerable improvement over their counterparts S F n (τ ) in Fig. 3b. ries, despite having no knowledge of the models employed. In particular, to capture the short-term correlations missed by the forecast, we introduce a modified forecast that is based on the original forecast, convoluted with an exponentially decaying memory kernel derived from the generated power time series. The modified forecast power is given by P M (t) = t 0 P F (τ )e −γ(t−τ ) dτ . The memory duration (1/γ) was chosen so as to minimize the relative difference between the structure functions of the generated and forecast power. As expected (as shown earlier, the low-order structure functions of the generated and forecast power are very similar), we found that the optimal γ varies with the order of the structure function. For n < 4, the memory-modified forecast shows no improvement in the agreement between S G n and S F n . For n ≥ 4, the modified forecast exhibits better agreement with the structure functions of the generated power as shown in Fig. 5b. The optimal γ (γ opt ) was found to be γ 4 ≈ 1.06 and γ 10 ≈ 0.37, as shown in Fig. 5a, plotted in log-linear scale to show the variation in γ opt for n ≥ 4. The simple scheme, suggested here, not only tries to rectify the timescale error e τ , but also attempts to statistically align the temporal correlations by improving the scaling error e ζ .
As is apparent from Fig. 5b, the structure functions (S M n (τ ) ≡ |∆P M (τ )| n ) for modified forecast time series are substantially improved over their unmodified counterpart ( fig. 3b). Firstly, scalings are restored at high frequencies (τ ≤ 10), thus rendering the timescale error irrelevant. More importantly, the scaling itself is improved as is evident from Fig. 4b, revealing ζ M n = 0.01 + 0.7n − 0.007n 2 . To linear order, the scaling error e ζ = ζ M n − ζ G n = 0.7n − 0.67n = 0.03n, a considerable improvement over the original forecast time series. Being computationally inexpensive, and given that spinning and non-spinning reserves must act within 10 minutes of failure, with replacement reserves acting within 20-60 minutes, there are tangible benefits to incorporating such a memory kernel into models to monitor instabilities in real-time. Furthermore, it might be possible to improve the forecast models using different parameterizations of the regional climate models or weather models.

SUMMARY
In summary, wind power exhibits significant temporal correlations even at the grid level, where fluctuations are expected to average out [7] as power is fed from geographically distributed wind farms. Previous studies show that the temporal correlations of the wind are essential to studying wind-generated large-scale ocean currents [42]; a similar appreciation of large-scale correlations in atmospheric turbulence within the context of wind power is called for. Fluctuations, albeit posing a problem to system operators, possess a statistical structure through temporal correlations, which could be exploited to quantitatively analyze the error in forecast models. The technique proposed here is only limited by the sampling rate of the time series. Beyond potentially serving as a standard for quantifying wind-power forecast accuracy, it could have applications for any renewable energy source with temporally correlated fluctuations possessing a statistical structure.
MT and MMB were supported by the OIST Graduate University with subsidy funding from the Cabinet Office, Government of Japan. CPC was hosted by OIST Graduate University while performing this work. GB was supported through the European Union Seventh Frame-work Programme (FP7/2007-2013) under grant number [293825]. The authors gratefully acknowledge EIRGRID for permission to use their data and N. Ouellette for scientific discussions.