Forbush decreases: Algorithm generated dataset

Efforts to correctly detect Forbush decreases (FDs) in the cosmic ray (CR) intensity flux are ongoing (see [2], [7]). The FD data presented here are a part of the data generated in [6] in a recent investigation of the effects of CR anisotropies on the simultaneity of FDs. A part of the simultaneous and non-simultaneous FDs detected from both raw and Fourier transformed CR data are presented. Some of the filtered CR anisotropies are also presented. The datasets are interesting as they provide an opportunity to investigate the interaction of CR anisotropies on FDs. [6] identified FDs from both raw and Fourier transformed CR data. For the FDs identified in the raw data, the impact of diurnal anisopropy was not removed. Additionally, FDs were identified using a careful adjustment for diurnal anisotropy. The results of FD catalogues for ten CR stations are presented. All are calculated with the same method and selection criteria. Thus these catalogues can be united contrary to the indication of [1], allowing comparison of Chree, regression or correction analysis based on CR data from isolated neutron monitors [9]. These data can also be used in the analysis of FD event simultaneity. Several claims on FD event simultaneity by some authors ([5], [8]) using few FDs can be verified using the data. The diagrams show the relationships between raw and Fourier transformed CR data as well as CR anisotropic variations at different locations. The similarities and differences between these figures could be used to discuss the impact of rigidity on diurnal anisotropy and FDs. The quantitative amplitudes of the diurnal anisotropy calculated at the time of FDs could be useful in such investigation. Okike [6] calculated the amplitude of diurnal anisotropies accompanying FDs and are presented here. The large number of strong and small FDs selected from each station for the year 2003 may encourage reanalysis of previous work employing only 3 or 4 FDs in each year. Many studies relating FDs to terrestrial effects have focused on a single FD without strong evidence that it is generalizable to others. This may attract criticism that the conclusions drawn do not extend to all FDs. The diagrams presented here are similar to those presented in [6] for the Climax station. Rather than display separate FD events, the complete intensity variation together with the accompanying FDs are clearly displayed. Solar wind plasma data may be displayed alongside the data for comparative purposes. It is hoped that this approach will answer the numerous objections made to solar-terrestrial analysis involving FDs.


a b s t r a c t
Effort s to correctly detect Forbush decreases (FDs) in the cosmic ray (CR) intensity flux are ongoing (see [2,7] ). The FD data presented here are a part of the data generated in [6] in a recent investigation of the effects of CR anisotropies on the simultaneity of FDs. A part of the simultaneous and non-simultaneous FDs detected from both raw and Fourier transformed CR data are presented. Some of the filtered CR anisotropies are also presented. The datasets are interesting as they provide an opportunity to investigate the interaction of CR anisotropies on FDs. [6] identified FDs from both raw and Fourier transformed CR data. For the FDs identified in the raw data, the impact of diurnal anisopropy was not removed. Additionally, FDs were identified using a careful adjustment for diurnal anisotropy. The results of FD catalogues for ten CR stations are presented. All are calculated with the same method and selection criteria. Thus these catalogues can be united contrary to the indication of [1] , allowing comparison of Chree, regression or correction analysis based on CR data from isolated neutron monitors [9] . These data can also be used in the analysis of FD event simultaneity. Several claims on FD event simultaneity by some authors ( [5,8] ) using few FDs can be verified using the data. The diagrams show the relationships between raw and Fourier transformed CR data as well as CR anisotropic variations at different locations. The similarities and differences between these figures could be used to discuss the impact of rigidity on diurnal anisotropy and FDs. The quantitative amplitudes of the diurnal anisotropy calculated at the time of FDs could be useful in such investigation. Okike [6] calculated the amplitude of diurnal anisotropies accompanying FDs and are presented here. The large number of strong and small FDs selected from each station for the year 2003 may encourage reanalysis of previous work employing only 3

or 4 FDs in each
year. Many studies relating FDs to terrestrial effects have focused on a single FD without strong evidence that it is generalizable to others. This may attract criticism that the conclusions drawn do not extend to all FDs. The diagrams presented here are similar to those presented in [6] for the Climax station. Rather than display separate FD events, the complete intensity variation together with the accompanying FDs are clearly displayed. Solar wind plasma data may be displayed alongside the data for comparative purposes. It is hoped that this approach will answer the numerous objections made to solar-terrestrial analysis involving FDs.  Table   Subject Atmospheric Science. Specific subject area This is an aspect of space physics that examines the impact of solar and extra-galactic emissions on the Earth's weather and human civilization. Arguably, several health hazards have been attributed to the influence of incoming energetic particles accelerated to very high speeds by the interplanetary magnetic field. Those particles are commonly called cosmic rays (CRs). The invention of neutron monitors allowed continuous recording of these CRs. Rapid fluctuations in CR flux are considered to be an effect of energetic particle solar emissions arising from solar flares, coronal holes or coronal mass ejections (CMEs). While short-term rapid depressions in the CR intensity are described as Forbush decreases, transient increases are referred to as ground level enhancements (GLEs). Forbush decreases, which are the focus of this work, is considered the most important CR intensity variation ( [10,12] ). Type of data The raw data (see Supplementary data for the ten CR stations were first downloaded from the IZMIRAN website ( http://cr0.izmiran.ru/common ). The raw data were processed before analysis as follows. The date/time format of the raw data was "YYYY.mm.dd HH:MM", followed by the CR count. This was transformed into "dd mm yy" format followed by the CR count. Missing counts, recorded as zeros, were changed to NA (data Not Available). These transformations were performed in "awk", a text processing utility usually available on UNIX operating systems.

Value of the Data
• Although the potential for bias in studying diurnal anisotropy using isolated neutron monitor data has long been known, it has not been quantified. The two datasets provide an opportunity for comparative studies of the interaction of diurnal anisotropy with Forbush events. • The data presented here may be useful to a number of researchers, including those analyzing the relation between FDs and their solar sources, the impact of CRs on terrestrial weather or the global simultaneity of FDs. • We have presented the data in such a manner that almost every CR investigator can employ it directly in analysis. The event magnitude and date are presented and can be used for either superposition or correlation analysis. • The quantitative relation between FD and the amplitude of the CR anisotropy are of interest to some researchers, e.g. [3] . The magnitude of FDs and the usual/abnormal amplitude presented in Tables 9-18 could be used to validate the result of the global survey method [4] used by the Russian group to calculate the amplitude of anisotropies associated with FDs.

Data Description
All the processed/filtered data presented in the diagrams and tables in this article are generated from the raw CR data (see Supplementary data). The raw data for each of the ten CR stations are downloaded from http://cr0.izmiran.ru/common . These raw CR daily averages are presented at the IZMIRAN website ( http://cr0.izmiran.ru/common ) using the American Standard Code for Information Interchange (ASCII) format. The data format for each of the raw data from the ten stations are the same (YYYY.mm.dd HH:MM, followed by the CR count).
Each file in the supplementary material represents raw data from one station. The files are indicated by the stations' four lettered abbreviated names (see table 1 of [6] ). Raw data for Climax CR station, for example, appears as CLMX whereas SOPO stands for raw CR data from South Polo NM. Some of the relevant characteristics of these stations such as latitude, longitude, rigidity and altitude are reported in table 1 of [6] . The full and abbreviated names for each of the ten stations are also included in the table.  Table 3 whereas non-simultaneous FDs are in Tables 4 and 5 . Simultaneous FDs at APTY, MCMD, MOSC, OULU and SNAE are presented in Table 6 .

Catalogues of Forbush Decreases Identified from Preliminary Processed CR Data
The FD data presented in Tables 1-6   lines and filled circles. The blue filled circles represent the FDs associated with the FTS. The number of FDs detected from FTS is displayed in the blue filled circles in Figs. 1-9 . The method of processing the raw CR data and the attempted automated FD event identification are explained in the "R-Fourier-FD Location Algorithm" section and in Fig. 11 .

Catalogues of Forbush Decreases Identified from Unprocessed CR Data
These are presented in Tables 7 and 8 . The unprocessed data here are CR data where the influence of CR diurnal anisotropy has not been removed. Although some data transformation or preprocessing are also implemented here, as indicated in Fig. 10 , the CR data are referred to as unprocessed since the bias arising from CR anisotropy has not been corrected. Magnitudes of events here are comparable to those identified by other researchers employing manual identification. The advantage lies in the much greater number of events than those identified manually in 2003.
Additionally the magnitude of those FDs identified from the unprocessed data is on average less than those identified using the FTS method.

Catalogues of Forbush Decreases Identified from FTS (FD1), Unprocessed (FD2) CR Data and the Amplitude of The Associated Anisotropy
Although FDs and concurrent anisotropies have been well investigated, the magnitudes of anisotropies in individual NMs have not been quantified. The IZMIRAN group attempted this using data from an array of NMs (see http://spaceweather.izmiran.ru/eng/fds1965.html ). Adjust-  ment for the influence of anisotropies on the magnitude of the FDs has not been included. In the recent work of [6] , two sets of FDs are calculated; FD1 (where the contribution from anisotropy is removed), and FD2 (where the impact of anisotropy is not taken into account). The amplitudes of both the normal and the enhanced anisotropies accompanying each FD at the ten stations are presented in Tables 9 -18 . The two FD datasets (FD1 and FD2) represetned by the black and blue filled circles in Figs. 1-9 are also presented. Relating the absolute magnitudes of the anisotropies       with the magnitude of two FD datasets at the different stations could lead to the understanding of the dependence of CR anisotropies on terrestrial/local factors.

Manual identification of Forbush events
CR intensity flux variations are continuously monitored by the ground level neutron monitors (NMs). These variations may be increases or decreases, periodic or sporadic, short-or longterm, and rapid or gradual. FDs are regarded as the most informative events in the CR intensity changes. Unfortunately, NMs do not measure FDs directly. In the NM data, FDs must be distinguished from other variations of similar magnitudes such as the daily and enhanced CR anisotropies. Detecting FDs in CR data involves methods of identification. Traditionally, the manual method has been used over the past eighty years. This method involves visual inspection of CR data, plotting of some selected segments and calculating the event magnitude. Fig. 10 illustrates the manual technique of detecting an FD event. One of the largest events (FD of 30/5/2003, see Tables 1 and 7 ) is used to demonstrate the steps involved in the manual approach. This illustration should enable the reader to understand the automated approach presented hereafter.
The method involves culling a certain part of raw CR data and plotting as indicated in panel a of Fig. 10 . The plotted portion is first examined to identify the four most important parts of a Forbush event (onset, main phase, minimum point and recovery phase, see Fig. 10 ). The blue arrow shows a range of onset days that may be visually associated with the event of 30/5/2003 at APTY station. Generally, an FD event includes a decrease lasting a day or more, depending on whether the onset is sudden or gradual, before reaching a minimum. After the minimum the CR flux recovers over some days. The recovery may be rapid or slow. The plotted event may be discarded if any of the parts described is missing. If the researcher confirms these parts by visual inspection, the experiment is taken to the next stage -calculation of FD event magnitude. This stage also involves several other steps. The researcher chooses the event onset and the minimum point of intensity reduction. While identifying the time of the minimum is usually easy, the time of onset may be more difficult as indicated in panel a.
The next step involves normalization and calculation of the event magnitude. There are two approaches to this, depending on the choice of the normalization baseline. The mean CR variation within the period may be used as the event threshold. On the other hand, the CR count on the day of FD minimum (labeled in panels a and b) may be used as the baseline. Separate equations are used for the two approaches. Equation 1 can be used to estimate event magnitude for the raw data displayed in panel a.
where FD min represents the CR count value on the day of minimum reduction. Equation 2 is used to calculate FD magnitude from panel b of Fig. 10 .
where mean I represents the mean value of the CR data for a predefined period. The FD magnitudes reported in panels a and b are calculated using Eqs. (1) and (2) respectively.

Automated Identification of Forbush Events
Two FD location algorithms are employed here. The first, referred to as R-FD Location Algorithm, is capable of handling raw CR data whereas the second (R-Fourier-FD Location) takes Fourier transformed CR data as its input data. Rather than testing the occurrence of Forbush event using a selected portion of CR data as demonstrated in the manual method, the two codes can take complete CR data for one or more years as its input signal. Panel a of Fig. 11 presents APTY data for the whole of 2003. The large FD of 30/5/2003 plotted in Figure 10 are clearly marked. The largest event of 31/10/2003 is also evident. Other smaller dips (indications of FDs) are also noticeable. The data are used as input signal for the two algorithms. The two methods are briefly described below.

• R-FD Location Algorithm
This algorithm is designed to search for dips/depressions/turning points in the unprocessed CR data. These dips are indications of FDs. It also records the time of the depressions. The search for the dips and the time of occurrence are executed by two subroutines implemented in the code. Normalization and filtering steps ( Eq. (2) ) similar to those implemented in the manual Using the mean as the normalization baseline, the first subroutine calculates the size of the dips while the second simultaneously records the time of the depressions. The result of normalization and data filtering is displayed in panel b of Fig. 11 . The event magnitude and date calculated by the program are presented Table 10 . The black signals (labeled Raw CR data) and the associated filled circles in Figs. 2-9 are other forms of outputs of the algorithm.
• R-Fourier-FD Location Algorithm The design of this code is very similar to the R-FD program. All the subroutines included in R-FD Location code are integrated in the R-Fourier-FD code. The major difference lies in the Fourier transformation subroutine which is implemented in the R-Fourier-FD code. The code is developed in [9] and implemented in [6] . We briefly describe the experimental set up.
Either of the data displayed in panels a and b can both serve as input data to the code. Note that the difference in the lies in the normalization and filtering steps. The two datasets are basically considered raw/unprocessed as the the presence of CR anisotropy has not been removed.  The input data is then treated as a Fourier series composed of several harmonics. The two major components of interest are the high and low frequency signals. Rather than dealing with several other superimposed signals separately, the raw data is separated into two component signalsthe high frequency and low frequency signals. The two signals are presented in panels c and d of Fig. 11 . This high frequency signal of the CR data contains the Forbush events. The signal is passed to R-FD location code. The code calculates both the FD magnitude and the time of occurrence. The high frequency signal is labeled FTS (Fourier transformed signal) in Figs. 2-9 . The low frequency signal contains both the linear trend (monotonic variation) and the CR anisotropy. In other to remove the linear trend, normalization and filtering steps are further implemented. The resulting signal is displayed in Figs. 2-9 as "Diurnal wave". A simple coincidence algorithm is used to identify the amplitude of the diurnal anisotropy for each of the FDs detected at the stations. Some of the results are presented in Tables 9-11 .

Table 17
Catalogues of Forbush Decreases Identified from FTS and Unprocessed CR Data at OULU station and the associated amplitude of the CR anisotropy (ADV).