A data set of earthquake bulletin and seismic waveforms for Ghana obtained by deep learning

The Ghana Digital Seismic Network (GHDSN) data, with six broadband sensors, operating in southern Ghana for two years (2012-2014). The recorded dataset is processed for simultaneous event detection and phase picking by a Deep Learning (DL) model, the EQTransformer tool. Here, the detected earthquakes consisting of supporting data, waveforms (including P and S arrival phases), and earthquake bulletin are presented. The bulletin includes the 559 arrival times (292 P and 267 S phases) and waveforms of the 73 local earthquakes in SEISAN format. The supporting data encompasses the preliminary crustal velocity models obtained from the joint inversion analysis of the detected hypocentral parameters. These parameters comprised of a 6- layer model of the crustal velocity (Vp and Vp/Vs ratio), incident time sequence, and statistical analysis of the detected earthquakes and hypocentral parameters analyzed and relocated by the updated crustal velocity and graphic representation of them a 3D live figure enlighting the seismogenic depth of the region. This dataset has a unique appeal for earth science specialists to analyze and reprocess the detected waveforms and characterize the seismogenic sources and active faults in Ghana. The metadata and waveforms have been deposited at the Mendeley Data repository [1].


a b s t r a c t
The Ghana Digital Seismic Network (GHDSN) data, with six broadband sensors, operating in southern Ghana for two years (2012)(2013)(2014). The recorded dataset is processed for simultaneous event detection and phase picking by a Deep Learning (DL) model, the EQTransformer tool. Here, the detected earthquakes consisting of supporting data, waveforms (including P and S arrival phases), and earthquake bulletin are presented. The bulletin includes the 559 arrival times (292 P and 267 S phases) and waveforms of the 73 local earthquakes in SEISAN format. The supporting data encompasses the preliminary crustal velocity models obtained from the joint inversion analysis of the detected hypocentral parameters. These parameters comprised of a 6-layer model of the crustal velocity (Vp and Vp/Vs ratio), incident time sequence, and statistical analysis of the detected earthquakes and hypocentral parameters analyzed and relocated by the updated crustal velocity and graphic representation of them a 3D live figure enlighting the seismogenic depth of the region. This dataset has a unique appeal for earth science specialists to analyze and reprocess the detected waveforms and characterize the seismogenic sources and active faults in Ghana. The metadata and waveforms have been deposited at the Mendeley Data repository [1]

Value of the Data
• The seismic waveforms, the crustal velocity data, and the updated catalog inspire Earth Science researchers to conduct more advanced studies on the seismicity and tectonics of the region. • The seismic catalog data updated to July 2022 can be used as a source to conduct Probabilistic Seismic Hazard Assessment (PSHA) on the region. This is specifically important because of the lack of a complete source of seismic data in the region. • The 3D live figures of the seismicity in Southern Ghana provide a 3D insight into the active seismogenic sources in Southern Ghana.

Objective
This article provides a detailed explanation and supplementary information on an original paper [2] , including earthquake waveforms, the compiled seismic catalog, the quality parameter of events, the seismicity parameter estimates, the 3D hypocentral map, and some preliminary analyzes of the seismicity analysis performed. The earthquake waveforms are provided in Seisan format as an easy source for additional studies, such as focal mechanism studies. The performed catalog is a ready resource for conducting seismic hazard assessments. This paper enhances the original paper by providing additional analysis and detailed information.

Data Description
Massive digital seismic data were recorded by GHDSN by broadband sensors equipped with GPS-clock timing between September 2012 and April 2014. Local earthquakes in this dataset were detected using DL methods [3] . The compiled earthquake catalogs from the processing GHDSN dataset [4] include 73 events. Post-processing was performed on the detected earthquakes to obtain an updated crustal velocity model and hypocentral parameters of the detected earthquakes [ 5 , 6 ]

Compiled Updated Catalog
The compiled catalog file from the dataset, (presented in CSV format) contains 11 columns; each row contains information about an individual earthquake, while the columns represent the associated parameters. The catalog file is presented in the standard input format for common Probabilistic Seismic Hazard Assessment (PSHA) software such as OpenQuake [7] . The catalog contains the following attributes: The data availability timetable of the raw GHDSN dataset is shown in Fig. (1) . A time sequence plot of the detected earthquakes in this dataset is shown in Fig. (2) . The detected events are classified based on different criteria presented in Tables 1-3 . Two sets of events classified as the complete set and a subset with a higher detection accuracy are shown in Fig. (3) [see     [2] are depicted in Fig. (4) . Finally, a magnitude of completeness and b-value plot of the detected events is shown in Fig. (5) .

Seismic Waveforms and Bulletin in the Seisan Format
A set of earthquake waveforms for 73 detected earthquakes, accompanied by its earthquake Bulletin information, is provided in the Seisan format (S-file) [8] .

Live 3D Matlab Figures of Hypocenters
This live 3D figure in MATLAB format ("Hypocenters_fig6. fig" file in [1] ) shows the hypocentral positions calculated using the joint-inversion method. Different earthquake clusters are shown in distinct colors, permitting recognizable delineation of the seismic sources. The live feature of the images lets the operator rotate and evaluate the hypocenter locations from different angles.

Updated Crustal Velocity Model in SEISAN Format
The updated crustal velocity model in SEISAN format consists of 6 constant velocity layers. The top layers are 1, 13, 8, 13, and 10km thick, corresponding to V p = 5.9, 6.1, 6.3, 6.5, and 6.9km/s, respectively. A halfspace layer with a velocity of 7.2km/s underlies the model.

Experimental Design, Materials and Methods
The current data set is a set of earthquake waveforms, seismic catalog, and detailed earthquake Bulletin that has been detected and extracted from the [4] data set by applying the DL method and a "conservative strategy". In addition to the extracted earthquake waveforms, an updated earthquake catalog up to July 2022 is provided for the Ghana region.
The data also contains a 3D map and live figures with the (.fig) Matlab format for the readers to provide a 3D presentation of active seismogenic sources in Southern Ghana. The network recorded earthquake waveforms of local and Teleseismic earthquakes in this time period. The detected earthquake in this time interval amenable to locate is shown in Fig. (4) . This figure also shows the compiled catalog of all the data sources (referred to in Sec. (2)).
The related article entitled "Application of deep learning for seismicity analysis in Ghana" has been published in the Geosystems and Geoenvironment journal.

Time Sequence of the Earthquakes
The hypocentral parameters of the earthquakes are estimated by applying a joint-inversion algorithm to search simultaneously for the ideal velocity model and the hypocentral parameters that best fit the arrival times [8] . Here, we briefly present the characterization of these events. Fig. (2) shows a homogeneously time sequence of the earthquakes over time, implying that there is no time cluster of events during this period.

Other Quality Parameters
As another quality control parameter to evaluate the accuracy of the detected events, here, we present information about 1. Number of events in the specified GAPs 2. Number of events with specified Root Mean Square Error (RMSE) 3. Number of events with a number of stations equal to or greater than, in Tables 1 , 2 , and 3 , respectively.

Depth Distribution and Clusters
According to the quality statistics presented in the previous section, we classified the events into two sets: 1) the complete set, and 2) a subset of the better-quality events with RMS ≤ 0.5 and NS TA ≥ 4, which contains 43 events.

Plot of Full Catalog Updated to April 2022
The epicentral map of the updated catalog for Ghana and the surrounding area, including the historical, reported instrumental, and the DL catalogs, are plotted in Fig. (4) .

Magnitude of Completeness and b-Value Calculation
In this part, we present the magnitude of completeness and b-value of the catalog containing 73 detected earthquakes by DL method. The MC for 73 detected events by DL is shown in Fig. (5) . Accordingly, the Frequency Magnitude Distribution (FMD) and the cumulative rate show the standard pattern with Mc = 3.2 and b-value = 1.9.

Ethics Statements
Not applicable.