Published July 13, 2023 | Version 1
Dataset Open

Machine-learning based lightning nowcasting data archive

  • 1. Wuhan University
  • 2. University of Tennessee

Description

This data archive contains the derived data supporting the findings of article "Lightning nowcasting with aerosol-informed machine learning and satellite-enriched dataset". The paper is currently in the preprint version:  https://doi.org/10.21203/rs.3.rs-2616886/v1

The prediction results in this data archive are generated by various models:

1. Current model. The model involves data input of aerosol observations together with meteorological variables and auxiliary datasets, as well as data enrichment by Geostationary Lightning Mapper (GLM). In the demo of the dataset, the year of 2020 is trained and predicted on a cross-validation scheme. 

2. LMA model. The model acts as the baseline model considering only data label obtained from the ground-based Lightning Mapping Array (LMA), which observes accurate lightning occurrence in limited spatial range.

3. No-AOD model. The model acts as the baseline model considering no aerosol observation is utilized during the machine learning process. 

The model results are demonstrated in a continuous value in 0-1. Trade-offs between Probability of Detection (POD) and False Alarm Ratio (FAR) can be optimized by selection of different thresholds. 

Other datasets:

1. Dataset for training. It is for the public use of machine learning training for the current model and no-AOD model (training input features vary).

2. PM2.5 dataset. The real-time spatially continuous and hourly-level PM2.5 dataset is obtained following a published method by Zeng  et al.. In this method, the fundamental in-situ measurements are obtained from Air Quality System (AQS) monitoring network operated by United States Environmental Protection Agency.

Reference:

Siwei Li, Ge Song, Jia Xing et al. Lightning nowcasting with aerosol-informed machine learning and satellite-enriched dataset, 14 March 2023, PREPRINT (Version 1) available at Research Square [https://doi.org/10.21203/rs.3.rs-2616886/v1]

Zeng, Z. et al. Estimating hourly surface PM2. 5 concentrations across China from high-density meteorological observations by machine learning. Atmospheric Research 254, 105516 (2021).

Files

Files (991.9 MB)

Name Size Download all
md5:4ca04a1ad6262efd0e9c1146aef33fb8
170.9 MB Download
md5:098964514db548cbdb40e64536e299a0
105.2 MB Download
md5:8e72f405a7d42aa3fcaef6866a193d7d
94.7 MB Download
md5:a2d9beed4115190449cc3797f2f73489
290.7 MB Download
md5:79ad2fdf422374b4099d71ec2cbe5527
330.4 MB Download

Additional details

References

  • Siwei Li, Ge Song, Jia Xing et al. Lightning nowcasting with aerosol-informed machine learning and satellite-enriched dataset, 14 March 2023, PREPRINT (Version 1) available at Research Square [https://doi.org/10.21203/rs.3.rs-2616886/v1]
  • Zeng, Z. et al. Estimating hourly surface PM2. 5 concentrations across China from high-density meteorological observations by machine learning. Atmospheric Research 254, 105516 (2021).