Sunspot data and human longevity

Title of referenced paper Solar energy at birth and human lifespan, Journal of Photochemistry & Photobiology B 186 (2018)59–68. This paper uses National Center for Health Statistics (NCHS) death data collected from 1979- 2013, inclusive, and average monthly solar intensity as measured by sunspot number collected from the National Oceanic and Atmospheric Administration (NOAA) from 1900–2013, inclusive.


Data accessibility
Data used with this article is in public repository and available Related  Value of the data.
The data used in this article are freely available to the public: Death data can be used to assess lifespan by cause of death and sunspot data can be used to track solar cycles as they pertain to lifespan.
These data may allow researchers to see how levels of ultraviolet radiation (UVR) energy may affect lifespan by using animal models.
These data might be used prospectively to determine the effects of solar energy on specific causes of death.

For death data
https://www.cdc.gov/nchs/nvss/deaths.htm A specific request is required for deidentified data or specific formats [1]. For solar data: www.noaa.gov [2] Psybernetics Research Group has collected average monthly ground UVR data (UV-A and UV-B as well as total solar energy, both in joules/m 2 , from 2007 through 2017 in Augusta, ME [latitude þ 44.308992 degrees; longitude À 69.769008 degrees]). Data from this period, largely including Solar Cycle 24, is available upon request from georgedavi@gmail.com [3].

Data
The data shared here include sunspot number (SSN) by year and month from 1900 to 2009, statistical tables from the original data sets and lifespan plots of data for various age groups. NCHS data formats are described in this attachment.

Cohort data
78,645,528 death records were obtained from NCHS 1979 to 2013. The following variables used from the dataset: year of birth (YOB), month of birth (MOB), sex, year of death (YOD) and race (White, Black, Native-American, and Asian). The dependent variable was lifespan, calculated as the YOB minus the YOD. Records with a lifespan longer than 113 years were designated as outliers and deleted from the analysis. Birth years originally ranged from 1866 to 2013. Table 1 (in the original paper) summarizes the original and scrubbed cohort data by sample size, mean age, sex, and race. For this analysis, deaths that occurred by accidents, suicides and war casualties were deleted as well as restricting the cohort to birthdates from 1900 to 2013. Suicides were deleted as their number was very small relative to the entire dataset. The final dataset was comprised 31,807,486 females and 31,947,344 males (total¼ 63,7543,830). Multiple sclerosis data (N¼85,202) was derived from the entire dataset by diagnosis code (ICD 10¼G35, ICD 9¼340).

Solar data
Solar cycle data as measured by monthly SSN was collected and used as a surrogate for UVR; for example, the higher the SSN the greater the UVR intensity. The average number of annual sunspots per month and per year was collected from NOAA web site: (https://www.ngdc.noaa.gov/stp/space-weather/ solar-data/solar-indices/sunspot-numbers) or see Appendix B for the SSN data). (2) To examine the influence of solar radiation on lifespan, sunspot numbers by year and month were matched-merged by year and month with each cohort case's birth year and birth month. Mean SSN for the entire cohort was 47.68 (41.57 median) with a minimum of 0 and maximum of 253 and for the scrubbed data used in this analysis the mean SSN was 43.4 (38.1 median) with a minimum of 0 and a maximum of 253.

Analysis strategy
To examine the influence of solar radiation on lifespan, sunspot numbers by year and month were matched-merged by year and month with each cohort case's birth year and birth month.
Two strategies were used for the analysis. The first was to use regression analysis (SAS 9.3) to test the hypothesis that UVR, as measured by SSN at year of birth (YOB) and month of birth (MOB), affects subsequent age at death; for example, lifespan. Table 2 (in the main manuscript) displays the correlation matrix for these variables. The regression (GLM) model tested included the relationship between lifespan, SSN, sex and their respective interactions.
The second strategy was to plot lifespan by SSN to visually assess the relationship between increasing SSN (UVR at the time of birth) and lifespan by sex for all races. Charts were created based on summarizing data by categorizing SSN into intervals of 10 starting with 0-10, 10-20, 20-30, etc. The mean SSN and mean lifespan with respective standard deviations were calculated.  Figures 3 and 4 (in the main manuscript). For those who are interested, except for the 90-113 years old cohort (Figure 7 in the manuscript), all plots by SSN group by age group; e.g., infancy, early life, puberty and post-menopause are in Appendices A-1 to A-4; a table of the average lifespan for each of each of these groups is in Appendix A-5. Appendices A-6 to A-11 display additional SAS statistical tables referring to the data used in the original manuscript and may be of interest to statisticians.