Letter The following article is Open access

Investigating permafrost carbon dynamics in Alaska with artificial intelligence

, , , , and

Published 16 November 2023 © 2023 The Author(s). Published by IOP Publishing Ltd
, , Focus on Permafrost Vulnerability to Climate Change Citation B A Gay et al 2023 Environ. Res. Lett. 18 125001 DOI 10.1088/1748-9326/ad0607

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

1748-9326/18/12/125001

Abstract

Positive feedbacks between permafrost degradation and the release of soil carbon into the atmosphere impact land–atmosphere interactions, disrupt the global carbon cycle, and accelerate climate change. The widespread distribution of thawing permafrost is causing a cascade of geophysical and biochemical disturbances with global impacts. Currently, few earth system models account for permafrost carbon feedback (PCF) mechanisms. This research study integrates artificial intelligence (AI) tools and information derived from field-scale surveys across the tundra and boreal landscapes in Alaska. We identify and interpret the permafrost carbon cycling links and feedback sensitivities with GeoCryoAI, a hybridized multimodal deep learning (DL) architecture of stacked convolutionally layered, memory-encoded recurrent neural networks (NN). This framework integrates in-situ measurements and flux tower observations for teacher forcing and model training. Preliminary experiments to quantify, validate, and forecast permafrost degradation and carbon efflux across Alaska demonstrate the fidelity of this data-driven architecture. More specifically, GeoCryoAI logs the ecological memory and effectively learns covariate dynamics while demonstrating an aptitude to simulate and forecast PCF dynamics—active layer thickness (ALT), carbon dioxide flux (CO2), and methane flux (CH4)—with high precision and minimal loss (i.e. ALTRMSE: 1.327 cm [1969–2022]; CO2RMSE: 0.697 µmolCO2m−2s−1 [2003–2021]; CH4RMSE: 0.715 nmolCH4m−2s−1 [2011–2022]). ALT variability is a sensitive harbinger of change, a unique signal characterizing the PCF, and our model is the first characterization of these dynamics across space and time.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

1.1. Permafrost carbon feedback

Frozen soil and carbon-rich permafrost characterizes nearly 14 million square kilometers of the global terrestrial surface, with total soil organic carbon stock estimates near 1307 ± 170 PgC (Hugelius et al 2014). Across the Circumarctic, quantifying the persistent irregularities and impacts attributed to permafrost degradation remains a scientific challenge. The transitional state of permafrost and spatiotemporal ALT heterogeneity drives abrupt changes emerging from rapid, nonlinear carbon-climate feedback mechanisms. These processes are correlated with several biotic and abiotic factors throughout the tundra and boreal, including tundra shrub encroachment, boreal forest migration, caribou migration patterns, topography, precipitation, solar radiation, land surface temperature, and subsurface hydrologic flow (Lloyd et al 2003, Evans et al 2020, Aguirre et al 2021, Joly et al 2021). Carbon release originating from the permafrost-carbon feedback is a climate change catalyst that amplifies localized warming patterns, disrupts carbon cycle partitioning, and destabilizes land–atmosphere coupling mechanisms and feedback thresholds (Sannel and Kuhry 2011, Koven et al 2015, Schuur et al 2015, 2022).

Abrupt permafrost thaw triggers ground subsidence, thermokarst formation, and the proliferation of wetlands, ponds, and methane emission hotspots near littoral zones (Jorgenson et al 2006, 2013, Walter et al 2007, Klapstein et al 2014, Turetsky et al 2014, Olefeldt et al 2016). Across the northern latitudes, increased warming and precipitation trends continue to foster an exponential increase in permafrost carbon release, i.e. permafrost carbon loss is a function of global mean temperature change: ${\text{p}}{{\text{C}}_{{\text{loss}}}} = f\left( {{\text{GMT}}} \right)$ (Pastick et al 2015). Although most model projections arbitrarily terminate between 2100 and 2500 CE, it is purported that anthropogenic-induced warming—coupled with permafrost degradation and soil carbon release—may trigger a positive carbon-climate feedback and sustain a warmer, inhospitable equilibrium for hundreds of thousands of years (Koven et al 2011, McGuire et al 2018, Steffen et al 2018, Wang et al 2023).

Permafrost degradation remains a substantial contributor to the uncertainty of soil carbon release and may promote a positive feedback on the climate system (Koven et al 2015, Burke et al 2017). Synergistic relationships exist among permafrost degradation, thaw subsidence, and terrestrial carbon cycling dynamics. Despite these revelations, the physical processes governing the rate, magnitude, and extent of permafrost degradation and soil carbon remobilization remain uncertain (Koven et al 2015). The mercurial nature of thaw-induced carbon release presents many challenges to quantifying the extent and variability of this feedback (Miner et al 2021). Therefore, it is necessary to constrain drivers of change in model projections to disentangle and interpret the PCF signal (Lawrence et al 2015, Song et al 2020). Rectifying these response patterns requires the ability to scale, simulate, and project the physical dynamics of moisture and energy at the land–atmosphere interface. However, the mechanics and interactions that currently govern each earth system model vary widely (Randall et al 2007, Li et al 2017). Furthermore, few earth system models simulate and account for the mechanisms and long-term effects of the PCF (e.g. UKESM, CanESM, CESM).

Spatiotemporal limitations and instrument constraints inherent to remote sensing technology present considerable obstacles for extracting information from the subsurface profile and restrict the ability to quantify and/or infer thaw variability with high precision and confidence (Gabarró et al 2023). In the high latitudes, the synthesis of optical remote sensing products remains a challenge due to data availability limitations from mechanical restrictions and natural phenomena, such as frequent cloud cover, short summer periods, and low illumination angles (Esau et al 2023, Gay 2023b). These acquisition challenges, compounded with instrument constraints and limited representation in process-based models, prevent accurate quantification of the PCF (i.e. frost heave-thaw dynamics, subsurface freeze- and break-up events, and nutrient cycling).

Despite these knowledge gaps, modeling and reanalysis products are realistic solutions that adopt simple sinusoidal annual temperature gap-filling algorithms. Additionally, cross-sensor probability density functions and probabilistic modeling approaches are used to compensate for crude spatial resolution (Westermann et al 2011). Benchmarking and coupled model intercomparisons (CMIP6) have reduced the broad swaths of uncertainty. Model frameworks continue to improve as a breadth of observational data provide intelligence for calibration and validation (CALVAL) (Collier et al 2018, Beauchamp et al 2020). Concurrently, multimodal applications with AI and observational data quantify, emulate, resolve, and improve upon the physical representation of the climate system with increased flexibility, efficiency, and precision (Koppe et al 2019, Xu et al 2021). Limitations inherent to tractable, modular-limited frameworks and sensitive, computationally intensive AI approaches present opportunities for unifying data harmonization, information sharing, and bias correction among physics-based and data-driven methods (Cuomo et al 2022, Slater et al 2022, Fernandes et al 2023). In response to these sustained and co-emerging limitations, we capture abrupt and persistent changes in subsurface conditions and disentangle control factors driving the PCF with a process-informed, data-driven formulation.

1.2. Deep learning

DL is a collection of machine learning algorithms that integrate artificial neural networks (ANN) in multiple layers to discover and extract high-level features and patterns. Convolutional neural networks (CNN) reduce the computational burden of parameterization during training without sacrificing performance. Alternatively, recurrent neural networks (RNN) are complex nonlinear systems that produce feature maps based on impulse sets at discrete time intervals (You et al 2017, Körner and Rußwurm 2021). The equation below (E1) contextualizes the feedback modulation critical for long short-term memory networks (LSTM) wherein learned representation (i.e. recurrent output) at the current time step combines hidden states and loss iteratively (S6):

Equation (E1)

The ability to handle complex multidimensional earth system data while improving data assimilation methods, capturing non-linear relationships, quantifying uncertainty with probabilistic measures, and enhancing modeling precision, efficiency, and forecasting power present important advantages and motivations for implementing AI into current ESM frameworks. Furthermore, encoding previous hidden states is crucial for preserving the internal dynamics and learned representation of memory. This allows the model to reconfigure and assess future cell states based on performance metrics (Peterson 2002). A classic example is soil moisture, a complex and sensitive covariate governed by abiotics (i.e. climatology, topography, soil hydraulics) functioning as a key explanatory variable responsible for encoding the ecological memory of the earth system (Ogle et al 2015). Ecological memory refers to the impact of previous events and drivers of change on the dynamics and complexities of ecosystem response patterns (Khalighi et al 2022). These elusive, interconnected signals illustrate the mercurial nature of memory and change and demonstrate why capturing these effects is critical for simulating real-world dynamics.

2. Methods

2.1. Study regions and physiography

The study region is composed of four regional subdomains delineated with borough and census tracts in Alaska (The North Slope [TNS], Interior Boreal [INT], Seward Peninsula [SEW], Yukon-Kuskokwim Delta [YKD]) and over 13.1 million unique observations across 790 field sites. Figure 1 illustrates the distribution of the flux tower network and field campaigns across Alaska. These subdomains are characterized by zone-defined climatological variability and indicate dynamic subsurface conditions and aboveground geomorphology.

Figure 1.

Figure 1. Spatial distribution of subdomains and ground truth networks across Alaska. The colorized polygons represent each subdomain, and the white points reflect in situ and flux sites used for teacher forcing.

Standard image High-resolution image

TNS exhibits continuous permafrost, mesic tundra, acidic/non-acidic loess foothills, hillslope water tracks, colluvial-basin deposits, wet tussock/non-tussock sedges, open/closed erect prostrate and dwarf shrubs, moss and forb communities, and sporadic wet graminoid vegetation at higher altitudes (Walker et al 1989, Raynolds et al 2019). The INT displays extensive land cover heterogeneity, i.e. rolling hills and lowland terrain with coniferous and deciduous forest canopies, continuous and discontinuous frozen soil and permafrost, thermokarst, lowland peat bogs, and wet bottomlands. This region is characterized by continental climate regimes, frequent wildfire activity, high elevation intersections between alpine tundra and arctic lowland tundra, and rapid ecological succession ranging from wet, disturbed regions with shrub and productive graminoid colonization to warm, dry hillsides with spruce replacement. The SEW is characterized by surface/subsurface dynamics ranging from sparse upland tundra flats with low productivity and shallow active layers, well-drained embankments with average productivity and spruce-dominated stands, and densely forested riparian networks with deep active layers (Raynolds et al 2019). The YKD is defined by low-and-upland terrain with extensive wet tussock sedge tundra, lichen-rich peat cover, low vascular plant speciation, and dry tussock graminoid-dwarf shrub communities (Raynolds et al 2019). Plant functional types and species richness contain mesic lichen, sedges, and dwarf shrub vegetation, and isolated instances of erect, prostrate shrubs across volcanic outcrops and drainages (Frost et al 2020). The YKD displays spatially variable, terrain-dependent frozen soil with spatially variable continuous, discontinuous, and sporadic permafrost peppering the upland tundra, flat rolling hills, and eolian surficial deposits. This region exhibits coastal maritime climate patterns with frequent upland wildfires fueled by lichen-rich peat and tussock litter, high shrub abundance, and low seed recruitment (Frost et al 2020).

2.2. Data synthesis

This research study examines over 13.1 million unique field observations from over 790 sites in Alaska. A considerable proportion of these micrometeorological and biogeochemical data originate from flux towers (AmeriFlux, NEON; appendix C), providing near-continuous measurements (i.e. 30 minute sampling frequency over a 3 to 19 year period). In addition, thaw depth survey measurements were derived via mechanical probing and ground-penetrating radar (GPR) during soil moisture and active layer monitoring campaigns from 1969 to 2022 (e.g. NASA Arctic Boreal Vulnerability Experiment Soil Moisture and Active Layer Thickness (ABoVE SMALT) and Remotely-Sensed Active Layer Thickness (ReSALT (Schaefer et al 2015), Global Terrestrial Network for Permafrost (GTNP), and Circumpolar Active Layer Monitoring program (CALM) (Hinkel and Nelson 2003)). The base dataset consists of 91 variables collected across Alaska over a 53 year period (table 1) and the corresponding Data Sources are provided in appendix C for reference and reproducibility.

Table 1.  In-situ and eddy covariance data used for teacher forcing and model training.

Data SourcesSubdomains of interestSpatial coverageTemporal coverageSampling methodologySampling frequencyEnvironmental covariates
CALM, GTNPTNS, YKD, SEW, INT68 sites, Alaska1969—present; 2010—present1–100 mSeasonal, 3 hour intervalsALT, ST, PT
USArrayTNS, YKD, SEW, INT74 sites, Alaska2016–20190–1.5 m at depthHourly, 6 hour intervalsST
ABoVETNS, YKD, SEW, INT35 sites, Alaska2010—present5–500 mAnnual, seasonal, monthly, dailyALT, ST, VWC, SR, CH4, CO2, GPP, REco, NEE
USGS, UAFTNS, YKD, SEW, INT151 sites, Alaska2000–2020; 20221–100 m 0–1.4 m at depthSeasonalALT, ST, SMC, CH4, CO2, PAR, SD, SP, DBH
NEONTNS, INT8 sites, Alaska2015—present10 cm, 1 m, 1 kmAnnual, seasonal, semi-monthly, 30 minute intervalsALT, CH4, CO2, LST, AT, RH, PRE, SWD, WTD, WT, WS, WD, SHF, LHF, GHF, SWR, LWR, NR
AmeriFluxTNS, YKD, SEW, INT41 sites, Alaska1974—presentEddy covariance observations; flux/energy measurementsAnnual, monthly, weekly, daily, hourly, 30 minute intervalsALT, CH4, CO2, LST, AT, RH, PRE, SWD, WTD, WT, WS, WD, SHF, LHF, GHF, SWR, LWR, NR

Following the workflow in figure 2, the construction of the base dataset and subsequent model development required spatiotemporally constrained preprocessing methods to address latent barriers (i.e. study design), reduce dimensionality, initialize gap-filling and data transformations (i.e. scaling), and preserve information legacies. Dimensionality reduction was used to reduce model complexity and improve generalization, reduce multicollinearity, and enhance computational efficiency. Replicate samples were considered independent features and were not resampled to reduce overgeneralization and bias inflation. The preprocessing pipeline used supervised learning while missing values were imputed by backward-forward gap-filling bilinear interpolation. This gap-filling method served as a padding mask for flux data, reducing sample mismatch and enabling temporal alignment over 53 years. Feature engineering addressed data type conflicts and replicate sampling by removing string objects from the dataframe followed by stationarity tests, i.e., logarithmic transformation and trend differencing (S3).

Figure 2.

Figure 2. Data processing and model development workflow.

Standard image High-resolution image

We examined the properties and distribution of the dataset with diagnostics including KDEs and correlograms under various scaling methods to clarify and justify scaling and feature selection (figure 3, table 2). These methods included dimensionality reduction (i.e. Variance Inflation Factorization [VIF]; Principal Component Analysis [PCA]) and spatial autocorrelation metrics (i.e. Wiener–Khinchin theorem; Augmented Dicky Fuller Test [ADF]; Kwiatkowski-Phillips-Schmidt-Shin Test [KPSS]). VIF functions as a diagnostic tool in multivariate forecasting to quantify multicollinearity severity, providing a scalar measurement that encapsulates how much the variance of an estimated regression coefficient increases when the predictors are correlated (Craney Surles 2002, Akinwande et al 2015).

Figure 3.

Figure 3. Correlation matrix of VIF-trimmed 56-feature dataset after multicollinearity reduction (i.e. 35 features removed) with high-confidence low-correlation feature dependency. The original 91-feature base dataset correlogram is referenced in S1.

Standard image High-resolution image

Table 2. Comprehensive iterative ranking of multicollinearity reduction (VIF), displaying the most significant features contributing to least variance (VIFthreshold < 4). These features were dropped, with the remaining features listed in S2.

FIDVariableVIFRankFIDVariableVIFRank
88GPP4940.1808789148TS_2_2_523.13194473
58TS_2_3_43415.8362429057SWC_1_1_419.54899172
59TS_2_3_62397.4822918943TS_STD17.75068471
48TS_1_1_31443.5347478813CH4_STD14.88181970
57TS_2_3_5788.5973368725TA_1_4_110.34368669
53TS_2_2_6465.9291298610Soil_[CO2]_25 cm7.50408468
53TS_2_3_3391.9857458557SWC_1_2_66.42359867
50TS_2_2_3101.4728318414CH4_1_1_26.31818566
80NEE101.174939833WD_STD6.23591165
52TS_2_3_190.7541908229LW_OUT5.59140864
16CH4_1_2_161.3207498151SWC_1_2_25.29434963
49TS_2_2_460.214292801WS_STD5.22736862
77RECO35.9929447911CH4_1_4_15.19602161
5CO2_STD27.7515157857LAI5.09400860
51H2O_MEAN25.2372277717TA_1_2_14.62302759
64SWC_1_2_424.3046117644SWC_1_1_34.27977958
6CO2_1_2_110.232259752CO2_MEAN4.10866557
42TS_2_2_19.2675907417TA_1_5_13.82515956

PCA compresses data in linear systems; however, for nonlinear relationships, multivariate dimensionality reduction with autoencoding networks was used to explore features' spatiotemporal response patterns via neural-based decomposition. ADF and KPSS methods address deterministic trends and lagging complexities by examining unit root stationarity and non-stationarity in time series data (Lopez 1997). The most significant principal components exhibiting maximum variance were extricated from the 56-feature VIF-reduced multivariate dataset in accordance with dimensionality reduction methods at an 85% cumulative explained variance (CEV) threshold (86.12% [VIF: 61.96%; PCA: 52.17%]). Moreover, pairwise correlation matrices and serial autocorrelations were computed with ranking criteria and Fourier transformations of the power spectra. Based on these results, three PCF features were targeted within the base dataset: ALT, methane flux, and carbon dioxide flux (ALT; CO2_1_1_1; FCH4_1; table 3). These measurements were reframed and prepended with time-lag−3 lead+2 supervision (i.e. learning encouragement) prior to model development.

Table 3. Stationarity results from ADF and KPSS testing on raw and gap-filled (GF) pre- and post-trend differenced datasets.

FeatureADFpADF LagsADF KPSSpKPSS LagsKPSS n
ALT−21.774, −11.170GF 1.448 × 10−21 144, 144GF 0.510, 18.449GF 0.010,163, 792GF 98 753
−52.148, −159.212GF 67, 143GF 0.024, 0.018GF 0.100862, 8594GF
CO2 −8.851, −11.700GF 1.052 × 10−23 143, 144GF 8.193, 10.435GF 0.010,410, 793GF 547 101
−76.897, −146.583GF 103, 144GF 0.087, 0.003GF 0.100715, 2810GF
CH4 −11.667, −4.503GF 2.625 × 10−5 144, 144GF 0.913, 2.704GF 0.010,259, 794GF 212 642
−47.966, −136.205GF 79, 144GF 0.095, 0.014GF 0.100272, 3821GF

2.3. Model development

GeoCryoAI is a state-of-the-art DL architecture composed of bidirectional memory-based recurrent networks (Schuster and Paliwal 1997, Graves and Schmidhuber 2005, Shi et al 2015, Gay et al 2022, Gay 2023b) capable of uncovering hidden response patterns and disturbance signals from in-situ measurements, remote sensing observations, and modeling output (figure 4). This framework presents a novel method to unify spatiotemporal multimodal data. The application quantifies and resolves real-world dynamics and disparities among abrupt and persistent drivers of change. For this study, the in-situ module and corresponding layer architecture is utilized for teacher forcing (S4 and S5). The process-driven component of the network accounts for real-world physical dynamics. The model implements test data and evaluation criteria to simulate ecological memory, highlighting the direct, indirect, concurrent, lagged, internal, and external effects of thaw-induced carbon release on the surrounding environment. The data-driven component is characterized by a CNN-filtered LSTM-encoded RNN that uses teacher forcing from in-situ and flux observations into a network of processing layers with activation and regularization functions (Williams and Zipser 1989). The outputs are trained within an autoencoder framework that encodes and imputes optimized decoding protocol for generative adversarial training and synthetic time series' reconstruction (Yoon et al 2019, Klopries and Schwung 2023). This generative process utilizes encoder-decoder compression to denoise, transform, and learn the latent feature space representation of inputs while using the identity function to reconstruct the inputs from the compressed latent-space vector (Kim et al 2021, Bourland et al 2022, Chen and Guo 2023).

Figure 4.

Figure 4. Inner mechanics of the GeoCryoAI framework.

Standard image High-resolution image

The structure of the model configuration is optimized with multimodal inputs and a predefined hyperparameter dictionary generated by a sequential model-based optimization method (i.e. Bayesian optimization search algorithm). These nonlinear programs culminate in a series of logical operations that scan and determine the best-performing model configuration based on regularizing model complexity and optimizing learning efficiency (i.e. early stopping and hyperparameter tuning in latent space). During training, validation loss is used as the objective function. The tuner constructs an a priori distribution model and iterates through a set of hyperparameters (Hp ) to compute the lowest objective score (f(x)) across a distribution of validation metrics (E2). The P(f(x)|Hp ) model is updated every trial with these results, and new criteria are established from a selection function (Wu et al 2019):

Equation (E2)

The GeoCryoAI model emulated hidden dynamics, linking approximately 2.511 million parameters with 1.177 billion interpolated measurements to computationally reflect the state space of the earth system from in-situ measurements. The model design required an efficient, flexible approach to emulate nonlinearities with ground-truth observations with module reconstructions including consolidated tabular time-series layer processing (i.e. LSTM) and sequential time-distributed convolving layers (i.e. Conv1DLSTM). During construction, expanded layer selection was guided by the loss of spatial information with LSTM applications, and the persistent degradation in quality inputs in both data structure and processing resilience. Model performance was optimized via hyperparameter tuning, with the loss function reconfiguring weight matrices while facilitating active learning and convergence, generating output vectors with every batch over each epoch. Furthermore, error metrics were the objective of the NN, with each iteration seeking to reduce the effects of the loss function on weight matrices and vector components. In response to Bayesian optimization hyperparameter tuning and validation loss captured after training, the training and validation subsets were shifted temporally to provide more data for the model to learn (ALT: 1969:2018, 2019–2020; CH4: 2011:2018, 2019–2020; CO2: 2003:2018, 2019–2020).

3. Results

3.1. Synthesis results

The cleaned 56-feature dataframe yields dynamic relationships; more specifically, thaw depth and subsurface soil respiration demonstrates moderate anticorrelation (rsoil_[CO2]_15cm: −0.311942), while thaw depth, CH4 and CO2 fluxes illustrate positive correlations (rFCH4_1: 0.334563; rFCH4_2: 0.309447; rCO2_1_1_1: 0.243649). These corroborate the rate-limiting biogeochemical mechanisms that fortify the permafrost-carbon feedback: increased thaw yields higher water availability, halting respiration and instead, fostering warmer anoxic environments suitable for methanogenesis. At the surface, increased thaw and water availability promotes aerobic respiration and evapotranspiration in the soil and surrounding vegetation. Further anticorrelated interactions were noted, i.e. thaw depth relative to GHF (rG_1_1_4: −0.267481) and WT (rTW_1_1_1: −0.219022). In addition to flux variables, a positive correlation was identified between thaw depth and WTD (rWTD_2_1_1: 0.237545). The complexities of the earth system are made apparent wherein mixed relationships were observed as well, with AT exhibiting negative and positive correlations relative to thaw depth (rTA_1_2_1: −0.205626; rTA_1_3_1: −0.180273; rTA_1_5_1: 0.263813). We examined the spatiotemporal distribution (i.e. statewide versus regional, 1969–2022) of permafrost degradation across the four subdomains. TNS exhibits the lowest ALT mean and variance in permafrost degradation (48.460 ± 14.976 cm), INT displays the highest ALT variance (±31.164 cm), and the YKD demonstrates the highest ALT mean (62.415 cm). The temporal period among these features varies substantially (i.e. ALT, 1969–2022; CO2, 2003–2021; CH4, 2011–2022). Thaw depth measurements were constrained in alignment with the temporal resolution of carbon flux (CH4, CO2) to identify potential nuances governing the dynamics of the PCF. The temporally constrained plots in figure 5 illustrate contrasting annual thaw depth, carbon dioxide, and methane flux variability across Alaska.

Figure 5.

Figure 5. Annual time series of ALT, CO2, and CH4 measurements relative to temporal coverage (a). The underlying plots are constrained to the temporal period of carbon dioxide and methane flux variability across Alaska, 2006-2019 (b).

Standard image High-resolution image

Abrupt thaw signals in 2014 and 2017 are complemented by persistent increases in CO2 and CH4, and increased CH4, respectively. An interesting anticorrelated relationship is also observed: carbon dioxide and thaw signal relationships are less obvious which may be attributed to increased sequestration of CO2 resulting in tundra greening and increased primary productivity. Another explanation may be rooted in feedback attribution: carbon dioxide regulates thaw rate, but due to increased warming, methane and thaw experienced runaway effects (i.e. 2014–2017). Error metrics (RMSEs) for ALT, CO2, and CH4 were computed from feature variances originating from calibration and measurement error prior to model training (i.e. aleatoric stochasticity): 4.348 cm, 20.117 umolCO2m−2 s−1, and 79.185 nmolCH4m−2 s−1, respectively.

3.2. Model results

Initial baseline metrics were explored prior to model construction. The implementation of time-delayed (±1) naive persistence forecasting yielded 1.997 cm, 1.906 µmolCO2m−2s−1, 0.884 nmolCH4m−2s−1 (RMSE). Thereafter, a simple RNN was explored to determine the suitability for emulating the complex interactions characterizing the PCF. The sequential linear regression network was evaluated for comparison, yielding 1.024 cm, 1.114 µmolCO2m−2s−1, 0.295 nmolCH4m−2s−1 (RMSE). These results indicate substantial error reduction by an order of magnitude (RMSE: 1.389) and suggest implementing a more comprehensive and efficient NN that accurately reflects the internal dynamics and ecological memory of the Arctic system. Model simulations were informed in a data-driven sequence of site-level teacher forcing via in-situ and flux observations of ALT (ALTRMSE: 1.327 cm [1969–2022]; CO2 RMSE: 0.697 µmolCO2m−2s−1 [2003–2021]; CH4 RMSE: 0.715 nmolCH4m−2s−1 [2011–2022]). Evaluation metrics were computed on the training configurations followed by the application of inverse transformations to the test sets prior to feature predictions (figure 6).

Figure 6.

Figure 6. Loss functions (a) and predictions (b) derived from GeoCryoAI simulations of in situ thaw depth and carbon release.

Standard image High-resolution image

The evaluation metrics demonstrate the performance and predictive capacity of the model (figure 6(a)). Although the predictions indicate accurate impulses over time (figure 6(b)), the magnitudes of ALT predictions are underestimated while CH4 and CO2 flux predictions are overestimated. The contributing factor for these disparities is likely due to data imbalance. Although randomization of data splitting (i.e. training, validation, testing sets) is best practice, data partitioning prioritized sample representation after cleaning while retaining the temporal structure. Moreover, the spatiotemporal sparsity of data samples likely contributed to these disparities, which was both accounted for and mitigated with gap-filling and time lag-lead methods. For example, ALT training data consisted of 72.63% of the 98 753 ALT samples (1969–2022), CO2 training data consisted of 74.05% of the 547 101 CO2 samples (2003–2021), and CH4 training data consisted of 71.23% of the 212 642 CH4 samples (2011–2022). Another possible factor contributing to underestimation is the introduction of excessive regularization (i.e. dropout) in the modeling framework, leading to an oversimplification of the model and feature variance. Conversely, the potential issues facilitating overestimated feature predictions may originate from model overcomplexity thus capturing spurious dynamics. To counteract these potential discrepancies and misinterpretation of derived results, more sensitive evaluation metrics could be evaluated in addition to further refining of the Bayesian hyperparameter optimization tuning process is necessary to include more operands (i.e. loss functions, activation functions) and continuous arrays (i.e. batch size, epochs) to isolate relevant causes of these magnitude disagreements.

4. Discussion

To our knowledge, this study is the first to use a process-constrained DL system to understand key regulators of the PCF. The model presented minor prediction errors and exposure biases, and the teacher forcing approach simplified the loss landscape in exchange for computational efficiency. Over time however, model performance degraded, thereby warranting spatial expansion and more discriminant temporal delegation. In addition, the vanishing gradient (i.e. learning terminates, performance decreases) presented multiple challenges throughout the training process, including the risk of overfitting due to model complexity (i.e. dampened with dropout generalization) and the inability to label sparse and coarse data. To ameliorate these challenges, regularization and optimization strategies were integrated into the pipeline to isolate the ideal learning rate and optimization for thaw-induced carbon signal retention. Optimization strategies included modifying layer structure and constituency of a layer, instance, or group, weight standardization, and synchronous batch normalization with in-situ data and hybridized ensembles (Iofee and Szegedy 2015).

DL models perform well in time series prediction; however, quantifying the internal mechanisms presents many challenges limiting the utility of their application. In addition, the absence of interpretability can hinder knowledge transfer among disciplines and necessitates more explainable methodologies. The current foundational understanding of these complex systems indicates the multilayered internal mechanisms facilitate active learning from error via back-forward propagation, non-linear functions (e.g. ReLU, tan−1h, ∫) and weight-optimized regularization between bias-corrected nodes (e.g. dropout, batch normalization; Vakalopoulou et al 2023). This knowledge is important when gauging the feasibility of integrating AI into ESM frameworks; more specifically, identifying feature importance for explainability (e.g. LIME, SHAP), designing attention mechanisms for interpretability, developing hybridized models with physical constraints, and employing ensemble-based Bayesian methods with confidence intervals for uncertainty quantification and enhanced reliability.

Uncertainty and prediction error overgeneralize performance gains resulting from overfitting during training. This behavior was expected and is a function of the initial model's inability to capture sub-seasonal trends in freeze-thaw dynamics (e.g. peak thaw, abrupt thaw events) from available data instances with standardization and scaling methods. Additional uncertainty may originate from landscape-level dynamics and sustained regional lagged effects in response to increased warming trends (i.e. LHF). Moreover, field observations and flux tower data may introduce error through instrument failure, perturbation, signal-to-noise calibration issues, and sampling redundancy (i.e. aleatoric uncertainty). Other potential sources of uncertainty may be reconciled with other pre-processing methods (i.e. temporal downscaling), data collection, and data assimilation techniques.

Ongoing work seeks to isolate causal links and feedback drivers of the PCF with multivariate mutual information and transfer entropy. In addition, we continue to modulate the spatiotemporal data lens to quantify, scale, and forecast PCF feedback mechanisms across Alaska by assimilating multimodal remotely sensed hyperspatiospectral inputs and model outputs into the GeoCryoAI framework (e.g. AVIRIS-NG, UAVSAR, SIBBORK-TTE, TCFM-Arctic). Airborne hyperspectral imaging and synthetic aperture radar systems provide a means to quantify, scale, and interpret features at higher resolution and precision over broader domain (e.g. carbon flux, thaw subsidence). In addition, these efforts seek to extrapolate PCF signals across the Circumarctic and generate space-time zero-curtain maps in parallel with future spaceborne missions (e.g. NISAR, MicroCarb, CO2Image, MERLIN, TRISHNA). This methodology will facilitate future exploration into knowledge discovery, reconcile algorithm and instrument limitations, resolve empirical assumptions and observational bias, and inform mitigation efforts and risk management across the Arctic region.

5. Conclusion

The GeoCryoAI model successfully demonstrates the application of machine learning to quantify and understand the dynamic complexities of the PCF. The model learned the subtle complexities among covariates while demonstrating an aptitude for emulating permafrost degradation and carbon flux dynamics with high precision and minimal loss (i.e. ALTRMSE: 1.327 cm [1969–2022]; CO2 RMSE: 0.697 µmolCO2m−2s−1 [2003–2021]; CH4 RMSE: 0.715 nmolCH4m−2s−1 [2011–2022]). Continued model development may improve sensitivity, disentangle the spatial processes and contributors of environmental change, bridge gap-filling multimodality, and reconcile disparate estimations and below-ground uncertainty across the Arctic system. This work demonstrates the importance of monitoring the variability of ALT as a sensitive harbinger of change. The active layer provides a unique signal characterizing permafrost degradation, soil carbon flux, and other biogeochemical drivers of land cover change and Earth system feedbacks in the Arctic. Understanding, interpreting, and identifying the most vulnerable regions undergoing accelerated permafrost degradation and subsidence is important to inform, engage, and promote high-impact cross-disciplinary research across the northern latitudes while providing evidence and support for mitigation strategies, infrastructure security, and informed policymaking.

Acknowledgments

Acknowledgment and gratitude are owed to the following institutions and campaigns for their research support, data accessibility, and funding opportunities: NASA, USGS, ABoVE, CALM, UVA, UAF, GWU, NAU, UC, WCRC, ARCUS, and the USPA. BG envisioned the original research question, managed the analyses, and organized the drafting and revision process. BG's participation was supported by the Department of Geography and Geoinformation Science at George Mason University and by an appointment to the NASA Postdoctoral Program at the Jet Propulsion Laboratory, California Institute of Technology, administered by Oak Ridge Associated Universities under contract with NASA. N P provided remote sensing expertise, field knowledge, editing, and review support. N P's participation was supported by the U.S. Geological Survey's National Land Imaging Program and National and Alaska Climate Adaptation Science Centers. A Z conducted preliminary reviews, engaged in discussion pertaining to computational requirements, and evaluated review responses and code versioning. A Z's contribution was supported by the Department of Geography and Geoinformation Science at George Mason University and the Department of Computer Science at Emory University. A A provided novel process-based modeling insight and editorial review throughout the writing process. A A is supported by University Maryland Earth System Science Interdiscimplinary Center (ESSIC). K M supported with editing and revisions, and is supported by the Jet Propulsion Laboratory, California Institute of Technology. J Q managed the scope and delivery of this study and is supported by the Environmental Science and Technology Center and the Department of Geography and Geoinformation Science at George Mason University. Much appreciation is extended to the three reviewers whose candor, insight, and expertise helped transform and improve this manuscript. Furthermore, the authors would like to thank the following institutions and campaigns for their research support, data accessibility, and funding opportunities: Funding for AmeriFlux core site data was provided by the U.S. Department of Energy's Office of Science. Imagery and data attribution: Data SIO, NOAA, US Navy, NGA, GEBCO, Landsat/Copernicus, IBCAO, LDEO-Columbia, NSF, Google. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

Data availability statements

The source code and pre-processed data supporting this research are available upon request and distributed in a GitHub repository (Gay 2023a). All data that support the findings of this study are included within the article (and any supplementary files). In addition to a comprehensive list of permafrost and carbon flux data sources, acronym and terminology glossaries are provided in the appendices followed by the supplementary materials.

Conflict of interest

The authors declare that the research was conducted without any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A: Abbreviations, symbols, and terminology

Arctic-Boreal Vulnerability Experiment.........ABoVE

Augmented Dickey-Fuller Test.........ADF

Artificial Intelligence.........AI

Active Layer Thickness.........ALT

Artificial Neural Network.........ANN

Air Temperature.........AT

Airborne Visible Infrared Imaging Spectrometer Next Generation.........AVIRIS-NG

Circumpolar Active Layer Monitoring Program.........CALM

Calibration/Validation.........CALVAL

Methane.........CH4

Coupled Model Intercomparison Project.........CMIP6

Convolutional Neural Network.........CNN

Carbon Dioxide.........CO2

Carbon Dioxide Imaging Spectrometer.........CO2Image

Diameter at Breast Height.........DBH

Deep Learning.........DL

Earth System Modeling.........ESM

Geocryological Artificial Intelligence Model.........GeoCryoAI

Ground Heat Flux.........GHF

George Mason University.........GMU

Gross Primary Productivity.........GPP

Ground Penetrating Radar.........GPR

Global Terrestrial Network for Permafrost.........GTNP

High Resolution.........HR

Interior Boreal.........INT

Indian Space Research Organization.........ISRO

Kernel Density Estimation.........KDE

Kwiatkowski–Phillips–Schmidt–Shin.........KPSS

Latent Heat Flux.........LHF

Local Interpretable Model-agnostic Explanations.........LIME

Land Surface Temperature.........LST

Long Short-Term Memory.........LSTM

Longwave Radiation.........LWR

Methane Remote Sensing Lidar Mission.........MERLIN

Microsatellite Carbon Dioxide Mission.........MicroCarb

National Aeronautics and Space Administration.........NASA

Net Ecosystem Exchange.........NEE

National Ecological Observatory Network.........NEON

Net Radiation.........NR

NASA-ISRO Synthetic Aperture Radar.........NISAR

Neural Network.........NN

Precipitation.........PRE

Photosynthetically Active Radiation.........PAR

Principal Component Analysis.........PCA

Permafrost Carbon Feedback.........PCF

Permafrost Temperature.........PT

Ecosystem Respiration.........Reco

Recurrent Neural Network.........RNN

Relative Humidity.........RH

Stem Density.........SD

Seward Peninsula.........SEW

Sensible Heat Flux.........SHF

SHapley Additive exPlanations.........SHAP

Spatially Explicit Gap Model for Boreal Forest in Three-Dimensional Terrain.........SIBBORK

Soil Moisture and Active Layer Thickness Campaign.........SMALT

Soil Moisture Content.........SMC

Snow Depth.........SWD

Soil Profile.........SP

Soil Respiration.........SR

Soil Temperature.........ST

Shortwave Radiation.........SWR

Terrestrial Carbon Flux Model for the Arctic.........TCFM-Arctic

The North Slope.........TNS

Thermal Infrared Imaging.........TRI

TRI Satellite for High Resolution Natural Resource Assessment.........TRISHNA

Taiga–Tundra Ecological Transition Zone.........TTE

University of Alaska–Fairbanks.........UAF

Uninhabited Aerial Vehicle Synthetic Aperture Radar.........UAVSAR

United States Geological Survey.........USGS

Variance Inflation Factor.........VIF

Volumetric Water Content.........VWC

Wind Direction.........WD

Wind Speed.........WS

Water Temperature.........WT

Water Table Depth.........WTD

Yukon–Kuskokwim Delta.........YKD

Appendix B: Glossary of terminology

Accuracy: The accuracy of a model is in reference to the percentage of correct predictions relative to observations.

Activation function: The activation function extracts the weighted sum of inputs from a previous layer and generates output value(s) to catalyze or 'activate' the next layer in the architecture.

Active learning: Active learning is a specialized form of semi-supervised machine learning wherein a learning agent interacts with an oracle (i.e. user) to obtain labels for new data.

Algorithm: An algorithm is a specific method and/or process that outlines how to solve a research problem and generate machine learning frameworks.

Artificial neural network: An ANN is a structure composed of successive layers of interconnected units known as artificial neurons. It incorporates non-linear activation functions and resembles the neurons in an animal brain.

Augmented Dickey fuller test: A common statistical test often utilized to determine the stationarity of a particular time series in question.

Autoencoder: An autoencoder is a type of ANN used for unsupervised and non-linear dimensionality reduction. It aims to generate efficient representations of data.

Backpropagation through time: Backpropagation through time is a technique used to train ANNs for computing gradients required for weight calculation in the network.

Batch: A batch refers to a group of examples used for one gradient update during model training.

Bayes' theorem: Bayes' theorem is explored by a wide breadth of the machine learning community, namely because it is used to probabilistically determine the precipitation of an event in consequence of a priori information related to environmental and/or boundary conditions.

Bayesian network: A Bayesian network is a mechanism used to represent probabilistic knowledge. Inference algorithms in belief networks leverage the network structure to efficiently generate inferences, as opposed to joint probability distributions over all variables.

Bias metric: Bias metric refers to the average difference between predictions and the correct values for observations. A low bias metric can indicate accurate predictions or an equal proportion of predictions above and below the actual values, resulting in a low average difference. High bias (with low variance) suggests underfitting and an inappropriate model architecture for the task.

Bias term: A bias term enables models to represent patterns that do not pass through the origin. It considers whether the output would still be zero if all input features were zero. Bias terms are typically associated with weights and attached to neurons or filters.

Boosting: Boosting is a machine learning 'meta-algorithm' used to reduce bias and variance in supervised learning and transforms 'weak' learners into stronger, more robust models.

Classification: Classification approximates a mapping function from inputs and/or features to discrete outputs while determining the classes characterizing specified instances.

Confidence interval: A confidence interval is a probabilistic interval estimate (i.e. confidence range) containing an unknown true value corresponding to a particular parameter and/or variable.

Convergence: Convergence is the state reached during model training when the loss function changes very little between iterations.

Convolutional neural network: A CNN is often implemented into imaging workflows and refers to deep feed-forward ANNs.

Cross-validation: Cross-validation refers to a set of processes designed to evaluate the generalization performance of a predictive model on new data sets.

Deep learning: DL is a machine learning field specializing in the integration of supervised, unsupervised, and semi-supervised learning approaches that prioritize data representation and pattern recognition as opposed to linear, task-regimented pipelines.

Dimension: Number of features in a dataset.

Ensemble: Ensembles are both machine learning and statistical techniques that employ multiple learning algorithms to improve predictive performance beyond what each algorithm can achieve individually.

Epoch: In the training of DL models, an epoch refers to one pass through the entire training dataset.

Feature: A feature represents an attribute in a dataset and refers to a relevant attribute-value combination used as input for a machine learning model.

Feature learning: Feature learning is an amalgamation of strategic approaches to automating knowledge discovery and latent representations critical for feature detection or classification.

Feed-forward networks: Feed-forward networks are ANNs in which connections between neurons do not form cycles and propagate in one direction.

Hyperparameter: A hyperparameter refers to higher-level properties of a model, such as learning rate or complexity. Examples include the depth of trees in a decision tree or the number of hidden layers in a neural network. Hyperparameters are manually adjusted during the model training process to optimize performance.

Kwiatkowski–Phillips–Schmidt–Shin Test: Statistical hypothesis test that identifies whether time series observations display trend stationarity (i.e. the null hypothesis; stationarity around a deterministic trend) relative to the observations' unit root behavior (i.e. the alternative hypothesis). This test is often coupled with the Ad-Fuller Test prior to model construction.

Layer: A layer refers to a set of neurons in an ANN that processes a group of input features. A hidden layer is an invisible layer with input-output neural connectivity.

Learning rate: Learning rate represents a scalar value integrated by the gradient descent algorithm multiplied by the gradient to update corresponding weight outputs during each training iteration, i.e. epoch.

Long short-term memory network: The LSTM network is a specialized form of an artificial recurrent neural network that is purported to address exploding gradients. Instantiating LSTMs are seeded in multiple forms and are regulated by gate flow and hidden cell memory over arbitrary or well-defined intervals (Sepp and Jürgen 1997, Li and Wu 2014, Sak et al 2014, Xingjian et al 2015).

Loss: Loss refers to the discrepancy between the true value (from the dataset) and the predicted value (from the machine learning model). A lower loss indicates a better model, unless overfitting has occurred. The loss is calculated for training and validation sets and reflects the model's performance.

Machine learning: ML is an AI discipline that employs statistical techniques to enable computers to learn and improve performance on specific tasks with data, without explicit programming. It involves algorithms that can learn from experience, improve performance on given tasks, and measure performance based on specific metrics.

MinMax: MinMax is an algorithm used for game playing in situations with perfect information. It is associated with techniques like alpha-beta pruning.

Model: A model is an abstract representation of what a machine learning system has learned from training data. It is a data structure that stores the learned representation of a dataset, including weights and biases.

Multimodal learning: Multi-modal learning is a machine learning field that aims to interpret and combine multimodal signals to process and relate information from various types of data.

Naive Bayes: Naive Bayes falls under the probabilistic classification umbrella based on Bayes' theorem guided by the assumed hypothesis that features exhibit strong independence.

Neural networks: Neural networks are mathematical algorithms inspired by the brain's architecture. They are designed to recognize patterns and relationships in data.

Neuron: A neuron is a unit characterizing ANNs that intuitively ingest a number of parameters and input values to output single values for subsequent processing.

Noise: Noise refers to irrelevant information or randomness present in a dataset that obscures the underlying pattern.

Normalization: Normalization is the process of restricting the weight values in regression models to prevent overfitting and improve computation speed.

Optimization: Optimization involves selecting the best element from a set of available alternatives based on a specific criterion.

Overfitting: Overfitting occurs when a model becomes overly tuned to the training data, incorporating specific details and noise that are unique to the dataset. Overfitting is identified when the model performs exceptionally well on the training/validation set but poorly on the test set or new real-world data. It indicates that the model has unintentionally learned patterns in the noise, falsely assuming they represent the underlying structure. Overfitting refers to a model that closely matches a particular dataset but fails to generalize effectively to unseen observations.

Parameters: Parameters are characteristics of training data that a machine learning model or classifier learns during the training process. They are adjusted using optimization algorithms and are specific to each experiment (e.g. weights, coefficients).

Precision: Precision is a metric that measures the performance of the model in classifying positive observations. Precision evaluates how often the model's positive predictions are correct.

Prediction: Prediction refers to the predicted output of a trained model when provided with an input instance.

Preprocessing: Preprocessing is the process of transforming raw data into a more comprehensible format, making it suitable for analysis or modeling.

Principal component analysis: PCA is a dimensionality reduction strategy that utilizes orthogonal transformation to convert correlated features into simplified, linearized outputs with correlation impacts removed (i.e. principal components).

Prior: Prior refers to the probability distribution that represents existing beliefs about a specific quantity before considering new evidence.

Rectified linear unit: Rectified Linear Unit is an activation function used in neural networks.

Recurrent neural networks: Recurrent Neural Networks are memory-based class of ANNs that simulate neural connectivity and internal memory, forming directed and undirected graphs along a sequence of data, thus enabling and fostering the ability to identify and forecast sequential patterns in space and time.

Regularization: Regularization is a technique employed to address the problem of overfitting. It involves adding a complexity term to the loss function, penalizing more complex models. Regularization introduces additional information to prevent overfitting.

Semi-supervised learning: Semi-supervised learning is a category of supervised learning techniques that utilize both labeled and unlabeled data for training. It typically involves using a small number of labeled instances along with a larger number of unlabeled data points. See also supervised learning and unsupervised learning.

State space: State space refers to the formulation of an artificial intelligence problem in terms of states and operators. It usually consists of a start state, goal state, and a problem space that is searched to find a solution.

Supervised learning: Supervised learning is the process of training a model using labeled data. It involves learning a function that maps input variables to output based on example input–output pairs.

Support vector machines: Support Vector Machines are discriminative classifiers that define a diverging hyperplane for labeled training data. For optimization, these hyperplanes are used to categorize new data points.

Synthetic data: Synthetic data refers to artificially generated data used when collecting sufficient real data is challenging or when the original data does not meet specific requirements.

Test set: A test set is a collection of observations used to assess the predictive performance of a model at the end of training and validation; more concretely, this subset of the data available provides further insight into model performance based on its ability to generalize and fit new unexplored datasets.

Time series: A time series is a sequence of data points recorded at specific times and indexed according to their order of occurrence.

Testing: Testing evaluates the final performance of a model using hold-out data in supervised machine learning. Testing data refers to the subset of available data selected for the testing phase during model development.

Training data: Training data is a subset of the data available and is commonly associated with the methodology of algorithm construction, providing information to learn and subsequently infer and generate predictions throughout model development.

Training set: A training set is a set of observations used to generate machine learning models through the training process.

Transfer learning: Transfer learning is a branch of machine learning that leverages knowledge gained from solving one problem to address a different but related problem. It involves reusing a pre-trained model or its weights as a starting point for a new model on a different task.

Uncertainty: Uncertainty refers to a range of values that are likely to contain the true value, representing the lack of certainty in a prediction or estimation.

Underfitting: Underfitting occurs when a machine learning algorithm fails to capture the underlying structure of the data effectively. This often happens when the model is too simplistic or inappropriate for the given task. Underfitting results in poor performance on both the training and test sets, as the model fails to incorporate relevant variations in the data necessary for accurate predictions.

Unsupervised learning: Unsupervised learning is a branch of machine learning that focuses on inferring a function describing the structure of unlabeled data. It involves training a model to discover patterns in an unlabeled dataset, such as clustering.

Validation: Validation is the process of using a separate set of data, known as the hold-out data, to assess the performance of a trained model. It helps determine if any iterative modifications are needed to improve the model's performance. Validation provides feedback on how well the current parameters of the model generalize beyond the training set. It is distinct from the testing phase, which is the final assessment of the model's performance.

Validation set: A validation set is a collection of observations used during model training to evaluate how well the current parameters of the model generalize beyond the training set. By assessing the performance on the validation set, one can identify if the model is overfitting. If the model's performance improves on the training set but deteriorates on the validation set, it suggests overfitting, and training should be terminated.

Vanishing gradients: Vanishing and exploding gradients present significant challenges and obstacles to the performance of recurrent neural networks in gradient-based learning methods with backpropagation. This issue arises because the weights of the neural network are updated in proportion to the partial derivative of the error function with respect to the current weight during each iteration of training.

Variance: Variance refers to the error caused by the sensitivity of a model to small fluctuations in the training set. It is computed as the expectation of the squared deviation of a random variable from its mean. A low variance indicates that a model is internally consistent, with predictions exhibiting minimal variation from each other after each iteration. However, high variance and low bias suggest the model may be overfitting and excessively relying on noise present in the training set.

Variance inflation factorization: A statistical measurement that quantifies the multicollinearity between independent variables within a given time series and/or regression problem.

Verification: Verification is the process of confirming that an implemented model functions as intended, ensuring that it operates according to the expected specifications and requirements. It involves validating that the model behaves correctly and produces the desired outputs within the defined constraints. Wiener–Khinchin theorem: Mathematical theorem that establishes a relationship between the autocorrelation of a stationary time series relative to the power spectral density of the random processes governing the time series in question.

Appendix C: Permafrost and carbon datasets

Appendix. Permafrost

Bakian-Dogaheh K, Chen R H, Moghaddam M, Yi Y and Tabatabaeenejad A 2020 ABoVE: Active Layer Soil Characterization of Permafrost Sites, Northern Alaska, 2018 (ORNL DAAC)

Brown J, Hinkel K M and Nelson F E 2000 The circumpolar active layer monitoring (CALM) program: historical perspectives and initial results Polar Geography 24 165–258

Chen A, Parsekian A, Schaefer K, Jafarov E, Panda S K, Liu L, Zhang T and Zebker H A 2015 Pre-AboVE: Ground-Penetrating Radar Measurements of ALT on the Alaska North Slope (ORNL DAAC)

Clayton L K et al 2021 Active layer thickness as a function of soil water content Environ. Res. Lett. 16 055028

Engram M J, Anthony K W and Meyer F J 2020 AboVE: SAR-based Methane Ebullition Flux from Lakes, Five Regions, Alaska, 2007–2010 (ORNL DAAC)

Nicolsky D J, Romanovsky V E, Kholodov A L, Dolgikh K and Hasson N 2020 ABoVE: Soil Temperature Profiles, USArray Seismic Stations, AK and Canada, 2016–2019 (ORNL DAAC)

Schaefer K et al 2021 Field Measurements of Soil Moisture and Active Layer Thickness in Alaska and Canada (ORNL DAAC)

Schaefer K et al 2021 AboVE: Active Layer Thickness Derived from Airborne L- and P-band SAR, Alaska, 2017 (ORNL DAAC)

Watts J D, Natali S and Minions C 2022 Soil Respiration Maps for the above Domain, 2016–2017 (ORNL DAAC)

Whitley M, Frost G, Jorgenson M T, Macander M, Maio C V and Winder S G 2018 AboVE: Permafrost Measurements and Distribution across the Y-K Delta, Alaska, 2016 (ORNL DAAC)

Yi Y and Kimball J S 2020 AboVE: Active Layer Thickness from Remote Sensing Permafrost Model, Alaska, 2001–2015 (ORNL DAAC)

Appendix. Carbon

Rocha A, Shaver G and Hobbie J 2020 AmeriFlux BASE US-An1 Anaktuvuk river severe burn, ver. 2-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246142

Rocha A, Shaver G and Hobbie J 2020 AmeriFlux BASE US-An2 Anaktuvuk river moderate burn, ver. 2-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246143

Rocha A, Shaver G and Hobbie J 2020 AmeriFlux BASE US-An3 Anaktuvuk river unburned, ver. 2-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246144

Billesbach D and Sullivan R 2022 AmeriFlux BASE US-A03 ARM-AMF3-Oliktok, ver. 5-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1498752

Billesbach D and Sullivan R 2021 AmeriFlux BASE US-A10 ARM-NSA-barrow, ver. 4-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1498753

Zona D and Oechel W 2016 AmeriFlux BASE US-Atq Atqasuk, ver. 1-1 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246029

Zona D and Oechel W 2019 AmeriFlux BASE US-Ivo Ivotuk, ver. 4-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246067

Zona D and Oechel W 2016 AmeriFlux BASE US-Brw barrow, ver. 2-1 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246041

Euskirchen E 2022 AmeriFlux BASE US-BZB bonanza creek thermokarst bog, ver. 4-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1773401

Euskirchen E 2022 AmeriFlux BASE US-BZF bonanza creek rich fen, ver. 4-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1756433

Euskirchen E 2022 AmeriFlux BASE US-Bzo bonanza creek old thermokarst bog, ver. 3-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1846662

Euskirchen E 2022 AmeriFlux BASE US-BZS bonanza creek black spruce, ver. 3-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1756434

Euskirchen E, Shaver G and Bret-Harte S 2022 AmeriFlux BASE US-Ich imnavait creek watershed heath tundra, ver. 4-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246133

Euskirchen E, Shaver G and Bret-Harte S 2022 AmeriFlux BASE US-Ics imnavait creek watershed wet sedge tundra, ver. 6-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246130

Euskirchen E, Shaver G and Bret-Harte S 2022 AmeriFlux BASE US-Ict imnavait creek watershed tussock tundra, ver. 4-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246131

Kobayashi H, Ikawa H and Suzuki R 2019 AmeriFlux BASE US-Prr poker flat research range black spruce forest, ver. 3-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246153

Randerson J 2016 AmeriFlux BASE US-Bn1 bonanza creek, 1920 burn site near delta junction, ver. 1-1 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246033

Randerson J 2016 AmeriFlux BASE US-Bn2 bonanza creek, 1987 burn site near delta junction, ver. 1-1 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246034

Randerson J 2016 AmeriFlux BASE US-Bn3 bonanza creek, 1999 burn site near delta junction, ver. 1-1 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246035

Torn M and Dengel S 2020 AmeriFlux BASE US-NGB NGEE Arctic barrow, ver. 3-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1436326

Torn M and Dengel S 2020 AmeriFlux BASE US-NGC NGEE Arctic council, ver. 1-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1634883

Ueyama M, Iwata H and Harazono Y 2019 AmeriFlux BASE US-Fcr cascaden ridge fire scar, ver. 2-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1562388

Ueyama M, Iwata H and Harazono Y 2022 AmeriFlux BASE US-Rpf poker flat research range: succession from fire scar to deciduous forest, ver. 7-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1579540

Ueyama M, Iwata H and Harazono Y 2022 AmeriFlux BASE US-Uaf University of Alaska, Fairbanks, ver. 10-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1480322

Sullivan P 2022 AmeriFlux BASE US-KPL Lily Lake Fen, ver. 1–5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1865478

Schuur T 2021 AmeriFlux BASE US-EML Eight Mile Lake permafrost thaw gradient, Healy Alaska., ver. 4-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1418678

Oechel W 2016 AmeriFlux BASE US-Hva happy Valley, ver. 2-1 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246064

Oechel W 2019 AmeriFlux BASE US-Upa Upad, Ver. 3-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1246108

NEON (National Ecological Observatory Network) 2022 AmeriFlux BASE US-xBA NEON barrow environmental observatory (BARR), ver. 5-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1671892

NEON (National Ecological Observatory Network) 2022 AmeriFlux BASE US-xBN NEON caribou creek—poker flats watershed (BONA), ver. 6-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1617727

NEON (National Ecological Observatory Network) 2022 AmeriFlux BASE US-xDJ NEON delta junction (DEJU), ver. 6-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1634884

NEON (National Ecological Observatory Network) 2022 AmeriFlux BASE US-xHE NEON Healy (HEAL), ver. 6-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1617729

NEON (National Ecological Observatory Network) 2022 AmeriFlux BASE US-xTL NEON Toolik (TOOL), ver. 5-5 AmeriFlux AMP, (Dataset) 10.17190/AMF/1617739

NEON (National Ecological Observatory Network) Soil temperature, RELEASE-2021 (DP1.00041.001) 10.48443/zayg-n802 (Dataset accessed from https://data.neonscience.org) (25 January 2021)

NEON (National Ecological Observatory Network) Triple aspirated air temperature (DP1.00003.001), RELEASE-2022 10.48443/q16j-sn13 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Soil water content and water salinity (DP1.00094.001), RELEASE-2022 10.48443/ghry-qw46 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Soil temperature (DP1.00041.001), RELEASE-2022 10.48443/9e56-pj39 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Soil CO2 concentration (DP1.00095.001), RELEASE-2022 10.48443/h4xx-yz37 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Precipitation (DP1.00006.001), RELEASE-2022 10.48443/6wkc-1p05 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) 2D wind speed and direction (DP1.00001.001), RELEASE-2022 10.48443/77n6-eh42 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Shortwave and longwave radiation (net radiometer) (DP1.00023.001), RELEASE-2022 10.48443/stbf-bh38 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Groundwater and active layer measurements at permafrost sites (DP1.20099.001), RELEASE-2022 10.48443/1v5y-av98 (Dataset accessed from https://data.neonscience.org) (2 November 2022)

NEON (National Ecological Observatory Network) Soil physical and chemical properties, Megapit (DP1.00096.001), RELEASE-2022 10.48443/10dn-8031 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Albedo—spectrometer—mosaic (DP3.30011.001), RELEASE-2022 10.48443/j8yy-ky49 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Discrete return LiDAR point cloud (DP1.30003.001), RELEASE-2022 10.48443/ca22-1n03 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Elevation—LiDAR (DP3.30024.001), RELEASE-2022 10.48443/ymmp-fr93 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) High-resolution orthorectified camera imagery (DP1.30010.001), RELEASE-2022 10.48443/ksv5-ts70 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Soil physical and chemical properties, periodic (DP1.10086.001), RELEASE-2022 10.48443/fk4j-ax76 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Field spectral data (DP1.30012.001), RELEASE-2022 10.48443/qy50-d561 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Soil temperature (DP1.00041.001), RELEASE-2022 10.48443/9e56-pj39 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Dissolved gases in surface water (DP1.20097.001), RELEASE-2022 10.48443/4661-hh45 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Chemical properties of groundwater (DP1.20092.001), RELEASE-2022 10.48443/h7vt-3v55 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Litterfall and fine woody debris production and chemistry (DP1.10033.001), RELEASE-2022 10.48443/b2qt-7z79 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Soil CO2 concentration (DP1.00095.001), RELEASE-2022 10.48443/h4xx-yz37 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Soil water content and water salinity (DP1.00094.001), RELEASE-2022 10.48443/ghry-qw46 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

NEON (National Ecological Observatory Network) Bundled data products—eddy covariance (DP4.00200.001), RELEASE-2022 10.48443/7cqp-3j73 (Dataset accessed from https://data.neonscience.org) (22 November 2022)

Please wait… references are loading.

Supplementary data (4.9 MB PDF) Supplementary .docx file includes an initial correlogram, KDE and PCA plots, comprehensive list of autocorrelation and stationarity testing, tables and layer graphs of submodule architectures, and residual deep learning background

Supplementary data (4.4 MB PDF) Supplementary .pdf file includes an initial correlogram, KDE and PCA plots, comprehensive list of autocorrelation and stationarity testing, tables and layer graphs of submodule architectures, and residual deep learning background