Crop residue burning practices across north India inferred from household survey data: bridging gaps in satellite observations

In north India, agricultural burning adversely affects local and regional air quality during the post-monsoon season (October to November), when the prevailing meteorology is favorable for smog and haze formation. Quantifying the contribution of smoke to air pollution in this region, however, is challenging. While the Moderate Resolution Imaging Spectroradiometer (MODIS), aboard NASA’s Terra and Aqua satellites, provides a nearly 20-year record of global fire activity, the sensor cannot adequately capture small, short-lasting agricultural fires due to its moderate spatial resolution (500 m to 1 km) and limited overpasses (twice daily for each satellite), as well as the hazy conditions that typically obscure the north India land surface at this time of year. Moreover, current global fire emissions inventories based on MODIS observations can differ by up to an order of magnitude in this region. Here we incorporate household survey data to bridge gaps in MODIS observations and improve estimates of fire emissions over the states of Punjab, Haryana, Uttar Pradesh, and Bihar during the 2003-2018 post-monsoon burning seasons. We develop a novel method that adjusts MODIS Fire Radiative the state-level emissions to construct a gridded inventory at daily, 0.25° x 0.25° resolution over north India from 2003-2018. The inventory, SAGE-IGP ( S urvey Constraints on FRP-based Ag ricultural Fire E missions in the I ndo- G angetic P lain), improves modeling assessments of air quality impacts from agricultural burning, thus supporting effective policy development.


Introduction
Agricultural fires are an important seasonal source of outdoor emissions that degrade air quality in north India Liu et al., 2018;Vadrevu et al., 2011). The gases and aerosols released by open fires not only degrade regional air quality and increase risk to acute respiratory infection and other lung and cardiac diseases (Bikkina et al., 2019;Chakrabarti et al., 2019), but may also damage crops due to elevated surface ozone exposure (Burney and Ramanathan, 2014;Ghude et al., 2016;Sinha et al., 2015). Despite the health and environmental hazards of emissions from these fires, standard estimates can differ by an order of magnitude (Liu et al., 2019), depending on the species and year. Here we diagnose the key reasons for the differences among existing emission inventories and provide an improved inventory constrained by on-the-ground survey data and additional satellite observations.
The practice of agricultural burning in north India gained traction with the rise of combine harvester use in the mid-to-late 1980s (Badarinath et al., 2006;Liu et al., 2019). Mechanical harvesting generates abundant root-bound and loose crop residues that are difficult to manage manually, and steady increases in crop production have added to the volume of excess residues. For many farmers, burning is a convenient, cost-effective method to remove crop residues and quickly transition between the monsoon (kharif) and winter (rabi) crops; observations of active fires and burned area shows increases of ~40-142% from 2003-2016 in the western Indo-Gangetic Plain (IGP) (Liu et al., 2019(Liu et al., , 2020a. Both satellites and ground-based monitors have detected enhanced aerosol loading downwind of smoke plumes from agricultural fires across north India in recent years (Badarinath et al., 2009;Jethva et al., 2018;Kaskaoutis et al., 2014;Liu et al., 2018;Sarkar et al., 2018). Recent bans and intervention efforts, such as Happy Seeder technology, aim to reduce post-monsoon fires (Shyamsundar et al., 2019;Sidhu et al., 2015;Tallis et al., 2017).
Much of the focus in the literature so far has been on agricultural fires in Punjab and Haryana, two northern states that account for over 90% of post-monsoon fire intensity in India (Liu et al., 2020a;Sarkar et al., 2018;Vadrevu et al., 2013). Less is known about burning practices elsewhere in north India, such as Uttar Pradesh (UP) and Bihar, where many farmers also follow a rice-wheat rotation (Singh et al., 2011). This study examines crop residue burning practices in four states: Punjab, Haryana, UP, and Bihar. One difficulty in monitoring agricultural fires in this region is the coarse spatio-temporal resolution of satellite measurements (Liu et al., 2019). The small size and duration of the fires, as well as increasing haziness from the smoke itself, also complicate interpretation of satellite observations Liu et al., 2019;Thumaty et al., 2015). These challenges may lead to gross underestimation of fire emissions driving atmospheric models Dekker et al., 2019;Lasko et al., 2017). To date, models have relied on global fire emissions inventories due to the lack of inventories specific to India, but these emissions estimates, including those for aerosols, in the global inventories can differ by an order of magnitude (Liu et al., 2019). Here we use survey data to help constrain satellite-based estimates by filling observational gaps.
In this study, we use on-the-ground survey data to help constrain satellite-based estimates and provide greater clarity on the magnitude of fire emissions and their impact on air quality downwind. We also seek to better understand the drivers, consequences, and farmer perceptions of crop residue burning across north India. We develop a FRP-based approach incorporating satellite and household survey data to adjust state-level MODIS FRP for small fires from VIIRS, cloud/haze gaps in satellite observations, partial-field burns, and the diurnal cycle in fire activity. We validate our FRP-based estimates using survey burn rates, government statistics on crop production, and fuel-related factors from the literature. Finally, we spatially disaggregate total dry matter (DM) burned to construct a daily, gridded 0.25° x 0.25° emissions inventory for Punjab, Haryana, UP, and Bihar from 2003-2018. Our regional inventory, SAGE-IGP, can be used to update agricultural emissions in north India in existing global inventories for use in atmospheric modeling studies, thus bolstering confidence in assessments of the burden of smoke from agricultural fires on air quality metrics downwind.

Study region
Many agricultural regions across the Indian IGP are double-cropped with a rice-wheat rotation, which is critical to the food security and livelihood of over 400 million inhabitants across north India . In this study, we focus on four states in north India: Punjab, Haryana, UP, and Bihar ( Figure 1a). Punjab and Haryana, the western IGP states and "breadbasket" of India, have rice yields ~1.5 times those of UP and Bihar (Palanisami et al., 2019).

Household survey
In a household survey in 2017, we asked over 2000 farmers in the four target states about agricultural practices pertaining to rice harvests and rice residue burning for the 2016-17 growing season. For each village, we used a stratified purposive sampling technique to select a subset of 20 households that represent the village-level distribution of landholding sizes and social classes (Palinkas et al., 2015). We hired two survey teams to conduct the surveys on a mobile-based application in Hindi (Haryana, UP, and Bihar) and Punjabi (Punjab). In particular, we asked farmers about the method of harvesting rice (mechanical or manual) and subsequent burning of rice residues (Table S1). In 2018, we repeated the survey with 90% of the same participants for the 2017-18 growing season and expanded our list of questions to determine the farmers' primary reasons for crop residue burning, as well as details on their burning practices: (1) start year of burning, (2) method of burning (complete or partial burn of field), (3) time of day for burning, (4) wait time (in days) from harvest to burning, and (5) reasons for burning. More details about the household survey are provided in Supplementary Section S1.
As our fire metric, we rely on the daily maximum Fire Radiative Power (FRP), a proxy for fire intensity. The Fire Radiative Energy (FRE), or the time integral of FRP, scales linearly to dry matter burned (Wooster et al., 2005).
Following Zhang et al (2014) and Liu et al (2020a), we estimate the start, midpoint, and end of the cumulative FRP during each post-monsoon burning season from 2003-2018: where < " is the sigmoid-smoothed partial sums of the sequence of daily FRP over day 1 to k, is the total number of days in the burning season, and ! is the first day by when < " , normalized by the seasonal sum of FRP < % , has surpassed breakpoint . We take = 0.1, 0.5, and 0.9 to represent the start, midpoint, and end, respectively, of the burning season. Unlike Liu et al (2020a), here we test the effect of sigmoid smoothing on estimating ! and its trends (Zhang et al., 2014). Zhang et al. (2014) found that the sigmoidal function well represents the cumulative biomass consumed by fire. For sigmoid smoothing, we use the nonlinear squares nls function in the R stats package to fit a sigmoidal curve to the partial sums of FRP: where and are shape parameters to be optimized and is a sequence from 1 to representing days in the burning season.

Statistical adjustments of agricultural fire emissions using satellite and survey data
Liu et al (2019) found that MODIS cannot capture > 75% of small, short-lasting fires in Punjab and Haryana. While that study developed a hybrid MODIS-Landsat algorithm (ModL2T) to improve the spatial allocation of burned area (BA) and BA-based fire emissions, the low temporal resolution of Landsat (every 16 days) and the possible conflation of harvested area and burned area suggests that FRP-based algorithms may enable fire emissions estimates at finer temporal resolution and with lower commission errors . Here we first derive daily state-level post-monsoon fire emissions from 2003-2018 for Punjab, Haryana, UP, and Bihar from MODIS FRP (Sections 2.4.1-2.4.2). Then, we disaggregate the state-level emissions to a gridded, 0.25° x 0.25° inventory (Section 2.4.3). We estimate emissions by state first rather than by grid cell to limit inconsistencies between neighboring grid cells and for computational efficiency. A reconstruction of long-term fire activity from 2003-2018 is valuable This is a postprint for EarthArXiv. This paper has been peer-reviewed andpublished in Atmospheric Environment: X at https://doi.org/10.1016/j.aeaoa.2020.100091. for assessments of air quality trends and to compare our regional inventory with global fire inventories, which generally start from 2003 at daily scale.

Adjustment of FRP based on survey data and additional satellite observations
Using both satellite and household survey data, we adjust the MODIS FRP to account for small fires, cloud/haze gaps, partial burning, and limited satellite overpasses. For each state and year, we derive an adjusted daily FRP timeseries over a 4-month period, September to December. This extended study period for post-monsoon fires allows us to accommodate the different timing of each state's fire season and to ensure stability in smoothing FRP timeseries. Figure 2 shows the graphical depiction of each step detailed below.

MODIS observations of FRP.
We first sum daily MODIS Terra and Aqua FRP during each post-monsoon burning season and over each state. This step assumes that the agricultural fires in this region are short-lived (~ 1/2 hour), following Thumaty et al (2015), and that the instruments detect different fires at the overpass times, Terra at 10:30 a.m. and Aqua at 1:30 p.m. at the equator. Here we use the maximum FRP from the MOD/MYD14A1 gridded active fire product and apply an agricultural mask derived from MCD12Q1 to ensure that only cropland fires are considered and conservatively exclude FRP in intermixed land covers. We adjust Terra and Aqua FRP separately for Steps 2-3 but sum the adjusted Terra and Aqua FRP at the end of Step 3.

Use of VIIRS observations for small fires.
Next, we incorporate the FRP observations from VIIRS, which at 375 m has a finer spatial resolution than the MODIS products (1 km) and so can more accurately capture fine-scale fire activity than MODIS. To account for these missing small fires, we diagnose those VIIRS active fires that do not intersect with MODIS/Aqua active fires within a 1-km buffer and then add VIIRS FRP of these fires to MODIS/Aqua FRP. We use only MODIS/Aqua FRP because VIIRS does not observe active fires during the Terra overpass. Because VIIRS observations are available only from 2012, we derive the incremental VIIRS boost for 2003-2011 for the entire state by taking the average ratio of additional VIIRS FRP and MODIS/Aqua FRP over 2012-2018 and then scaling up the MODIS/Aqua FRP over the earlier years by that ratio. Assuming that the fraction of FRP missed by MODIS but detected by VIIRS is the same during the Terra and Aqua overpasses, we also boost MODIS/Terra FRP uniformly by the same ratio from 2003-2018 to account for missed small fires during the morning overpass.
3. Filling in gaps of observed FRP due to clouds and haze. The evolution of fire activity over the burning season as detected by MODIS is not smoothly varying but is instead characterized by dips or gaps in regional total FRP. Cusworth et al (2018) suggest that this large day-to-day variability in active fire detections in north India is due in large part to clouds, haze, and/or smoke, occasionally obscuring the fire activity on the ground ( Figure S3). While MODIS and VIIRS routinely detect fires through thin haze by using thermal infrared bands (Giglio et al., 2016), the persistent, dense haze during peak fire days in north India can decrease brightness temperature over a broad region, thus interfering with detection of thermal signatures of small fires. Generally, agricultural fires are short-lasting, but temporally and spatially aggregating the fires in Punjab and Haryana reveals a Gaussian-like distribution (Jethva et al., 2019;Liu et al., 2020a). Taken together, many agricultural fires act like a large, long-lasting fire, and so we can track the influence of cloudy/hazy days as deviations from the expected Gaussian curve. To test this hypothesis, we check whether these dips or gaps in the summed FRP timeseries for each state correspond with MODIS observations of surface reflectance (MOD/MYD09GA) in the red visible band, * . As noted above, surface reflectance in this band is sensitive to clouds (Xiang et al., 2013) and so would be expected to anticorrelate with the area visible to satellites during the burning season. We extend this assumption to thick haze, which can be misclassified as cloud in this region and also obscure active fires from satellite detection . We then take advantage of * measurements to gauge the extent to which clouds or haze interferes with fire detection, and we iteratively fill in the cloud/haze gaps in the statewide data for each fire season. Additional details on cloud/haze gap-filling procedure is described in Supplementary Section S3.2.

4.
Boosting FRP with survey data on partial field burning. The survey data reveal that in the four states, 30-57% of farmers piled the loose crop residue in the center of the field before setting the residue on fire, resulting in partial burning of the field. Taking the practice into account has importance in constructing fire emission inventories for three reasons. First, the FRP from partial burns, which consume small, discrete areas, are much less likely to be observed from space than the FRP from fires that completely burn a field (Liu et al., 2019). Second, only loose residues are set on fire in partial burns, yielding less DM burned than in complete burns . Third, observations suggest that the emissions factor for PM2.5 in partial burns, with respect to the mass of rice residue burned, is on average ~1.92 times that for complete burns due to the incomplete, smoldering combustion of wetter residues .
To overcome these challenges, we assume as an upper bound that all partial fires have been missed by satellite detection (Liu et al., 2019). For each state, we boost the daily FRP by the partial-burn fraction derived from survey data and normalized by operational landholding area. To account for the lower mass of DM burned in partial burns, we also apply a scaling factor to the partial-burn FRP, since FRP is linearly proportional to DM burned (Wooster et al., 2005). Here we scale partial-burn FRP by 0.75, or the approximate fraction of total crop residues that are piled in the center of the field and burned. This factor assumes a rice plant height of ~101 cm, of which 20-22 cm are left standing after harvest (Mahajan et al., 2009). Our resulting estimates of FRP from partial fires are then distributed uniformly in time across each burning season.
5. Adjustment to take into account the diurnal cycle of fire activity. The two satellites associated with MODIS each have one daytime overpass per day -Terra at ~10:30 a.m. and Aqua at ~1:30 p.m. local time at the equator. Typically, Aqua detects over five times as many fires as Terra in northwest India during the post-monsoon (Liu et al., 2019). These overpass times can miss the peak burning times of individual agricultural fires, which are small and shortlived, each lasting only about half an hour (Thumaty et al., 2015). Here we adjust the satellitederived FRP to reflect those fires unseen by satellites. While GFEDv4s provides 3-hourly diurnal fractions of fire activity, they are extrapolated to other regions using geostationary observations over North and South America, aggregated by broad land cover types (Mu et al., 2011;van der Werf et al., 2017). We thus take advantage of the survey data to adjust the FRP captured by MODIS to reflect the diurnal variation of agricultural fire activity specific to north India. The survey responses are separated into four time periods: early morning (4-10 a.m.), mid-day (10 a.m.-2 p.m.), evening (2-6 p.m.), and late night (6-11 p.m.). For example, the survey data reveal that 8-41% of IGP farmers typically set fires between 10 a.m.-2 p.m., depending on the state. Additionally, the Terra and Aqua/S-NPP daytime overpasses only partly overlap with the mid-day burning window, assuming that all fires last half an hour. Accounting for variance in when satellites see each fire and how long each fire burns, we estimate that satellite-derived FRP captures just 45 minutes of fire activity per satellite over this 4-hr mid-day time interval ( Figure  S5). The temporal distribution of active fire counts, binned at 15-minute intervals, for Terra and Aqua show that Terra peaks at ~11:15 a.m. while Aqua peaks at ~1:45 p.m. ( Figure S6). We extrapolate Terra FRP from 10:00 a.m. to 12:30 p.m., and Aqua FRP from 12:30-2 p.m. to match the survey's 4-hr mid-day window. Based on the fire duration and transition from Terra to Aqua at 12:30 p.m., we multiply Terra and Aqua FRP by a scaling factor of 3.33 and 2, respectively. We further adjust daily total FRP by assuming that all fires outside the mid-day window are undetected and by adding FRP increments according to the temporal distribution implied by the survey data. We weight these increments by the operational landholding area with reported burning in each time window.
As a post-processing step, we remove anomalous FRP spikes that often occur outside the burning season and are likely contaminated by false satellite detections. An anomalous day is tagged if its FRP exceeds three times the maximum FRP in a 2-day buffer window (5 days in total) and is above the 25th percentile of daily FRP from September-December of that year.
To account for agricultural fires that extend eastward from Haryana into the state of Rajasthan along the Ghaggar-Hakra River, we also include Ganganagar and Hanumangarh, two districts in north Rajasthan. We follow the same methods as described above but use survey data for Haryana for Steps 4-5.

Conversion to dry matter burned and emissions
For the final step in constructing our improved fire inventory, SAGE-IGP, we follow Kaiser et al (2012) to convert FRP in each grid cell to dry matter burned and then to emissions for various chemical species, as is done in constructing the Global Fire Assimilation System (GFAS) emissions inventory: where + is the emissions of species (g species), FRE is the fire radiative energy (MJ), or the time integral of FRP, is a conversion factor dependent on land use/land cover (kg DM MJ -1 ) that yields DM burned, and + is the emissions factor for species (g species kg -1 DM). To convert FRP to FRE, we multiply the adjusted daily FRP by the lifetime of the agricultural fires, which we assume to be 30 minutes, or 1.8 x 10 4 s day -1 , in this region (Thumaty et al., 2015). Following Kaiser et al (2014) and Liu et al (2015), we use a conversion factor for agricultural fires of 0.41 kg MJ -1 .
To validate the DM burned derived from adjusted FRP, we focus on Punjab, which accounts for > 85% of MODIS-observed FRP during the post-monsoon in the study region. We use a bottom-up method, following Aalde et al (2006), that involves burn rates from the household survey and government crop production estimates from the Indiastat data portal (Indiastat.com), and crop-specific parameters from literature for 2016 and 2017: where (,-%./ is the fraction burned, is crop production in kg (in this case, of kharif rice), This is a postprint for EarthArXiv. This paper has been peer-reviewed and published in Atmospheric Environment: X at https://doi.org/10. 1016/j.aeaoa.2020.100091.
is residue-to-crop ratio, 33 is combustion completeness, and 01 is the mass fraction of DM burned of total from crop production (Table 1). Here, fuel loading ( ) is the product of , , and 01 over the cultivated area ( ) in units of g m -2 ; fuel consumption ( ) is the product of fuel loading and 33 : Following the FRP-based method for estimating adjusted DM burned, here we also adjust the DM for partial burns using survey data.
As we will see, our top-down estimates of fuel load agree well with bottom-up validation for the 2016 and 2017 post-monsoon burning seasons (Section 3.2). We then extend these bottom-up estimates to 2003-2018 by first calculating the ratio of survey burn rates to the satellite-derived, adjusted FRP for 2016-17 and then applying this ratio to all years in the satellite FRP record.
Finally, application of emissions factors allows us to quantify emissions of black carbon (BC) and primary organic carbon (OC), as well as of CO2 and CO, from the agricultural fires. We focus on these four species for the following reasons: OC and BC as components of particulate matter, CO2 as a greenhouse gas, and CO as a primary pollutant from combustion. However, we note that DM burned, provided in our SAGE-IGP inventory, can be converted to any chemical species provided that a corresponding emissions factor is available. Here we use the compilation from Andreae (2019) to be consistent with fire emissions broadly designated as "agricultural" in standard global inventories, but we note differences in emissions factors that are region-specific and derived from rice residue burning in Supplementary Section S3.4. For OC and BC from partial field burns, we additionally scale DM by a factor of 1.92 to account for the higher PM2.5 emissions factor in these fires relative to complete-field burns , as described in Section 2.4.1. We compare the resulting statewide emissions estimates with five global inventories: (1)

Constructing a spatially and temporally explicit gridded emissions inventory
The steps described so far yield total seasonal emissions for each state for 2003-2018. We next disaggregate the state-level DM emissions to daily, 0.25° x 0.25° spatial resolution to create a gridded inventory, SAGE-IGP. We start with total state-level emissions rather than the finer gridded resolution to limit noise, ensure convergence in our cloud/haze gap-filling step, and aggregate survey responses from sparsely located households.
First, we allocate the seasonal DM emissions spatially according to the fraction of MODIS Terra + Aqua unadjusted FRP in each grid cell for the season. Second, we approximate the evolution of fire activity over the season in each grid cell as Gaussian, using the dates of three breakpoints, or ! , where = 0.1, 0.5, and 0.9 (defined in Section 2.2.1): where is the value of the Gaussian on day . Because the day of peak burning varies spatially within the state, we cannot simply impose uniform daily variability across the state using our daily DM emissions. For each grid cell, the corresponding Gaussian distribution, whose maximum value is 1, is multiplied by the spatially-allocated DM emissions from Step 1. Finally, we iteratively nudge the gridded DM emissions until convergence such that (1) the daily total of our gridded inventory matches the state-level adjusted DM emissions, and (2) the spatial allocation of our gridded inventory matches that of the MODIS Terra + Aqua unadjusted FRP on a seasonal basis. One caveat is that this step assumes all grid cells within each state are equally obscured by clouds and/or thick haze. However, this is a reasonable assumption given that cloud cover and thick haze generally affect large areas across the state ( Figure S3). With our gridded inventory, we also provide an ancillary dataset of gridded hourly fractions of fire activity, based on household survey data (Supplementary Section S3.3).

Ground and satellite-based measurements of aerosols
We use ground and satellite-based measurements of aerosols to check whether we improve the temporal distribution of fire emissions over current global inventories. We focus on October-November 2017, when a hazy/cloudy period lasted for almost 3 weeks during the postmonsoon fire season. The Aerosol Robotic Network (AERONET) site in Kanpur, India (80.23°E, 26.51°N) provides a long record of ground-based aerosol optical depth (AOD) measurements (from 2001-present), which have been used to infer the properties and transport of smoke aerosols emitted from post-monsoon agricultural fires across the IGP (Kaskaoutis et al., 2014). As an ancillary dataset, we use the Aerosol Index (AI) from the Ozone Measuring Instrument (OMI) aboard the Aura satellite, gridded to a spatial resolution of 1° x 1°. The OMI AI reliably indicates enhancements in absorbing aerosols, such as those in soot and smoke, using radiances at the 354 and 388-nm ultraviolet wavelengths (Kaskaoutis et al., 2014;Torres et al., 2007). We spatially average daily AI over Punjab but exclude those days with only one observation across the state.

Atmospheric modeling of smoke from agricultural fires: validation using station PM2.5 observations
We assess the utility of our regional inventory, SAGE-IGP, compared to the five global inventories, in the context of atmospheric modeling (Table S10). Following Cusworth et al (2018), we use the Stochastic Time-Inverted Lagrangian Transport (STILT) model, version 2, to generate gridded maps of the sensitivity of air quality in New Delhi to upwind fire emissions in This is a postprint for EarthArXiv. This paper has been peer-reviewed and published in Atmospheric Environment: X at https: //doi.org/10.1016/j.aeaoa.2020.100091. October-November from 2013-2018 (Fasoli et al., 2018). We drive STILT with meteorology from the Global Data Assimilation System (GDAS; https://www.ready.noaa.gov/archives.php), available at 0.5° x 0.5° spatial resolution, to simulate the 5-day back trajectories of an ensemble of 500 air parcels. The resulting STILT sensitivity footprints, in units of ppm/[μmol/(m 2 s)], are then multiplied by the emissions of primary PM2.5 from each fire inventory, including SAGE-IGP, to obtain the modeled smoke PM2.5 for New Delhi. Here we define primary PM2.5 as the sum of OC and BC, following Koplitz et al (2016) and Cusworth et al (2018). For each inventory and year, we then calculate the correlation between daily-averaged modeled PM2.5 and PM2.5 observed at the U.S. Embassy in New Delhi (https://in.usembassy.gov/embassyconsulates/new-delhi/air-quality-data/). Cusworth et al (2018) previously found that the PM2.5 observations at the U.S. Embassy are well correlated with average citywide measurements from the Indian Central Pollution Control Board (CPCB) during the post-monsoon burning period. Figure 3 shows the average temporal evolution of fire activity, crop phenology, and rainfall in Punjab, Haryana, UP, and Bihar derived from satellite data. The total post-monsoon fire intensity in Punjab is on average one to two orders of magnitude higher than that in Haryana, UP, and Bihar. Punjab is a highly productive state with larger fields and higher use of combine harvesters, thereby yielding more excess residues that need to be managed. As revealed by the upward triangles in the NDVI time series, maximum winter greenness in the western IGP is greater than that in the eastern IGP, with Haryana at 0.7 and Punjab at 0.76 versus Bihar at 0.56 and UP at 0.6; this confirms the gap in winter crop productivity between these two regions (Jain et al., 2017). Maximum monsoon greenness is more homogenous across all states (0.61-0.71). In the western IGP, the summer monsoon follows the pre-monsoon fire season from March to May and precedes the post-monsoon fire season from October to December. In the eastern IGP, the pre-monsoon fire season follows the earlier monsoon onset and thus starts in mid-March rather than mid-to-late April. As the monsoon continues through October, the post-monsoon fire season occurs later and extends to December.

Crop residue burning across the Indo-Gangetic Plain: drivers, consequences, and farmer perceptions
Traditionally, farmers across the IGP have harvested rice in the post-monsoon season manually. By 2017, 61% of households in surveyed in Bihar and 71% in UP still followed this practice, while 62% in Haryana and 93% in Punjab had transitioned to fully mechanized harvesting, namely using combine harvesters (Figure 4). The large amounts of loose and intact residues generated from combine harvesters are difficult to clear manually and thus often burned (Tallis et al., 2017). Based on 474 responses, we find that IGP farmers started to burn rice residues as early as 1957, with the most rapid growth occurring after the mid-1990s (Figure 1a). The 10-yr period with the highest rate of households adopting the practice of crop residue burning took place more than a decade earlier in Punjab (mid-1990s to early 2000s) than in Haryana, Bihar, and UP (mid-2000s to 2010s).
As crop production increased and mechanization continued spreading across eastern IGP, burn rates also increased. In 2017, over a quarter of surveyed farmers burned crop residue after This is a postprint for EarthArXiv. This paper has been peer-reviewed and published in Atmospheric Environment: X at https: //doi.org/10.1016/j.aeaoa.2020.100091. rice harvests, with post-monsoon fire activity concentrated in Punjab: 53% of farmers in Punjab burned rice residue, compared to 9-30% in Haryana, Bihar, and UP (Figure 4). At the household level, the year-to-year persistence in burning varies: in 2016, higher percentages of farmers in Punjab (82%), Haryana (20%), and UP (14%) burned rice residue compared to 2017, while a lower percentage of farmers in Bihar (18%) burned. However, the 2016-2017 decline in the burn rate in Punjab is less pronounced (71% vs. 89%) when weighted by operational landholding area, suggesting that farmers with larger fields continued to burn residues in 2017. In any case, the recent decline in burning may reflect intentional underreporting, given the recent government bans on agricultural fires.
We find that the time of burning varies spatially: peak burning occurs roughly evenly between mid-day (10 a.m.-2 p.m.) and evening (2-6 p.m.) in Punjab and Haryana but mainly in the evening in UP and Bihar (Figure 1b). Liu et al (2019) found that the method of burning also varies spatially: crop residues are primarily managed by complete-field burning in Punjab and northern Haryana and more commonly by partial-field burning in central and southern Haryana. This conclusion is supported by the increasing fraction of partial burning from western to eastern IGP (30% in Punjab to 57% in Bihar) (Figure 1c). Consistent with Kim Oanh et al (2011), we find that the type of burning is associated with the method of harvest, with 68% of fields with complete burns -and conversely, only 19% of those with partial burns -were harvested fully mechanically in the IGP.
Relative to Punjab, the more recent adoption of crop residue burning at the householdlevel in Haryana, Bihar, and UP, along with the current low rate of burning (12-46%) among survey households in these states, suggests high potential growth in agricultural fire activity (Figure 1a, Table S4). For example, assuming a future scenario in which all households across the IGP harvest rice mechanically, the rate of crop residue burning in terms of landholding area would increase by just 2-27% in Punjab and Haryana, compared to 2016-2017, but by 67-207% in UP and Bihar (Table S4). These values assume that the proportion of burned versus unburned fields relying on mechanized harvesting remains constant in each state.
Nearly 90% of farmers surveyed across the IGP believe that rice residue burning impacts the air quality of nearby cities (Table S3). Nevertheless, for farmers, the positive effects of burning, namely saving time and cost in rice residue management, ultimately outweigh the potential negative effects, including what the farmers fear could be damages to soil health and lower crop yield. We find that 56-92% farmers burn rice residue to overcome the short turnaround time to prepare the land to sow the next crop ( Figure S1a). Nearly three quarters of households wait 10 or fewer days after rice harvests to burn the crop residue, underscoring the quick transition from the kharif to rabi crops ( Figure S1b). Other factors that play a role in the decision to burn crop residue include the unsuitability of the rice residue as cattle feed (42-76% of farmers), difficulty in cutting and managing the residue (61-80%), absence of technology to manage the residue (29-64%), and lack of incentive from the government to not burn, especially in Punjab, where 81% of surveyed farmers cite this factor ( Figure S1a). In addition to circumventing the short transition period between crops, 80% of farmers say that burning saves cost in cutting and managing rice residue (Table S3). On the other hand, more farmers (39-44%) believe that crop residue burning negatively impacts soil health, in terms of crop yield, fertilizer usage, and soil color and texture. Only 7-29% of farmers think that fire improves soil health in these ways. Figure 2 shows an example of daily timeseries of MODIS FRP adjusted for small fires, cloud/haze gaps in satellite observations, partial-field burning, and the fire diurnal cycle for Punjab for the 2017 post-monsoon season. The VIIRS small fires boost increases MODIS FRP on clear-sky days and overall by 109%, while the cloud/haze gap-fill further increases the overall MODIS + VIIRS FRP by 142%, with the greatest adjustment during cloudy/hazy periods, such as from October 30 to November 17. The household survey data implies that partial field burning adds 33% more FRP. Accounting for the fire diurnal cycle results in a further 361% boost in FRP, by far the most uncertain of the adjustments. This boost is due to the large number of shortlasting fires inferred from the survey data that occur outside the satellite overpass times, leading to "missing," or unobserved fire activity in this region. Our FRP estimates are not sensitive to the assumption that fires last just half an hour, because any change in our fire duration assumption affects both the diurnal cycle adjustment and the FRP to FRE conversion. For example, if we assume instead that fires last an hour, the factor to account for fires seen outside the satellite overpasses during mid-day survey period would be halved, and the FRP to FRE conversion factor would double, thereby yielding no net change in our estimate of DM burned.

Adjusted emissions from agricultural fires using satellite and survey data
Using the adjusted FRP, we estimate on average 10.4 ± 3 Tg DM burned each year, or 65 ± 18 Gg OC, 5.6 ± 1.6 Gg BC, 791 ± 225 Gg CO, and 14.9 ± 4.2 Tg CO2, in the IGP per postmonsoon burning season from 2003-2018 (Figure 5a-b). Punjab comprises 70% of total DM burned and 67% of aerosol emissions. Importantly, for Punjab, our FRP-based estimates of DM burned calculated from adjusted FRP in 2016-2017 are consistent with our bottom-up estimates based on burn rates from the household survey and Indiastat-derived fuel loadings (Figure 6a), lending confidence to our method. Overall, we find that DM burned increased by 112% from 2003 to 2018. In contrast, without adjustment, the apparent 16% increase in MODIS Terra + Aqua FRP is not statistically significant (p = 0.45). The discrepancy in trends arises because as fire intensity increases, haze cover also likely increases and obscures fires to a greater extent. Our cloud/haze gap-fill compensates for the ~36% decline in the satellite observable fraction ( Figure S4b  2) values. Our estimates are closest in magnitude to FINNv1.5, higher than GFEDv4s, GFASv1.2, and QFEDv2.5r1 but lower than FEERv1.0-G1.2. These differences with other inventories are discussed below. As we shall see, the close match with FINNv1.5 is fortuitous, with FINNv1.5 underestimating burned area and overestimating fuel consumption.
To further examine the utility of our adjusted FRP approach, we compare our daily DM burned with different global inventories in the context of aerosol loading in Punjab from October-November 2017 (Figure 7). In 2017, an almost 3-week cloudy/hazy period persisted from October 30 to November 17, with minimal fire activity detected during the second and third weeks of November. Using a model combined with satellite data, Dekker et al (2019) suggested This is a postprint for EarthArXiv. This paper has been peer-reviewed andpublished in Atmospheric Environment: X at https://doi.org/10.1016/j.aeaoa.2020.100091. that residential and commercial combustion was the most important driver of extreme pollution over the IGP from November 11-19, 2017. However, we argue that agricultural fire activity during this period is grossly underestimated and likely also a key emissions source. Our reasoning is as follows. First, three global inventories -GFASv1.2 (used in Dekker et al (2018)), FINNv1.5, and QFEDv2.5r1 -all show a hiatus in fire activity bounded by two local maxima in fire activity (Figure 7a-b). This hiatus coincides with the cloudy/hazy period and low satellite observable fraction (mostly < 0.7) during the Aqua overpass time, or when most postmonsoon fires occur (Figure 7b; Vadrevu et al 2011). Second, the variations in aerosol loading during this time period closely follow the Gaussian-like temporal evolution expected of postmonsoon fires (Figure 7c; Kaskaoutis et al 2014, Liu et al 2020a. Finally, large enhancements in both daily AOD (> 1) at Kanpur and mean OMI AI (> 1.5) over Punjab throughout the cloudy/hazy period suggest that fire activity continued during this time although obscured from satellite detection. Figure 8 shows the correlations of observed PM2.5 in New Delhi and modeled smoke PM2.5 using our inventory, SAGE-IGP, and the five global inventories from 2013-2018. Of particular importance is our finding that modeled PM2.5 using SAGE-IGP emissions is moderately correlated with observed PM2.5 in 2017 (r = 0.57) but not correlated when the global inventories are applied (r < 0.1). With regard to the 2013-2018 time period, the higher correlation achieved using SAGE-IGP (mean r = 0.6) over the global inventories (r = 0.39 to 0.45) demonstrates the utility of our regional inventory for atmospheric modeling studies.
Our study demonstrates that more rigorous correction is needed during persistent cloudy/hazy conditions than is carried out by the current cloud correction algorithms used in some inventories, such as GFASv1.2 and QFEDv2.5r1 (Darmenov and da Silva, 2013;Kaiser et al., 2012). Our method iteratively gap-fills this hiatus in observed FRP in 2017, resulting in daily DM burned that follows a Gaussian-like distribution similar to that in other years, and more closely matches the bottom-up approach based on the household survey burn rates and Indiastat rice production (Figure 7a; Liu et al, 2020a). However, the 5-31% underestimate of DM burned in 2017 using our FRP-based approach compared to the bottom-up method suggests that our cloud/haze gap-fill adjustments to MODIS FRP may still be somewhat conservative ( Figure 6). In particular, our cloud/haze gap-fill method requires some peak fire days to be well-observed to provide an upper bound for the adjustment. In 2017, however, weeks-long thick haze persisted through the expected peak fire days, leading to severe underestimates of fire activity ( Figure S3). Figure 5c shows average total DM burned from 2003-2018 over the IGP from our gridded 0.25° x 0.25° inventory. While our adjusted FRP approach leads to a more realistic seasonal, state-level budget for DM emissions than current global inventories, we note several limitations. First, the disaggregated emissions inventory contains statistical noise from the iterative cloud/haze gap-fill adjustments as well as from our application of a Gaussian temporal distribution of emissions within each grid cell. Further analysis using AOD observations with back trajectory modeling could help constrain the daily variability in fire emissions. Second, the household survey sample sizes for Haryana, UP, and Bihar are only 10-29% of that for Punjab, leading to uncertainty in the implications of the survey for these states (Figure 1). We are relatively confident in the partial burn fractions and the fire diurnal cycle that we derived from survey data. However, the burn rate used for validation is more uncertain, given that recent bans on agricultural burning may have led to underreporting by the farmers. We also do not know if the partial-field burn fraction or diurnal cycle changed over time. To our knowledge, the household survey provides the best available ground-level data we have access to on agricultural burning in north India. Third, the adjusted FRP assumes a uniform fire diurnal cycle across each state throughout the time period, as well as a uniform spatial and temporal distribution of partialfield burning. More detailed on-the-ground data may help constrain the spatio-temporal variability in these two parameters. Fourth, as with current global fire emissions inventories, we lack information to provide detailed uncertainty estimates for our gridded emissions. Thus far, only the emissions factors for different species allow for a detailed uncertainty analysis based on standard deviations from the laboratory and field studies compiled by Andreae (2019) and Lasko and Vadrevu (2018).

Limitations and uncertainties in constructing spatio-temporal explicit emissions
In our adjusted FRP approach, the fraction of cloud/haze gap-fill FRP to satelliteobserved FRP can be taken as the relative uncertainty of satellite-derived FRP from year to year ( Figure S7). This uncertainty anti-correlates with the satellite observable fraction ( Figure S4b). A further 5-9% uncertainty stems from the static VIIRS FRP boost factor applied to years from 2003-2011, when no VIIRS observations are available (Table S7). For the survey-based components, we use a bootstrap hold-out method to estimate 10-18% uncertainty in partial burn fractions and 14-42% in the diurnal cycle boost for the four states (Table S8). A more detailed discussion of the uncertainties associated with each step is provided in Supplementary Section S3.4.
In future versions of SAGE-IGP, several technical improvements can be made to the adjusted FRP method with regard to input satellite products. First, taking into account the slight day-to-day variability in satellite overpass times may increase the accuracy of the VIIRS boost and diurnal cycle adjustment. Currently, we assume that the Aqua and S-NPP overpasses are perfectly aligned for the VIIRS FRP boost, but the offset between the two overpasses varies from a few seconds to 50 minutes (Li et al., 2018). Second, the VIIRS and MODIS FRP merging can be fine-tuned by correcting for biases in FRP due to atmospheric absorption and solar zenith angle (Dickinson et al., 2016). Li et al. (2018) found nearly a 1:1 ratio between daytime MODIS and VIIRS FRP in Asia, though this correlation may vary with fire intensity, location, timing, and land cover. When available in Google Earth Engine, the VNP14A1 gridded active fire product can replace the VIIRS VNP14IMGML active fire geolocation product currently used in our workflow to better complement the MODIS MxD14A1 gridded products. Both the MODIS and VIIRS "A1" products are in sinusoidal projection, and together their use avoids resampling errors introduced by using the VNP14IMGML product, which is unprojected. Finally, explicit calculations of the detection limits of MODIS and VIIRS in view of both the small fire size and observing conditions are needed to refine the cloud-haze gap fill.

Differences in fire activity, fuel consumption, and emissions factors assumed in inventories and the consequences for emissions estimates
Global fire emissions inventories ingest different combinations of MODIS burned area and active fire products. For example, GFEDv4s, a bottom-up inventory, relies primarily on the This is a postprint for EarthArXiv. This paper has been peer-reviewed and published in Atmospheric Environment: X at https: //doi.org/10.1016/j.aeaoa.2020.100091. MCD64A1 burned area product with the MCD14ML active fire product for its small fire boost, while GFASv1.2, QFEDv2.5r1, and FEERv1.0-G1.2 use FRP from the MOD/MYD14 active fire products (Liu et al., 2020b). In Figure S9, we compare the spatial patterns of burned area (MCD64A1) and active fire pixels (MxD14A1) stacked by year, in which for each pixel any occurrence of burning for a given post-monsoon season is assigned a value of 1. The stacked values for MCD64A1 are more spatially uneven than MxD14A1 and tend to cluster, which may reflect classification bias in this product caused by the conflation of harvest and burning. This conflation becomes especially problematic as rice production increases (Jethva et al., 2019;Liu et al., 2020a), because the greater drawdown in satellite-observed greenness after rice maturation may artificially inflate total burned area.
In contrast, MxD14 can capture smaller, fragmented burns but may miss fires occurring outside overpass times and result in inconsistent detection from year to year. In constructing GFAS, Kaiser et al (2012) surmised that FRP observed during the MODIS overpasses is representative of daily fire activity. While this is a reasonable assumption for large wildfires, the short duration of fires in north India makes this approach problematic. Some top-down inventories, such as GFASv1.2, also rely in part on bottom-up inventories, such as GFEDv4s, to linearly scale FRP to DM (Kaiser et al., 2012). Consequently, biases in bottom-up estimates of DM burned can propagate to top-down inventories.
Here we discuss differences in the bottom-up approach used in GFEDv4s versus FINNv1.5. Calculating DM burned mainly depends on two variables: burned area and fuel consumption, which is the mass of biomass burned per unit area. We next explore how the range in these two components in different datasets affects estimates of DM burned and the resulting aerosol emissions for the 2003-2016 time period. We consider (1) burned area estimates from GFEDv4s, FINNv1.5, and ModL2T (Liu et al., 2019); (2) fuel consumption estimates from GFEDv4s, FINNv1.5, and Indiastat (Table S11) (Table S12; van der Werf et al 2017, Wiedinmyer et al 2011, Kaiser et al 2012, Darmenov and da Silva 2013, and from Andreae (2019). We find that the DM burned calculated using our best estimates of burned area (ModL2T) and fuel consumption (Indiastat) is consistent with the adjusted FRP approach of this study (7.2 Tg) ( Figure 9). This indicates that ModL2T burned area has utility for deriving agricultural fire emissions, but only when paired with reasonable fuel consumption estimates. The average outof-box GFEDv4s DM burned (3.2 Tg) is 56% lower than this study, while that of FINNv1.5 (7.9 Tg) is comparable. However, the FINNv1.5 fuel consumption is more than twice that estimated from Indiastat (Table S11), compensating for its 37-55% lower burned area compared to other estimates. If we apply FINNv1.5 fuel consumption to GFEDv4s and ModL2T burned area, the resulting DM burned is ~2-3 times as high as this study's FRP-based estimates (Figure 9).
We also calculate the percent contribution of burned area, fuel consumption, and emissions factors to the range in OC, BC, CO, and CO2 agricultural fire emissions. We find that fuel consumption is the most uncertain component (54-66%), followed by burned area (24-29%) and emissions factors (5-22%) (Table S13). We also diagnose a 16-27% decline (p < 0.05) in GFEDv4s and FINNv1.5 fuel consumption from 2003-2016 (Table S11), while that based on Indiastat is relatively constant (+3%, p = 0.3). Here we focus on the fuel load component of fuel consumption (see Eq. 6) since combustion completeness is not readily observable using satellite data. In Indiastat, the increase in rice production implies a higher fuel load, but the concurrent increase in rice area cancels out this effect. The discrepancy between the two satellite products and Indiastat can be explained by assumptions in GFEDv4s and FINNv1.5 regarding vegetated fraction and fuel load. First, both global inventories use satellite-derived sub-pixel-level vegetated fraction to scale fire pixels such that only the vegetated fraction of each pixel counts toward the total burned area. Overall, these vegetated fractions increased by 9% (p < 0.05) in Punjab from 2003-2016, consistent with the increase in rice area reported by Indiastat. Second, we would also expect to see a positive trend in fuel load in FINNv1.5 and GFEDv4s due to the increase in rice production. However, FINNv1.5 assumes a constant fuel load for each region and land cover type (Wiedinmyer et al., 2011), while GFEDv4s models carbon fluxes to calculate fire emissions at monthly time steps, using climatological fuel loads (van der Werf et al., 2010). The shift in peak burning from October to November in Punjab ( Figure S10) means that a higher fraction of post-monsoon burned area in GFEDv4s is multiplied by the lower fuel consumption historically characteristic of this state in November.
Taken together, fuel consumption and the fire diurnal cycle represent two important sources of uncertainty in agricultural fire emissions in global inventories. The new inventory, SAGE-IGP, addresses these uncertainties with the bottom-up validation and use of household survey responses. While the survey constraints in this study cannot be applied globally due to the expense of conducting such surveys, our work suggests that global inventories should consider satellite-derived or government statistics of crop production, yield, and area to improve fuel load estimates. Geostationary satellite fire observations can also be useful to constrain the diurnal cycle of fire activity, where such data are available.
Despite limitations and uncertainties, our SAGE-IGP inventory is a substantial improvement compared to existing global inventories and regional estimates. While global inventories tend to place more emphasis on large wildfires, regional estimates of agricultural fire emissions are often temporally or spatially limited due to global generalizations. Here we rely on both satellite and on-the-ground household survey data to improve FRP estimates. Based on our detailed adjusted FRP approach, our SAGE-IGP inventory provides daily, gridded emissions estimates in north India over a 16-year period from 2003-2018.

Conclusion
In summary, we combine household survey results with satellite observations to revise estimates of post-monsoon agricultural fire emissions across north India from 2003-2018. To do so, we develop an approach based on MODIS FRP, adjusted for small fires from VIIRS, cloud/haze gaps in satellite observations, partial-field burns, and the diurnal cycle of fire activity. Regionally, we estimate an average of 10.4 ± 3 Tg DM burned each year, yielding emissions of 65 ± 18 Gg OC, 5.6 ± 1.6 Gg BC, 791 ± 225 Gg CO, and 14.9 ± 4.2 Tg CO2. Our estimates of DM burned for Punjab, which contributes more than two-thirds of emissions, closely match the bottom-up validation estimates using state-level government statistics and survey burn rates from 2016 and 2017. Importantly, our cloud/haze gap-filling method leads to an 84% increase in DM burned in Punjab from 2003-2018; without this adjustment, the trend in MODIS FRP is only 16% and not statistically significant. Here we constrain the fire diurnal cycle and fuel consumption, two components that contribute most to bias and disagreement across this region among standard global fire emissions inventories used in atmospheric studies. Our results suggest that additional information from household surveys and crop statistics can help constrain these two components. We construct a daily, 0.25° x 0.25° emissions inventory over the IGP by disaggregating state-level DM burned. As we show, our emissions inventory, SAGE-IGP, may be used in atmospheric transport models to improve estimates of smoke exposure downwind and evaluate the associated public health burden and climate impacts. The likely expansion of crop residue burning among smallholder farms in north India, where satellites poorly capture fire activity, makes regional, survey-constrained inventories such as ours especially valuable for improving emissions estimates.

Data Availability
The CHIRPS rainfall and MODIS/VIIRS land cover, active fire, and surface reflectance datasets are publicly available through Google Earth Engine (http://earthengine.google.com/). MODIS/VIIRS datasets are freely available from NASA's Earthdata platform (https://earthdata.nasa.gov/).   The units for the daily FRP timeseries is GW. Percentages denote the FRP increase relative to the previous step. Checkerboards in Steps 1-3 denote a 1-km grid with MODIS FRP observations in black, VIIRS hotspots with a 1-km buffer in red, and clouds/haze cover in blue. Squares in Step 4 show the difference between complete versus partial burns, where black striations depict burning within the field. In Step 5, the plot on the left shows the diurnal cycle of fire activity, with horizontal lines showing the survey fractions (same as Figure 1b but weighted by landholding area) and vertical bars depicting our adjustment of midday satellite FRP. Detailed methods for each step are described in Section 2.4.1.

Oct 1
Nov 30   The Normalized Difference Vegetation Index (NDVI), a proxy for vegetation greenness, and Fire Radiative Power (FRP), a proxy for fire intensity (note the differing y-axis scales), are derived from MODIS over agricultural regions. FRP and rainfall are stacked by year, while NDVI is shown as the weighted average across all years, with weights based on the usable fraction, or fraction of agricultural area that is cloud and haze-free on that day in each year. Horizontal segmented bars above each subplot denote the approximate midpoint (black dot) and start and end (whiskers) of the pre-monsoon (March-May) and post-monsoon (Oct-Dec) burning seasons in each state. Upward triangles denote the timing of maximum monsoon and winter greenness; conversely, downward triangles denote the timing of minimum pre-monsoon and post-monsoon greenness.

Source of Burned Area
Fuel Consumption