1 Introduction

Air pollution has been one of India’s persistent environmental issues in recent years. Poor air quality is a substantial health and environmental concern that impacts well-being, hampers development, and incurs steep economic costs. As an emerging economic country in South Asia, India’s rapid industrialization and urbanization have led to extreme air pollution in urban and rural areas. The World Health Organization (WHO) reported that, in India, 14 of the 15 cities worldwide have the most severe air pollution. Northern India has experienced more severe air pollution episodes than southern India due to spatial heterogeneity and diverse meteorological conditions (PHFI, 2017). In 2019, household air pollution and particulate matter contributed to 1.67 million deaths in India, accounting for 17·8% (Pandey et al., 2021). Seventy-seven percent of India’s population lives in places where the annual PM2.5 concentration is above 40 μg m−3, the standard set by the Indian National Ambient Air Quality Standards (NAAQS) (Balakrishnan et al., 2019). No state in India has met the annual PM2.5 standard limit of 10 μg m−3 set by the WHO. India’s ambient air pollution sources are classified into vehicular exhaust households; small, medium, and large-scale industries; agriculture; power plants; waste and biomass burning; and construction and demolition activities. India’s populace has been breathing higher levels of toxic air, due to which they have experienced increased mortality and disease burden (Balakrishnan et al., 2019; Botle et al., 2020; Dwivedi et al., 2022; Gangadharan & Nambi, 2020; Masih et al., 2019; Pandey et al., 2021; Rohra et al., 2020).

Central and State Pollution Control Boards (CPCB and SPCBs) evaluated 88 industrial clusters and identified them as polluted areas, illustrating India’s air, water, and land pollution (SoE, 2021). The dissemination of daily levels of air pollutants, particularly in industrial areas, is essential for individuals suffering from illnesses from air pollution exposure. Knowledge of the statistical distribution of air quality data is necessary for predicting high pollutant concentrations, so that action can be planned and taken by related agencies and governments to tackle high pollutant events in future years. The ambient air pollutant levels varied with emission source strength and meteorological conditions. Previous studies have used distributions such as normal, lognormal, Weibull, gamma, and exponential distributions to fit the ambient air quality parameters (Giavis et al., 2009; Gulia et al., 2017; Lu & Fang, 2002; Mishra et al., 2021; Yang et al., 2012). Trend analysis is a helpful method for assessing variations in pollutant concentrations over time and may also be used to demonstrate the effectiveness of policy initiatives (Pandolfi et al., 2016; Sicard et al., 2021). The most common component of many statistical approaches researchers use is analyzing and correlating trends in pollutant data and meteorological parameters. Site-specific monitoring and source apportionment techniques can act as management and decision-making tools. Extreme event analysis (EVA) provides a statistical model to quantify the probability of extreme events, such as air pollution and the highest concentration of air pollutants, using return value analysis in a time of interest (Martins et al., 2017; Masseran et al., 2016). Extreme concentrations of air pollutants affect air quality and cause health hazards.

Studies have reported the adverse health effects of air pollution, focusing on long-term (chronic) and short-term (acute) exposures. The health of vulnerable and susceptible individuals (e.g., the elderly, women, and children) can be affected even on days with low levels of air pollution. Air pollution has many health effects, including respiratory and cardiovascular disorders, cancer, premature death, hypertension, and cognitive impairment. Many researchers worldwide have used the AirQ + model developed by the WHO to evaluate the long- and short-term health impacts of ambient air pollutants (Amoatey et al., 2020; Manojkumar & Srimuruganandam, 2021; Miri et al., 2016; Omidi Khaniabadi et al., 2019).

The present study investigates the statistical distributions and trends in the 24-h PM2.5 and PM10 concentrations measured in an industrial area of southern India. The Python-based pyMannKendall package was used to identify the trend in PM concentrations over the monitoring period. Additionally, we computed the return value of PM concentrations in a period of interest by applying threshold modelling, a Python package for modelling excesses over a threshold using the Generalized Pareto Distribution. Moreover, we explored the influence of meteorological parameters on pollutant concentrations. Furthermore, we estimated long-term health impacts such as total mortality and mortality due to chronic obstructive pulmonary disease (COPD), ischemic heart disease (IHD), lung cancer (LC), and stroke due to PM2.5.

2 Methodology

2.1 Study Area

Kanjikode, with an area of 16.88 km2 and a population of more than 50,000 people, is located 13 km east of Palakkad town, one of the major industrial areas in Kerala, India. Kanjikode is an industrial hub in the Palakkad District of Kerala, India. Due to many electric furnace–based industries, air quality degradation in the Kanjikode area has been observed (MOEF 2022). Various companies, such as PPS Steel (Kerala) Pvt. Ltd., Pepsi, Indian Telephone Industries Limited (ITI), Patspin India Ltd., United Breweries, Rubfila International Ltd., Bharat Earth Movers Limited (BEML), and Saint-Gobain India Pvt. Ltd. (SEFPRO), are located in this region (Fig. 1). There are nearly 48 industries in this region, including manufacturing units for steel, cement, paint color, distilleries, fertilizers, textiles, and chemicals. Industrial emissions in Kanjikode vary based on the activities carried out in various industries (Table S3). High-volume samplers, such as APM 460BL, were used to collect 24-h PM10 samples, and fine-volume samplers (APM 550) were used to collect 24-h PM2.5. PM10 and PM2.5 sampling was conducted on the rooftop of a building with a height of 20 m located in SEPR Refractories (SEFPRO). The monitoring station located in the study area was at least 250–300 m away from the major polluting sources (stacks, roads). SEFPRO is a 100% subsidiary of Saint-Gobain SEFPRO and one of the leading manufacturers of fused cast and sintered refractories for glass furnaces. The two main processes in this industry are sintering and fusion. A total of 232 samples were collected during 2018 (winter—16, summer—24, southwest monsoon—33, northeast monsoon—27), 2019 (winter—19, summer—18, southwest monsoon—21, northeast monsoon—14), and 2020 (winter—18, summer—15, southwest monsoon—18, northeast monsoon—9).

Fig. 1
figure 1

Layout of monitoring location in Kanjikode industrial area

2.2 Statistical Distribution Analysis

Characterizing the probability distribution of PM data is necessary for predicting the average concentration and probability of exceeding the standard limit set by the Indian NAAQS. The present study evaluated normal, lognormal, Weibull, gamma, and exponential distributions of the measured 24-h PM2.5 and PM10 concentrations. These distribution parameters (scale, shape, and location) were computed using the maximum likelihood estimation (MLE) method. The goodness-of-fit tests used to assess the best fit in the present study were the Kolmogorov–Smirnov (K-S), modified Kolmogorov–Smirnov, and Anderson–Darling (A-D) tests. A probability plot was used to identify the best fit of a dataset that followed a given distribution. After identifying the best distribution model, we used the cumulative distribution function to calculate the exceeding probabilities. The algorithms for parameter estimation by MLE and goodness of fit tests are provided in the supplementary information (S1).

2.3 Trend Analysis

The Mann–Kendall (M–K) test (Kendall, 1970; Mann, 1945) is a nonparametric test that is extensively used to find notable trends in temporal data (Bari et al., 2016). The slope estimator method was proposed by Theil (1992) and Sen (1968) to assess the magnitude of monotonic trends. Theil-Sen trend estimator is suitable for nonparametric data because of its insensitivity to anomalies, and it might be applied in severely skewed datasets. The test produced a p-value for the significance level and a slope value with a 95% confidence interval. The python-based pyMannKendall package was used for trend analysis in the present study (Hussain and Mahmud 2019).

2.4 Extreme Value Analysis

Generalized Pareto distribution (GPD) is commonly used in extreme event analysis (EVA). This approach uses the data above a predetermined threshold value and evaluates all data over the threshold without selecting a set of minimum or maximum values. The likelihood of the return level of the adverse environmental state was calculated using the fitted GPD model represented in Eq. (1). The probability of the return level is an essential measure of the likelihood of air pollution occurrences for a specific period.

$${G}_{\xi ,\beta }\left(x\right)=1- {\left(1+\frac{\xi x}{\beta }\right)}^{\frac{-1}{\xi }}$$
(1)

\(x\) denotes the excess over a given threshold value. The maximum-likelihood approach was used to estimate the shape (ξ) and scale parameters (β), which determine the distribution. The threshold provides a simple but essential component for a correct fit; it defines the point at which an occurrence is excessive (Martins et al., 2017). A recurrence interval, also known as a return period, can be used to estimate the probability of an event with an extreme level of pollutants. The danger of an air pollution event can be inferred from this indicator based on the return level over a considerable amount of time. In this study, threshold modelling, a Python package, was used to choose, fit the threshold, and estimate the return value of PM concentrations at a time of interest (Lemos et al., 2020; Masseran et al., 2016).

2.5 Health Risk Estimation

The WHO developed AirQ + , which quantifies the health impact of ambient PM concentrations (WHO, 2018). The burden of diseases, such as mortality, morbidity, and risk assessment, can be studied using AirQ + . The long-term impacts of ambient PM2.5 were assessed in this study using AirQ + software (Version 2.1.1). This study used the values of input factors such as pollution concentration and population. In addition, to execute the AirQ + programme, relative risk, disease-specific or health endpoint incidence, and cut-off values should be known. A log-linear methodology was used in AirQ + to generate relative risk levels. The baseline incidence values (per 100,000 individuals) for each health outcome derived from previous studies were as follows: Total mortality = 1013, lung cancer = 22, COPD mortality = 106, and stroke mortality, 70 (Amoatey et al., 2020; Maji et al., 2017; Manojkumar & Srimuruganandam, 2021). The attributable proportions, number of attributable excess cases, and number of attributable cases (per 1 lakh population) were calculated using a 95% confidence interval. The mathematical expressions used for calculating health risk estimation are given in the supplementary information (S2).

3 Results and Discussion

3.1 Seasonal Characteristics of PM in the Kanjikode Industrial Area

The boxplot of PM2.5 for 2018–2020 showed that 75% of the data were within the standard limit prescribed by Indian NAAQS (60 µg m−3). Twenty-four-hour average PM2.5 concentrations for 1 day in 2018 and 4 days in 2019 exceeded 60 µg m−3. Annual PM2.5 concentrations of the Kanjikode industrial area were 3, 4.8, and 3.5 times more than the WHO-specified limit (5 µg m−3) in 2018, 2019, and 2020, respectively. The dataset showed that 50% of the PM2.5 concentration followed the range of 5–18 µg m−3, 10–28 µg m−3, and 9–40 µg m−3 in 2018, 2019, and 2020, respectively (Fig. 2). The maximum 24-h average PM2.5 concentration was observed to be 73 µg m−3, 167 µg m−3, and 55 µg m−3 in 2018, 2019, and 2020, respectively. The distribution of the time-series data exhibited high kurtosis (K) and skewness (S), which are considered frequency distribution moves away from a normal distribution (Hair et al., 2017). The high K, high S, and less interquartile range (IQR) of the PM2.5 dataset in the winter season of 2019 (K = 2.2; S = 4.3; IQR = 27.8), summer seasons in 2018 (K = 1.5; S = 2; IQR = 15.1), 2019 (K = 5.8; S = 2; IQR = 14), southwest monsoon seasons in 2018 (K = 12.9; S = 3.2; IQR = 7.4), 2020 (K = 6.2; S = 2.1; IQR = 6.7), and northeast monsoon season in 2018 (K = 1.5; S = 1.5; IQR = 14) indicated the PM2.5 data has heavy tails. Also, this dataset was characterized by larger mean values than the median, indicating the occurrence of extreme quantities during this period. Previous studies observed a similar trend (He et al., 2020; Liu et al., 2019; Zhai & Chen, 2018). The 24-h average PM2.5 concentration was higher in the southwest monsoon and winter. The 24-h average PM2.5 dataset demonstrated significant differences regarding the kurtosis and skewness of the data reflected in seasonal PM2.5 level variations.

Fig. 2
figure 2

Seasonal variation of 24-h average PM2.5 concentration during 2018–2020

Twenty-four-hour average PM10 concentrations exceeded 100 µg m−3 in 29, 2, and 8 days in 2018, 2019, and 2020, respectively. The annual average PM10 concentrations in the study area were 5.3, 3.6, and 3.5 times more than the WHO-specified limits (15 µg m−3) in 2018, 2019, and 2020, respectively. The dataset showed that 50% of the PM10 concentration followed the concentration range between 49 and 102 µg m−3, 32 and 69 µg m−3, and 25 and 83 µg m−3 in 2018, 2019, and 2020, respectively (Fig. 3). The 24-h average PM10 concentration was higher in the winter, northeast monsoon, and summer seasons. The high K, high S, and less IQR of the PM10 dataset in the summer season (K = 3.1; S = 1.6; IQR = 40.5) and southwest monsoon seasons (K = 1.1; S = 0.9; IQR = 16.7) in 2018, winter season in 2019 (K = 8.5; S = 2.5; IQR = 24), and north-east season (K = 2.9; S = 1.6; IQR = 8.1) in 2020 indicated the occurrence of extreme PM10 concentrations during this period.

Fig. 3
figure 3

Seasonal variation of 24-h average PM10 concentration during 2018–2020

The processes in manufacturing industries including crushing, grinding, sieving, and mixing various materials to produce monolithic refractories were reported to emit elevated levels of PM in the atmosphere (Kuenen et al., 2019; MSME, 2010). Fugitive dust emissions may occur during the handling and transportation of raw materials. Industrial stack emissions also led to higher particulate matter (PM) emissions.

3.2 Statistical Distribution and Trend Analysis of PM Concentrations

Statistical distributions of atmospheric concentrations were used to assess the degree of compliance of a region with ambient air quality standards. The present study performed a statistical distribution analysis for the 24-h average PM2.5 to find the representative distributions. Goodness-of-fit tests used for assessing the best fit in the present study were Kolmogorov–Smirnov (K-S), modified Kolmogorov–Smirnov, and Anderson–Darling (A-D) test. Lognormal distributions with a significance level of 0.05 were better with actual data with lower K-S, K-S modified, and A-D test statistics for 2018–2020 (Table 1). PM2.5 concentrations followed gamma distributions in 2019 and 2020 (Table 1). Smaller K-S, K-S modified, and A-D test statistics signify a better fit with the actual data (McHugh, 2013). The lognormal distribution analysis of the 24-h average PM2.5 dataset predicted a 2.8%, 7.9%, and 7.7% probability of 24-h average PM2.5 concentration exceeding 60 µg m−3 in 2018, 2019, and 2020, respectively. In 2018, 2019, and 2020, the probability of PM2.5 concentration exceeding the prescribed limit (WHO 24-h standard limit: 15 µg m−3) was 32.1%, 52%, and 55%, respectively. The estimates of the location (µ) and scale (σ) of lognormal distributions are shown in Table 2. As demonstrated in Table 2, the values of σ were similar from 2018 to 2020, indicating that the meteorological parameters remained similar for the 3 years. The probability distributions of PM2.5 concentration showed a unimodal distribution. Past studies have reported well-fitted lognormal and gamma distributions of 24-h average PM2.5 concentration in megacities across the globe (Gulia et al., 2017; Lu & Fang, 2002; Mishra et al., 2021). The probability-probability (P-P) plots are provided in supplementary Figures (S1-S3) (supplementary information (S3)). The analysis results of various modified M–K tests and Theil-Sen slope estimation of the 24-h average PM2.5 dataset of 2018–2020 are provided in Table S1 (supplementary information (S4)). A significant monotonic increasing trend (Seasonal M–K test result: tau (τ) = 0.15, p-value = 0.002) was found for 24-h average PM2.5 concentration with an increasing magnitude of 0.43 µg m−3 per annum during 2018–2020 (Fig. 4). This increasing trend may be attributed to the various industrial activities in the study area. Industrial operations were not restricted during the COVID-19 lockdown in 2020. A similar increasing monotonic trend for PM2.5 was observed in an industrial location in Brisbane, Australia (Lorelei de Jesus et al., 2020) and urban areas of NY, USA (Masiol et al., 2019).

Table 1 The goodness of fit test data for the PM2.5 probability distributions
Table 2 The estimates of the location (µ) and scale (σ) of lognormal distributions of PM2.5
Fig. 4
figure 4

Increasing trend of 24-h average PM2.5 concentrations during 2018–2020

The statistical distributions of the 24-h average PM10 dataset showed that gamma distributions with a significance level of 0.05 were a better fit with actual data with lower K-S, K-S modified, and A-D test statistics for 2018–2020 (Table 3). In comparison, no distributions were detected for the 24-h average PM10 dataset in 2020. The gamma distribution analysis of the 24-h average PM10 dataset predicted 24.5%, 5.5%, and 9.3% probabilities of PM10 concentrations exceeding 100 µg m−3 in 2018 and 2019, respectively. The probabilities of PM10 concentrations exceeding the prescribed limit (WHO standard limit: 45 µg m−3) in 2018, 2019, and 2020 were 85%, 58%, and 51%, respectively. The P-P plots are provided in supplementary figures (S4-S6) (supplementary information (S3)). The analysis results of various modified Mann–Kendall tests and Theil-Sen slope estimation of the 24-h average PM10 dataset of 2018–2020 are provided in Table S2 (supplementary information (S4)). Trend analysis of the PM10 dataset showed a decreasing trend (Modified M–K Hamed Rao approach test result: tau (τ) =  − 0.3, p-value = 0.002) with a low rate of 0.2 µg m−3 per annum during 2018–2020 (Fig. 5). The downward trend may be attributed to the reduction in road resuspension brought on by COVID-19’s restrictions on vehicle mobility in the study area.

Table 3 The goodness of fit test data for the PM10 probability distributions
Fig. 5
figure 5

The decreasing trend of 24-h average PM10 concentrations during 2018–2020

3.3 Extreme Value Analysis and Return Level of PM Concentrations

This study applied extreme value analysis (EVA) to estimate the probability of exceedance and return values of PM10 and PM2.5, which can be anticipated in the coming years. Threshold selection is the basis for performing EVA using the peak-over-threshold (POT) method based on GPD. Mean residual life (MRL), parameter stability, and return level plots were used in the present study to establish the threshold.

The MRL plot in Fig. 6 indicates the threshold region between 40 and 45, where the linearity condition is required for threshold selection. The parameter stability plot in Fig. 7 indicates that the stability region appears to be confined between 40 and 45. The distribution became unstable after a threshold value of 45, indicating a lack of sufficient exceedances. The return value stability plot (Fig. 8) can be used as an additional check to investigate the sensitivity and stability of the GPD model to the threshold value. The return level plot showed the return value for the given return period (100 years) and thresholds (40 to 42). The return value between threshold values of 40.25 to 41.25 appeared to be constant for both Gen Pareto and exponential distributions. Based on these analyses, the threshold value for PM2.5 data was 41 in the present study. Maximum likelihood estimated the GPD parameters-shape (ξ) = 0.47 and scale (β) = 8 based on the threshold value of 41 with standard errors of 0.2 and 2.3, respectively. The diagnostic plots of the fitted GPD are shown in Fig. 9. The goodness-of-fit in the probability, quantile, and probability density plots seems convincing, though not perfect, as accurate. Thus, the fitted GPD model is reliable for estimating the return levels of PM2.5, for different return periods and predicting extreme events exceeding the prescribed standards and the risk of exposure to such extreme events. The return values estimated for the return periods (1–1000 years) with 95% confidence intervals (Fig. 10). Since shape (ξ) > 0, the distribution is unbounded with a concave shape (Coles, 2001), as depicted in Fig. 10. The return value of the PM2.5 concentration in the 100th year was 602 µg m−3. The return value of PM2.5 in the 10th year will be 193 µg m−3, with PM2.5 concentrations ranging from 71 to 650 µg m−3.

Fig. 6
figure 6

MRL plot for 24-h PM2.5 concentrations during 2018–2020

Fig. 7
figure 7

Parameter estimates against the threshold for 24-h PM2.5 concentrations during 2018–2020

Fig. 8
figure 8

Return value plot for the threshold stability for 24-h PM2.5 concentrations during 2018–2020

Fig. 9
figure 9

Diagnostic plots for threshold excess model fitted to 24-h PM2.5 concentrations during 2018–2020

Fig. 10
figure 10

Return level plots for 24-h PM2.5 concentrations in future

EVA was also performed for the PM10 data, as previously explained. The MRL parameter stability and return level plots are shown in Figs. 11, 12 and 13. Based on these plots, a threshold value of 122 was selected for PM10 in this study. The diagnostic plots of the fitted GPD shown in Fig. 14 indicate the reliability of the model in estimating the return levels. The maximum likelihood estimated the distribution parameters-shape (ξ) =  − 1.13 and scale (β) = 49.71 with standard errors of 0.5 and 0.03, respectively. The return level plot indicates that the plotted curve asymptotes to infinity because of the negative value of the shape parameter corresponding to the distribution with a short bounded upper tail (Coles, 2001). The return value of PM10 concentration in the 100th year was 144–223 µg m−3. The return value of the PM10 concentration in the 100th year was 166 µg m−3. The return value of PM10 in the 10th year was 165 µg m−3, and the PM10 concentration varied between 144 and 180 µg m−3 (Fig. 15).

Fig. 11
figure 11

MRL plot for 24-h PM10 concentrations during 2018–2020

Fig. 12
figure 12

Parameter estimates against the threshold for 24-h PM10 concentrations during 2018–2020

Fig. 13
figure 13

Return value plot for the threshold stability for 24-h PM10 concentrations during 2018–2020

Fig. 14
figure 14

Diagnostic plots for threshold excess model fitted to 24-h PM10 concentrations during 2018–2020

Fig. 15
figure 15

Return level plots for 24-h PM10 concentrations in future

EVA analysis of PM2.5 and PM10 has provided the highest concentration levels expected in the coming years, higher than the NAAQS and considered a public health threat. A similar trend and extreme episodes were observed for PM2.5 and PM10 concentrations measured in the metropolitan areas of São Paulo and Rio de Janeiro, Southern America (Martins et al., 2017).

3.4 Meteorology Dynamics on PM Concentrations

Meteorological parameters influence the ambient PM concentration, including wind direction, speed, relative humidity, and temperature. The wind is an essential meteorological parameter that can transport and disperse pollutants in the ambient atmosphere. Previous studies have highlighted that low wind speeds cause stagnation of pollutants near the emission source, and higher wind speeds transport pollutants away from the emission source (Ji et al., 2012; Kim Oanh & Leelasakultum, 2011; Peter & Nagendra, 2021). According to the Beaufort scale, light air (0.5–1.5 ms−1), light breeze (1.6–3.3 ms−1), gentle breeze (3.4–5.5 ms−1), and moderate breeze (5.5–7.9 ms−1) were observed during the study period. The analysis in Table 4 showed the elevated levels of both 24-h PM2.5 and PM10 during the wind speed of 1.6–3.3 ms−1. The predominant wind direction in the industrial area ranged from SSW-W (210°–270°) from March to September (Fig. S7-S9 in Supplementary Information (S5)). The wind direction at the monitoring site was ENE-SSE (90°–150°) from November to February (Fig. S7–S9). The data showed elevated levels of both 24-h PM2.5 and PM10 in the wind direction of ESE (120°), impacting the air quality of the nearby residential area. Spearman correlation analysis showed that wind speed was negatively correlated with PM10 (− 0.34) and a very weak correlation with PM2.5 (0.06) (Fig. 16).

Table 4 Comparison of PM concentrations in the study area with change in wind speed
Fig. 16
figure 16

Correlation between PM concentrations and meteorological parameters

The hygroscopic growth of PM is affected by ambient relative humidity (RH) and tends to result in the amalgamation, accumulation, and dry deposition of PM. The maximum RH was observed during the post-monsoon period, during the southwest monsoon, and low RH was observed during the summer in the study area. Relative humidity in this study is categorized in the ranges between 35 and 60%, 61 and 70%, 71 and 80%, and 81 and 94%. The data analysis indicated that both PM10 and PM2.5 had higher concentrations during RH, ranging between 35 and 71% (Table 5). Mcmurry and Stolzenburg (1989) observed that particle diameter increased (4–7%) even as RH increased to 85–90%, resulting in dry deposition of PM. Spearman correlation analysis showed that RH had a strong negative correlation with PM10 (− 0.6) and a moderate negative correlation with PM2.5 (− 0.3) (Fig. 16).

Table 5 Comparison of PM concentrations in the study area with change in relative humidity

The ambient temperature significantly influenced the vertical air motion. The temperature in the study area ranges from 21 to 32 °C. Low temperatures were recorded during the monsoon season (July–September), and the maximum temperature was recorded during the summer (March–April). The data analysis showed that the temperature did not vary much throughout the year, which did not influence the PM concentration (Table 6). Correlation analysis showed a weak correlation between temperature and PM concentrations (Fig. 16).

Table 6 Comparison of PM concentrations in the study area with temperature change

3.5 Health Risk Assessment of PM 2.5

The AirQ + software was used to evaluate the health risks associated with exposure to PM2.5. Long-term health impacts such as total mortality, mortality due to chronic obstructive pulmonary disease (COPD), ischemic heart disease (IHD), lung cancer (LC), and stroke were evaluated in 2018, 2019, and 2020, respectively. The annual PM2.5 exposure concentration of 24.34 µg m−3 in 2019 was higher than that in 2020 (21.53 µg m−3) and 2018, (16.26 µg m−3). The annual PM2.5 average concentrations exceeded the WHO annual limit (5 μg m−3). The estimated proportions and number of attributable cases due to total mortality, COPD, IHD, LC, and stroke are depicted in Table 7. It was observed that PM2.5 exposure could induce higher mortality due to IHD and COPD. More than 15, 34, and 27 premature deaths caused by total mortality in 2018, 2019, and 2020, respectively, could have been prevented if PM2.5 concentrations in the Kanjikode industrial area did not exceed 10 μg m−3, as suggested by WHO standards in 2005. The authors collected the number of mortality cases due to stroke and lung cancer during the study period from the state government’s health centre in the study area (Table S4). Mortality due to ischemic heart disease (IHD) and chronic obstructive pulmonary disease (COPD) in this region has not been documented. Previous studies have reported strong relationships between long-term and short-term PM2.5 exposure and increased mortality and hospitalization due to respiratory or cardiovascular diseases (Amoatey et al., 2020; Hoek et al., 2013; Omidi Khaniabadi et al., 2019; Pala et al., 2021).

Table 7 Estimated attributable proportions (AP), number of attributable excess cases, and number of attributable cases (per 1 lakh population) from PM2.5 exposure above 10 μg m−3 at 95% confidence intervals in the Kanjikode area during 2018–2020

3.6 Control Measures for Air Pollution in Kanjikode Industrial Area—a Way Forward

The Kanjikode industrial area is home to many medium- and small-scale factories, including those producing steel, cement, paint colors, distilleries, fertilizers, textiles, and chemicals, contributing to elevated pollutant concentrations in the study area. Industrial point sources, such as the power sector, industrial boilers, and other industrial processes, individually contribute to local source emissions. Despite emphasizing industrial emissions, transportation sources emitted more pollutants than the power sector. Kanjikode is situated in a region where a national highway road connects two Indian states, where various transport operations and large vehicle movements are everyday activities. The present research suggests the following measures to curb air pollution from industrial areas:

  • Reducing air pollution using numerous technologies can destroy hazardous pollutants at the source of pollution. These technologies include regenerative thermal oxidizers, catalytic oxidizers, and rotary concentrators.

  • Adopting renewable energy alternatives, such as solar power and wind turbines, enables industries to be more self-sufficient and energy-sufficient.

  • Industries’ existing air pollution control measures should be adequately maintained and periodically inspected for performance.

  • Conventional vehicles should be substituted with electric vehicles to eliminate exhaust emissions.

  • Pavements and roads should be maintained regularly to reduce vehicle-induced resuspension of road dust.

  • To contain industrial pollutants, adequate industrial exhaust ventilation and green belts should be increased.

  • Stringent measures and policies should be implemented to reduce pollution, thus reducing occupational and environmental exposure to various toxic pollutants.

4 Conclusion

Industrial air pollution has long been a significant contributor to poor air quality and has been recognized as an exacerbating factor that induces health threats. The trend and statistical distribution analysis showed that the probability of exceedance and increasing trend of PM2.5 might be attributed to various industrial activities in the study area. Extreme event analysis of PM10 and PM2.5 has provided the highest concentration levels expected to be higher than NAAQS in the coming years and is considered a public health threat. AirQ + software developed by WHO evaluated long-term health impacts such as total mortality, chronic obstructive pulmonary diseases, ischemic heart disease, lung cancer, and stroke due to PM2.5 exposure. Increasing adequate green belts, improving road transportation, and conducting periodic monitoring of the performance of air pollution control technologies will curtail PM2.5 levels in the study area, thereby safeguarding public health.