A novel dataset for analysing sub-national socioeconomic developments in the Indian coal industry

Coal use needs to rapidly decline in the global energy mix in the next few decades in order to meet the Paris climate goals of keeping global warming well below 2-degrees Celsius. In emerging economies such as India (the second largest producer and consumer of coal) this would entail reducing long-term coal dependency. Prior work has focused on a coal transition in India from a techno-economic point of view, yet little attention has been given to the socio-economic dimensions of this transition. This is in part due to lack of availability of datasets required for such analysis. The first step in understanding the socio-economic dimensions of a coal transition in India is to understand the scale of current socio-economic dependency on coal at the sub-national level. We contribute to this literature by creating a novel dataset comprised of all 459 operational coal mines in India, using multiple Right to Information Act applications (India’s Freedom of Information Act) and then combining this dataset with coal company wise employment factors to estimate direct job numbers at the district level (a sub-administrative unit). We find that coal is produced in 51 districts in 13 states in India with large variations in employment numbers among these districts. While Korba district in Chhattisgarh state is the highest coal producing district, Dhanbad district in Jharkhand state is home to the highest number of coal mining workers. This is the first attempt at understanding the socio-economic dependency on coal at a district level and future work could focus on quantifying other district level socio-economic indicators such as coal related revenues. The new dataset and the results of this paper will be useful for scholars conducting future work on coal transitions and related topics.


Introduction
Meeting the global Paris climate target of staying well-below 2°C requires rapid reduction in use of fossil fuels, particularly coal [1]. While some rich countries have made plans to phase out coal as part of the powering past the coal alliance, attaining Paris Agreement goals would entail reduced long-term coal dependency in emerging economies such as India-the second largest producer and consumer of coal [2].
Prior work on coal transitions has largely focused on techno-economic analyses, including for India [3][4][5]. These technology focused studies typically analyze least cost-or cost-effective ways to achieve coal transitions. For example, a recent study showed that the average cost of solar is cheaper than some old coal power plants in India and made the case for shutting down these older power plants [3].
Over the last few years, a rich literature has emerged on socio-economic dimensions of low carbon transitions but it is limited primarily to Organization for Economic Co-operation and Development (OECD) countries [6]. This work is focused on the need for a 'just transition' for coal workers whose livelihoods depend Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. on coal production [6][7][8][9]. Just transition plans are also seen as important to overcome political resistance from fossil fuel workers against the low carbon policies or coal transitions [6,10].
In India, the last five decades have seen both the federal government and state governments promoting coal mining and power development through state-owned enterprises in order to meet growing energy needs and achieve social & economic development [11,12]. Over time, this has meant that certain coal supply regions in India now depend on coal jobs, and local government revenues [5,11,12]. Despite the importance of coal for regional economies, the scale of regional socio-economic dependency on coal is not well understood and has not been quantified [5,6,11,12]. Understanding the scale of socio-economic dependency and creating just transition plans will become an important factor in any coal transition in India, as it has become in other contexts (e.g. Germany, the United States, and the UK) [6,7,10].
A key reason for the lack of quantification of socio-economic dependency on coal is the absence of publicly available datasets such as locations of coal mines and their production in India [10]. For comparison, coal mining datasets are openly provided by government agencies in Global North countries such as the US [13]. Scholars have in the past highlighted that energy research in India has a data availability problem [14].
We fill this important data gap by creating a dataset of all 459 operational coal mines in India, including their production and location. We then use this dataset and combine it with another novel dataset (coal company wise employment factors) to quantifying district wise (a sub-administrative unit within the state) direct coal mining job numbers.
By doing so, we fill this important data gap and contribute to the growing literature on socio-economic dimensions of coal transitions by focusing on the coal mining sector in India. It must be noted that the scope of this paper is only limited to quantifying direct jobs and that jobs are only one key dimension of the socioeconomic dependence of local regions on the coal industry. However, direct jobs are a clear measure of dependency and an important first step in any analysis of this nature. Future work needs to explore a broader range of metrics and indicators such as district revenues from the coal industry, social spending by coal companies, and others.  [20]. The employment factors vary between subsidiaries and between the type of coal mining. Underground coal mining is much more labor intensive than open cast coal mining across subsidiaries.

Methods
In this note, we first created a coal mines dataset, and then combined this dataset with coal company wise employment factors (or number of employees per million tonne production i.e. jobs/MT) dataset to estimate district wise job numbers. An employment factors approach for energy sector job quantification has been used extensively in past scholarly work [10,15,16]. Direct jobs here refer to those jobs that are directly connected with the coal mining industry. These include workers working in the mines and washeries, executives working in coal company offices , and support staff. We do not quantify indirect (e.g. trucking) and induced jobs (e.g. restaurant workers serving coal workers) in the coal industry. We also do not account for unauthorized coal workers (also known as informal workers) who scavenge coal for a living [11,12,17].
In 2018-2019, Coal India Limited (CIL), Singareni Collieries Company Limited (SCCL) and Neyveli Lignite Corporation (NLC), the three government-owned coal mining companies, produced 93% of the total 756 Million Tonnes (MT)) of coal in India [18]. The rest (7%) was produced by a small number of government owned or private coal producers [18]. Federal government-owned CIL is the largest of the three coal mining companies, which alone produced around 610 MT (81%) during the above period.
We obtained information on all coal mines in India, their production and location by filing applications under the Right to Information (RTI) with CIL & its subsidiary companies, SCCL, NLC, and the Coal Controller Organization (India's coal sector regulator). The RTI Act in India is similar to the Freedom of Information Act in many other countries. Overall, our 'Indian coal mine location and production' dataset includes information on each of the 459 operational coal mines in India, their geolocation and the following details: (1) Name of the mine; (2) District Name; (3) Coal production in 2019-2020; (4) Operator name; and, (5) Type of mine (open cast (OC) or underground (UG)). The dataset also includes geocoordinates for each mine [19].
Our employment factors dataset for CIL and its subsidiaries is separated by type of mining (OC or UG) and company average (figure 1). We used the latest Joint Bi-partite Committee of Coal Industry (JBCCI) [20] report to create this dataset. It must be noted that CIL subsidiaries' produce coal either directly or through contractors. We created the employment factors dataset using CIL subsidiaries' direct production and employment numbers. Due to lack of employment numbers for CIL contractors, we assumed that the contractor run mines have the same employment factor as that of the CIL subsidiary which hires the contractor. For example, for CIL subsidiary Eastern Coalfields Limited (ECL), we assumed that all the ECL mines have the same employment factor whether they are run by the ECL or its contractors. We note that the impact of this assumption is mitigated by the fact that the contractor run mines are always located in the same district as the directly operated mines and in most cases right next to mines run by CIL subsidiaries. Both CIL subsidiaries and their contractors operate mines that have similar mine geology, local conditions and use a local workforce. However, we recognize that this is a limitation of our study. Future work can collect detailed datasets on employment factors for contractor run mines for each coal company and then use our mines dataset to improve the jobs quantification.
For non-CIL mines (nearly 20% of production), we used CIL country-wide employment factors for OC & UG mines. We made this assumption because of lack of availability of employment factors data for non-CIL mines-a limitation that can be addressed in future analyses by collecting employment factors data for these coal companies. There are a small number of mines (about 4%) operated by 3 CIL subsidiaries that are considered mixed mines where coal production is happening using both OC & UG methods. For these mines, we used the weighted average employment factor for the subsidiary that owns the mine.

District wise production & mines
Coal is produced in 51 districts across 13 states in India. The bulk of these districts are concentrated in the Central and Eastern Indian states of Jharkhand, Madhya Pradesh, Chhattisgarh, Odisha, and the southern state of Telangana (figures 2 and 3). While previously state wise coal production numbers were known, this is the first quantification of district level production for all coal producing districts in India. Among districts, there is a large variation in coal production and the number of mines producing coal. Korba district in Chhattisgarh is the largest coal producing district-just 15 mines produce 120 MT. There are also others such as Singrauli (Madhya Pradesh) and Angul (Odisha) with 7 and 13 mines respectively producing just over 80 MT. The mines in these districts are operated by CIL's newer subsidiaries South Eastern Coalfields Limited (SECL), Northern Coalfields Limited (NCL) and Mahanadi Coalfields Limited (MCL), which are more efficient and operate large OC mines.
On the other hand, Paschim Bardhaman (West Bengal) and Dhanbad (Jharkhand) districts are home to 65 and 51 mines respectively but only produce about 31 MT each. The mines in these districts are operated by Bharat Coking Coal Limited (BCCL) and Eastern Coalfields Limited (ECL), the oldest subsidiaries of CIL and home to a large number of UG coal mines. All in all, 22 districts produce over 10 MT of coal, 17 districts produce between 1 & 10 MT, and 12 districts produce less than 1 MT of coal (figures 2 and 3).

District wise direct jobs
Our results show that there were 744,984 direct coal mining jobs in India in the financial year 2019-2020. Dhanbad district in Jharkhand state has the highest number of coal mining jobs at 122,348 and Pakur district in the same state with only 63 jobs has the lowest number of jobs (table 1). Overall, 28 districts have over 5000 coal Figure 4. District wise comparison between coal production & jobs. The Y axis on the left side represents coal production and on the right side represents number of jobs (on a logarithmic scale). As coal production increases, jobs increase but with variations between districts. mining jobs, 12 districts have between 1000 and 5000, and 11 have less than 1000 jobs. This is the first quantification of coal mining jobs at the district level in India.
When comparing district wise coal mining jobs and production, we find that generally with an increase in coal production there is an increase in the number of coal jobs ( figure 4). However, among districts there are variations in job numbers due to differences in the type of mines (OC Versus UG) in operation and the employment factors of different coal companies that operate the mines. The districts that have predominantly large OC mines have a lesser number of jobs compared to those with more UG mines. For example, Korba district, which is the highest coal producing district (over 120 MT) has nearly 30,000 less coal jobs compared to Dhanbad district which produces 30 MT.

Conclusion
Meeting Paris climate targets would require reduction in the use of coal-based energy systems globally with implications for India, the second largest producer and consumer of coal. Globally, prior work on coal transitions has focused on least cost techno-economic analysis of coal transition. With regards to the socioeconomic dimensions (or just transition related aspects) of coal transition, scholarly work has mainly focused on OECD countries. Understanding India's energy transition including socio-economic dimensions of coal transition requires availability of good data.
In this note, we collected the 'Indian coal mine location and production' dataset that includes details of all operational coal mines and quantified district level jobs in India, a country whose coal trajectory is crucial to meeting global climate targets [5]. The novel dataset created for this note will be useful for scholars researching the spatial dimensions of coal transitions in India both from a techno-economic and socio-economic point of view. Using the dataset, and using a similar employment factors approach, future work could focus on quantifying indirect and induced coal jobs at the district level. We also anticipate that a number of scholars will be interested in using our coal mines dataset to identify specific districts to focus on for more detailed local analyses on the socio-economic dimension of the Indian coal transition. Future applications of the dataset include: (a) broader socio-economic analyses of coal mining that accounts for more than jobs; (b) examining spatial dimensions of understanding local air pollution and health issues around coal mines; (c) case selection for detailed local qualitative and quantitative studies of livelihoods, wages and working conditions of coal miners; (d) prioritization by stakeholders (e.g. policymakers or trade unions) seeking to focus their efforts on creating effective just transition plans.
While our dataset and results will be very useful for future scholarly and policy work on socio-economic dimensions of coal transitions, we note again that direct jobs are only one of the several socio-economic dimensions of coal transitions (research use (a) above). We are currently collecting data on other socioeconomic indicators such as district wise coal related revenues and social spending by coal companies. This will be included in future updates of the dataset.