Potentials of big data for corporate environmental management: A case study from the German automotive industry

Integrating more sustainability into business processes is becoming increasingly important for companies. At the same time, they aim to collect and analyze large amounts of data (big data) to improve these processes. The potentials of big data for corporate environmental protection are hardly dealt with in the scientific literature. The main contribution of this paper is to identify potential big data use cases for corporate environmental management by using the example of the German automotive industry. For this purpose, expert interviews were conducted with corporate environmental managers which were evaluated by using a qualitative content analysis. In order to balance this environmental perspective and enhance it with data analytical expertise, these use cases were assessed by data analytics experts through a mixed method approach, in a subsequent process. The presentation of the identified five use cases and their critical reflection through data analytics experts are the key results of this paper.


INTRODUCTION
In many industrialized countries, the industrial sector is currently undergoing a period of digital transformation. However, the implications of this digital transformation with regard to sustainability remain unclear (Fritzsche, Niehoff, & Beier, 2018), while also leading to a continuous rise of available data on industrial production processes (Khan, Wu, Xu, & Dou, 2017). Current possibilities to collect and analyze large amounts of data (big data) offer companies opportunities to improve their business processes. "Big data is high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation" (Gartner, 2018). Big data analytics is usually differentiated into three types that build on one another: descriptive, predictive, and prescriptive analytics (Delen, 2015).
as playing an important ecological, social, and economic role . In Germany, industry accounts for 25.7% of gross domestic product (Federal Statistical Office, 2017a) and 7% of total emissions (German Environment Agency, 2018) whilst employing around 7.3 million people (Federal Statistical Office, 2017b).
From an environmental point of view, it is essential that the use of non-renewable resources is reduced to a minimum and that they are handled in a particularly responsible manner. At the same time, the sustainable access to natural resources is a crucial factor for economic development.
Therefore, the efficient use in terms of corporate resource management is of particular importance, both for companies and for environmental protection (Merino-Saum, Baldi, Gunderson, & Oberle, 2018). The economical use of resources is a central issue in which the operational interests often overlap with corporate environmental management.
Corporate environmental management has become increasingly important for companies aiming to contribute to sustainable development from an environmental perspective (Engelfried, 2011). Through corporate environmental management, all procedures and responsibilities are to be organized in such a way that corporate and societal demands for environmentally sound action are ensured, environmental opportunities and risks are recognized early, and legal requirements are met (German Environment Agency, 2017).
The automotive industry has a particularly salient role within the industrial sector in Germany in terms of employment, value creation, and investments in research and development (Legler et al., 2009), making it one of the most relevant industries for the transformation toward sustainable development as defined in the Brundtland Report (WCED, 1987). Holden, Linnerud, and Banister (2014) derived four dimensions of sustainable development from the Brundtland Report: safeguarding long-term ecological sustainability, satisfying basic human needs, promoting intragenerational and intergenerational equity. The analysis in this paper focuses exclusively on the dimension of long-term ecological sustainability.
The aim of this work is to identify potential big data use cases for corporate environmental management using the example of the German automotive industry, as it is a very relevant factor for all three dimensions of sustainability and due to its high spending on research and development supposedly also a progressive industry with regard to modern technologies such as big data analytics. The study covers original equipment manufacturers (OEM), as well as supplying companies. With regard to the ecological perspective of sustainable development, it seeks to demonstrate ways in which industrial companies can strengthen corporate environmental protection through the use of big data.
The paper continues with a section on the theoretical background and a state of the art analysis. Afterward the study and its findings are presented in the form of use cases. Subsequently, the paper discusses the results including the critical assessment of the developed use cases through data analytics experts and provides conclusions and an outlook on future research.

THEORETICAL BACKGROUND AND STATE OF THE ART
Corporate environmental management is to be understood as a cross-sectional and managerial function that is integrated at an operational, strategic, and normative level (Kramer, Brauweiler, & Helling, 2003). It is characterized by four features: multi-dimensional targeting, cross-functional character, cross-company character, and proactive behavior (Meffert & Kirchgeorg, 1998). The aim of an environmental management system is to continuously improve the environmental performance of a company as part of a continuous improvement process (Förtsch & Meinholz, 2014).
The Plan-Do-Check-Act (PDCA) cycle is an iterative, four-step process with which this improvement process is implemented (DIN EN ISO 14001, 2015).
However, there is no uniform concept of how the tasks of corporate environmental management can be classified in the various phases of the PDCA cycle. The classification rather depends on the organization with specific tasks sometimes falling into multiple categories, so that in practice tasks cannot always be strictly assigned to one phase. Table 1 offers an orientation with regard to this classification.
The tasks of an environmental management system are implemented by the normative specifications of the ISO 14001 and the Eco Management and Audit Scheme (EMAS) regulation (Kramer et al., 2003). Both environmental management systems are based on the structure of the PDCA cycle.
They are among the most common environmental management systems (Engelfried, 2011).
A systematic analysis of the state of the art, through a literature search in the scientific databases Scopus, Google Scholar, and ScienceDirect, has revealed a number of studies have investigated the potentials of big data analytics in a corporate context also touching upon environmental issues (search string: "big data" AND environment*). In the following paragraphs the findings from the most relevant studies published in highly recognized journals enhanced by few other reports that are very closely related to the topic at hand are summarized.
In general few studies have assessed the impact big data has on environmental performance, the majority of which propose theoretical approaches (Belhadi, Kamble, Zkik, Cherrafi, & Touriki, 2020). However, recent literature provides some indication for a positive relationship between big data and corporate environmental protection. For instance, Wu et al. (2017) suggest to use a big data approach based on social media, quantitative and qualitative data to manage a multitude of environmental supply chain risks and uncertainties. Similarly, Queiroz (2018) presents two environmental frameworks based on a social media and other big data analytics approaches to enhance environmental management and sustainability performance. Seles et al. (2018) identify and analyze challenges and opportunities the climate crisis presents for organizations and how organizations should respond to this scenario, while also examining the implications of big data management. Belhadi et al. (2020)  and environmental performance, arguing that big data allows for a "complete and immediate solution to analyze massive environmental data that is generated at different nodes in the organization" thereby reducing the complexity that is usually associated with environmental data (Belhadi et al., 2020).  use a big data analysis in their study and find that participation of Chinese enterprises' in global value chains can considerably improve their green technology levels. Kumar, Singh, and Lamba (2018) propose a method to solve sustainable robust stochastic cellular facility layout problem based on big data. With the help of 100 experts they identify 14 criteria for that purpose, which are grouped into four clusters: Material Handling Distance, Maintenance, Adjacency, and Hazard. In addition, a pool of potential layouts is created and a consensus ranking is generated (Kumar et al., 2018). Raut et al. (2019) analyze responses of 316 Indian professional experts from manufacturing firms to identify multiple factors influencing big data analytics and sustainability practices. Their findings suggest that management and leadership style as well as state-and central-government policy are the two most important predictors of big data analytics and sustainability practices (Raut et al., 2019). Corporate sustainable capabilities resulting from the integration of big data technologies, green supply chain management, and green human resource management practices are investigated by Singh and El-Kassar (2019), who find that corporate commitment influences big data assimilation leading to an overall improvement in the sustainable performance of the company.
With regard to improving resource efficiency in industrial production, a case study by Zhang, Ren, Liu, and Si (2017) indicates that energy consumption of manufacturing and maintenance processes can be reduced through big data approaches. Two of the authors subsequently develop a big data driven analytical framework for energy-intensive manufacturing industries that aims at reducing the energy consumption and emission Zhang, Ma, Yang, Lv, and Liu (2018). On a similar topic Mani, Delgado, Hazen, and Patel (2017) explore the application of big data analytics in mitigating supply chain social risk and demonstrate how this can contribute to environmental, economic, and social sustainability. This is in line with the findings of Dubey et al. (2019), who state that big data and predictive analytics significantly impact on social and environmental performance in supply chains, analyzing 205 manufacturing companies in India. On a more general level, a digital "planetary nervous system" on the basis of a concept called "Predictive Sustainability Control" and operationalized in a big data driven environment is proposed by Seele (2017) in order to predictively identify likely unsustainable events in industrial organizations.
Regarding the direct relation of big data and corporate environmental management, Keeso (2014) examines the connection between the two in a cross-sectoral study using different organizations (non-governmental organizations, governments, and companies). The study shows how big data is perceived in the context of environmental sustainability and how difficult it is to implement it. Keeso concludes that although the link with sustainability activities is only slowly progressing, big data is an integral part of environmental sustainability, for instance, through improved environmental performance measurement. Hampton et al. (2013) examine the question of how and why big data should be brought more into the focus of science and practice in order to solve global ecological and social problems. Song, Fisher, Wang, and Cui (2016) deal with the connection of big data and environmental management in an entrepreneurial context. They look at the potential of big data in environmental performance measurement. In addition, they summarize the latest advances in environmental management based on big data technologies.
De Camargo Fiorini, Jabbour, Lopes de Sousa Jabbour, Stefanelli, and Fernando (2019) follow a similar approach of identifying potential contributions of information systems in general and big data approaches in specific for the evolutionary process of corporate environmental management with their two case studies of Brazilian companies. Cooper, Noon, Jones, Kahn, and Arbuckle (2013) examine the potentials and challenges of big data for life cycle analysis, for example, in terms of collecting heterogeneous data from different databases. Etzion and Aragon-Correa (2016) show overlaps between big data and sustainability management and how operational and strategic corporate activities are affected in this process.
Additionally, there are numerous publications dealing with big data use cases in a corporate context, which focus on improving the efficiency of different value creation processes but do not explicitly address environmental issues. However, so far big data projects have mainly been TA B L E 2 Overview of (anonymized) interviewees Although several studies have examined their general potential and analyzed effects of implemented use cases, there has been no in-depth discussion in the context of corporate environmental management. The possibilities of big data are already being used in other areas to address various socio-ecological problems and to promote sustainable development (Hsu, 2014;Wagner & Hamann, 2015).
In order to identify whether these potentials also exist in the context of corporate environmental protection, this study aims to identify potential applications for corporate environmental management using the example of the German automotive industry.
The following central question will be answered in this paper: Q (c): Research question: What are potential big data use cases for corporate environmental management in the automotive industry?
To answer the central research question, the following sub-questions will be examined: Q (1): Which phases of corporate environmental management can be supported by big data? Q (2): Which specific objectives can be derived for the phases of the PDCA cycle? Q (3): Which categories of big data analytics may be applied to achieve the objectives examined in Q (2)? Q (4): What data should be used to achieve the objectives examined in Q (2)?

METHODS
Due to the relatively unexplored field of investigation, this work has an exploratory character and uses guideline-based expert interviews (see Table 2 for an overview of interviewees) as a recognized method of qualitative social research. The study is a partial sample survey, in which the objects of investigation in the first stage of the study have been selected based on a predefined set of criteria. The following criteria have been applied for the selection of experts: • Environmental management officer or other responsibility in the context of corporate environmental management, and • Working for a company in the automotive industry (this includes automobile OEM and automotive suppliers), and • The respective company is a large company (according to EU Recommendation 2003/361/EC 33), and • The interviewee works for a company site in Germany.
The interview guideline is divided into three sections. The first section begins with a brief thematic introduction to the topic. In the middle section, the questions addressing the formulated research questions are asked. This section is divided into the following topics: • big data (understanding of terminology, existence of big data strategy, departments using big data analytics), • corporate environmental management (certified environmental management system, corporate environmental information systems used, what data is analyzed in the department, strategic goals for environmental management and data analysis), • potential big data use cases (general knowledge about use cases from other areas, how and with which data can big data analyses support which processes/phases of corporate environmental management).
The third section provides the opportunity for interviewees to add other important aspects that have not yet been addressed. The complete interview guideline is attached in Supporting Information S2. Before the beginning of the interview, each interviewee was informed about the purpose of the study, the protection of personal data, assured of their anonymity and asked for permission to record the interview. All expert interviews were recorded electronically and completely transcribed afterward. The analysis of the collected data was carried out by means of qualitative content analysis following the approach of Mayring (2010) on the basis of the transcribed interviews. Specifically, the analysis technique of content structuring was applied: After the text was edited using a category system, the material extracted in the form of paraphrases is first grouped by sub-category and then subsequently by main category (Mayring, 2010). As main categories, the central topics were derived from sub-questions Q (1)-Q (4). Prior to the application of the category system, a coding guideline was created which defines the categories and their characteristics, as well as examples and coding rules for every single category. The complete coding guideline and the final category system are shown in Supporting Information S1.
In a subsequent process, the use cases were evaluated by five data analytics experts in order to balance the environmental perspective and enhance it with data analytical expertise. All experts in this stage of the study are working for companies offering data analytics services and software solutions and had previous experience in projects, where data analytics use cases were implemented at industrial companies in different industries. The interviewees (Int. 7-11) were asked for open feedback regarding the elaborated use cases, which were successively discussed, and their assessment of every use case regarding the following set of criteria.
• Availability of data: Can the required data easily be accessed when and where needed?
• Heterogeneity of data: What is the level of heterogeneity of the data used?
• Data protection efforts: How do you rate the efforts necessary to ensure compliance with data protection rules for this use case?
• Feasibility: How do you rate the feasibility/simplicity to implement this (potential) use case?
• Benefit : How do you see the potential benefit for big data analysis that goes beyond mere data aggregation?
The complete interview guideline and the final category system are shown in Supporting Information S3.

RESULTS
The data evaluation showed that one of the surveyed companies already implements a specific big data project for corporate environmental management. In addition to this use case, four further potential use cases were identified, which are described in the following sub-sections.

Use case 1: Improved creation of life cycle assessments
The optimized calculation of life cycle assessments (LCAs) was mentioned as a potential use case, although none of the companies surveyed has implemented or planned a big data project in this regard (Int. 4). In particular, respondents were interested in the optimization of product LCAs, which consider the entire product lifecycle, from raw material extraction to disposal/recycling (external data). The compilation of LCAs must take place across departments, especially in cooperation with product development (Int. 4). The aim is to create a more accurate assessment of the environmental impact of products, especially for CO 2 emissions (substance and energy flow data) (Int. 4) in order to substitute substances with more environmentally friendly materials and processes (production and process data) (Int. 4). Additionally, the product materials should be automatically synchronized with different databases to immediately see whether they are subject to restrictions in certain markets or destination countries, or may be subject to future regulation (Int. 4), in order to be able to react early to legal issues (organizational data) (Int. 4).

Use case 2: Measuring energy consumption and increasing energy efficiency
Frequently mentioned goals of the interviewees included a more exact measurement of energy consumption and the increase of energy efficiency (Int. 2, 5, 6). The analysis of energy data is one of the most important tasks of data analysis in corporate environmental management (Int. 1, 2, 3, 5).
The survey revealed that big data could yield great potential for optimization especially in the field of energy (Int.4, 6). Only one of the companies surveyed is currently implementing a big data project to increase energy efficiency. The strategic goal of this project is to reduce energy consumption per unit produced. In the first step, the summary of all energy-relevant data takes place on a common database, the so-called data lake (Int. 6).
This data lake is integrated into a superior system, managed by the central IT. The local systems are managed by the power supply units of the sites (Int.6). The purpose of the data lake is to eliminate the data exchange via different individual systems and interfaces and to create a "single source of truth" (Int. 6). As a result, the data sources can be determined directly and the inevitable occurrence of errors by grouping data several times can be reduced (Int. 6). In addition, since the big data project is implemented throughout the group, other business-relevant data that is normally difficult to obtain (e.g., specific process data) can be easily accessed through the shared database (Int. 6). The energy-relevant data in this case include energy consumption (related to the smallest unit in the production process, e.g., a robot cell), structure and process data (e.g., outputs per process unit, operating times of the plants), as well as external data (e.g., weather data, data from energy exchanges) (Int. 6).
With the help of the data lake, a central, multi-site energy management system can be implemented (Int. 6). As soon as the data lake is activated, the next step is to use big data analytics to identify potential savings, both for the energy side and for costs (Int. 6). On the one hand, energy consumption at defined measuring points can be observed and evaluated in real time (descriptive analytics, determination of environmental aspects/plan, regular measurements/check). In relation to the predictive planning aspect of power plant deployment planning (implementation of the environmental program/do), predictive and prescriptive analytics can be used to consume optimal amounts of energy at an optimal time (implementation of the environmental program/do; derivation of improvement measures/act) (Int. 6).

Use case 3: Measurement and reduction of emissions
Another potential use case relates to the reduction of emissions using big data (Int. 4, 6). The measurement of emissions associated with business activities is a key component of corporate environmental management. In addition, it was pointed out that CO 2 reduction per product is achieved through optimized LCA and the resulting improved measures for product materials. This would also reduce the overall carbon footprint of the company (Int. 4).

Use case 4: Measurement and reduction of water consumption
The reduction of water consumption is a frequently mentioned goal of the respondents (Int. 2, 4, 5). The analysis of water data, such as the collection of wastewater consumption is an integral part of data analysis in the surveyed companies (Int. 1, 2, 4, 5, 6). The aim is the optimal tracking of water consumption in connection with a cause/effect analysis of the implemented measures, as well as suitable reporting tools (descriptive analytics, determination of environmental aspects/plan, regular measurements/check, derivation of improvement measures/act) (Int. 3). By tracking water consumption, the possibility of forecasting expected consumption was also mentioned as a potential application (predictive analytics; determination of environmental aspects/plan) (Int. 4).

Use case 5: Optimization of waste management
A further potential use case is the optimization of waste management (Int. 3, 5, 6). The reduction of waste generation was defined as a strategic goal (Int. 2). Waste data are among the most frequently analyzed data in operational environmental management (Int. 1, 2, 4, 5).
Companies should be able to track waste quantities more precisely. The aim is the tracking of waste volumes and waste movements, as well as the preparation of waste reports (descriptive analytics, determination of environmental aspects/plan, regular measurements/check) (Int. 3). In addition, the disposal should be better controlled by forecasting future waste volumes (derivation of environmental goals/plan, predictive analytics, implementation of the environmental program/do) (Int. 5).

Improved creation of life cycle assessments
LCA at the product level enables an evaluation of the advantages and disadvantages of certain products and processes with regard to their environmental effects. An optimized life cycle performance offers a competitive advantage that is important from a sustainability perspective, especially for manufacturing companies  The limited availability of environmental data across the entire life cycle is a significant problem for the creation of LCAs .
For this reason, average data is often used (Xu, Cai, & Liang, 2015). Implementing effective big data analysis seems to be one promising step within the complex process of collecting a multitude of up to date data from products and production processes along the value chain. Thus, big data can TA B L E 3 Use case 1

Phases
Plan, Check, Act

Goals
Improved life cycle assessments

Data
Organizational data, substance data and material master data, substance and energy flow data, production and process data, external data be one important factor in improving data aggregation and analysis for LCA, which in turn contributes to a more environmentally friendly resource management (Cooper et al., 2013). However, the possibilities of big data for the creation of LCA are hardly used, yet (Li, Tao, Cheng, & Zhao, 2015).
Such potentials have been identified in all phases of a product life cycle, which generally consists of three phases (Li et al., 2015): • Beginning of Life (BOL): includes design, sourcing, and manufacturing of a product In the BOL phase, a considerable amount of data is collected today already. However, data in the MOL and EOL phases are difficult to collect, as companies in these phases no longer have influence on the use of their products. One starting point is the integration of Product Embedded Information Devices (such as radio frequency identification and sensors) in products, to enable real-time detection of certain parameters during each phase of the life cycle .
The analysis of data using descriptive analytics may help to better capture the environmental aspects of products along their entire life cycle (determination of environmental aspects/plan). Predictive analytics can help to predict the expected environmental impact of a certain user behavior, whilst also assessing the effects of substituting product ingredients (derivation of environmental goals/plan). Prescriptive analytics can support decision-making processes by determining the optimal solution within the scope of all available options. As a result, environmental goals and the resulting environmental program can be planned more precisely (plan). New data collection capabilities can support the adding of potential user behavior data (e.g., social media data) and thus add another dimension to LCA that has been difficult to analyze so far (Xu et al., 2015). The improved collection of data, including data from outside the company, can help to support measurements of the environmental impact of a product in very short time frames (descriptive analytics; regular measurements/check). This data basis can be used as a starting point for an improved assessment of corrective actions (predictive and prescriptive analytics; derivation of improvement measures/act). Table 3 summarizes the key characteristics of use case 1.
The interviewed data analytics experts see a very high benefit for the application of big data approaches in the context of LCA and they agree with the high degree of heterogeneity stated in the literature, but they are in average more positive regarding the availability of data for an LCA (see Figure 1). Int. 7 also pointed out that the combination of big data and artificial intelligence could be especially helpful in the context of this use case. In contrast Int. 10 does not regard big data in its current form as a suitable approach for LCA, preferring other digital approaches such as digital twins, Internet of Things, and Smart Factories, which are prominently used in the debate around Industry 4.0 (Beier, Ullrich, Niehoff, Reißig, & Habich, 2020), as these supposedly deliver more accurate and specific data. Int. 11 emphasized that the conceptualization and implementation of use case 1 would be very challenging as it requires significant efforts for organizational coordination due to the involvement of multiple departments and the current lack of information regarding suppliers further down the value chain. Two specific fields of application were stated as promising: condition assessment of used batteries and the respective forward planning of their usage in the context of e-mobility even though the economic benefit was doubtful (Int. 8) and determining market and product specific regulatory documents and create respective forecasts (Int. 7).

Energy efficiency
The accurate determination of the initial state of energy is a prerequisite for increasing energy efficiency. Since there are often only a few measuring points in many companies, a detailed assessment of the main energy consumption points is difficult. Therefore, it is necessary to build a dense network of measuring points in order to obtain a solid database for developing optimization potentials (Förtsch & Meinholz, 2014).
Currently, energy-relevant systems are often considered independently meaning that separated optimization measures are carried out. The introduction of an intelligent energy data management system which holistically considers all energy-relevant systems and processes, allows for optimization potentials to be bundled which, in turn, may lead to higher energy efficiency. However, this requires the connection and networking of these systems through a joint network. Collecting these sensor data and load profiles, as well as integrating enterprise-external data (e.g., weather  -3  2  2  --3  2  4  2  2  -2  --4  5 1

F I G U R E 1
Multi-criteria assessment of use cases by data analysis experts-per expert (p.Exp.) and on average (avg.)

Phases
Plan, Do, Check, Act

Data
Substance and energy flow data, organizational data, production and process data, external data data, electricity price data), and applying big data analytics on them might allow for monitoring and anticipatorily controlling these systems (Shrouf, Ordieres, & Miragliotta, 2014;Wang, Zhang, Shi, Duan, & Liu, 2018).
In addition to the environmental benefits of a more energy-efficient production, energy costs are also becoming an increasingly decisive competitive factor for businesses. Energy costs in production can be reduced by for instance limiting power peaks (Tschandl, 2012). Table 4 summarizes the key characteristics of use case 2.
Use case 2 was the only use case where all interviewed data analytics experts could provide insights. They see a very high benefit through the application of big data approaches for increasing corporate energy efficiency, where data is largely available and not too heterogeneous. Int. 4 considered it a typical use case implemented in many projects, where big data can help to identify patterns and therefore improvement potentials.
One challenge was raised by Int. 11, who cautions if energy consumption data can be related to specific products, this data must be protected in a complex manner. Otherwise it could be critical for keeping business secrets. Int. 9 considers the implementation of the energy efficiency use case also relevant in the context of predictive maintenance, when energy consumption that varies from usual patterns can be used as an indicator for failure prognosis. Int. 7 emphasized that the currently common frequency of collection for energy data (every 15 min) could be too coarse for closely coupling renewable energy systems and markets with the automotive industry in the future. Demand Response Management as one such synergy TA B L E 5 Use case 3

Phases
Plan, Do, Check

Data
Substance and energy flow data, production and process data was also named as one potential benefit by Int. 8, despite claiming that the reality in today's companies was much less visionary: many energy efficiency projects rather substitute conventional illuminants with LED.

Emissions
The provision of primary energy and its conversion cause emissions that can be reduced by increasing energy efficiency which can be supported through big data approaches. Likewise, CO 2 reduction per product can be achieved through an optimized LCA and the resulting improvement measures with regard to product ingredients. This would also reduce the company's overall carbon footprint (Int. 4).
The main tasks for companies in the context of emissions are the monitoring and (if required) control of corporate facilities with regard to their emissions. These data are for instance structured as a series of measurements on pollutant emissions (Tschandl, 2012). Significant industrial emissions include carbon dioxide (CO 2 ), nitrous oxide (N 2 O), and halogenated hydrocarbons (SF 6 ) (German Environment Agency, 2016).
Most companies calculate their greenhouse gas emissions and carbon footprint by assessing their energy consumption and other resource consumption metrics. Using a network of sensors is a more reliable method of measuring and monitoring emissions (Tang, Yang, & Zhang, 2014). By installing sensors to measure emissions, identifying emission sources by emission type can be supported whilst the effectiveness of reduction measures can be more effectively evaluated (descriptive analytics; determination of environmental aspects/plan; regular measurements/check).
The data-based evaluation of such relations can be regarded a first step toward forecasting emission levels based on different production and process parameters, thereby minimizing the risk of threshold violations (predictive analytics; implementation of the environmental program/do). Table 5 summarizes the key characteristics of use case 3.
Use case 3 was evaluated most critically by the interviewed data analytics experts. According to them the least benefit can be expected, relevant data is hardly available (while higher efforts could be necessary to protect them) and potential solutions will be more difficult to implement compared to all other use cases. According to Int. 9 it will be also very challenging to achieve precise data recording along the entire supply chain and the benefit to be expected, which will be relevant mainly in the business to consumer sector, largely depends on the actual CO 2 performance of the respective company. Int. 9 also claims that easy to use, off-the-shelf solutions for the detection and calculation of emissions, which are a prerequisite for companies applying them on a grand scale, are currently still missing. Partially due to these reasons Int. 10 claims, that other digital technologies could be more beneficial than big data for this use case.

Water consumption
Water is very important in many production processes, for instance, as a cleaning, cooling, transport, or storage medium, leaving the company as waste water in material and/or thermal form. Water management consists of the management of input streams (water use and provision) and output streams (waste water and waste water treatment) (Engelfried, 2011).
Improving production processes can help to reduce water consumption (Engelfried, 2011). To determine the optimization potential, a comprehensive data basis is required. Data collection through sensor-based water meters enables accurate real-time water consumption recording (descriptive analytics; regular measurements/check). Continuous data collection is a prerequisite for saving operating costs for manual readings and can facilitate the collaboration with the relevant authorities.
In addition, consumption losses and leakages can be more easily determined by the complete data transparency of the supply network (descriptive analytics; derivation of improvement measures/act). Analyzing data from pressure and flow sensors in water pipes can be used as input for water consumption forecasts. In combination with effective modeling, certain machines could be operated more resource efficiently (predictive analytics; implementation of the environmental program/do). Table 6 summarizes the key characteristics of use case 4.
The least amount of insights by the interviewed data analytics experts could be provided for use case 4. However, the information that was provided suggests that data related to water was the most homogeneous, relatively well available, and the application of big data approaches on these data could be very beneficial. Similarly to the energy use case, pattern recognition and monitoring techniques could also be used for predictive maintenance: "Constant monitoring of water use makes visible if there is a problem earlier, where in the past a leakage was discovered after weeks or months" (Int. 10).

TA B L E 6 Use case 4
Phases Plan, Do, Check, Act

Data
Substance and energy flow data, production and process data TA B L E 7 Use case 5

Phases
Plan, Do, Check

Goals
Optimization of waste management

Data
Substance and energy flow data, production and process data

Waste management
Nowadays, conservation of resources and recycling of used materials are at the center of waste management. Therefore, holistic product responsibility of companies is required throughout the entire life cycle. The integration of an environmentally friendly product design with low-waste process orientation is essential in this regard (Förtsch & Meinholz, 2014). Specific goals in the context of waste management may include an increase in material efficiency, demand-driven production, and optimized separation of waste fractions for further mono-material treatment (Engelfried, 2011).
Big data can be a means to detect trends and make predictions on the basis of analyzing current and historical data. Waste data can also be used to derive conclusions about the performance of production processes (descriptive analytics; regular measurements/check).
As an example, sensor-based monitoring systems in waste containers could help to determine fill levels and temperatures automatically and in real time. Through the collection and analysis of this data-in combination with data on emptying intervals, as well as production and process dataforecasts could be made about the expected amount of waste, allowing for a more efficient management of disposal measures. Therefore, not only environmental (reduction of waste) but also economic benefits (lower disposal costs) could be obtained through big data (predictive analytics; implementation of the environmental program/do). Table 7 summarizes the key characteristics of use case 5.
Use case 5 was considered as the easiest to implement by the interviewed data analytics experts, but data availability is mediocre and data formats are rather heterogeneous. Int. 10 sees "application potential for big data and machine learning in discovering patterns that can lead to reduced wastage in production systems," where the prediction of waste streams is less important compared to its overall reduction. For Int. 9 use case 5 is "most often requested by producing companies" and therefore the "most important use case of all," especially when analyzing the wastage for every process step in manufacturing.

Limitations of corporate environmental management
The interviews also revealed a number of limitations of the potential use of big data in corporate environmental management that will be presented and discussed in this section.

Requirements for data analysis in the context of corporate environmental management
Collecting and analyzing large amounts of data is not relevant for the tasks and processes that are part of corporate environmental management as it is currently defined. For this reason, according to Int. 1 a big data use case would generally not be compatible with standard corporate environmental management. The permanent collection of data in real time, such as the quality of water, is neither required from the regulatory side as only singular samples need to be taken, nor does the need for such monitoring exist internally (Int. 1). Likewise, the collection and analysis of company external data would not be suitable for corporate environmental management due to a lack of demand (Int. 1).
In general, the intersection of big data and corporate environmental management, as defined by ISO 14001 and EMAS, does not currently exist.
The operational procedures and responsibilities of environmental management officers have no overlap with data analysis in the context of big data (Int. 1). To create overlap between both areas, the scope of the term "environmental management" would need to be expanded and the focus of the study on the operational aspect would need to be broadened. However, as the environmental experts as well as the data analytics experts see great potential especially in use cases 1, 2, and 5, such an expansion could prove to be beneficial.

Scope of corporate environmental management
Big data is mainly an important aspect in the development of new business models or in product development, to which corporate environmental management might possibly have a connection. However, this would no longer be part of the actual scope of corporate environmental management since the products are used outside the company (Int. 1). The analysis of the data confirms this assessment since some of the mentioned potential big data use cases go beyond the actual scope of corporate environmental management. LCAs as an example are not a genuine task of corporate environmental management although ISO 14001 and EMAS require product life-cycle-specific analyses. This task can only be performed together with other departments and external partners.

Overlap with other management systems (e.g., DIN EN ISO 50001, 2018)
This distinction is particularly evident in the environmental aspect of "energy." Here, the only concrete implementation project for a big data use case was identified. Frequently mentioned goals for corporate environmental management included a more exact measurement of energy consumption and the increase in energy efficiency (Int. 2, 5, 6). The analysis of energy data was among the most important data analysis tasks for the interviewees

CONCLUSION AND OUTLOOK
Within this paper potential big data use cases for corporate environmental management were identified taking the automotive industry in Germany as an example. The main purpose of the paper was to gain an insight into which areas of corporate environmental management can potentially be supported by big data and which concrete goals can be derived from them. A further aim was to determine what kind of data needs to be analyzed using which category of big data analytics to achieve this objective.
To address these central research goals, expert interviews were conducted with environmental management officers of the German automotive industry which, in turn, were evaluated on the basis of a qualitative content analysis. As a result of the content analysis five potential big data use cases for corporate environmental management have been identified: • Improved creation of LCAs • Measurement of energy consumption and increase in energy efficiency • Measurement and reduction of emissions • Measurement and reduction of water consumption • Improved waste management Use case 2 (measurement of energy consumption and increase in energy efficiency) is the only big data use case that is already being implemented as a concrete project in one of the studied automotive companies. All other use cases have been named by the interviewed experts, but are neither implemented nor currently planned in any of the studied automotive companies. This impression was put into perspective by the interviews of the data analytics experts, who had already implemented several of the use cases in parts (Int. 7: use cases 1 and 2, Int. 9: use case 2 and 5, Int. 10: use cases 2, 4, and 5) for companies.
The results show that there are opportunities to integrate big data analytics, especially in the phases "Plan" and "Check." In the "Plan" phase, the determination of environmental aspects (energy, emissions, water, and waste) was defined as a specific objective. Data analytics experts also emphasized the potentials through the combination of big data with artificial intelligence and text mining techniques for dealing with regulatory frameworks. In the "Check" phase, the potential of big data was identified in the form of continuous measurements and derived variance analyses.
In the area of big data analytics, the category "descriptive analytics" was identified as the most important category for corporate environmental management. With the help of descriptive analytics the analysis, monitoring and control of the collected environmental data can be supported.
Furthermore, the results show that predictive analytics provide an opportunity to analyze legacy data in order to predict future trends, for instance, on expected energy and water consumption, and thereby identify overall efficiency potentials.
With regard to the type of data that could be used for big data analysis in the context of corporate environmental management, substance and energy flow data (energy, emissions, waste, and water) were considered to be highly relevant. The gathering of production and process data is also central to the identified use cases as the analysis of this data in combination with material and energy flow data is crucial to measure environmental aspects and derive measures for improvement. Two concrete measures in the context of predictive maintenance were suggested by the data analytics experts, where unusual consumption patterns could be early indicators for machine failure (energy) or pipe leakages (water).
Another finding of this work is the difficulty of embedding life cycle analysis into the scope of corporate environmental management. A comprehensive life cycle analysis or LCA has not yet been carried out by any of the studied automotive companies. In this regard, the use of big data can provide considerable potential not only to gather and analyze product related data over an entire life cycle, but also to optimize its ecological performance. Some of the data analytics experts offered a more critical perspective regarding the adequateness and feasibility of big data approaches for LCA though.
The study also revealed that the identified big data use cases cannot be exclusively assigned to the original tasks and responsibilities of corporate environmental management. Instead, there are overlaps with other areas of activity and business sectors such as energy management or product development. Besides, our results confirm the assumption that the lack of exemplary use cases inhibits the implementation of big data projects in companies. As the benefits of big data for environmental protection are currently not seen, no big data projects are initiated in this area (Int. 2).
The validity of the presented study has limitations with regard to the gathering of data, resulting from the small number of interviews and the restriction to exclusively address one specific industry in Germany. The experts interviewed were selected on the basis of their expertise, but cannot be regarded as a representative cross-section of their professional group. The combination of these factors is limiting the overall reliability of the study. For this reason, the results of this study are not representative and provide only a small selection of potential use cases.
Additionally, as mentioned in Section 1, the analysis focuses exclusively on the topic of long-term ecological sustainability. Sustainability in the sense of the Brundtland Report, however, identifies a much wider field of topics that, in their entirety, contribute to sustainable development.
Therefore it must be considered an additional limitation of this study that is addresses only a few SDGs and neglecting social aspects almost entirely.
Against this background, findings from this paper can only be integrated to broader concepts of sustainable development to a limit extent. On the one hand, the identified use cases do clearly address the goals stated in the Brundtland Report that industrial companies should consume less natural resources, water, and energy and produce less waste. On the other hand, if all identified use cases lead to increased efficiency within companies, that might lead to a higher overall consumption of resources through rebound effects (Gillingham, Rapson, & Wagner, 2015). Therefore, as with most studies improving corporate performances, it should be carefully examined in how far the presented findings may lead to such unintended side effects and how they can be embedded in a broader concept of sustainable development.
So far, there has been no research focusing specifically on the identification and analysis of big data use cases for corporate environmental management. The results of this work thus contribute to closing this existing research gap. According to Int. 10 this is of utmost importance, as "there can be no sensible environmental management without a sensible data basis." The paper also provides companies with a first orientation on which potential benefits can be created for corporate environmental management through the use of big data. Future research should enhance the findings of this study by investigating a bigger sample of companies, also from other industries and countries, to eventually identify additional potential use cases and analyze them in more detail.

FUNDING INFORMATION
Grischa Beier's contribution to this work was supported by the Junior Research Group "ProMUT" (grant number: 01UU1705A) which is funded by the German Federal Ministry of Education and Research as part of its funding initiative "Social-Ecological Research." Open access funding enabled and organized by Projekt DEAL.