1 Introduction

In 2020, it was said that the number of sensor devices exceeded the global Information and Communication Technology (ICT) fleet by an approximate factor of 1.2 [1], and by the end of 2023, it is expected 1.8 IoT-based connections on internet for each member of the global population [2]. As similar estimates project a steady, or even an exponential growth of connected devices; researchers have started to wonder what the ecological cost of using IoT systems will be.

The advancements in transistor scaling and energy efficient systems over the last decades allow to imagine optimistic scenarios. For example, recent projections show that the operational energy footprint share of specialized electronic components inside IoT devices will decrease to 0.01% by the end of 2025 [3]; and estimations report a low increase of only 6% in the total electricity consumption of IoT-oriented data centers in a period of 8 years [4]. However, as more researchers foresee the limits of our current technologies and embrace progressively a new beginning [5, 6]—clearly steeped in massive data and pervasive computing; the scientific community becomes more prudent with its projections. For instance, Koot and Wijnhoven [7] explain that the required electricity to power data centers only for Industrial IoT (IIoT) sensor data in 2030 could amount to 364 TWh (considering an endless transistor scaling); but they also clarify that it could go up to 752 TWh, considering a progressive decay of the Moore’s law.

In this sense, more and more scientists advocate for vigilance regarding a potential reallocation of computing intensity (from end-devices to internet networks and cloud servers) [8, 9]; and recommend progressively the inclusion of mutualized infrastructures to evaluate appropriately the environmental damage incurred from using IoT systems [10,11,12]. Unfortunately, estimating the impact of full systems is challenging and literature offers unconclusive results. Moreover, careless design related to the usage [13] or volume of sensor data and information [14] is observed oftentimes.

In fact, although it has been already alerted the potential growth of data centers and communication networks due to the booming of IoT [15], recommended the correct dimensioning of sensor data [16], and promoted appropriate design [3]; little is said about sensor data and its flow within Life Cycle Assessment (LCA) and eco design communities. We presume that one of the possible reasons that could explain this disregard is that, in general, reference flows of IoT systems tends to be modeled exclusively on the basis of energy and local equipment.

According to the standard ISO 14040 [17], the reference flow is the quantity of materials, energy (e.g., electricity), or even additional subproducts and supplies needed to fulfill a functional unit as it is expressed. On the other hand, the functional unit is the quantified performance of the function a product does. Both are two interrelated concepts, essential for environmental assessment and eco design. As the basic functional unit of IoT systems is providing meaningful information to humans and/or machines in an autonomous way, we believe that an IoT system can have different environmental impacts from different reference flows (e.g., different sensor systems, edge devices, mutualized infrastructures or even different supplies), depending on the unique way by which it collects and transforms raw data and sends information, under different contexts.

This work develops this approach to highlight its relevance for comprehensive and practical impact estimation and eco design of the Internet of Things. It employs a previous framework based on function-capacities and data flow, and it extends an initial design standpoint outlined in [18]. The rest of this article has the following structure. Section 2 reviews the impact assessment and eco-design literature of IoT systems in the context of sensor data and reference flows, so that a posture around a data-driven approach for impact estimation and eco-design can be postulated. From this, the estimation of the reference flow and associated impact of a case study is provided in Sect. 3 (according to a theoretical and empirical dissection of its data flow). In Sect. 4, a discussion introduces our contributions oriented to IoT designers and LCA researchers and practitioners. These contributions include:

  • An instance of a comprehensive impact estimation procedure and sharp LCA results

  • An instance of an advanced redesign process

  • A number of practical guidelines for impact estimation and eco design

Section 5 presents our conclusions and introduces our parallel work in progress.

2 Related work

Estimating the environmental impact of using full IoT systems is challenging. The reported proliferation and complexification of IoT networks and devices due to complex data and applications [19], together with the lack of information regarding the reference flows within mutualized infrastructures (i.e., electricity consumption), make it difficult for researchers to provide comprehensive studies (limiting their efforts to partial IoT systems, as seen in several works [10, 12, 20,21,22,23,24,25,26,27]. According to Malmodin et al. [28], scientists interested in extending the scope of the analysis adopt two perspectives for modeling reference flows and estimating the environmental impact of mutualized infrastructures: a top-down approach (based on the dissection of large-scale economic and environmental data, that can be later allocated to specific ICT sector [29]); and a bottom-up approach (based on the LCA dissection of specific ICT products [30]). In the IoT field, few authors working on the basis of the top-down approach (as Lelah et al. [16]) usually use data from telecom operators to allocate the electricity consumption of internet & access networks. On the other hand, authors working on the basis of the bottom-up approach [28, 31,32,33,34,35,36,37] usually estimate the reference flow in cloud servers and internet & access networks according to an electricity consumption per data unit ratio (e.g., kWh per Gigabyte generated).

Unfortunately, unconclusive LCA results emanate from references flows obtained by both communities. For example, Ingemarsdotter et al. [36] (with a bottom-up posture) and Lelah et al. [16] (with a top-down posture) suggest that the environmental impact contribution of mutualized infrastructures is not significant but, Dekoninck and Barbaccia [11] report augmented impact contributions of internet networks. what is more, Lelah et al. [16] remind that the impact contribution of the telecom infrastructure of their case study could be relevant if a hourly-based frequency of transmissions between local and mutualized infrastructures is adopted; and Ingemarsdotter et al. [36] warn that their conclusions could be different if other data-intensive scenarios in smart maintenance were considered.

Regarding eco design, there is solid evidence of the pivotal role of sensor data for the sustainable conception of IoT systems too. Lelah et al. [16] show that an increase in the circulation of data between sensors nodes and gateways provokes significant modifications not only in the reference flow of mutualized infrastructures, but also in the reference flow of local equipment (i.e.; scaling up photovoltaic cells, accumulators and batteries); and Köhler et al. [22] demonstrated for their parts, that a reduction of the sensing spatial resolution of a textile-based sensor network decreases its reference flow per square meter in both sensor modules and energy consumption. What is more, they showed that more than 98% of power dissipation can be avoid only by switching off radio receivers, putting microcontrollers (MCUs) in sleep mode (when possible) and reducing the sampling rate of sensors from 10 to 2 Hz (sticking the data resolution to the strict necessary).

Indeed, dimensioning correctly sensor data is crucial for sober design, but other related aspects, such as determining how information is transmitted are fundamental too. In their study around eco design of wireless sensor networks, Bonvoisin et al. [10] show that lowing the hearing sensibility of sensor nodes to the strict necessary (by adjusting the probability of successful reception of messages to 95%) provokes a reduction of 10% on their energy needs, but also provokes more replacements of edge devices (embedded, battery-powered repeaters), because they spend more energy to maintain poorer communication links.

As Köhler and Bonvoisin; some researchers focused on eco design have started to consider sensor data but, in general, instruments they propose suffer from a lack of consensus. Indeed, although certain complementarity and convergence is acknowledged in guidelines, contradictory positions can be also found. For example, Bonvoisin et al. [20] recommend starting the eco design process of Wireless Sensor Networks (WSN) by reflecting on essential information smart applications need and Arshard et al. [38] suggest using “selective sensing” referring to collecting minimum data required in particular situations. Moreover, the guideline “Find the device coverage which minimize the number of devices deployed” suggested by Bonvoisin et al. [20] complements the posture of Huang et al. [39], and the recommendations of Arshad et al. [38] who advice reducing “the network size by efficient placement of sensor systems or ingenious routing mechanism”. However, Bonvoisin et al. [10] distance themselves from Huang et al. [39] by advocating for reducing the entire local equipment through the analysis of data flow and information (when the latter highlights the relevance of increasing the number of repeaters and communication ranges to extend the lifetime of IoT networks).

Besides, other authors put excessive focus on the manufacturing and energy consumption of end-devices; and IoT standards propose impractical recommendations in reduced contexts. For example, Gurova et al. [40] focus their guidelines on raw materials and low energy consumption of smart wearables; and international standards [41,42,43,44,45] are extremely oriented to technical design, which usually neglects the reality of high-constrained sensor systems of our days (for example the recommendation “an IoT device should include a processor with at least two cores of 1 Ghz processing speed” included in the ITU-T L.1370 [45] standard for smart buildings simply neglects the energy-constrained nature of self-powered devices in modern edifices).

In this difficult context, we retain from literature the fundamental role of sensor data for environmental assessment and eco design; and we apply it to a unique posture focused on sensor data and information, that flows through different devices (D) and electronic components (C), executing specific data-functions with defined capacities (CF) (Fig. 1). By adopting this approach, we believe that appropriate modeling of reference flows, sharp impact estimations and effective eco design of IoT systems can be done.

Fig. 1
figure 1

Proposed framework for comprehensive impact estimation and effective eco design of IoT Systems [18]

Our posture is based on the fact that, every electronic component contributes to both the functional unit (i.e., providing meaningful information to human and/or machines in an autonomous way) and the environmental impact of IoT systems. Previous literature would support this hypothesis. For example, Morin et al. [46] have already suggested that the lifetime of IoT devices—in terms of the depletion rate of batteries, is drastically conditioned not only by the energy required in transmission and receiving functions, but also by a combination of additional factors linked to data manipulation and quality (including the data size required by the application, the data rate, and even the distance range at which wireless components operate in relation to other devices). Samie et al. [47] for their parts, would align to the idea that, for eco designing full IoT systems, careful attention on technology selection of electronic components (on the basis of data signals, rates, Tx and Rx power consumption patterns or distance range) should be adopted.

To illustrate our posture and demonstrate its relevance in the context of IoT systems, the next section estimates theoretically and empirically the reference flow and the environmental impact of a case study from an analysis of its inner data flow; by implementing our proposed framework (Fig. 1) in a cross-typed lifecycle model of specific electronic components.

3 Research methodology

Our research methodology is based on the study of a case and it is organized in three parts. In the first part we provide a detailed description of our case study, which has been designed by a third party and selected randomly from the market. In this part we also present a customized implementation of our proposed framework oriented to estimate the reference flow of our case study from its data flow, functions, key electronic components and capacities (in this occasion, focused on the use phase, and with a respective major and minor focus on LoRa and WiFi technologies). This allows estimating theoretically the reference flow and the long-term impact of our IoT system by adopting an unfavorable data-intensive scenario (in a second part); and empirically, by a packet traffic analysis (in a third part). Key findings from this section are discussed so that our main contributions can be presented in Sect. 4.

3.1 Case study

Our case study consists of  an IoT system oriented to track water consumption in domestic environments. Its local equipment is composed of a sensor system [48] (equipped with an induction emitter [49]) and a Long Range (LoRa)-Internet gateway [50].

The induction emitter (Fig. 2) is a separated sensor device powered by one 2/3AA-sized Li-ion battery. It is wired to the sensor system and generates electronic pulses by a bank of low voltage capacitors and inductors found in its electronic card (both in the front and the back side of its electronic card, approximately located within the red frame in Fig. 2b).

Fig. 2
figure 2

a The induction emitter; b front side of its electronic card

The sensor system (Fig. 3) is a 9 V-battery-powered device that has to be located no more than 30 m from the induction emitter (according to the manufacturer). It is equipped with a RN2483 LoRa module [51] (shown in the bottom red frame in Fig. 3b) and a YWW-BLEMOD Bluetooth module [52] (shown in the top red frame in Fig. 3b). Contrary to our preliminary assumptions presented in our previous work [18], its electronic card lacks of MCU and memory components. Because of that, we presume that the device uses at least the embedded microprocessor of the System-on-Chip (SoC) subcomponent of its LoRa module to manage processing and transmissions tasks; and at least its embedded memory, to keep transitory data and metering configuration settings.

Fig. 3
figure 3

a The sensor system; b front side of its electronic card

The gateway (Fig. 4) is a device powered by the electric grid and plays the role of an edge device in our IoT system. It is equipped with the same LoRa and Bluetooth modules of the sensor system (Fig. 4a); a WiFi ESP8266EX module [53] (shown in the left red frame in Fig. 4b), an ARM Microcontroller and a Flash memory (shown in the right red frames in Fig. 4b). According to the manufacturer, this device must be placed to no more than 800 m from the sensor system.

Fig. 4
figure 4

a The front side of the electronic card of the gateway; b its back side

The most basic functioning of the IoT system is shown in Fig. 5 below (illustrated in terms of our proposed framework recalled in Fig. 1). For simplicity, in this work we analyze the minimal network deployment for covering smart metering in an area of 1 km2 (according to the maximum distance range of 800 m between the flowmeter and the internet gateway, recommended by the maker). This includes one inductive pulse emitter, one flowmeter, one gateway, one smartphone and one IAP device).

Fig. 5
figure 5

Basic deployment of the IoT system

The induction emitter is attached to a conventional jet meter and sends electronic pulses to the sensor system whenever the spinning disk of the jet meter indicates water consumption. In this sense, the sensor system plays the role of a flowmeter, tracking the pulses generated by the induction emitter. Periodically, the flowmeter communicates with the gateway wirelessly using LoRa transmissions. For its part, the gateway communicates with the cloud server through an Internet Access Point (IAP) device (e.g., an internet modem) by its WiFi module.

On the other hand, a smartphone can stablish Bluetooth connections with the flowmeter and the gateway to either set/modify their initial configurations (e.g., security settings or sampling rate accuracy) or consult water consumption locally (consultations and metering configuration changes can be also held online, at mySolem.com).

The flowmeter transforms the tracked pulses (raw data) into information (count of pulses), and send this information to the gateway every 3 min (according to technical documentation provided by the maker). Because the flowmeter records the counts every 15 min (as declared by the manufacturer), it is believed that the gateway accumulates the periodic LoRa packets transmitted by the flowmeter and update the cloud server every 15 min, to keep synchronized the water consumption statistics in the local and cloud equipment.

Bearing in mind all these aspects and based on technical documentation, Fig. 6 presents a customized implementation of our framework, which is oriented to estimate the data flow within the IoT system of our case study (starting from the maximal capacities of key components), and from that, the reference flow and impact of its use phase in the long term, assuming an unfavorable data flow scenario. The Functional unit that leads this analysis is defined as “facilitating the hourly monitoring of water consumption of an area of 1 km2, during 2 years”. For obtaining the embodied global warming damage, environmental data from the CML-IA 2001 Life Cycle Impact Assessment (LCIA) method was used.

Fig. 6
figure 6

Implementation of a cross-typed lifecycle model for the case study (focused on its use phase)

Because this work is focused on the use phase of our case study, the implementation presents all the physical elements that allows the functioning of the local equipment (i.e.; the dotted elements) but their respective manufacturing and disposal life cycle phases are not taken into account. Also, notice that a battery can be seen as a component itself (power unit) and, at the same time, as a part of the reference flow (whose number vary according to the operational context of the system).

Double-sensed red arrows in Fig. 6 indicate data application flow or additional traffic related to internet protocol mechanisms. We assume the IAP device as an element of the internet infrastructure (access network).Footnote 1

To define the maximal capacities of the system and establish an unfavorable data flow scenario for our analysis, in this work we focus on some features of the LoRa and WiFi technologies. LoRa is a modulation technique optimized for long-range, low-power-consumption communications in IoT environments [54]. In order to cover long distances and keep high Quality of Service (QoS), LoRa uses an Adaptive Datarate Routine (ADR) that regulates the bitrate at which data is encoded in a determined bandwidth, according to a Spreading Factor (SF). The spreading factor is a parameter that determines the number of bits transmitted per LoRa symbol and it is inversely proportional to the bitrate.

When the distance range between two devices increases, the established link in LoRa communications degrades. To assure quality in transmissions, the ADR mechanism regulates the spreading factor to high values, which lowers the data encoded per second (the bitrate reduces), extends the time-on-air of a data sequence and consequently, prolongs the transmission states of transceivers.

Table 1 gives the optimal data application size (data payload) that a LoRa packet can carry efficiently in different bandwidths and distance ranges, according to given Spreading Factors (the affected bitrates are also showed).

Table 1 Data application size (payload) that a LoRa packet can carry according to different parameters

In this sense, to construct an unfavorable data flow scenario for our analysis, we assume that the data application size of the case study is equal to the maximal data payload allowed within a bandwidth of 125 kHz and a spreading factor of 7 (242 Bytes). This corresponds to a distance range of 2 km, which is close to the maximal distance range recommended by the manufacturer (800 m). We also assume that any data reduction task is applied in the gateway (the accumulated counts are simple resent to the cloud server).

3.2 Theoretical estimations

This section models the reference flow of our case study under the unfavorable data flow scenario suggested above, and based on available information provided by the manufacturer. The sampling rate of the induction emitter is set to 10 pulses per second. Under this configuration, the lifetime of its battery is approximately 10 years (according to the manufacturer). Consequently, the reference flow of using this device in normal conditions for 2 years (as is defined by the functional unit) is only one battery, and its associated impact is 3.82 × 10–2 kg CO2-eq. This damage corresponds to the impact of producing a single Li-ion battery of one-dry-cell that weights 5.65 g.

On the other hand, the reference flow of the flowmeter (in terms of the number of batteries (\({B}_{fm}\)) required during 2 years) is given by the following equation.

$${B}_{fm}=TT{E}_{LoRa}\left({BD}_{{T}_{x}}\times {t}_{{T}_{x}}+{BD}_{s} \times {t}_{s}\right)$$
(1)

where TTELoRa = Total LoRa Transmission Events in 2 years (350,333), \(t_{\left( {T_x } \right)}\) = Time elapsed in transmission state during a LoRa trans. event, ts = Time elapsed in sleeping state during a LoRa trans. event, \(BD_{\left( {T_x } \right)}\) =Battery Depletion factor trans. mode (1 battery per 25.6 h), BDs = battery depletion factor sleep mode (1 battery per 712,500 h).

Below, we present the main assumptions and calculations to obtain every component of Eq. 1.

Firstly, we assumed that the energy consumption for simply counting the electrical pulses depends on the current consumption of the sleep mode, and the time elapsed in this state (\({t}_{s}\)); and for transmitting the counts, on the current consumption of the transmitting mode, and the time elapsed in this state (\({t}_{{T}_{x}}\)), as showed in Fig. 7.

Fig. 7
figure 7

Energy consumption pattern of the LoRa module of the flowmeter based on its current consumption (no scale)

Secondly, the time elapsed in the transmission state in a LoRa transmission event is the quotient between the final size of a LoRa packet (\(P{S}_{LoRa}\)) and the bitrate capacity (\(B{R}_{SF}\)) of the LoRa module, which is determined by a SF of 7 in a bandwidth of 125 kHz (Eq. 2).

$${t}_{{T}_{x}}=\frac{P{S}_{LoRa}}{B{R}_{SF}}$$
(2)

where PSLoRa = final size of a LoRa packet (2256 bits), BRSF = bitrate capacity under a SF7 and a 125 kHz bandwidth (5470 bps).

Thirdly, according to technical data and the established unfavorable data flow scenario, we assume that the microprocessor of the flowmeter’s LoRa Module collects and counts the electrical pulses generated by the induction emitter during 3 min (180 s), creating a maximal payload of 242 bytes. To send this data payload, the LoRa module adds to it headers and footers, conforming full LoRa packets of 282 bytes. Thus, by considering this packet size in bits (2256 bits), the time elapsed in transmission mode (\({t}_{{T}_{x}}\)) in a transmission event is 0.41 secs (or 1.14 × 10–4 h); and in sleeping mode, (\({t}_{s}\)), 179.59 secs (or 4.98 × 10–2 h).

Fourthly, if the flowmeter would have to transmit continuously, the Battery Depletion factor for the transmission mode (\({BD}_{{T}_{x}}\)) would be 25.6 h. This is the quotient between the nominal capacity of a 9 V PP3-typed battery (1200 mAh) and the current consumption of the flowmeter’s LoRa module in transmission mode (44.5 mA), all multiplied by an output current performance rate of the battery of 95%. Similarly, if the flowmeter would have to sleep continuously, the Battery Depletion factor for sleeping mode (\({BD}_{s}\)) would be 712,500 h (considering the same nominal capacity of the battery and a current consumption of the flowmeter’s LoRa module in sleeping mode of 1.6 × 10–3 mA).

Thus, by considering all these aspects in equation one, and by knowing that the total LoRa transmission events in 2 years amount to 350,333 (one every 3 min), the reference flow for the flowmeter is two batteries.Footnote 2 This generates a damage of 0.458 kg CO2-eq, which corresponds to the impact of producing 2 Li-ion 9 V pp3-typed batteries of six dry cells that weights 33.9 g).

Regarding the gateway, it is believed that this device (1) accumulates the LoRa packets generated by the flowmeter during 15 min (which generates a data payload of 1410 Bytes), and (2) transmits these data to the cloud server by its WiFi module, in a Hypertext Transfer Protocol (HTTP) post request (which generates a packet of 1480 bytes). To do all this, the device requires 6 × 10–3 kW (according to technical documentation), which means a total electricity consumption of 105.12 kWh during 2 years. This generates an impact of 5.15 kg CO2-eq, according to the CML-IA LCIA methodology and assuming a French electricity mix.

To estimate the reference flow and the impact of the cloud layer, we determine the data traffic from the gateway to the cloud server (regular operational mode) and the data traffic from the smartphone to the cloud server (consultation operational mode).

For the regular operational mode, one considers the total number of transmission events (assumed to one every 15 min, during 2 years), the final packet size of HTTP POST requests (packets of 1480 bytes generated by the Gateway, and addressed to the Cloud server [we call these packets “HTTP post GC”)], and the additional data generated by the Transmission Control/Internet protocols (TCP/IP) overhead (packets regarding acknowledgements (“TCP ack”), TCP three-way handshake (“TCP ths”) and TCP teardown (“TCP t”) mechanisms). Thus, by considering a total of 70,066 transmissions events during 2 years, the total data traffic flowing between the gateway and the cloud server amount to 0.1331 Gigabytes (GB).

For the consultation operational mode, one considers the total number of transmission events (one every hour, during 2 years), the packet size of HTTP GET requests (assumed to one byte each), the packet size of HTTP responses (assumed to one byte each), and the additional data generated by the TCP/IP protocol overhead.

Thus, by considering a total of 17,520 transmission events during 2 years, the total traffic flowing from the smartphone to the cloud server amounts to 1.03 × 10–2 GB.

From this, the reference flow related to the internet networks and cloud server amounts to 2.15 × 10–2 kWh and 2.01 × 10–2 kWh respectively (assuming an electricity intensity factor of 0.15 kWh/GB for the internet infrastructure, which is a median value between that one proposed by Malmodin et al. [31] and Krug et al. [55]; and an electricity intensity factor of 0.14 kWh/GB for the cloud server infrastructure, as reported by Andrae and Edler [56]).

Thus, the impact generated by the internet networks amounts to 1.7 × 10–2 kg CO2-eq; and the impact generated by the cloud server to 1.6 × 10–2 kg CO2-eq (both assuming a global electricity mix, and according to the CML-IA LCIA methodologyFootnote 3).

Finally, the estimation of the reference flow from using the smartphone considers only the energy needed for powering its WiFi module in the transmitting and receiving modes (1.62 × 10–3 and 1.375 × 10–3 kW respectively, according to Perrucci et al. [57]), multiplied by the total time that the WiFi module works in 2 years (292 h, if one assumes that the user turns on the WiFi function of the phone during one minute to consult his or her water consumption hourly).

Thus, the reference flow of the smartphone for sending consultation requests and receiving responses amounts to 0.5 kWh and 0.423 kWh respectively. Both reference flows add up an impact of 4.5 × 10–2 kg CO2-eq, which correspond to the impact of charging a smartphone oriented only to the functional unit of this analysis, and by assuming a French electricity mix.Footnote 4

Table 2 below offers a summary of all these assumptions and calculations oriented to establish the reference flow of the case study in the use phase, which generates a total impact of 5.7235 kg CO2-eq.

Table 2 Summary of the electronic components, functions and capacities considered to estimated theoretically the reference flow and the impact of the case study

3.3 Empirical estimations

This section models the reference flow of the case study from a data traffic analysis. It was conducted by adding two network analyzers (sniffers) in our IoT system, according to the experimental deployment presented in Fig. 8.

Fig. 8
figure 8

Experimental deployment of the IoT system for the packet traffic analysis

In Fig. 8, a Wireshark-based sniffer (packet sniffer A) equipped with a RTL2832U-based dongle and GNU-radio companion software (running in a Linux PC) is placed to capture the LoRa packet traffic between the flowmeter and the gateway. A customized GNU-radio companion model based on gr-lora implementation [58] was developed to allow the dongle intercept LoRa transmissions. The second sniffer (packet sniffer B) is a Wireshark-based sniffer using the wireless Network Interface Controller of a desktop Windows PC (in promiscuous mode). It aims to capture and analyze the WiFi traffic of the system. The distance between the flowmeter and the gateway is 4 m.

The packet traffic analysis was conducted as follows. The network operated approximately 10 h (from 11 h 40 to 22 h 20). During this period, regular water consumption was emulated by applying compressed air to the jet meter, and hourly consultations were made, by acceding the user’s interface in the cloud server via a smartphone equipped with a WiFi module. Sniffer A and B were initialized before starting up the network. To capture the LoRa transmissions, the GNU-Radio companion model considered a spreading factor of 7, a frequency band of 868 MHz hearing in three frequency channels, and a bandwidth of 125 kHz. The LoRa transmissions were intercepted by the RTL-RDS bundle and transformed into User Datagram Protocol (UDP) packets, which were reoriented to the Wireshark interface by the ports 40,868, 40,869 and 40,870 of sniffer A (one for every frequency channel). In the packet traffic captured and reported by sniffer B, specific Wireshark filters were applied.

Figure A2 in the Supplementary Information shows a sample of the packet traffic between the flowmeter and the gateway captured by the sniffer A (from 21 h 06 to 22 h 06). Each point represents the aggregated data size of a LoRa transmission in minute resolution (excluding UDP headers). On it, it can be observed that transmissions occur effectively every 3 min (with some exceptions, in which packet loss is assumed) and, besides some outliers, it could be said that the generic size of LoRa packets is 57 Bytes (as it is documented in the Table A2 of the Supplementary Information).

By considering this packet size in equation two, the time elapsed in transmission mode (\({t}_{{T}_{x}}\)) is rather 0.083 secs (and the time elapsed in sleeping mode is (\({t}_{s}\)) is 179.92 secs). In this sense, the flowmeter requires only one battery to operate during 2 years (according to equation one). This provokes an impact of 0.229 kg CO2-eq, which corresponds to the impact of producing only one 9 V-PP3-typed battery (more details about these results are available in the Sect. 6 of the Supplementary Material).

With respect to the data traffic within the internet network and cloud server, Fig. A3 in the Supplementary Information (Sect. 5) shows a refined sample of the WiFi traffic of the system (for the regular and the consultation operational modes) occurred in approximately two minutes (from 21 h 58 m 03 s to 21 h 59 m 51 s). On it, each point represents the aggregated data size of Domain Name System (DNS), TCP and HTTP transmissions occurred in second resolution, and it is observed that, in the regular operational mode, the transmission events between the gateway and the cloud server occur every 18 s (contrary to our assumptions stated in the previous section). Moreover, a detailed inspection of the TCP traffic (which is also illustrated in the Table A3 of the Supplementary Information) shows that the three-way handshake and the teardown TCP mechanisms generates packets (SYN, SYN/ACK, ACK, FIN or FIN/ACK types indistinctly) with a mean size that range from 54 to 58 bytes (this suggests that the TCP, IP or MAC headers of these packets do not include certain optional fields).

Another inspection on the HTTP traffic shows that the POST request-typed packets with data application are fixed to 200 Bytes. This suggest that the gateway executes processing tasks oriented to transform the data payload of the incoming LoRa packets into formatted HTTP POST-type packets with constant size addressed to the cloud server. On the other hand, the cloud server generates HTTP timeout-request packets of approximately 468 bytes before starting a TCP teardown routine. A timeout request allows a server announce and close an unused connection, and the continuous presence of this type of request in our case study would suggest that the gateway waits for this request to start a TCP teardown routine. Beside of this, the Intensive HTTP traffic observed in Fig. A3 (from 21 h 59 m 13 s to 21 h 59 m 31 s) during a water consultation request via the cloud server (by acceding the user’s dashboard at www.mySolem.com) would suggest extra transmissions stablished between the gateway and the cloud server, in which the server asks the gateway for additional data (data that probably differs from counts). This generates high volumes of data that seems to be fragmented on packets (7 approximately) of less than 800 Bytes (as it is documented in Table A3 of the Supplementary Material).

Finally, in the packets traffic analysis one can observe unexpected Domain Name System (DNS) packets too. DNS is an upper-layer protocol in charge of finding the IP address from an Uniform Resource Locator (URL); in this case www.mySolem.com. When a device needs to find and save the IP address of a remote server from an URL, it sends a query request to a DNS server—usually hosted in the cloud, which sends the response information in a query response [59]. This operation takes place usually in early connections and the recurrent presence of DNS requests (with a mean size of 71 bytes each) and DNS responses (with approximately 87 bytes each) in the regular operation of our case study (sending counts every 18 s) suggest that the gateway do not keep in memory the IP address of the cloud server (generating extra traffic in mutualized infrastructures).

Table A4 in the Supplementary Information (part 7, subsection a) synthesizes the data flow observed in the packet traffic analysis for the regular and consultation operational modes, highlighting the new behavior disclosed above (gray cells) and disclosing a total data volume of 4.305 GB. From this, the reference flow of the internet networks and cloud server amounts to 0.6457 kWh and 0.6027 kWh respectively (assuming the electricity intensity factors mentioned before). This provokes an impact of 0.5 kg CO2-eq for the internet use, and 0.467 kg CO2-eq for the cloud server (both assuming a global electricity mix). In this way, and considering the same impact coming from the same reference flow of the induction emitter, the gateway and the smartphone; the total impact estimated from the packet traffic analysis of the case study amounts to 6.4296 kg CO2-eq.

4 Discussion and contributions

Methodologically, the theoretical procedure proposed in this paper shows how LCA practitioners can figure out sensor data generation and volume by constructing data-intensive-based scenarios from limited information of devices and key electronic components in local equipment of IoT systems (e.g., connectivity patterns or technical specifications around data transmission performance thresholds, usually provided by manufacturers or founded in literature). On the other hand, the empirical procedure conducted in this work shows how researchers can dissect and estimate precisely the dataflow behavior within IoT systems from simple adaptations on the deployment of local networks. Both approaches are complementary and overcome the documented issues around complexity and missing information when modeling reference flows (specially under a bottom-up logic), facilitating a detailed examination of the environmental impact of using full IoT systems.

4.1 Contribution with an instance of sharp impact estimations of IoT systems

Indeed, the theoretical and empirical procedures seen in Sects. 3.2. and 3.3 allow us to obtain detailed estimations of the ecological damage of using an IoT system in the long-term. From our theoretical procedure, we obtain and absolute impact of 5.7235 kg CO2-eq and from our empirical procedure 6.4296 kg CO2-eq. The latter increases in 12.34% with respect to the former because our packet traffic analysis reveals additional data flow in the regular and consultation operational modes (red texts in Fig. 9), which increased significantly the reference flow of the Internet & Access networks and the cloud server (in terms of electricity used per GB generated by the gateway and the cloud server).

Fig. 9
figure 9

a Packet traffic share in the regular operational mode of the case study; b in the consultation operational mode

Importantly, the impact contribution of the Internet and Access networks and the cloud server went from being insignificant in the theoretical estimation of the reference flow of the system, to relevant in the empirical estimation (overcoming the impact contribution of the flowmeter, as is showed in Fig. 10).

Fig. 10
figure 10

a Impact contributors calculated from the theoretical estimation of the reference flow of the case study; b from the empirical estimation

These findings bring lights around the environmental load of mutualized infrastructures, which would be subject to different aspects of sensor data and communication links. Indeed, the packet traffic analysis presented in this work demonstrates that the frequency at which the local infrastructure connects with the cloud infrastructure, the volume of data generated and transmitted, and the protocol overhead of transmissions in regular and/or user-driver operations are all fundamental aspects to be considered in the modeling of reference flows and impact estimation of full IoT systems. This last aspect warns Life Cycle Assessment (LCA) practitioners from be cautious about the postures of Malmodin et al. [31] and Coroama et al. [60] who suggest that, for estimating the electricity intensity of ICT (which is different from estimating its ecological impact), end devices—such as sensor systems—should be consider only from a use time perspective.

4.2 Contribution with an instance of advanced eco design of IoT systems

From the detailed results reported in previous sections, an advanced eco design run around dataflow policies and sensor data (here, in the context of high-level protocols and LoRa technology) is provided. It aims to reduce the reference flows of our case study (regarding electricity consumption, hardware, and number of batteries) and it illustrates a conciliatory decision-making process based on a data-centered language, never seen so far.

Basically, three redesign initiatives could be discussed. Firstly, designers could reconfigure the gateway to keep in memory the IP address of the cloud server and trigger mechanisms to reobtained it, whenever it changes (this would avoid unnecessary DNS traffic in every transmission event of the regular operational mode). Secondly, designers could set the gateway to initialize a TCP teardown routine automatically (this would prevent the server from sending HTTP timeout requests after uploading data application). In this line, designers could consider alternatively the use of connectionless protocols. Thirdly, designers could configure the gateway to execute preprocessing routines (data aggregation or data reduction techniques) to avoid massive HTTP traffic (POST requests) whenever an online consultation of water consumption occurs. However, designer should proceed with caution, as the synthesis or reduction of sensor data and information may damage the accuracy of the system and/or require more energy.

On the other hand, the impact contribution of the gateway is undeniable. To mitigate it, designers should consider alternative self-powered versions of this device. However, in order to this drastic change makes sense, it should be accompanied by a redesign of the entire data flow of the system, oriented to manage only the necessary transmission frequency, with sufficient quality (in this sense, an analysis on the reasons that provokes the high frequency of transmission events in the regular operational mode need to be conducted).

Beside of that, it is intriguing to see that the manufacturer recommends a maximal distance range of 800 m between the flowmeter and the gateway, when the LoRa technology offers longer communication ranges (from 2 to 14 km). Although revealing the reasons why the manufacturer suggests this distance is beyond the scope of this work, it is believed that a probable motivation is assuring high QoS and maximizing the lifetime of the flowmeters’ batteries (by avoiding long time-on-air periods, as it was explained in Sect. 3.1). However, this would lead to drawbacks, because the number of gateways in a network could increase, depending on their locations and especially on the number of flowmeters and extension of terrains.

Figure 11 shows for example that, if we decided to change our functional unit, and included additional flowmeters on points A, B, C and D [to cover a total area of 2.56 km2 (1.6 × 1.6 km)], a deployment of at least three gateways would be required (according to the recommendations of the manufacturer). However, this does not necessarily have to be so. The packet traffic analysis conducted in this work shows that the flowmeter does not generate significant volumes of data application (approximately 57 Bytes per LoRa transmission event). By considering that this load was perfectly managed by one gateway in a 125 kHz bandwidth and a Spreading Factor of 7, there is sufficient evidence to believe that, under the same conditions, a flowmeter could be covered perfectly within a distance range of 2 km and for more than 5 years, as estimations in the second row of Table 3 suggest.

Fig. 11
figure 11

a Coverage shortcomings (shaded zones) of one gateway (central point) in an area of 2.56 Km.2 (scale 1:75,000). According to the manufacturer, flowmeters in points A, B, C and D would be uncovered by the gateway; b (right) recommended network deployments according to manufacturer for covering flowmeters in points A, B, C and D (scale 1:150,000)

Table 3 Estimation of the maximal lifetime of one Li-ion 9 V battery for transmitting periodic LoRa packets of 57 Bytes, under different operational conditions of the flowmeter (in years). In the second row; results obtained from the empirical conditions seen in our experiments

On the other hand, the two rightest columns of Table 3 (Bfm and Lifetime) reveal two interesting aspects for the reference flow of the case study and its eco design. Firstly, intentions of replacing the battery-powered design of the flowmeter by another self-powered design would be beneficial only in specific contexts. For example, if the distance range is about 14 km, switching to a self-powered version of the flowmeter would avoid the continuous replacement of batteries in the short-term (i.e.; approximately six batteries every 2 years). However, if the local transmissions involve distance ranges of 2 or 4 km, and a selected bandwidth of 125 kHz, a self-powered design would bring only marginal benefits (one would avoid changing one battery approximately every 6 or 3 years). What is worse, one could transfer impacts from the use-phase of the flowmeter to its manufacturing phase (depending on the complexity of its self-powered design). Secondly, maximal benefits could be obtained by using a bitrate of 11kbps in a 250 kHz bandwidth (a battery would be replaced approximately once every 11 years).

Although this analysis on distance ranges and technical capacities could reveal even more tradeoffs in design stages, it should be taken into account prudently, as more variables influencing the quality of LoRa transmission may exits (further research and estimations should be conducted, as long as documentation about the design of the flowmeter, gateway and the dataflow between them become publicly available).

4.3 Contribution with guidelines for impact estimation and practical eco design

To promote further research and eco design initiatives in the context of similar case studies, the following guidelines are offered to the LCA and eco design communities.

  • Guideline 1: The protocol overhead can reconfigure the reference flow of IoT systems and alter the conclusions of LCA studies. As seen in Sect. 3, the environmental impact contribution of the sensor, edge and cloud layers of our case study differs moderately from the theoretical estimation (in which we have considered only TCP and HTTP traffic) to the empirical estimation (in which we acknowledged additionally DNS traffic and high frequency transmissions). The difference is explained by the high internet transmissions frequency in the regular mode and the unexpected internet traffic from the consultation mode (big HTTP post messages, suggesting massive data transfer when the user visualize his or her water consumption). In this sense, our guideline 1 suggest that, when modeling reference flows and estimating the impact of the use phase of IoT systems, LCA practitioners should not underrate transmission frequency in regular operations nor should they dismiss additional transmissions between IoT layers. Specially, they should be aware of additional data flow occurring when a user or a device retrieves information directly from cloud resources. Also, when high accuracy is not needed for an application (as seen in our case study), IoT designers should reduce respectively the sampling rate and the transmission frequency on the sensor and edge layers; and use approximate computing on the cloud side (if possible).

  • Guideline 2: When invariable behavior characterizes an application (as in our case study, in which constant water consumption was emulated) information can be extrapolated from historical data in the cloud infrastructure so that massive local-to-cloud transmissions can be avoid.

  • Guideline 3: When high transmission frequency is necessaire, consider connectionless protocols and reinforce security (for example by using message authentication, rate throttling, robust encryption, etc.). However, evaluate simultaneously the additional energy requirements of security routines and find a balanced solution distributed over the sensing, edge and cloud layers. In our case study, for example, designer can avoid the TCP and HTTP traffic in the regular mode by establishing UDP-based connections between the internet relay and the cloud server. This would produce an approximate reduction of 69% in the dataflow, electricity consumption and environmental impact of mutualized infrastructures (by discarding HTTP timeout requests and three-way handshake, acknowledgement and teardown TCP mechanisms; and by assuming a data payload of 208 bytes for each of the POST messages occurring every 18 s [3,503,888 messages, according to our empirical results summarized in the subsection b of part 7 of the additional material)].

  • Guideline 4: When projecting the reference flows of IoT systems for (re)design, consider not only the operational time of local devices but also their energy consumption patterns in time affected by different circumstances and different capacities of different electronic components, in the context of functions and data flow (e.g., sampling rates in data collection, maximal payloads in data processing, bitrates, frequency bandwidths or distance range in data transmission, etc.). For example, according to our empirical estimations in Sect. 3.3, the flowmeter requires only one battery to transmit a data payload of 57 Bytes to the internet relay every 3 min, during 2 years [by considering a sampling rate of 10 pulse/sec, a bitrate of 5470 bps, a frequency bandwidth of 125 kHz and a distance range covering 800 m (< 2 km)]. However, for 14 km, and considering the same data payload and frequency bandwidth, the reference flow of the flowmeter could increase to 6 batteries (according to the calculations presented in Table 3).

  • Guideline 5: Study functions-capacities of electronic components carefully and in an exhaustive way, so that to propose reasonable designs, or to find technical features that boost or challenge a design. For example, using Frequency-Shift-Keying (FSK) modulation makes no sense in the LoRa-based transmissions of our case study, because its application type does not generate voluminous sensor data. In this line, the high bitrate available in the WiFi module of the gateway is simply unused because wireless transmission between the local equipment and the cloud server do not involve big or irreducible payloads.

5 Conclusions and on-going research

In our increasingly connected world, this research aims to point out the capital role of sensor data for modeling the reference flow of IoT systems and, in this way, illuminate the way for LCA practitioners and IoT designers to conduct comprehensive environmental assessments and efficient eco design. To do this, we proceed in three parts. Firstly, we presented previous LCA and eco design literature that reports interesting variations on reference flows and ecological damage from changes made on sensor data. From this, a data-driven posture for comprehensive impact estimation and effective eco design is outlined. Secondly, we proposed a customized implementation of our previous data-driven framework to estimate the reference flow of a case study and, in this manner, to have a rough idea of its impact in the use phase, in a long-term, and in an unfavorable data flow context. Thirdly, we conduct a packet traffic analysis to estimate empirically the real reference flow of our case study. Overall, the influence of data flow on reference flows and impact estimation, and the relevance of sensor data, functions and capacities for the ecological design of IoT systems are demonstrated.

Indeed, in the first part, we introduce the main challenges when estimating the environmental impact of using full IoT systems and we acknowledge the uncertainty in results when introducing the sensor-data angle. We also foresee the fundamental role of data flow for modeling reference flows and conducting eco design (something that would be aborded superficially in controversial eco design guidelines and impractical standards used so far).

From the second part, it was showed that comprehensive and agile estimations can be conducted from available information (e.g., technical specifications around data manipulation) of IoT devices and key electronic components, all applied in a cross-type life cycle model. Indeed, the theoretical estimation of the reference flow conducted in this part shows a low impact contribution of mutualized infrastructure, a moderate impact contribution of the flowmeter (that comes from the use of two batteries) and the significant impact contribution of the internet gateway of our case study (that comes from its electricity consumption during 2 years).

From the third part, the central role of data has been verified by demonstrating that its flow can modify the reference flow of a system and even redistribute the impact contribution of the local and mutualized infrastructure calculated theoretically. Indeed, the empirical estimation of the reference flow conducted in this part reveal a redistribution of the impact contribution of the flowmeter and the mutualized infrastructure of our case study (the impact contribution of the latter overcome the impact contribution of the former). This is explained by the additional electricity consumption used to deal with additional dataflow between the gateway and the cloud server; and by the reduced number of batteries used for the flowmeter of our case study. In this part, it was also observed the potential complexity of interactions between local and mutualized infrastructures and it was noted that, in the long-term, the absolute impact of the internet and cloud components of IoT systems should not be neglected.

Based on our findings, we presented a fruitful discussion around our contribution, which is oriented to LCA practitioners and IoT designers, and is fourfold: (1) a theoretical and empirical procedure showing how researchers and LCA practitioners can overcome the reported complexity and the lack of information when modeling the reference flow in mutualized infrastructures, (2) an instance of detailed LCA results that shines new light on the unconclusive results seen so far around the environmental damage incurred from using IoT systems, (3) an instance of an advanced redesign run that shows a conciliatory decision-making process based on a common denominator centered on sensor data and (4) practical guidelines.

For all this, in this paper we made use of a customized implementation derived from our previous data-driven framework, which is based on a simple, yet powerful association of functions-capacities of electronic components. This framework, together with another one oriented to facilitate the integration of ecological aspects into the New Product Development (NPD) of IoT prototypes [61] is being tested in parallel in the industrial and educational context. Both conform a methodology for sustainable IoT systems. LCA practitioners and designers should apply them simultaneously in the context of their own projects, as a sort of roadmap that recalls that each part, such as data, information, devices, function, capacities, and electronic components (including its physical and technical characteristics and circularity potential) must be examined holistically and carefully.