Internet of Things, Challenges for Demand Side Management

The adoption of any new product means also the apparition of new issues and challenges, and this is especially true when we talk about a mass adoption. The advent of Internet of Things (IoT) devices will be, in the authors of this paper opinion, the largest and the fastest product adoption yet to be seen, as several early sources were predicting a volume of 50 billion IoT devices to be active by 2020 [1][2]. While later forecasts reduced the predicted amount to about 20-30 billion devices [3], even for such “reduced” number, demand side management issues are foreseeable, for the potential economic impact of IoT applications in 2025 will be between 3.9 and $11.1 trillion USD [4]. Not only that new patterns will emerge in energy consumption and Internet traffic, but we predict that the sheer amount of data produced by this quantity of IoT devices will give birth to a new sort of demand side management, the demand side management of IoT data. How will this work is yet to be seen but, at the current moment, one can at least identify the bits and pieces that will constitute it. This paper is intended to serve as short guide regarding the possible challenges raised by the adoption of IoT devices. The data types and structures, lifecycle and patterns will be briefly discussed throughout the following article.


Introduction
The concept of Internet of Things (IoT) mainly implies Internet connectivity of things such as smart appliances or devices (lighting or heating systems, automatic pet feeders, medical monitoring implants, surveillance monitoring systems, vehicles, etc.), sensors, actuators, communication protocols, big data management and analytics. Even today, IoT produces large volumes of data, with the consequent need for collection, aggregation, processing and storage more effectively. This ocean of data provides new perspectives of services, environment, improved efficiency and life quality, but also come up with some challenges. Mainly, data is acquired by embedded sensors or electronics platforms designed to measure additional data such as current and voltage, operation duration and other parameters. For instance, some smart refrigerators sense the food items that are being cooled inside and keep a track of the stock based on barcode or radio-frequency ID scanning. They are also equipped to keep the track of the stocks and inform the householders whenever food needs to be replenished. However, these features are relevant for the life quality improvement, but for other purposes such as energy demand side management or device performance, they are not relevant or sufficient. Also, smart washing and dish machines can be programmed via mobile phones to operate within certain hours and identify by embedded sensors that the detergent level is low, but its application could also provide detailed information about consumption of each specific washing program based on certain loading of the tank. The data sensor from different washing machine operation programs will be used in the consumption optimization process, since the consumption level and duration are based on the washing program. The smart vacuum could sense the level of dust, start operate in certain conditions and so on. Therefore, in the coming years the producers of smart appliances or In Figure 2 the Device to Cloud model can be seen. Data can be transmitted from smart devices into the Cloud and then receive analyzed data, like voice recognition or time left until the next fitness session. Also, video cameras can be set to transmit images from certain angles, based on the Cloud information received from the user.

Fig. 2. Device to Cloud
Device to Gateway model in presented in Figure 3. The Gateway gathers data from sensors, sends it to Applications Service Providers that process it and return the analyzed data to the same or other sensor.  Figure  4 that data that was received and processed from device sensors is then sent to other application services that interact with IoT system. In Table 1, there are some examples of devices that can connect to each type of IoT system.

IoT data types and structures
IoT devices are primarily characterized by their data production function, only some of them having data processing abilitiesaccording to [11], one can classify IoT devices in four categories: the devices from the first category are simply making one-way service requests while monitoring themselves; the ones from the second category are making one-way export of monitored data, but they also function as a real-time reporter of more complex data in a much more ramped-up fashion; the third category includes devices with interactive abilities and the fourth category includes the IoT devices capable of processing data, sometimes to the extent of some level of artificial intelligence. Taking into account the data as a primary product of IoT devices, one should think about the data types and structured involved.

Types of data generated by IoT
IoT systems work with a huge volume of data, collected usually through sensors, RFIDs, and mobile devices. In IoT systems, data is streaming continuously from a variety of sources which led the companies to the need to rapidly collect, store and analyze large volumes of heterogeneous data [12]. Big data technologies, such as NoSQL databases and Hadoop, can resolve this kind of problems listed above. According to [13], there are four types of data in the Internet of Things:  Status data: indicate the status of every activity monitored by an IoT device;  Location data: help to track an IoT device across a geographical area;  Automation data: represent the core of an IoT system and the main reason of IoT development;  Actionable data: can be defined as status data which may have implications on customers' behavior for future modality of action. When we refer to IoT data management, there are two main questions that need to find answers, as detailed in [14]: where the data will be stored (the storage facilities) and how the data will be stored (the format to be used for storage).
The solution for the first issue refers to NoSQL databases, the only effective way of storing huge volumes of heterogeneous, streaming and geographically-dispersed realtime data. The second issue addressed above refers to the data format used for storage. The main supported data formats are briefly described below:  XML (eXtensible Markup Language): markup language used for encoding unstructured and semi-structured data in a format that is both human and machinereadable;  JSON (JavaScript Object Notation): data interchange format, derived from JavaScript, which is used for storing and exchanging data. Since JSON defines only a data format as text to be read and used by any programming language, we can conclude that it is language independent;  PNG (Portable Network Graphics): extensible file format for the lossless, portable, well-compressed storage of raster images [15];  CSV (Comma Separated Values): file format that stores tabular data as plain text. Each file consists of records divided into fields separated by commas;  XDR (eXternal Data Representation): data serialization format for the description and encoding of data;  RDF (Resource Description Framework): data format recommended by W3C for data interchange on the Web. According to [16], RDF data model is similar to classical modeling approaches, as entityrelationship or class diagrams. Using RDF, we can model data by making statements about it using triples as subject-predicate-object. Other particular data formats are as follows:  GeoJSON / TopoJSON: data formats used for storing spatial data as JSON objects. GeoJSON stores both non-spatial attributes and simple geographical attributes (as points, lines, and polygons). Instead, TopoJSON is an extension that encodes topology, enabling more accurately computations [17];  N3, Turtle, and N-Triples: data formats for storing and exchanging data using RDF model;  Entity Notation: data format for distributed systems, compatible with RDF. A detailed comparison regarding the semantic aspects of RDF, N3, and Entity Notation can be found in [18].

Structure requirements for various data generated by IoT
Since 2012, two reasons led to impressive extension of IoT connectivity: both sensors and communications were improved drastically and that is the origin of the large amounts of data mentioned above. The biggest challenge is to process and use such a large amount of data generated by IoT while it is still "alive", still have value and major impact (while the date is not becoming stale). For instance, data sensors of appliance may send alerts regarding their performance that could affect other things such as food inside refrigerator when the cooling system is not working properly or even the life cycle of the appliance itself. If data regarding low performance of the refrigerator is not used rapidly then the food will be altered and the life of the refrigerator engine may shorten drastically. It is very important to prevent the expensive and critical device such as heating system to encounter failure. So, the storage and post-event analytics processing of this data will be useless. When the data is processed, and used rapidly then the almost real-time actions will be possible. Therefore, the scope is to process and use the data as soon as the occurrence of the event. However, data stream processing needs assessment, aggregation, correlation and timebased analyses (as seen in Figure 5). Data assessment can be seen as a filter that retain the significant data, identify events and discard irrelevant data. Different appliances with different sensors provide variate formats and recording rates. Also, data generated by sensors could be missing, inconsistent or DOI: 10.12948/issn14531305/21.4.2017.05 incomplete due to sensors damage or drops in communication network. The recording rate might be too detailed for certain analysis and monitoring for trends, therefore aggregation is necessary. Even if the sensor is able to record operating values at short time intervals, they could be aggregated on hourly basis since other analyses imply hourly consumption data. Based on simple correlations, alerts will be generated considering relation with other data sets and basic rules. Temporal analyses usually identify irregularities in the consumption pattern that could trigger certain alerts and actions. For instance, in case the heating system is operating more than usually in similar outside temperature conditions, then a window could have been left open. This information may help consumers to save energy since they can decide to slow down the heating system until the event is removed.

Fig. 5.
Steps in processing data streams These steps should be performed immediately after the data is being generated in order to find out what is happening now, rather than what happened some time ago. When predictive, optimization and specific algorithms are included, what will happen in the near future comes out [19] [20]. For on-device data management, SQL (Structured Query Language) is too resourceintensive and probably inappropriate, while a combination of a language producing fast code (e.g. C, C++ or, maybe, C#/.Net Core) with a database management systems (DBMS) provided with a performant native API are most appropriate. On-device embedded databases mainly collect data, act based on that data, and perform some data processing. However, for upstream data already accumulated from the devices, DBMSs that collect, aggregate and process data produced by the IoT can take advantage of the benefits provided by SQL or Not only SQL (NoSQL) technologies. Usually, a combination of technologies, for example SQL and NoSQL for semi or unstructured data for fast data processing, Hadoop for data storage and analytics are usually envisioned [21].

How are the data models and application patterns evolving because of the IoT?
Traditional applications, such as OLTP (Online Transaction Processing) or OLAP (Online Analytical Processing), follow data life cycles we are accustomed to. For OLTP data, a normal pathway will pass through data capture, maintenance, synthesis, usage, publication, archival and purging [22]. For OLAP data, one will add some transformation, filtering and aggregation stages in between capture and maintenance. Also, the OLAP data life cycle will be, time concerning, longer than the one for OLTP data. Data modelling and application design patterns are well known for both OLTP and OLAP. The next historical step was the apparition of Big Data, and,while the data lifecycle for Big Data is largely the same as the one for OLTP data [23], new constraints appeared on data itself, meaning the so called four V's (Volume large volumes of data, Velocitydata lifecycle happens at a higher speed, Varietydata comes from a much larger number of sources, has different forms and structures, Veracity -Big Data's data has a higher degree of uncertainty than the one usual for traditional applications) [24]. Later on, a fifth V was added, Value [25], but in this paper's authors opinion, this fifth V is less a data characteristic from a purely data science point

Assessment Aggregation
Correlation and time-based analyses of view and more an economic concern which can be made not only for Big Data.

Data modeling
The last apparition in the data world, the IoT has the potential to be a game changer for the fact that it involves new data modeling and application design patterns, even negating some of the old ones [26] while some others are kept [27]. From a pure design point of view, one will have to include into a complete IoT application, components / models for the following features [26][28]: Data Ingestion: IoT solutions require the ability to ingest various types of data such real-time telemetry data at a massive scale while remaining effective (i.e. providing a low Gateway is an intermediate device that communicates with sensors and actuators using low-level protocols. Such a gateway will offer capabilities such as sanitization of telemetry data, aggregation and edgeanalytics of telemetry data. Business Rules Engine: Once ingested, the data has to be processed in order to derive control decisions and business insights. The processing could be done at two points in time: immediately, on real-time data streams (the so called Hot Path), and later, on offline data (the so called Cold Path). Heartbeat: For the same reason mentioned earlier at Loose Coupling, IoT devices should send a 'heartbeat' to the Cloud platform from time-to-time which includes information about the internal health of the device. Watermark: having an explicit model of the age of data to ensure accurate data is processed (e.g., by using heartbeats). Lineage: similar to a watermark. Lineage maintains a history of data as it moves across all devices. Control totals: consistency checks between values. Canary Firmware Releases: Most of the IoT devices require their firmware to be remotely updated, which means remote download and integrity check before replacing the older firmware on the device. Unlike software updates for traditional applications which should be ideally done on the entire base of systems at the same time, firmware updates should be incrementally rolled-out, limiting the risks due to a faulty firmware release and avoiding the data transfer spike generated in the cloud infrastructure by a mass update. Unified Endpoint Management: The ability to manage IoT devices remotely, usually via a dashboard and a set of APIs. Device Authorization: IoT devices are often headless, offering no means to provide authorization via credentials, using client certificates to authorize themselves with the IoT Cloud. State Synchronization: It can be based on virtualizing the IoT device's "state" as an object in the Cloud. Web and Mobile applications can interact with this "state" DOI: 10.12948/issn14531305/21.4.2017.05 object to change the state of the IoT device. The changes will be further synchronized with the real device object. Device Registry: IoT applications need to keep track of all deployed IoT devices (identifiers, certificates, configuration, state, etc.). By including in an application the abovementioned techniques, one will obtain a consistent and stable IoT solution.

Application patterns
Volkmar Denner, chairman of the board of management at Robert Bosch GmbH, identified in [29] five application patterns for IoT solutions: Cloud-based applications: This pattern supports applications that reside solely in the cloudin other words, applications that are not connected to any assets or devices. In many respects, this pattern is supported by any normal cloud, but there are some aspects to this pattern that are specific to the IoT. Before connecting to the asset, most IoT solutions require basic functions, such as master data management for users and assets. IoT application developers demand basic functions from a cloud, including the ability to manage the relationship between users and assets also with different access rights. Asset-based applications: This pattern sees the IoT solution from the opposite point of view: an IoT cloud has to support application logic and data, which enables autonomous asset behavior. It is important for many mission-critical solutions that simply cannot rely on the assumption that the asset is always connected. However, even if asset functions have to perform autonomously in this pattern, there is one dependency on the cloud. Since most IoT solutions are constantly evolving, the IoT cloud's job is to ensure that software is distributed to the asset whenever a sufficient level of connectivity is available. Distributed IoT applications: Many IoT applications will leverage the ability to combine and integrate an application's logic and data, both on the asset and in the cloud. Distributed IoT applications can be extremely powerful, with the means to harness the IoT's full potential. Digital twin: The idea of this pattern is that IoT will provide us the ability to create a digital twin of the physical asset based on readouts from machine components as well as additional sensors. This digital twin in the cloud will open up many new functions and solutions, including predictive analytics (see the state synchronization model described at 4.1). The key benefit for application developers in this scenario is that they don't have to worry about connecting to the asset and extracting the data. Because the applications are not deployed on the asset, but only in the cloud, the sandbox approach reduces the security risks. Social IoT: Multiple assets can be aggregated and used by multiple applications. These applications can use this "social" IoT data to benefit either the community or an individual asset user. In order to support the social IoT pattern, an IoT cloud would have to be capable of supporting data ingestion and processing on a massive scale. Also, the IoT cloud needs an enforceable data management and security policy to ensure that only designated data is shared. Greg Gorbach later identified in [30] two more application patterns, on top of the five imagined at Bosch: Edge-optimized IoT: In industrial applications, there are cases which may benefit from a pattern in which intelligence and analytics is deployed at or near the assets. Rapid responses and actions would be facilitated, and only a subset of the streaming data would need to be sent to the cloud for storage and processing. Multi-party IoT: The concept behind this pattern is that there is often more to the story than just the asset and the cloud. For example, look at a mining operation. Data from an asset (heavy machine) may be monitored by the machine manufacturer in order to improve the machine design, by the mine operator to coordinate the machine's work with other machines, by a local third-party field service company; and by a replacement parts company.

Issues to solve in order to make IoT work
As for any new technology, IoT adoption means the emergence of new issues to be solved. Some of them can even lead, according to the authors of this paper opinion, to the advent of a new specialty, the demand side management of IoT data, as various sources are indicating a volume of 4.9 billion IoT devices active in 2015, 6.4 billion devices in 2016 [31], 8.4 billion devices in 2017 [32] and a predicted 20.4-20.8 billion devices in 2020 [3]. New energy consumption patterns will be surely induced in the demand side management of energy. While is not the purpose of this paper to discuss energy consumption issues, and one of the IoT devices usages is to provide data usable in order to reduce the energy consumption of other devices, on a demand side management note, we can remark that a typical IoT device consumes an average of 0.4 -8.0 W of energy in standby, which leads to a staggering predicted standby consumption of 46 TWh for 2025, not taking into account the energy consumed by the IoT devices while active, and the energy used in auxiliary activities (i.e. battery and solar cells production, IoT devices production, etc.) [33]. Some of the issues to be solved will be related to the sheer number of devices and to the volume of data generated by them while other challenges will be technology related. The following subchapters will briefly describe these types of issues.

Volumes of data implied by IoT
For an individual accustomed with day-to-day data volumes involved by the activities involving PC or traditional mobile devices (megabytes, gigabytes, or maybe even terabytes), the amount of data produced by inconspicuous systems, such as the small IoT devices, may seem amazingly large. Cloud traffic will rise 3.7-fold by 2020, starting from a 3.9 zettabytes (ZB) growth per year in 2015 to a 14.1 ZB growth per year in 2020 [34]. The Big Data and associated Internet of Things are a big part of this growth. As IoT grows, so do the volumes of data it generates. Globally, the data created by Internet of Everything (IoE) devices will reach 507.5 ZB per year (42.3 ZB per month) by 2019 (269 times greater than the data being transmitted to data centers from end-user devices and 49 times higher than total data center traffic), up from 134.5 ZB per year (11.2 ZB per month) in 2014 [35]. By 2020, database, analytics and IoT workloads will account for 22% of total business workloads, compared to 20% in 2015. The total volume of data generated by IoT will reach 600 ZB per year by 2020, 275 times higher than projected traffic going from data centers to end users/devices (2.2 ZB); 39 times higher than total projected data center traffic (15.3 ZB) [34]. Again, by 2020, for a typical smart city, connected air planes will produce 40 TB per day, 0.1% of this being further transmitted to a cloud / data center, connected factories will produce 1 PB per day, 0.2% of this being transmitted, public safety systems will produce 50 PB per day, less than 0.1% of this being transmitted, weather sensors will produce 10MB per day, 5% of this being transmitted, intelligent buildings will produce 275 GB per day, 1% of this being transmitted, smart hospitals will produce 5 TB per day, 0.1% of this being transmitted, smart cars will produce 70 GB per day, 0.1% of this being transmitted and smart grid will produce 5 GB per day, 1% of this being transmitted [36].

Challenges of IoT
The following challenges can be identified while working with data produced by IoT devices [37]: Latency: Communications and data processing are characterized by latency. For some mission-critical functions, a large latency is intolerable, so compute on the edge is a must. Latency in data transfer reduces time to insight, which slows time to action for business and responses to the data. Bandwidth: Even when each device is sending small amounts of data, a large number of devices will need a hefty bandwidth and, while the available bandwidth is growing rapidly [38], there was never a large excess to be used by a supplementary application. Cost: Sending large amounts of data is costly. Processing data at the edge reduces the network-related costs. Threats: The practice of data communication channels involves the fact that data is exposed to attacks and security breaches. These risks strengthen the case for securing and analyzing data at the edge. Duplication: While traditional OLTP applications design loathes redundancy of data, duplicated data is a fact for Big Data / IoT applications. The complexity and cost of supplementary storage and other assets is a challenge. Corruption: Various factors (attacks, faults) can lead to data corruption. Taking into account that data quality for Big Data / IoT is already low, supplementary corrupted data is an issue. Compliance: Regional and country compliance regulations can restrict or complicate data transfer across borders and over long distances. Such issues can be mitigated with edge analytics. As noted by Kevin Kalish, IoT Domain Lead at SAS, "The view of big data in IoT is that it is more a commodity and that sometimes can lead businesses to the desire to become a bit of a data hoarder." "The misconception is that storage is a commodity and big data will solve these problems but the volumes and the costs are quickly becoming unsustainable." "Unless part of your business model is data monetization, it's highly likely that you can afford to only send back filtered data." [39]. In simpler terms, half of the above-mentioned challenges can be mitigated by not sending all the data that is produced by the IoT devices to a central cloud / data center. Several solutions, all of them involving lesser amounts of data to be transmitted to a central cloud / data center, are already available or are envisioned as answers to these challenges [40]: Edge Computing / Fog Computing: A "fog" (term proposed by Cisco) extends the cloud to be closer to the things that produce and act on IoT data. These devices, called fog nodes, can be deployed anywhere with a network connection: on a factory floor, on top of a power pole, alongside a railway track, in a vehicle, or on an oil rig. Any device with computing, storage, and network connectivity can be a fog node. Examples include industrial controllers, switches, routers, embedded servers, and video surveillance cameras [41] [42]. Mobile Edge Computing: Mobile Edge Computing (MEC) is a solution proposed by ETSI which offers application developers and content providers cloud-computing capabilities and an IT service environment at the edge of the network. This environment is characterized by ultra-low latency and high bandwidth as well as real-time access to radio network information that can be leveraged by applications. MEC provides a new ecosystem and value chain. Operators can open their Radio Access Network (RAN) edge to authorized third-parties, allowing them to flexibly and rapidly deploy innovative applications and services towards mobile subscribers, enterprises and vertical segments [43]. Edge computing / Cloudlet: a new architectural element that arises from the convergence of mobile computing / IoT and cloud computing. It represents the middle tier of a 3-tier hierarchy: mobile or IoT devicecloudlet -cloud. A cloudlet can be viewed as a "data center in a box" -it's self-managing, requiring little more than power, Internet connectivity, and access control for setupwhose goal is to "bring the cloud closer" [44] [45]. Micro Data Center: a smaller or containerized (modular) data center architecture that is designed to solve different sets of problems that take different types of compute workload that does not require to traditional facilities. Typically, it is a standalone rack-level system containing all the components of a "traditional" data center [46].

Conclusions
IoT is already a part of our daily lives, solving various problems and providing answers and insights, while it is still in its infancy stage. This paper passed through some of the already large quantity of information available about the IoT devices, attempting a synthesis of the challenges raised by their usage while, together with some available or proposed solutions.