1 Introduction

At the age of Industry 4.0, industrial digital technologies (IDTs) have been developed rapidly (Maier 2017; Dopico et al. 2016), such as the Internet of Things (IIoT), artificial intelligence (AI), edge computing (Sittón-Candanedo et al. 2019), and pervasive knowledge (Deng et al. 2020). It requires the significant digital devices integrating into industrial systems, which has a high demand for data storing, transferring and analysing. For example, an automobile manufacturing company generally generated about 480 TB in 2013, which will be still increasing three-time by the end of 2020. Furthermore, the data was generated from different data sources, fused and nested together (Luckow 2015). In order to handle such a big and complex data, the conventional big data analytics methods have many challenges, such as high unreliable latency, high energy consumption, incomplete data fusion and poor security. The current industry needs a reliable, stable, and fault-tolerant data communication system and adequate real-time processing power for exploiting the hidden knowledge from it (Gai et al. 2016). At the meanwhile, the edge devices have become more potent and powerful in terms of high computing speed, ample memory space, and multiple embedded functions (Wang et al. 2020). It has pushed applications, data and computing power away from centralised points to locations closer to the user, which provides low latency, low energy consuming and secure support to delay-sensitive applications. Every edge of this IoT environment has the sufficient capability to learn and discover knowledge based on the big data and each other (Deng et al. 2020; Shi et al. 2016). The IIoT now has become a critical research topic for solving the industrial big data challenges.

The IIoT based framework, architecture, and taxonomy have been increasingly generated and published since the term of IIoT was firstly used (Boyes et al. 2018). According to the previous research, an IIoT system generally consists of four layers, which include device layer, network layer, service layer, and content layer (Hylving and Schultze 2013; Jansen and van der Merwe 2020; Hossain and Muhammad 2016). Data is collected on the device layer, transferred on the network layer, and then analysed on the service layer. Finally, the discovered knowledge is present on the content layer. These four layers commonly follow a logical sequence in the previous research when the framework is applied. However, with the rapid development of edge device and communication technology, the functions of these four layers are not distinguished specifically. Data can be analysed on the service, device and network layer. Furthermore, the different types and levelled knowledge is discovered pervasively, which will be presented to the users and also improves the performance of every point of the IIoT framework. There is a need in both academia and industry to realise how to achieve the above functions.

This paper aims to propose a novelty approach, Industrial Internet of Learning (IIoL), by delivering concept, introducing the framework and revealing case studies. This approach fully discovers the potential of the entire data network to solve the issues of data centralisation by the technologies of LPWAN and edge computing. In Sect. 2, relevant literature is reviewed to understand the state-of-the-art of LPWAN and edge computing technologies. Server LPWAN technologies are discussed and comprised. Also, edge computing is reviewed to compare to cloud and fog computing. Based on the understanding of LPWAN and edge computing, the IIoL is proposed in Sect. 3. The framework includes two main parts, the LPWAN part and the cloud part. The LPWAN consists of smart sensors and gateways. The raw data is sensed and collected from the industrial environment, such as production line, robots, and Computer Numerical Control (CNC) machines, by smart sensors. Multiple dimensional data is integrated and pre-processed in smart sensors. Then the pre-processed information is sent to smart gateways which can discover the knowledge for smart sensors and the industrial environment. Finally, the processed data is processed and uploaded to the cloud, which means only the important processed information is uploaded to the cloud. The knowledge chains flow in the entire architecture, which creates a pervasive knowledge network. This approach has entirely realised the computing and data communication potential of edge devices and LPWAN. In Sects. 4 and 5, two cases, health prognosis of a water plant and automobile factor assets monitoring and management, are revealed to prove the feasibility of the proposed IIoL. Section 5 concludes.

2 Literature review

2.1 Data communication for LPWAN

The industrial communication used dedicated networks which were named as Fieldbus system in early days. This type of network was limited by parallel cabling between systems, sensors, actuators, and controllers (Sauter 2010). From the 1990s, the wireless networks became increasingly popular in the industry due to cables do not restrict it. These wireless networks were mainly adopted by the IEEE 802 protocol group (Tramarin et al. 2015). However, the main challenge of wireless networks was to ensure real-time and reliability capabilities, especially in the manufacturing environment (Vitturi et al. 2013). Wollschlaeger et al. (Wollschlaeger et al. 2017) believed that the Internet of Things (IoT) and Cyber-physic systems (CPS) would change the industrial scenery again because the concepts of IoT and CPS can fulfil the industrial requirement, e.g., regarding real-time, mobility, safety and security. They had also pointed out there were still some challenges in IoT and CPS current industry, such as hard time boundaries, isochronous communication, low jitter, high availability and low cost. In the current industrial environment, LPWAN became increasingly popular (Qin et al. 2019; Al-Sarawi et al. 2017) due to it offered affordable connectivity to industrial devices (Raza et al. 2017). In Fig. 1, the communication technologies are compared in two perspectives, data rate speed and range capacity.

Fig. 1
figure 1

IoT communication technologies comparison

Compared to other wireless communication methods, LPWANs generally has covered a big range. In this wireless communication technology group, several individual technologies are included, such as LoRa (Georgiou and Raza 2017), Sigfox (Zuniga 2016), Narrowband IoT (NB-IoT) (Ratasuk et al. 2016), and ZETA (Mekki et al. 2019). These four most popular LPWANs have various capabilities in terms of bandwidth, data rate, communication range, allow private network and standardisation, which are shown in Table 1.

Table 1 Comparison between LoRa, Sigfox, NB-IoT and ZETA

The LoRa and NB-IoT are the two leading emergent LPWAN technologies (Sinha et al. 2017). ZETA is emerging which targets on reducing the energy consumption with the required communication quality of the network. It is currently popular in some Asian countries such as China and Japan (ZiFiSense 2020). From the above table, LoRa and ZETA have the most considerable bandwidth with a reasonable data rate. In contrast, the bandwidth and data rate of Sigfox is less than LoRa, NB-IoT and ZETA.

2.2 IIoT and edge computing: challenges and opportunities

The industrial big data refers to data generated in high-volume, high-variety, and high-velocity that requires high veracity to creates high-value knowledge so-called five ‘V’ (Yin and Kaynak 2015). Generally, the industry generated data is upload to the factory management systems or the cloud, which is analysed by the centralised cloud computing technologies (Gonçalves 2015). It offers industrial organisations to centrally store massive amounts of data and optimise computational resources to deliver on their data processing needs. The conventional centralised cloud computing is encountering severe challenges, such as unreliable latency, high energy consumption, and poor security (Shi et al. 2016).

Specifically, compared to the fast-developing data generation speed, the bandwidth of network has come to a standstill. With the growing quantity of data generated from IIoT, speed of data transportation is becoming a bottleneck for the cloud-based computing paradigm. For example, a typical automated manufacturing company generates 24 TB data per day, resulting in 13 billion data samples per day (GE 2012). If all the data needs to be sent to the cloud for processing, the response time would be a big issue. In some industry scenarios, The difference between a response time of 100 ms and 1 ms can be life-threatening (Bosch 2018). Secondly, big data could lead to an explosion in energy use. Nowadays, data centres use an estimated 200 terawatt-hours (TWh) each year and contribute around 0.3% to overall carbon emissions (Jones 2018). New alarming research suggests that data centres will be one of the biggest energy consumers on the planet, beating energy consumption levels in many countries. One of the most worrying models predicts that electricity is consumed by information communication technologies (ICT) could exceed 20% of the global (Andrae and Edler 2015). With increasing applications installing on to the cloud, it may become untenable to meet the increasing energy demands. Thirdly, the security issue is another big challenge for using cloud computing. With as many IoT devices, smartphones, and other computing systems as there are available now, cyber-attacks against cloud platforms were mostly unthinkable. In present, not all cloud computing services can provide the high level network security. Some of the cloud solutions hardly deliver the required security between users, leading to shared resources, applications, and systems. Therefore, threats can originate from other users with the cloud service. Also, they target one user could also have an impact on other users (Kaufman 2009).

Motivated to solve these challenges, a new technology, edge computing, is driving a trend that shifts the function of centralised cloud computing to edge of networks (Ai et al. 2018). Similar to cloud computing, edge service providers furnish application, data computation, and storage services to the end-users. However, the edge services provide low latency, low energy consuming and secure support to delay-sensitive applications. Although edge computing has several advantages over cloud computing, the research on the emerging domain is still in its infancy (Khan et al. 2019). There exist three conventional edge computing technologies, cloudlet (Satyanarayanan et al. 2009), mobile edge computing (Hu et al. 2015) and fog computing (Bonomi et al. 2012). Mobile edge computing was defined by the European Telecommunications Standards Institute (ETSI) as an edge computing technology. It provides mobile users to use the computing service from the edge of the mobile network, within the Radio Access Network (RAN) and close to mobile subscribers (Hu et al. 2015). Cloudlet is a mobility-enhanced small-scale cloud data centre located at the edge of the network. The cloudlet supports resource-intensive and interactive mobile devices with lower latency. Different from the cloud, a cloudlet needs to be more agile in its configuration in order to associate with mobile devices. Also, the offloaded services need to be seamlessly migrated between cloudlets (Satyanarayanan et al. 2009). Fog computing extends the cloud computing paradigm to the edge of networks, wireless networks for the IoT. It is a highly virtualised platform that provides compute, storage, and networking services between end devices and traditional cloud computing data centres, but not exclusively located at the edge of network (Bonomi et al. 2012).

Li et al. (2019) proposed a resource scheduling approach for manufacturing systems on edge devices. This approach contained two aspects: selecting an algorithm for edge server (SA-ES) and cooperation of edge computing for low-latency task (CEC). The computing results were obtained based on the communication, computing and queuing time. In the experiments, the proposed approach was compared to cloud server computing and ordinary edge computing. The validation results showed that the latency of the approach was the lowest with the lowest energy consumption generally. It was interesting that in the small data size task, the performance proposed approach was similar to ordinary edge computing. Comparing between edge and cloud computing, cloud computing was typically slower (30%) and consuming more energy consumption (50%). The paper also proved the resource scheduling was necessary for edge computing, which can provide lower computing latency and energy consumption.

Qi and Tao (2019) proposed several reference architectures of edge, fog, and cloud computing in smart manufacturing. These architectures were related to and independent of each other. The hierarchy architecture included three-level computing concepts, which edge computing was close to the real world. The fog computing was in between the edge and cloud computing with the medium speed, and cloud computing was on the top of the pyramid, which focused on the collaboration task with the high latency. In this article, three other architectures were provided for each of edge, fog and cloud, which could display their features and the development roadmap.

Besides, services on these three computing platforms can be summarised. Generally, on edge, the services are low intelligent compared to the fog and cloud. Nevertheless, the response time is much short, which can achieve real-time control. On the cloud, some high-level intelligent services are provided, such as task management, supply–demand matching, personalised customisation smart design collaboration and commerce collaboration. Authors tended to generate an overall structure for smart manufacturing including edge, fog and computing. Based on the capability and features, various levelled services are provided, which allows three-level computing technologies cooperated for achieving the requirement of smart manufacturing.

3 IIoL: pervasive knowledge network—concepts and framework

3.1 Concept and framework of IIoL

The IIoL architecture, shown in Fig. 2, consists of three main sections, physical section, LPWAN section, and cloud section. In the physical section, the industrial equipment, including industrial robots, computer numerical control (CNC) machines, and other industrial machines, are the primary data sources. The raw data is collected from these data sources by the smart sensors of the LPWAN section. Another critical part of the physical section is information platforms which provide the information and knowledge discovered from the cloud and LPWAN sections.

Fig. 2
figure 2

The framework of IIoL

In the LPWAN section, this research proposes an internet of learning based framework for industry, which is called Industry internet of learning (IIoL). The smart sensors collect raw data from industrial equipment. The data is also pre-processed in smart sensors relying on the embedded processors. The pre-processed data is uploaded to the smart gateway, and low-level information, such as system status warning information is delivered to the physical section assisting the people in making corrected decisions. All of the pre-processed data is uploaded to the smart gateways, where the pre-processed data is analysed by some advanced data analytics algorithms, such as regression, classification, and clustering. The data is processed becoming more abstracted. Two types of knowledge are generated in the smart gateways. One knowledge is for smart sensors which improve the parameter settings in both software and hardware of smart sensors. Another knowledge will be present in the physical section, which focuses on the entire manufacturing system. For example, the twin models and machine health analysis is generated based on knowledge. In the proposed IIoL framework, only processed data is uploaded into the cloud. In the cloud, high-level data analysis, such as integrated simulation and synthesis, collaborative diagnostics and decision making, is completed. Cloud-discovered knowledge can be used for improving the internet of learning based LPWAN. Generally, the proposed LPWAN, including smart sensors and gateways, have the learning capability from each other and the cloud. Additionally, on the cloud, AI knowledge, such as self-configuration, self-adjust, and self-optimisation, is discovered.

The proposed framework matches the structure and requirement of 5C architecture proposed by Lee et al. (2015), which is shown in Fig. 3. In the 5C architecture, five levels of functions are described as Sensor connection level, Data to information conversion level, Cyber level, Cognition level, and Configuration level, which are classified as three main technologies, Data technology, Analytic technology, and Operation technology. In the proposed structure, the smart sensors have mapped the functions in the Sensor connection level and Data to information conversion level. It is benefited from the edge device capabilities of data collection and processing. These capabilities are referred to the expected features of Data technology and Analytic technology defined for the 5C architecture, such as high data transfer rate, low latency, high reliability, data pre-processing and data visualization. The proposed smart gateways provide the functions on Cyber level, which the analytic technology is applied. Furthermore, the cloud makes the manufacturing system intelligent and self-configured which are the core functions on Cognition level and Configuration level. In order to discover the hidden patterns and knowledge, Operation technology is used for achieving enterprise control and intelligent optimisation. Generally, the proposed framework includes the primary functions and features in 5C architecture (Lee et al. 2015). In the next section, the knowledge chains in this framework are examined and discussed with more details.

Fig. 3
figure 3

The mapping between the proposed framework IIoL and 5C architecture (Lee 2019)

3.2 Pervasive knowledge network: IIoL-enabled LPWAN and cloud

Based on the computing capability of each edge in a proposed LPWAN network, including smart sensors and smart gateways, the knowledge is discovered and flowed between smart sensors and gateways. Integrating the cloud, there are several knowledge chains in the proposed framework. In this section, the details of these knowledge chains are discussed. Before introducing the details of the methodology, the list of nomenclature is displayed in Table 2.

Table 2 The list of nomenclature which is used in the methodology

3.2.1 On smart sensors

The raw data is collected from the physical section by smart sensors which are represented as follows:

$${\text{D}} = \left[ {{\text{D}}_{1} , {\text{D}}_{2} , \ldots , {\text{D}}_{r} } \right],$$
(1)

where \({\text{D}}\) is the entire raw dataset which consists of the number of r features, \({\text{D}}_{1} , {\text{D}}_{2} , \ldots , {\text{D}}_{r}\).

There are three types of knowledge for the physic work, Knowledge I, III, V which is discovered from smart sensors, gateways and the cloud. For Knowledge I, the based representation is shown as:

$$\begin{aligned} E_{1}^{kp} = g_{kp} (D,e_{1}^{kp} ), \hfill \\ E_{2}^{kp} = g_{kp} (D,e_{2}^{kp} ), \hfill \\ E_{n}^{kp} = g_{kp} (D,e_{n}^{kp} ), \hfill \\ \end{aligned}$$
(2)

where \({\text{E}}_{1}^{kp} \ldots {\text{E}}_{n}^{kp}\) are the edge obtained knowledge for the physic world (Knowledge I), \(g_{kp} \left( \cdot \right)\) is the function of discovering the knowledge for the physic world. \({\text{n}}\) is the number of edge devices. \(e_{1}^{kp} \ldots e_{n}^{kp}\) are the variables of the function \(g_{kp} \left( \cdot \right)\). Furthermore, the data is also pre-processed on smart sensors, which are defined:

$$\begin{array}{*{20}c} {E_{1}^{kf} = g_{kf} (D_{1}^{E} ,e_{1}^{kf} ),} & {E^{KF} = [E_{1}^{kf} ,E_{2}^{kf} , \ldots ,E_{n}^{kf} ],} \\ {E_{2}^{kf} = g_{kf} (D_{2}^{E} ,e_{2}^{kf} ),} & {E_{1}^{FK} = [E_{1}^{kf} ,E_{2}^{kf} , \ldots ,E_{j}^{kf} ],} \\ {{\text{E}}_{n}^{kf} = g_{kf} \left( {{\text{D}}_{n}^{E} ,e_{n}^{kf} } \right),} & { {\text{E}}_{2}^{FK} = \left[ {{\text{E}}_{j + 1}^{kf} , {\text{E}}_{j + 2}^{kf} , \ldots , {\text{E}}_{k}^{kf} } \right],} \\ \end{array}$$
(3)

In the above functions, \({\text{E}}_{1}^{kf} \ldots {\text{E}}_{n}^{kf}\) are the pre-processing results for the smart gateway, \(g_{kf} \left( \cdot \right)\) is the function of discovering the knowledge for smart gateway. \(e_{1}^{kf} \ldots e_{n}^{kf}\) are the variables of the function \(g_{kf} \left( \cdot \right)\).

3.2.2 On smart gateways

In the smart gateway section, two types of knowledge are discovered. One is for the physic world (Knowledge III), which the knowledge chain is represented as:

$$\begin{aligned} {\text{F}}_{1}^{KP} = & \, f_{kp} \left( {{\text{E}}_{1}^{FK} ,f_{1}^{kp} } \right) \\ {\text{F}}_{2}^{KP} = &\, f_{kp} \left( {{\text{E}}_{2}^{FK} ,f_{2}^{kp} } \right) \\ {\text{F}}_{n}^{KP} = & \, f_{kp} \left( {{\text{E}}_{m}^{FK} ,f_{n}^{kp} } \right) \\ \end{aligned}$$
(4)

In this set of functions, \({\text{F}}_{1}^{KP} \ldots {\text{F}}_{n}^{KP}\) are the knowledge for the physic world (Knowledge III), \(f_{kp} \left( \cdot \right)\) is the function of discovering the knowledge for the physic world. \(f_{1}^{kp} \ldots f_{m}^{kp}\) are the variables of the function \(f_{kp} \left( \cdot \right)\). Another type of knowledge (knowledge II) is for smart sensors which the knowledge chains are denoted as:

$$\begin{array}{*{20}l} {{\text{F}}_{1}^{fe} = f_{ke} \left( {{\text{E}}_{1}^{FK} ,f_{1}^{ke} } \right)} & {f_{1}^{ke} = \left[ {e_{1} ,{\text{e}}_{2} , \ldots , {\text{e}}_{j} } \right]} \\ {F_{2}^{fe} = f_{ke} (E_{2}^{F} K,f_{1}^{k} e)} & {f_{2}^{ke} = [e_{(j + 1)} ,e_{(j + 2)} , \ldots ,e_{k} ]} \\ {{\text{F}}_{n}^{fe} = f_{ke} \left( {{\text{E}}_{n}^{FK} ,f_{n}^{ke} } \right)} & {f_{n}^{ke} = \left[ {{\text{e}}_{n - 2} ,{\text{e}}_{n - 1} , \ldots , {\text{e}}_{n} } \right]} \\ \end{array}$$
(5)

the \({\text{F}}_{1}^{KE} \ldots {\text{F}}_{n}^{KE}\) are the knowledge for smart sensors (Knowledge II), \(f_{ke} \left( \cdot \right)\) is the function of discovering knowledge. \(f_{1}^{ke} \ldots f_{n}^{ke}\) are the variables of the function \(f_{ke} \left( \cdot \right)\), which is related to the inputs of smart sensor \({\text{e}}\). Meanwhile, in the smart gateways, the data collected from smart sensors is processed again for uploading to the cloud:

$$\begin{aligned} {\text{F}}_{1}^{KC} = f_{kc} \left( {{\text{E}}_{1}^{FK} ,f_{1}^{kc} } \right) \hfill \\ {\text{F}}_{2}^{KC} = f_{kc} \left( {{\text{E}}_{2}^{FK} ,f_{2}^{kc} } \right) \hfill \\ {\text{F}}_{m}^{KC} = f_{kc} \left( {{\text{E}}_{m}^{FK} ,f_{m}^{kc} } \right) \hfill \\ \end{aligned}$$
(6)

\({\text{F}}_{1}^{KC} \ldots {\text{F}}_{m}^{KC}\) are the processed data for cloud, \(f_{kc} \left( \cdot \right)\) is the function of data processing for the cloud. The \({\text{m}}\) is the number of smart gateways. \(f_{1}^{kc} \ldots f_{m}^{kc}\) are the variables of the function \(f_{kc} \left( \cdot \right)\).

3.2.3 On cloud

On the cloud, the primary knowledge (Knowledge V) \({\text{C}}^{KP}\) is for the physical world by using the function \(h_{kp} \left( \cdot \right)\):

$${\text{C}}^{KP} = h_{kp} \left( {{\text{F}}^{KC} ,\Delta^{kp} } \right)$$
(7)

The inputs of \(h_{kp} \left( \cdot \right)\) is the processed data obtained from the smart gateway. Also, the cloud can provide another knowledge (knowledge IV) for smart gateways based on the high-level information to improve the performance, which is represented as:

$$\begin{aligned} {\text{C}}_{1}^{kf} = h_{kf} ({\text{F}}_{1}^{KC} ,\Delta^{kf} ) \hfill \\ {\text{C}}_{2}^{kf} = h_{kf} ({\text{F}}_{2}^{KC} ,\Delta^{kf} ) \hfill \\ {\text{C}}_{m}^{kf} = h_{kf} ({\text{F}}_{m}^{KC} ,\Delta^{kf} ) \hfill \\ \end{aligned}$$
(8)

\({\text{C}}_{1}^{kf} \ldots {\text{C}}_{m}^{kf}\) are the knowledge (knowledge IV) delivering to each smart gateway, where \(h_{kf} \left( \cdot \right)\) is the knowledge discovering function, \({\text{F}}_{m}^{KC}\) are the variables. In both expressions, \(\Delta^{kf}\) and \(\Delta^{kp}\) are the bias. According to the framework details shown above, five types of knowledge are discovered in the proposed framework. They have been summarised in Table 3, where the main contents, the target of the knowledge and the location of discovery are clarified

In order to validate the feasibility of the concept, framework and pervasive knowledge network, two cases are demonstrated in the next two sections. In the first case, the proposed framework was applied for the machine health prognosis of a water plant. The plant consists of three pumping stations located in three different places, which over ten pump machines are installed in each pumping station. The vibration data of the pump machine is monitored, transferred and analysed for the predictive maintenance. The second case is about the automobile production factory assets monitoring and management. An over 200,000 square meters automobile production factory is focused. The status of laser rooms and assembly robots in this factory are monitored and controlled. The Predictive maintenance (PdM) is applied for the robots in this case. Both cases have been implemented in the real scenario, guided by the framework of the IIoL which the LPWAN of is based on the ZETA wireless communication technology. The data in these two case is collected from ongoing machines, systems, and working environment, which is used for improving and maintaining the relevant manufacturing systems.

Table 3 The list of knowledge discovered by the proposed method

4 Case study I: machine health prognosis and management of a water plant

4.1 Background

In this case, the machine health prognosis was analysed under the framework of the proposed IIoL approach focusing on a water plant which includes three pumping stations. These pumping stations are located in Shenzhen China, which has displayed on the following map, Fig. 4. The distance between the central station and west station is about 1.1 km, and the distance to the east station is about 0.35 km. The smart gateway is allocated in the central station. In each station, over ten water pumps are monitored by the smart sensors. The data is sensed, collected and pre-processed in each smart sensor. Then the pre-processed data is sent to the smart gateway for further analytics. The knowledge of improving data collection and pre-processing return to the smart sensors. The essential information of urgent maintenance is displayed to the maintenance operators. These knowledge and information are also uploaded to the cloud for complete analytics. The advanced big data analysis, such as machine learning and deep learning, is applied to the cloud for machine health prognosis, which will also integrate more data and information out of the system.

Fig. 4
figure 4

Locations of pumping stations

4.2 Data pre-processing on smart sensors

The smart sensors used in the case study focus on collecting the vibration data representing the vibration of pumps. The sensor are mounted on each pump which connects to the smart gateway wirelessly in the ZETA network. Figure 5 shows the pumps in the pump station and the smart sensor mounted on the pump machine. The smart sensor is built on the STM32L452CC processor, which is an Ultra-low-power flex power control processor (STMicroelectronics 2019). The vibration sensor in the smart sensor is the micro-electro-mechanical systems (MEMS) vibration sensor, which is used for collecting the acceleration signals of three-axis, in terms of the x-axis, y-axis, and z-axis. There are 4096 sampling points collected by the sensor, where the data sampling frequency in this case study is 3200 HZ.

Fig. 5
figure 5

Pumps (left) and smart sensor (right) in the water station

Based on the IIoL framework, the raw data is pre-processed on smart sensors. Only the pre-processed data is sent to the smart gateway. In this case study, 57 pre-processed features are generated on smart sensors, which have been determined in Table 4.

Table 4 Pre-processed features generated by smart sensors

Apart from sending the pre-processed data to the gateway, the information of these features can be displayed on the user interface based on the interests of the operators and plant managers. The displayed information is referred to the Knowledge I in the proposed framework. In this case, the mean, maximum and minimum values of x, y, and z axes are displayed which gives plant operators and manager a general idea of the vibrations on three axes at the perspective of basic statistics.

4.3 Knowledge and information discovery on smart gateways

Technique details of the smart gateway are presented in Table 5. The smart gateway applies ATSMA5D36 as the core processor. This processor uses ultra-low-power ARM Cortex A5 as the core, but it has a reasonable computing capability in terms of 536 MHz CPU speed and128 KB SRAM (Cortex-A5™ 2009). This smart gateway can cover over 5 km in the urban environment, which three pumping stations are built under the communication range.

Table 5 Technique details of the Smart gateway

On the smart gateway, all pre-processed data that is collected from pump machines will be integrated. The data from each pump machine is explicitly marked for identifying the machine. The clustering process is applied for realising the general machine behaviours, which are represented as different clusters on the smart gateway. The K-Means is used as the clustering algorithm in this case study (Jiang et al. 2020). Three clusters have been set up as the cluster number, which is based on three pump station. The outlier data is then highlighted to present to the operator, which are represented as Knowledge III. The clustering results are also integrated into the dataset, which is uploaded to the cloud. Also, a feature selection process is used on the smart gateway for the clustering at the meanwhile. The most important features are determined out of pre-processed features. The entropy-based feature ranking is used, which the importance of each feature is outlined. To increase the processing capability of smart sensor, 80% features which are on the top of feature ranking, are used for pre-processing on the smart sensor. This knowledge (Knowledge II) is sent down to each smart sensor. However, it is dynamic information which is influenced by not only clustering on the smart gateway but also the results sent by the cloud, which will be explained in the next subsection.

4.4 Machine health prognosis on the cloud

By using the processed data, the health of each pump machine is predicted, especially for the critical component of a pump machine, the bearing. Based on the historical data and the domain knowledge, the health of each pump machine is levelled as a health score, which is 0–100. The 0 means the pump machine is no longer to be used, and 100 means the machine is new. This score is used as the main predicted target for machine health prediction, which is presented to the managers as Knowledge V In this case, three prediction methods are used, which are linear regression, decision tree and neural network. Comparing the prediction accuracy of these three algorithms, the neural network obtained the highest accuracy. The predicted health score is presented to operators and system managers, which is used for understanding and maintaining the machine.

Furthermore, on the cloud, all the data is integrated, and the feature is ranked again. The feature selection results are sent to smart gateways, which improves the processing capability of the smart gateway. This knowledge (Knowledge IV) is also compared with the feature selection results of the smart gateway to enrich the understanding of the processed vibration features.

5 Case study II: assets monitoring and management for automobile factory

5.1 Background

The second case, designed under the IIoL framework is about the assets monitoring and management in an automobile factory in which the production area is over 200,000 square meters. In this case, two laser rooms and over 40 industrial robots are focused as the main targets. The working environment is shown in Fig. 6. The left image shows the cooling water status monitoring smart sensors and working environment for the laser room. The right image shows the smart sensor of robots, which can sense the temperature and vibration of industrial robots.

Fig. 6
figure 6

The working environment of the automobile production factory

Several types of smart sensors have been installed for collecting data and data pre-processing due to the significant production area and the signal interference by the factory facilities. The smart gateway cannot cover the entire factory area within a reasonable network. To solve this issue, two mote devices are used in this case to enhance wireless communication strength and extend the communication range. The details of the mote devices are introduced in Sect. 5.3. By using the data collected from the smart sensor and the proposed IIoL approach, the working environment and devices status is monitored and analysed, and the industrial robots are maintained predictively.

5.2 Data pre-processing on smart sensor

The smart sensor uses the same processor in this case. The details are shown in Sect. 4.3. However, more types of sensors are applied in this case. The features of the raw data are explained in Table 6. According to the table, the temperature and humidity of the laser room have been monitored inside and outside. For the industrial robot, the temperature of cooling water, the temperature and vibration of robot server motors are also monitored. Each smart sensor is connected with smart gateway wirelessly in the entire automobile factory.

Table 6 Feature description

Similar to case one, for each feature, 19 pre-processed features are generated on the smart sensor, which 133 features are generated and sent to the smart gateway in the LPWAN. Additionally, the mean, maximum and minimum values of the laser rooms in a period are presented to the operator with the alert as the defined Knowledge I. There are three colour alerts introduced in this case.

5.3 Knowledge and information discovery on smart gateways

To avoid the unnecessary data missing in the lager and complete working environment, this case uses the other device for enhancing the communication strength of the LPWAN, which is called mote. The mote is designed for extending and enhancing the LPWAN, which is a low power and battery support wireless network middleware, especially for some production factories and more substantial network range requirement. In this case, the mote details are shown in Table 7.

Table 7 Technique details of the mote

Comparing to a smart gateway, the mote abandons the computing capability but much lower the power usage, especially when it is standby, which allows the battery power supply. It has been approved as an indispensable component in the IIoT in this case study. After the pre-processed data is sent to the smart gateway via the mote, all the data is integrated and classified into different datasets, the laser room dataset and the industrial robot dataset. Similar to the last case, clustering is one of the primary data analysis methods used on the smart gateway, which is presented to the relevant users as the defined Knowledge III. The status cluster information is integrated with the pre-processed data and uploaded to the cloud. Additionally, the feature selection method is also applied to the smart gateway in this case. The laser room and industrial robot features are selected separately. The selected feature knowledge (Knowledge II) is sent back to a smart sensor for improvement.

5.4 Factor assets monitoring and management on the cloud

After receiving the processed data from smart gateways, this data is integrated with historical data. Based on the laser room monitoring data and processed data, the laser room monitoring status is predicted. The predictive time window is set as 1 s. The time-series based neural network, long-short-term-memory (LSTM) network was used in this case study. Another primary function on the cloud is predictive maintenance of industrial robots. The processed data is used with robot historical maintenance dataset. This case targeted on the Remain useful life (RUL) of robots as the output of the modelling. A suitable modelling approach is explored to establish an airport conveyor RUL prediction model. Since there are multi-types of data in the integrated database, a merged neural network which consists of a long-short term memory network and artificial neural network is designed for modelling. A merged neural network can learn the hidden patterns from different data types and fuse these patterns to further learn the abstract patterns that relevant to the RUL of robots. The RUL information is display to the factory managers as Knowledge V. Also the RUL results can help smart gateway to improve feature selection, which is referred to as the defined Knowledge IV.

6 Discussion

In the last two sections, two industrial cases have been revealed to validate the feasibility of the proposed approach, IIoL. The IIoL has been implemented successfully in these two different industrial scenarios, which the working environment, situation and performance are different. In the first case, the pump machines health was presented, analysed and predicted to the operators and managers. The knowledge and data have been exchanged between the smart sensors and smart gateway, which improves the performance of LPWAN and Cloud sections. In the second case, the proposed method was applied in an even larger and more complete working area. Two types of industrial targets were focused on this case, the laser room and industrial robots. The industrial robots are maintained predictively benefiting from the proposed IIoL approach.

Although the working environment is different between the introduced case studies, there are still some commonalities. Both cases are located in a large and complicated working field, which is generally larger than hundred-thousand square meters. Furthermore, both cases have been surrounded by sophisticated electrical devices, which have caused signal interference. These two commonalities have shown the advantages of LPWAN, wide range and high stability. Additionally, this is a low power consumption network, which has a low cost of power consumption when the network is running. However, it is still the early stage of both cases, according to the principle of the proposed approach. Also, the cases are still developing following the proposed framework to achieve the goals, mainly the functions on the cloud. More relevant data needs to be collected by integrating more sensors. The results of machine health prognosis and predictive maintenance are on the stage of the test, which is going to be used in reality soon.

Apart from the introduced industry case studies, the proposed approach can also be applied to another manufacturing system with some requirements. First of all, the smart sensors need a friendly interface to sense and collect essential data from the target system, which provides the system with a digital industrial environment. A suitable digital environment can show real-time data and information to the relevant people. Moreover, the target system needs to be covered by an LPWA network. In the above case studies, ZETA network has been implemented as the supported LPWAN technology in this research, which can be replaced as another LPWAN technology. The technology plan is based on the local wireless communication policy and infrastructure budget, which should be planned before applied the proposed approach.

7 Conclusions

IIoT is one of the core technologies at the age of Industry 4.0. The computing capability of current edge devices is sturdy and able to achieve some complex data processing. The LPWAN technologies can cover a large area with low power consumption, which is factory-suitable wireless communication technology. The focus of this paper has been on discovering the full potential of the IIoT network, which includes edge and network devices. A data and knowledge framework (IIoL) was proposed, which was inspired by a review of related research indicating the significant computing and data-communication capability of edge devices and LPWAN. In the proposed framework, every point of the network is used to generate relevant information and knowledge. The information and knowledge are exchanged between edge devices and gateways for improving the performance of the entire system. Two case studies are revealed in this paper. In the case studies, the feasibility of the proposed approach was carried out based on a pump station and automobile factory. Data was collected based on different purposes, and smart sensors and gateways were improved by the knowledge discovered from the proposed IIoL approach. On the cloud, functional models were built based on advanced data analytics technologies, such as machine learning and deep learning. The results are used for assisting relevant professionals in understanding the system and maintaining the critical component predictively. Finally, this approach bears the nature of data-driven which can also be developed to help many other manufacturing systems apart from the cases introduced in this paper.