A HIERARCHICAL MACHINE LEARNING FRAMEWORK FOR THE IDENTIFICATION OF AUTOMATED CONSTRUCTION OPERATIONS

A robust monitoring system is essential for ensuring safety and reliability in automated construction. Activity recognition is one of the critical tasks in automated monitoring. Existing studies in this area have not fully exploited the potential for enhancing the performance of machine learning algorithms using domain knowledge, especially in problem formulation. This paper presents a hierarchical machine learning framework for improving the accuracy of identification of Automated Construction System (ACS) operations. The proposed identification framework arranges the operations to be identified in the form of a hierarchy and uses multiple classifiers that are organized hierarchically for separating the operation classes. It is tested on a laboratory prototype of an ACS, which follows a top-down construction method. The ACS consists of a set of lightweight and portable machinery designed to automate the construction of the structural frame of low-rise buildings . Accelerometers were deployed at critical locations on the structure. The acceleration data collected while operating the equipment were used to identify the operations through machine learning techniques. The performance of the proposed framework is compared with that of the conventional approach for equipment operation identification which involves a flat list of classes to be separated. The performance was comparable at the top level. However, the hierarchical framework outperformed the conventional one when fine levels of operations were identified. The versatility and noise tolerance of the hierarchical framework are also reported. Results demonstrate that the framework is robust, and it is feasible to identify the ACS operations precisely. Although the proposed framework is validated on a fullscale prototype of the ACS, the effects of strong ambient disturbances on actual construction sites have not been evaluated. This study will support the development of an automated monitoring system and assist the main operator to ensure safe operations. The high-level operation details collected for this purpose can also be utilised for project performance assessment and progress monitoring. The potential application of the proposed hierarchical framework in the operation recognition of conventional construction equipment is also outlined.


INTRODUCTION AND BACKGROUND
Growing demand for complex buildings and affordable mass housing, along with the recognized need for improving working conditions and safety, have given a greater push to the adoption of automation and robotics in construction. Researchers have studied automation of various aspects of construction such as planning and scheduling (Piroozfar et al., 2019;Wang and Azar, 2019;Aslam et al., 2021), construction materials and methods Tamayo et al., 2018;Lemke et al., 2019;Men and Zhang, 2019), construction progress monitoring (Golparvar-Fard et al., 2009;Harichandran et al., 2018;Mahami et al., 2019), resource allocation and tracking (Azar, 2016;Kargul et al., 2017;Kim and Chi, 2017;Hongjo Kim et al., 2018), quality assurance and quality control (Zhong et al., 2018;Kazemian et al., 2019;Lakhal et al., 2019), improving safety at the worksite (Park et al., 2017;Nnaji and Karakhan, 2020;Ammad et al., 2021), assessing labour productivity (Joshua and Varghese, 2014;Cheng et al., 2017Cheng et al., , 2021, and structural health monitoring (Alavi et al., 2016;Liu and Zhang, 2019;Valero et al., 2019;Sun et al., 2020). However, the application of automation and robotic technologies in actual construction sites is still in very early stages. In particular, automated systems for the construction of lowrise buildings are rare. A vast majority of automated construction systems (ACS) and related technologies were developed for high rise buildings (Gassel;Hamada et al., 1998;Bock and Linner, 2016b). Urgent relocation, treatment or temporary accommodation of a large population affected by natural calamities or pandemic are examples of situations that demand rapid construction of low-rise buildings. In this context, automation and robotic technologies for low-rise buildings are gaining increasing attention. A top to bottom construction method of lowrise buildings with automated coordinated lifting is described in . This is further developed into an automated top-down construction system for modular construction of low-rise buildings (Harichandran et al., 2019b(Harichandran et al., , 2019a(Harichandran et al., , 2020. Even in the midst of rapid technological advancements, the construction industry is far away from a fully automated or robotic construction site (Melenbrink et al., 2020). Until we reach this stage, the automation systems, robots and workers need to coexist on construction sites (Bock and Linner, 2016a). These scenarios involve complex interactions between machines and workers and have risks associated with unsafe conditions. This necessitates the development of an automated monitoring system for safe operations (Fig. 1). Construction monitoring using sensor data involves several challenges, such as designing an optimal sensor configuration and data interpretation. There have been several studies on sensor placement, both in construction and other domains. For examples, see (Papadopoulou et al., 2016;Yu et al., 2018;Goyal et al., 2019;Mahami et al., 2019;Mahjoubi et al., 2020;Pachón et al., 2020). While challenges related to sensor placement are not fully solved yet, greater challenges exist in data interpretation. Whether construction operations can be accurately identified using sensor data is an interesting question. In this paper, the term 'Operation' refers to low-level activities related to the use of the construction equipment. These might be considered as subparts in the decomposition of higher-level construction activities.
Operation identification is one of the critical tasks in an automated monitoring system (Sherafat et al., 2020). The monitoring system must identify operations in progress and possibly discern faulty operations to warn the operator on time. To identify faulty operations, precise identification of the operating states is necessary. This paper presents the first stage of this research, which focuses on identifying the normal operations in an ACS. This will be further extended to the detection of faults. The ACS developed for this study is for the construction of low-rise buildings (Harichandran et al., 2020). However, the identification framework proposed in this study can be applied to traditional, automated or robotic construction (Fig. 1). Sensors measuring structural responses are installed on the structure during the construction. The interactions between labour, materials, equipment, and structure will be reflected in the structural responses. These responses reveal the operation being carried out. This is the central idea of this operation identification framework (Harichandran et al., 2018(Harichandran et al., , 2019b(Harichandran et al., , 2019a. Sensor placement is not considered within the scope of this study, and the present framework attempts to achieve the best identification for a given sensor configuration. Activity recognition is a widely studied area in construction. Most of these studies focus on identifying the right data to be collected, the data analysis method, the learning algorithm or features to be selected to improve identification performance (Sherafat et al., 2020). Some examples can be found in (Catal et al., 2015;Akhavian and Behzadan, 2016;Cheng et al., 2017;Twomey et al., 2018). A critical review of these studies is given in section 2.2. However, the importance of the problem formulation for operation identification is seldom considered. By incorporating knowledge of relationships between activities, the efficiency of operation identification might be improved (Soman et al., 2017). This paper incorporates this knowledge in the problem formulation by arranging the classes to be identified in a hierarchy. Thus, the operation identification problem is decomposed into a hierarchy of learning tasks. The lower levels of the hierarchy contain the finer details of the operation. The information from previous classification levels is used to improve the identification in subsequent levels. This novel hierarchical identification framework based on machine learning is the main contribution of this paper. The performance of the proposed framework is compared with that of the conventional operation identification approach, which involves a flat list of operation classes. Artificial neural networks (ANN) are adopted as classifiers for both methods.

FIG. 1: Role of the current study on the context of automated/robotic construction
The overall objective of this study is the precise identification of Automated Construction System (ACS) operations using the hierarchical machine learning framework from acceleration data acquired from the structure. The broader context and the scope of the current study are shown in Fig. 2. The three major components of an automated construction monitoring system are operation identification, operation tracking and performance estimation (Sherafat et al., 2020). The present study addresses the operation identification problem focusing on the normal operations of an ACS. Identifying failure conditions is essential for the further development of a monitoring system. Generally, vibration, sound, images or videos of the construction equipment are collected for activity recognition. The present study uses the vibration data from the structure for operation identification. While there are several model-based methods like system identification (Soman et al., 2017) to estimate the actual condition from sensor data, the model-free method based on machine learning (Golparvar-Fard et al., 2013;Ahn et al., 2015) is adopted for this study. To benchmark the performance of the newly proposed hierarchical problem formulation, the conventional approach is also presented.
The hierarchical identification framework will be most beneficial to the operators of automated equipment in the construction work. It helps them to ensure the safety and stability of the structure being constructed. For example, consider the coordinated lifting operation in an ACS. During coordinated lifting, all supports should lift simultaneously to move the structure upwards. Suppose one of the supports moves faster due to some internal error in the machine. A part of the structure will be lifted faster than the other and eventually result in the overturning of the entire structure. Situations like these will cause catastrophic accidents in real construction scenarios. Hence the monitoring system should be trained to accurately recognise each operation to detect any early signs of anomalies. This paper presents the first stage of this research, which focuses on identifying the normal operations in an ACS. This will be further extended to the detection of faults. During automated construction, the operator will be updated with the status of the operations to the finest level of the details to take necessary actions in case of potential failure cases. The final goal is to develop an integrated automated construction monitoring system. Such a system will provide real-time information about all the construction operations. This will help to ensure the correct execution of the operations. Besides, the construction progress can be monitored, the productivity rates could be estimated, and good quality of construction is ensured. The construction progress information can be made accessible to the client, the main subcontractor and the project manager for assessing the performance of the whole project. Even though the identification framework is validated on a top-down construction case study here, it can be applied to any type of construction system. However, the hierarchy of learning tasks varies with the chosen construction method.

FIG. 2: Research objective: Hierarchical identification of automated construction operations from sensor data (Scope of the current study is displayed in green colour boxes with dashed outline)
The rest of the paper is organised as follows: Previous research on automated monitoring systems and construction operation identification are briefly described in Section 2. A brief description of the ACS and automated top-down construction method is given in Section 3. Section 4 contains the research methodology and the concept of the hierarchical identification framework. Section 5 describes the application of the developed framework on an ACS. The subsequent section covers the results of the analysis and discussion of the observations. The final sections illustrate the generality of the proposed framework and the conclusions from the study.

REVIEW OF RELATED WORK
The review of previous work in this area is divided into two sections. The first section gives an overview of the existing Automated Construction Systems and the monitoring methods adopted. The second section covers the studies related to operation identification of construction equipment.

Monitoring in automated construction systems
Automated construction systems are widely developed for the construction of high-rise buildings. In literature, the space containing the automation equipment is referred to as the factory. This is because of its analogy with industrial production, where components are produced with a high level of automation. Construction automation systems belong to different classes based on the location of the main factory (Bock and Linner, 2016b). If the location of the main factory is the top of the building (sky factory) (Wakisaka et al., 2000), it will be shifted upwards as the work progresses. If the main factory is located on the ground floor (ground factory) (Sekiguchi et al., 1997), the constructed part of the building will be pushed vertically upwards or horizontally depending on the orientation of the building. There are other factory systems that combine both onsite and offsite construction (Bock and Linner, 2016b). Table 1 lists some of the ACSs and the monitoring systems adopted in them. The components of the monitoring system involve sensors, cameras, barcodes, control room, laser, RFID and software for data collection and analysis (Sekiguchi et al., 1997;Tanijiri et al., 1997;Yamazaki and Maeda, 1998;Wakisaka et al., 2000;Ikeda and Harada, 2006). Some current ACSs lack real-time monitoring systems (Gassel;Bock and Linner, 2016b). However, monitoring in other ACSs is usually performed for checking whether specific tasks have been completed successfully. Most construction systems have independent sub-systems focusing on designated tasks like material handling, assembling, lifting, etc. (Kang et al., 2011). Hence, collecting integrated information about the whole automated construction is highly challenging. This information is crucial for critical decision making, especially to avoid major accidents (Harichandran et al., 2019a). Machines in automated or robotic construction should not be entrusted to make logical decisions when there are large uncertainties in situations that are likely to cause accidents. A human operator can act better in those scenarios. However, software systems are better equipped for discerning minute variations in patterns of construction-related data (Harichandran et al., 2019b). If the meaning of these patterns is readily interpretable, humans can take quick decisions based on the circumstances. This is why an integrated automated monitoring system with a human operator will have better control over overall construction than discrete construction subsystems. Unlike high rise ACSs with numerous sub-systems, the development of an integrated monitoring system for low-rise automated construction is feasible. Presently, there are limited studies in this area. The authors of the current study have developed an ACS for low-rise building construction (Harichandran et al., 2019b(Harichandran et al., , 2019a(Harichandran et al., , 2020. Identifying the basic operations of the ACS is the primary step in the development of an automated monitoring system. Integrated information about the construction process can be obtained from sensor measurements taken from either the structure under construction (Harichandran et al., 2019b) or the construction equipment (Soman et al., 2017). The present study attempts to identify the operations of an ACS from the sensor data collected from the structure. This approach applies to automated and robotic constructions involving different scenarios such as site factory and single or multiple robots on site. The construction operation can be identified irrespective of the construction method since the interaction of the robots, and the structure will create structural responses that have characteristic patterns.

Identification of construction equipment operations
In most studies, construction equipment is monitored to calculate cycle time, productivity, cost and fuel consumption or optimally allocate resources (Chen et al., 2020;Shi et al., 2020;Slaton et al., 2020). In such cases, minor mistakes in identification are not critical. Since this work aims to develop an automated monitoring system for the safe operation of construction equipment, the expected identification accuracy is high. The majority of existing equipment activity identification methods use computer vision, sensor data, audio data, or other characteristic measurements from the equipment. Table 2 and 3 summarise various equipment activity recognition methods and their performances. The methods are subdivided based on the data collected for identification (visual data, sensor data and audio data). The following paragraphs further examine each of these activity recognition methods and their applicability to identify automated construction operations.
With the advent of low-cost recording devices and better computing platforms, computer vision-based methods of activity identification become extremely popular. Golparvar-Fard et al. (Golparvar-Fard et al., 2013) used Spatiotemporal features from video recordings to identify activities of excavator and dump truck with Support Vector Machines (SVM). They focus on identifying the single actions of earthmoving equipment. The problems due to noisy feature points, varied background, poses of equipment and levels of occlusion were addressed. Kim et al. (J. Kim et al., 2018) included the interactions between excavators and dump trucks to identify their operations using tracking-learning-detection. They showed that the incorporation of domain knowledge in problem formulation considerably improved the identification accuracy. Kim and Chi (Kim and Chi, 2019) considered the sequential working pattern of excavators for improving vision-based action recognition. They have used a hybrid of two deep learning methods for classification. The activities of heavy equipment and labours were identified by the bag-ofvideo-feature-words framework and overcame the limitations of variations in scale, partial obstruction and point of view (Gong et al., 2011). All of these methods show promising results. However, the construction site is a complex environment with various disturbances and obstructions. The dynamic nature of operations cannot be fully captured by still cameras (Sherafat et al., 2020). The applicability of computer vision-based methods is limited in this aspect. Most of these studies identify activities of earth excavation or moving equipment. This equipment has articulating parts or movements which can be clearly captured through visual data. Identifying minute variations in the parts of equipment during various operations is exceptionally challenging by computer vision-based methods. Hence, activity recognition based on visual data will not be suitable for identifying operations of ACS. Sensor-based activity recognition methods rely on a wide range of characteristic measurements from the equipment. Most popularly used data include acceleration or vibration, location of the equipment or a combination of these. Ahn et. al. (Ahn et al., 2015) demonstrated the feasibility of using the low-cost accelerometer for identifying the operations of an excavator with machine learning classifiers and achieved 93% identification accuracy. Akhavian and Behzadan (Akhavian and Behzadan, 2015) used accelerometer and gyroscope data for predicting the operations of a front-end loader with machine learning classifiers. The identified operations were used to estimate the activity duration for simulation input modelling. Even though the identification of major classes of operations was highly accurate, the performance reduced while identifying finer classes. For cycle time measurement of equipment, Kim et. al. (Hyunsoo Kim et al., 2018) used IMUs embedded in a smartphone. With the help of the dynamic time warping algorithm, they achieved 91.83% accuracy in cycle time estimation. Rashid and Louis (Rashid and Louis, 2019) placed inertial measurement units (IMUs) on the articulated part of the equipment to identify their operations using deep learning methods. The improvement of prediction results with various types and levels of data augmentation is explored in this study. Shi et. al. (Shi et al., 2020) considered the working stages of an excavator and main pump pressure for operation identification. Instead of using complex deep learning methods, they have applied machine learning classifiers for identification. The domain knowledge is introduced in the problem formulation employing a rule-based intelligent calibration system to obtain a prediction accuracy of 93.82%. The location of the equipment and vibration patterns captured by sensors have the potential to identify operations better compared to limited visual data. The sensor-based activity recognition methods are capable of delivering high performance in real-time. Most of these methods are unaffected by ambient or climatic conditions. These serve as promising attributes for identification of automated construction operations.
Audio-based activity recognition methods are mainly suitable for equipment that produce significantly measurable sounds. Cheng et al. (Cheng et al., 2017) used audio signals and SVM classifiers to identify various construction equipment activities. This method attempts to address the limitations of computer vision methods and sensor methods by capturing the sound patterns of heavy equipment to identify its activity. Similar studies have been carried out for equipment activity recognition using audio data as listed in the last sections of Table 2 and 3. Audiobased methods can identify multiple machines at once. However, the level of details of the activities identified by these methods is minimal. Hence, audio-based activity recognition methods are not suitable for developing a monitoring system.
Previous research shows that the operations which involve the limited movement of equipment are best identified by sensor-based methods or by characteristic measurements from the equipment. Operations that involve machine vibrations are best captured by accelerometers. The development of an automated construction system requires a high level of detail about the operations. Sensor measurements have the potential to provide detailed information about the equipment. Among all the activity recognition methods, sensor-based methods seem to be the best option for identifying automated construction operations.
The existing studies mainly focus on identifying construction activities at a macro level without considering the hierarchical relationship. This is primarily because the objectives of these studies are mainly to collect the overall information about the construction cycle. But, close monitoring of the micro-level operations to detect early signs of failure requires a high level of activity details. Hence a new approach that focuses on problem formulation is essential for the broad research objective of this study, i.e., the development of an automated monitoring system. While monitoring the construction operations for purposes such as cycle time estimation or resource allocation, minor mistakes in identification are not critical. However, even minor identification mistakes in a fast-paced automated construction would have disastrous consequences. Therefore, the development of a monitoring system for automated construction demands high identification accuracy. Most of these identification methods have adopted conventional machine learning or deep learning methods for classification. However, none of these methods has achieved the high accuracy of identification required to develop an automated monitoring system. Most of the studies were focused on improving the performance, either through the type of data collected or by exploring multiple classification algorithms. This study explores the possibility of improving the performance of operation identification focusing on the problem formulation. Instead of using high performing deep learning methods, this study uses a well-established machine learning algorithm (artificial neural networks) for identifying automated construction operations. In summary, the current research identifies automated construction operations at a high level of details based on acceleration data from the structure using a machine learning-based identification framework.

TOP-DOWN MODULAR CONSTRUCTION
The ACS in the present study consist of a set of lightweight and portable machinery designed to automate the construction of a low-rise building's structural frame (Harichandran et al., 2020). This ACS adopts the top-down modular construction method as explained in Harichandran et al., 2019aHarichandran et al., , 2019bHarichandran et al., , 2020. This method involves constructing the structural frame of the building from the topmost floor and lifting it upwards to add floors below. It is similar to the 'ground factory and building push-up' category of automated construction described by Bock and Linner (Bock and Linner, 2016b). However, the present method uses modules of structural elements rather than using the building components as a whole. This method is developed primarily for the construction of low-rise buildings. It eliminates the need for heavy equipment like tower cranes.
In top-down construction, the construction works progress in the vertical direction. The control and main operating units are placed at ground level. The prototype used in the current study consists of six lifting machines placed inside the core of the structural frame. Each lifting machine has a small platform that can be moved up or down using a hydraulic or electrical motor system. Each of the platforms supports a column of the structural frame during the construction. Fig. 3 shows a schematic representation of the operations in a top-down construction. The components of the structural frame are shown in blue and the supporting platforms in orange. For enhancing the clarity of representation, only a diagonal section of the entire construction system is shown. C1, C2 and C3 represent the columns whose modules are added sequentially in each step of the construction process. S1, S2 and S3 represent corresponding platforms that support the columns.

FIG. 3: Operations of top-down modular construction
Simultaneous operation (lifting or lowering) of all the lifting machines in the top-down construction system will result in coordinated lifting (or lowering) of the entire structure under construction. In addition, each machine can be individually operated to lower or raise its supporting platform. When one lifting platform is individually lowered, the structural frame will be supported on the remaining platforms, and the column at this location will be hanging from the system of beams above. When the platform is raised again to make contact with the column base, the load will be transferred to the platform.
The structural configuration, number and position of columns are designed to ensure the stability of the structure during automated construction. The structural frame is divided into smaller modules. The modules are assembled and lifted upwards step by step, starting with the topmost components of the structure. The first step is the connection of modules at the roof, supported on lifting platforms kept at construction level 0. The top-down modular construction consists of various stages depending on the total number of construction levels at which the operations happen. Construction stage 0 (CS0) is when the operations are happening at construction level 0 and so on. The addition of each module of the column increases the height of the structure by one stage.
There are four major operations in top-down modular construction: coordinated lifting, lowering of support, the connection of a column module, and lifting of support. These operations are further divided into subclasses based on the position of the supporting platform (SupNo1, SupNo2, …, SupNo6) and the construction stage (CS0, CS1, CS2) at which the operation starts. Identifying the operations and the construction stage helps monitor the progress of construction, detecting the faults and their location. This makes restoration and further corrective actions easier. The sequence of operations in one cycle of top-down modular construction is illustrated in Fig. 3. All the operations in this construction method are carried out at the ground level. This improves safety of workers and makes it easy to automate the connection of modules with equipment fixed on the ground. More details and alternate schemes of top-down modular construction can be found in (Harichandran et al., 2020).

METHODOLOGY
The overall methodology for hierarchical identification of automated construction operations is shown in Fig. 4. First, sensor data from the structure is collected during controlled experiments. The raw data are then subjected to pre-processing, and features are extracted for supervised learning. The next stage is machine learning classification and operation identification. The novel hierarchical operation identification framework adopted in this study is described in the next paragraph. The output of the hierarchical operation identification is supplied to an automated construction monitoring system. The monitoring system evaluates the operation execution and signals the operator in case of anomalies. The scope of the current study is limited to operation identification. The development of the automated construction monitoring system is a work in progress.

FIG. 4: Methodology for identification of automated construction operations
In the proposed operation identification framework, operations are hierarchically decomposed using domain knowledge about the construction equipment and operations types. A schema containing the equipment states, operations and their hierarchical relationships is developed first. Activities at the top level are general; specialized operations with more details appear at lower levels. Activity recognition occurs in multiple stages, starting from the topmost level using different machine learning classifiers at each level. That is, a single machine learning model (classifier) is not used to separate all the classes. Instead, a new classifier is used to explore the subclasses of a previously identified operation class.

CASE STUDY: HIERARCHICAL OPERATION IDENTIFICATION OF AN AUTOMATED CONSTRUCTION SYSTEM
The proposed hierarchical framework is implemented to identify the operations of an ACS prototype developed at the Building Automation Laboratory, IIT Madras. The development of the ACS, hardware and software specifications are elaborated in (Harichandran et al., 2020). A detailed description of the ACS operation cycles and top-down construction method is given in section 3. Vibration measurements from the structure are collected through controlled experiments using accelerometers. Six repetitions of experiments were conducted to capture multiple operations of the ACS. More than 19 million readings were collected from 8 accelerometers. The acceleration pattern associated with each operation is used for identification through 4 classification levels. The lowermost classification level delivers the finest operation details. Classification level 1 recognizes whether the ACS is in operating condition or idle. If the ACS is in operating condition, the second classification level determines the major operation class. The third classification level identifies the sub-operation class. The construction stage in which the operation happens is identified in the fourth and final classification level. The hierarchical identification relies on the hierarchical relationships of operations to refine the operation category at each classification level. After identifying an operation to the finest level of detail, the next operation is identified.
The process completes when all operations are identified up to classification level 4. To benchmark the performance of the hierarchical identification framework, a conventional identification approach is also evaluated.

Controlled automated construction
The controlled automated construction experiments are conducted in a laboratory using the top-down modular construction system prototype as described in Section 3. Fig. 5 shows the complete experimental setup. The automation construction system (ACS) consists of six lifting machines, each of 2-ton lifting capacity with its supporting platforms facing outwards. The structural frame to be constructed is modularised into small components. These structural modules are made of standard steel tube sections with external threading on both ends (50 mm nominal bore, 60.3 mm outer diameter and 4.5 mm thickness). The column modules are connected by standard steel sockets (couplers) with internal threading (50 mm nominal bore, 70 mm outer diameter and 65 mm length).

FIG. 5: Experimental Setup
Based on an extensive review of equipment activity recognition methods (section 2.2), a sensor-based method suitable for identifying ACS operations was chosen. All operations induce vibrations in the structure which have signature patterns associated with them. After careful consideration of the configuration and operation sequence of the ACS prototype, the accelerometer is selected for data collection. The location of the sensors on the structure was determined based on heuristics and these criteria: a) locations that give maximum vibration during construction, b) locations where normal operations will not get affected, c) locations where the entire duration of the construction can be captured. Eight monoaxial piezoelectric accelerometers (1000 mv/g sensitivity and -5g to +5g measurement range) are fixed on the topmost beam-column assembly of the structure. They are numbered as AM_01 to AM_08. AM_07 and AM_08 are positioned at the mid-height of the topmost column modules, parallel to ground level and perpendicular to each other. AM_01 to AM_06 are placed on different locations on the bottom surface of the beam assembly perpendicular to ground level.
The control unit of the ACS and data acquisition system are located at the ground level. HBM universal measuring amplifier (model: QuantumX MX840B, Number of channels: 8) is used for acquiring accelerometer data with a time-stamp. Based on previous studies on construction equipment activity recognition Behzadan, 2014, 2015;Hyunsoo Kim et al., 2018) and the Nyquist criterion (Lyons et al., 2005), the sampling frequency for data collection is set to 200Hz. This sampling rate ensured the capturing of minute vibrations during machine operations without creating excessive data. The data was collected using HBM Catman data acquisition software (catman Data Acquisition Software, no date) and later imported to Microsoft Excel (XLSX format) and MATLAB (mat format) files for further analysis. Separate time tracking excel sheets are used for recording timestamps of each operation during the experiments. This data is compared with the timestamps from the data acquisition system to extract signals corresponding to each operation accurately. The automated construction experiments involve the construction of two stages of a structural frame. The experiment is repeated six times and accelerometer data is collected continuously during the experiments. All the operations except connection of modules were automated in the current prototype of ACS. A trained operator controls the ACS while two unskilled labours carry out the connection of the modules. The operations involved in the top-down modular construction is described in section 3.

Raw sensor data
Each operation in the automated top-down construction has a pattern of acceleration associated with it. The vibration of the machine and the structure during the operation cycle is captured in the acceleration data. Intuitively, all the automated operations should have similar patterns irrespective of the repetition of the experiment or operating cycle. However, that is not the case in the actual scenario. The structure changes with every operation either due to the addition of modules or the changes in supporting conditions during lifting and lowering. Hence the vibration patterns corresponding to these operations will show variations (Fig. 6). This makes the identification problem far more complex than it appears. For example, the acceleration patterns of operations at support number 1 for two construction stages (CS1 and CS2) can be studied in Fig. 6. The operations, lowering of support and lifting of support are entirely automated. However, the patterns in the data for these operations do not appear to be similar in the corresponding regions of CS1 and CS2. This dissimilarity in patterns can be observed in other operations as well. In the case of connection of modules, the pattern of measurement and duration of the operation is likely to change in every repetition of the experiment and operation cycle. Even though these are highly dependent on the labourer involved in the operation, a general trend can be observed. Among the operations, some of them have similar patterns. As the classification becomes finer, the complexity of identification increases.

Feature extraction
The pre-processing of the raw data, feature extraction and machine learning classifications are carried out in MATLAB. Features that represent important characteristics of raw data have to be extracted to get good results in supervised learning. Features should have good discriminatory power in separating model classes. Based on a preliminary assessment of the raw data and previous studies (Figo et al., 2010;Joshua and Varghese, 2011;Akhavian and Behzadan, 2015;Hyunsoo Kim et al., 2018) on activity recognition from accelerometer data, ten features have been identified for the current study. Peak, mean, interquartile range, variance and root mean square error are the time-domain features. Besides, signal energy and the period of the signal are extracted using autocorrelation. Finally, the first three prominent frequencies from the spectral analysis are also used. These ten features are extracted from acceleration data measured at eight different locations of the structure. Thus, 80 features are extracted from the whole data set. The data set were not divided into small overlapping windows as described in previous studies. The capability of the features to represent the whole dataset is tested here. The whole feature space is used for supervised learning. With the current processing speed of the computer (Processor: Intel(R) Core (TM) i7-8700T CPU @ 2.40 GHz, installed memory (RAM): 16GB), computing cost and time are not too high for this feature space. The long-term goal of this research is to develop an automated monitoring system. The monitoring system will be implemented on a computer with similar computing power. After the training phase, there is no need to do the feature extraction, which might reduce the requirements on computational power. Hence using the whole feature space will not affect the monitoring time, and advanced feature selection methods for dimensionality reduction were not attempted here.

Selection of machine learning classifier
Machine learning techniques are widely used for solving activity recognition problems using sensor data. Supervised learning methods have been shown to deliver better results compared to unsupervised learning methods for activity identification problems (Golparvar-Fard et al., 2013). According to Akhavian and Behzadan, unsupervised learning methods tend to cause overfitting during classification with imbalanced equipment activity classes (Akhavian and Behzadan, 2015). The classification of automated construction operations is similar to that of construction equipment activity identification in imbalanced activity classes. Hence supervised learning methods are adopted for this study. Deep learning methods show promising results for equipment activity identification. However, these methods demand large datasets for training. The automated construction experiments are costly and time-consuming. It is not practical to generate large datasets by experiments. Augmentation of data also requires expert knowledge, and the generated datasets should capture the possible working conditions of automated construction. The evaluation of deep learning techniques for this task is in progress. The current paper explores the possibility of using a well-established machine learning technique for identification of automated construction operations. It is essential to evaluate the attributes of the classifier for precise identification of automated construction operations. Given the limited experimental data, the classifier should have good generalisability without overfitting. The acceleration patterns measured are a complex combination of the vibration from the structure and the ACS during construction. The classifier should be able to learn the nonlinear relationship between the acceleration measurements from different locations of the structure and the automated construction operations. The classifier should have clear parameters to indicate the confidence for the predicted results so that necessary control actions can be taken during construction monitoring. Based on the above requirements, Artificial Neural Network (Feed-forward classification network) is selected as the classifier for identification of automated construction operations. A separate study was carried out to determine the best learning algorithm for operation recognition. It was identified that ANN delivers the best performance at all classification levels. Therefore, further studies to validate the identification framework was performed using ANN. The study conducted to identify the best learning algorithm is not included in this paper. This is because the focus of this paper is the use of domain knowledge in the formulation of the identification problem to ensure high accuracy. However, a summary of the study results is included in appendix A. ANN is the classifier adopted for both hierarchical identification framework as well as conventional identification approach.

Frameworks for operation identification
Our long-term goal is to develop an automated monitoring system. Unlike other scenarios of operation recognition for progress monitoring, cycle time estimation or productivity calculation, the identification accuracy and level of detail are of prime importance. Automated construction will be faster than conventional construction. Hence, it demands an automated monitoring system that provides accurate and detailed information about the ongoing construction. This information should be easily comprehensible by the operator of the ACS to take appropriate actions in time. The status of the operations and ACS from macro-level to micro-level should be readily accessible at any instant. If the operation is going well, the operator needs only general information like the major class of operation (classification level 2). If there is a fault in operation, the operator should know the details like the subclass of the operation (classification level 3) and stage of construction (classification level 4). The main operations, sub-operations and construction stages vary with the ACSs. Presently, the identification framework is applied to an automated top-down construction system. Hence the descriptions include the operations specific to this construction method. However, the operation identification framework proposed in this study is applicable to other construction methods as well.
This study evaluates two different problem formulations for the identification of automated construction operations. The first one is the conventional methodology adopted in previous studies (Fig. 7), and the second one is the hierarchical framework developed here (Fig. 8). Both frameworks are evaluated for their ability to identify operations at four classification levels. Even though all the classes are input as a flat list in the conventional framework, to test the performance of the two frameworks, operations are separated into four levels. From top level to bottom level, operations are classified into finer subclasses. Classification level 1 consists of the operation states of the ACS, viz. idle and operations. The idle state indicates that the automation system is turned on, but no operations are being performed. The data corresponding to this state is primarily due to ambient vibrations. Classification level 2 further divides the operations into four major classes. Classification level 3 contains the subclasses of operations. It divides two operations (lifting and lowering) into subclasses based on which lifting machine is in operation. The 'connection of column module' operation is divided based on which column is being constructed at that time. All operations are subdivided at classification level 4 based on the stage of construction at which the operation was performed.

FIG. 7: Conventional framework for identification of operations or states. A flat list of classes is used by a classifier to identify operations. Four classifiers are used to compare with the proposed hierarchical identification framework.
The conventional framework containing a flat list of classes is shown in Fig. 7. Here, there is one identification task (classification problem) per classification level. A machine learning classifier in the current context is a predictive model developed and trained to solve a classification problem. There is one machine learning classifier per classification level in the conventional framework, as shown in Fig. 7 (Classifier 1, Classifier 2, …, Classifier 4). The yellow boxes represent the classification levels. The grey boxes represent machine learning classifiers at a classification level. The white boxes are the operations classified by a particular machine learning classifier. Most previous studies have adopted this problem formulation for operation identification [58,60,61]. It does not use any prior information from the previous classification level. As the classification level increases, the complexity of the learning task also increases. Classification level 1 has only two operations, while classification level 4 has 41 operations. This conventional framework of identification seems to give good performance only when the number of operation classes is small. As the number of similar operation classes increase, performance appears to be consistently declining. To verify the suitability of the conventional framework for identifying a large number of classes to develop a monitoring system, the initial classification was performed using that framework.
The hierarchical framework for identification proposed in this study formulates the identification problem into a hierarchy of learning tasks (Fig. 8). Each classification level in this identification framework uses prior information from the previous classification level to simplify the identification task. There can be more than one identification task per classification level. Accordingly, there is a hierarchy of machine learning classifiers, each assigned to solve an identification task. There are 25 machine learning classifiers numbered systematically as 'Classifier L.N' where L represents the classification level, and N represents the number of the classifier at classification level L. Each machine learning classifier will classify similar operations at a particular level.

FIG. 8: Hierarchical framework for identification of operations or states. Instead of a flat list of classes, the classes representing the operations are arranged in a hierarchy.
The identification tasks for the hierarchical framework are formulated based on the logical flow of information required in an automated monitoring system. Consider this example for the flow of information and measured data among classification levels: A particular operation is going on in the automated construction. The monitoring system identifies the status of the ACS as 'Operations' at classification level 1 (Classifier 1.1). Now, the main operation needs to be identified in the next classification level. The class 'Idle' can be removed from the further identification tasks to simplify the problem (Classifier 2.1). If the main operation is identified as "Lifting Support" in classification level 2, only the sub-classes of "Lifting Support" need to be investigated for further classification. This means that there should be specific identification tasks for each subclass of the main operation. The sensing data will be redirected to a particular identification task based on the prior information from the previous classification level. In this way, there are three simple machine learning classifiers (Classifier 3.1, Classifier 3.2, Classifier 3.3) in the hierarchical framework instead of one complex machine learning classifier (Classifier 3) in the conventional framework at classification level 3. Each of these classifiers solves an identification task with six classes instead of one classifier that solves an identification task with 20 classes. In the previous classification level, the operation is identified as "Lifting Support". Now, the sensing data will be redirected to classifier 3.3 for further classification. If the operation is identified as "LiftSupNo6" at classification level 3, the next classifier in classification level 4 will be classifier 4.20. This classifier will identify the operation based on the construction stage (LiftSupNo6_CS1 or LiftSupNo6_CS2).
The first two classification levels have only one machine learning classifier, each in the hierarchical identification framework. Classifier 1 (conventional framework) and classifier 1.1 (hierarchical framework) are essentially the same. Classifier 2 is slightly different from classifier 2.1 since it also included 'idle' in classification along with the operations. Classifier 3 is replaced by classifier 3.1 to classifier 3.3 (3 classifiers), and classifier 4 is replaced by classifiers 4.1 to classifier 4.20 (20 classifiers) in the hierarchical framework. The purpose of designing this complex framework is to develop robust machine learning classifiers for each classification level. The overall objective of this operation identification is to develop an automated monitoring system. Ensuring high accuracy in operation identification will reduce the possibility of false alarms during monitoring and decrease the chances of not reporting any faulty operation. This will eventually reduce workplace accidents.

Evaluation of performance
The performance of each classifier is evaluated through k-fold cross-validation to avoid dependency on a particular dataset or overfitting. In k-fold cross-validation, data is arbitrarily split into k folds. Then, one fold is reserved for validation (used as unseen data) and the others are used for training. Next, another fold is used for validation, while the remaining folds are used for training. This process is repeated k times until all the folds are used for validation once. Each performance parameter of the cross-validated classifier is computed as an average of that parameter from all the folds. The classifiers in the first three levels of classification are 10-fold cross-validated. The classifiers in classification level 4 are 5-fold cross-validated since the number of data points is less.
Accuracy, precision, recall and F1 score are the parameters used to assess the performance of a classifier. Accuracy is the percentage of data points correctly identified out of the total number of data points (equation 1). Identification accuracy is an overall estimate of the performance of a classifier. Precision and recall are computed to investigate the relevance of the information retrieved by a classifier. Precision is also known as Positive Predictive Value (PPV). It is the percentage of the identifications which are relevant out of all the identification results (equation 2). The recall is also called the true positive rate. In other words, it is the percentage of the relevant operation classes correctly identified by the classifier (equation 3). F1 score is the harmonic mean of these two parameters (equation 4).

Performance of identification frameworks
The performance of the hierarchical framework of identification is compared with the conventional framework at different classification levels. The overall identification accuracy per classification level of both frameworks is shown in Fig. 9. Precision, recall, F1 score and accuracy of classifiers and overall accuracy of the identification framework per classification level are displayed in tabular form (Table 4 and Table 5).  In the conventional identification framework, the prediction accuracy is constantly decreasing with an increase in classification level. Hence, the finest level of classification has the least accuracy. This is due to the problem formulation in the conventional identification framework. There is only one machine learning classifier per classification level. The number of classes in identification tasks from classification level 1 to 4 are 2, 5, 20 and 41. As the complexity of the identification task increases, the accuracy decreases. These results confirm the observations from previous studies (Akhavian and Behzadan, 2015). Other performance parameters such as precision, recall and F1 score show similar trends. Even though classifier 2 shows slightly better performance than classifier 1, the downward trend continues with a higher number of classes. There is only a marginal difference between classifier 1 and 2 in terms of the number of classes. While the difference is substantially higher for other classification levels 3 and 4. Hence declining performance becomes evident for these classification levels. The results show that the conventional framework of machine learning classifiers is not suitable for developing an automated monitoring system.
The performance of the hierarchical framework of identification is independent of the classification level. It depends mainly on the complexity of the identification task. The performance of the two identification frameworks was comparable in the initial classification levels. However, at the finer levels of classification, the hierarchical framework outperforms the conventional one with a significantly high level of accuracy. Hence, the hierarchical framework is promising in delivering the high accuracy of identification required for an automated monitoring system. The following part of this section will discuss the performance of the classifiers in the hierarchical framework in detail.
The classifiers in the hierarchical framework are consistently giving the best prediction results close to 100% accuracy, except for two classifiers in classification level 4. The classifiers are composed of simple neural network architecture with one hidden layer, and the number of neurons in the hidden layer in most of the classifiers is less than 10. There are no studies on the identification of automated construction operations. However, the results can be compared with that of construction equipment activity identification. Kim et al. (J. Kim et al., 2018)used visionbased activity identification methods incorporating the interactions between excavators and dump trucks to identify their activity with an accuracy of 91.27%. Cheng et al. (Cheng et al., 2017) used audio signals and SVM classifiers to identify construction equipment activities and obtained the best identification accuracy of over 90%. Golparvar-Fard et al. (Golparvar-Fard et al., 2013) used spatio-temporal features and SVM classifiers to identify activities of excavator and dump truck with 86.33% and 98.33% of accuracy, respectively. Akhavian and Behzadan (Akhavian and Behzadan, 2015) reported the highest accuracy of predicting the operations of a front-end loader using the neural network as 98.59%. However, the prediction performance in that study decreases with finer levels of classification. The hierarchical framework in the current research ensured consistently high performance even at the finest classification level. This was possible using a simple artificial neural network architecture, and no overfitting was observed, as indicated by the high prediction accuracy with unseen data. It is acknowledged that such high accuracy may not be achieved in real-world site conditions. However, the hierarchical framework is still expected to have higher performance than the conventional approach because it takes advantage of the domain knowledge available in the form of decomposition of operations. The latest studies show reasonable identification performance with deep learning methods. It is frequently claimed that the major advantage of these methods is avoiding feature extraction. But the great challenge in implementing those methods include the generation of large datasets and high computational time. This study follows a different approach to improve the performance of the existing methods, that is, with appropriate problem formulation making use of domain knowledge.
There are 246 instances of idle and normal operation classes. Except for the first two classification levels, all other classifiers have classes with an equal number of data points. Hence the identification tasks at a particular classification level in the hierarchical framework have relatively similar complexity in terms of class size distribution. Fig. 10 and Fig. 11 show confusion matrices for selected classifiers. All these classifiers belong to the hierarchical framework. In the confusion matrix, actual class (target class) is represented by columns, whereas rows represent the predicted class (output class). The correctly classified data points are located on the matrix's main diagonal, and misclassifications are on the off-diagonal positions. Each cell shows the fraction of data points that belong to that particular cell. In k-fold cross-validation, each fold generates a confusion matrix. The entries of each cell in the displayed confusion matrices are the average of corresponding values of all folds. In classification level 1, classifier 1.1 delivered good performance even with unbalanced data sets (Table 5). This confirms the ability of supervised learning classifiers to handle unbalanced data. However, the performance seems to be better with balanced classes. This justifies the superior performance of other classifiers compared to classifier 1.1. At classification level 3, classifiers 3.1 and classifier 3.3 have a minor dip in their performances. The confusion matrices show that ( Fig. 10 and Fig. 11) this is caused by occasional misidentification of lowering or lifting operations at support three since these are similar to the operations at support 1. At classification level 4, identification becomes more complex since the classifier has to identify subtle changes in the patterns to identify which stage of construction the operation happens. In this case, also, classifiers identified all instances accurately except for classifiers 4.8 and classifier 4.14. These classifiers are for the identification of operations 'Lowering support no.6' and 'Connection of column module step6' into two classes based on the stage of construction (CS1 and CS2). Since the number of instances is less in classification level 4, one misclassification itself reduces the accuracy considerably. The classifiers in classification level 4 are 5-fold cross-validated due to the limited number of instances. This raises the question of the robustness of the classifiers. The next section discusses this issue in detail.

Noise tolerance of the classifiers
The data for the current study is acquired through controlled laboratory experiments. Data from the actual construction site may contain higher levels of noise. Once trained, the machine learning classifiers should identify the operations correctly even if the collected data contains noise. The generalisability and noise tolerance of the classifiers are tested by inputting all the classifiers with data containing noise. The raw acceleration signals are introduced with random noise whose maximum value ranges from 5% to 50% of the root mean square (RMS) of the signal. Totally six different sets of augmented data were created, and features were extracted from the data as described in Section 5.3. This data was supplied to all the trained neural network classifiers in the hierarchical framework. The prediction results are given in Table 6 and 7.  Classifier 1.1 has a high tolerance for noisy data. Even with 50% of noise in the signal, the prediction accuracy is 95.1%. It is interesting to note that the 10-fold cross-validation accuracy for this classifier was 99.58%. A high percentage of error in the signal reduces the performance of the classifier only slightly. Classifier 2.1 has a fairly high noise tolerance up to 20% of noise in the signal. However, the performance of this classifier reduces considerably from 30% noise onwards. A similar trend can be observed for classifier 3.1 and classifier 3.2. But the noise threshold for the drastic reduction in performance for these classifiers varies. Classifier 3.3 shows a consistent reduction in performance with an increase in noise. Classifier 3.1 and classifier 3.3 had 10-fold crossvalidation accuracy of 98.57% and 98.75% respectively. Nevertheless, these classifiers could identify operations with high accuracy up to a certain percentage of noise. This shows the generalizability of the classifiers. The classifiers in classification level 4 show relatively high noise tolerance compared to all other classifiers. Some of the classifiers identified all operations correctly, even with 50% of noise. The robustness of these classifiers can be observed here. The noise threshold and variation in performance with an increasing percentage of noise changes in each classifier.

ILLUSTRATION OF THE GENERALITY OF THE PROPOSED FRAMEWORK
The hierarchical identification framework proposed in this study is generic and can be potentially applied to several operation recognition tasks in construction. This section illustrates the generality of the proposed framework by applying it to the example of operation recognition in the operation of an excavator. This example is adapted from (Akhavian and Behzadan, 2015). An excavator is a commonly used equipment for digging and moving soil and other material on construction sites. The core idea of the hierarchical framework is to identify a specialized operation with more details by exploring the subclasses of a previously identified operation class. Hence, the first step is to develop a schema containing the equipment states, operations and their hierarchical relationships. This step helps to determine the maximum classification levels for the particular equipment. A possible operation decomposition for excavator operations is shown in Fig. 12. In this example, all operations of an excavator can be identified within four classification levels if we include the states 'Engine off' and 'Engine on' in the hierarchy. Development of the schema helps to enumerate the operation classes to be identified. In Fig. 12, white boxes represent the operation classes or states, and yellow boxes represent the classification levels.

FIG. 12: Hierarchical framework for identification of operations or states of an excavator
The next step is to identify the purpose and level of details required for operation recognition. If the purpose is to estimate the cycle time for simulation input modelling, a high level of operation details is required (Akhavian and Behzadan, 2015). This means that the operations have to be identified up to classification level 4, in which all suboperations are recognised. If the purpose is to identify the overall productivity of the equipment, information up to classification level 3 is sufficient to recognise major operation classes (Kim and Chi, 2019). Classification level 2 is sufficient for estimating the emission rate for sustainability analysis. How much time the engine is turned on in the idle condition gives an estimate of wasteful emission. Fuel consumption can be estimated from classification level 1 itself.
The next step is to identify the machine learning classifiers that are needed to separate the operation classes. In general, at each level in the hierarchy, one classifier is chosen for each operation (or state) that needs to be separated further. This is the fundamental difference between the hierarchical framework and the conventional approach. In the conventional approach, a single machine learning classifier separates the data into all the operation classes. For example, in (Akhavian and Behzadan, 2015), one machine learning classifier is used to separate operation into five classes which are represented as a flat list of output nodes of a neural network. In contrast, in the hierarchical framework, multiple classifiers are trained to separate the classes one after the other in a cascading network. In Fig. 12, the classifiers are represented by grey boxes which contain the operation classes identified by them. These classifiers are named as shown in the grey ovals next to them. For example, at classification level 1, classifier 1.1 is assigned to identify 'Engine off' and 'Engine on' states. If the engine is turned off, there is no need for further classification. If the engine is turned on, further classification is needed to identify the subclasses of that state. Similarly, at subsequent classification levels, subclasses of a previously identified operation will be further separated, and classifiers will be assigned accordingly. Unlike ACS, the excavator has a fairly simple hierarchy of operations. Only at classification level 4 there is more than one classifier. In the ACS, the number of combinations of operations and states is large, and the classification problem is extremely complex. As discussed in the previous section, the current identification framework presents a novel problem formulation for simplifying the problem and enhancing the robustness of identification. Activity recognition in automated construction is a novel and challenging application that has not been discussed in the published literature. Activity recognition with high accuracy and a high level of details is essential for monitoring the ACS. Hence, a new framework is developed for meeting these requirements. With four classification levels, we can obtain sufficient details that are necessary to take corrective actions. However, based on the complexity level of operations to be identified in a piece of construction equipment, the number of classification levels could vary.
Since the primary application area of this research is automated construction, a more detailed discussion of the application of the framework to other construction operations is not attempted here. The case of the excavator is presented purely for illustrating the generality of the approach. The particular hierarchical representation of the excavator operations is provided as an example. The sub-operations could be modified based on the purpose of identification.

CONCLUSIONS
The main contribution of this research is the development of a hierarchical machine learning framework for achieving high accuracy in operation identification. The conventional approach adopted in all the previous operation recognition studies uses a single machine learning classifier that separates all the operations classes. The performance of these approach drops with the increase in complexity of the identification problem. The newly developed hierarchical framework does not use a flat list of classes like the conventional methodology. Instead, it utilises the hierarchical relationship between operations to decompose them into various classification levels.
Multiple hierarchically organized machine learning classifiers address the identification problem at each classification level. These two frameworks are tested for their efficiency in identifying the operations of an automated construction system prototype.
The performances of the two identification frameworks were comparable at the initial classification levels. However, at finer classification levels, the hierarchical framework outperformed the conventional one with 3 -15 % higher accuracy. This study emphasises the significance of problem formulation for operation identification. The hierarchical organization of classes incorporates domain knowledge that helps the machine learning algorithm to separate the operations more efficiently.
The neural network classifiers with a simple architecture consistently delivered a high performance at all classification levels of the hierarchical framework. This study also confirms the efficiency of neural network classifiers for equipment operation identification from sensor data. The generalisability and noise tolerance of these classifiers demonstrate the prospect of using them for an automated construction monitoring system development.
There are several studies on equipment activity recognition from sensor data using machine learning classifiers. However, a hierarchical operation identification framework applicable for general construction equipment has been introduced for the first time through this research. Most of the previous studies were for estimating the production cycles of earthmoving equipment. This is the first study that uses a systematic and robust method for recognising automated construction operations. Another significant contribution is the incorporation of domain knowledge for operation recognition problems. Besides, the studies were performed on one of the first full-scale prototypes of automated construction systems for low-rise buildings.

LIMITATIONS AND FUTURE WORK
The current research was conducted using a laboratory prototype of an ACS. The experiments were conducted in a controlled environment. A commercial automated construction will have a much more complex system and stronger ambient disturbances. Collecting sensor data from the structure is still possible in that scenario through wireless sensors. However, the sensitivity requirements of the sensors should be evaluated carefully. The proposed framework works best when we use continuous time-series data as raw input. The data can be either the data from the structure as in the present study or data from various parts of construction equipment from the previous studies. Multiple types of sensor data from the same location will not pose any problem. In those cases, the raw input data supplied to the framework will have additional dimensions. The only requirement is the data provided should have enough information about the ongoing construction activities. Use of visual data has not been tested in the present identification framework. Future studies involve the identification of complex failure case scenarios in automated top-down construction. Advanced deep learning methods and model-based system identification methods are being explored for this research.

DETERMINATION OF THE BEST LEARNING ALGORITHM FOR OPERATION RECOGNITION
There are 25 machine learning classifiers in the hierarchical identification framework, each corresponding to an identification task. The classifiers were tested with six different machine learning algorithms to determine the best performing algorithm. The best identification results of each learning algorithm are presented in this section. The results of operation recognition for identification level 1 are summarized in Table 1 and illustrated in Fig. 1. Classifier 1.1 identifies the idle and operating states in automated construction. All the learning algorithms have identification accuracy above 95%. Even though slightly lower, the F1 score also follows a similar trend of accuracy except for DA. The ANN has the best overall performance in terms of accuracy and relevance of information retrieval. SVM seems to have high accuracy (95.528%) even though it is slightly lower compared to other learning algorithms. But the precision and recall are considerably lower than those of the other algorithms. This is the first identification task in this identification framework. The performance of the classifier in this task highly influences the performance of the overall framework.

FIG. 1: Accuracy of prediction for identification level 1
The operation recognition results of identification level 2 are given in Table 2 and Fig. 2. Classifier 2.1 distinguishes the major operation category of the given input data. The ANN identifies the operations with 100% accuracy and an F1 score. This classifier has a simple network architecture: 11 neurons and one hidden layer. All the performance indices demonstrate a similar trend in performance for all learning algorithms. SVM and DA show comparable performance for this identification task. They have the next to best performance compared to other learning algorithms, contrary to the results in identification level 1. The interesting observation is that both SVM and DA follow discriminant or boundary-based classification strategy. However, for the current identification task, SVM used a polynomial kernel function. DA used a linear discriminant. All other learning algorithms show less than 95% accuracy and F1 score for this identification task. Compared to the results in identification level 1, all classifiers improved their F1 score. Even though accuracy is reduced, the relevant information retrieval improved.

FIG. 2: Accuracy of prediction for identification level 2
Identification level 3 has three classifiers; each one is assigned to identify the sub-operation categories. Each of these identification tasks contains six categories, each having an equal number of instances. Hence the classifier need not handle the problem of an unbalanced dataset. From this identification level onwards, the stark difference in the performance of ANN from other learning algorithms is evident (Table 3 and Fig. 3). The ANN delivers close to 100% accuracy and F1 score for all classifiers. Except for Classifier 3.2 using kNN, all other algorithms demonstrate a considerable decline in identification performance. The similarity among sub-operation classes is much higher than previous identifications tasks. As the complexity of identification increases, all learning algorithms except ANN fail to achieve the necessary performance required for this identification task. Consider the operation identification results of identification level 4 (Table 4 to Table 7 and Fig. 4). For convenience, the performance parameters are displayed in separate tables. The summarised results can be seen in Fig. 4. There are 20 classifiers (Classifier 4.1 to Classifier 4.20) in this identification level. For clarity, only the best, the worst and the median results are included in the figures. At this final identification level, the classifiers need to identify the construction stage at which the operation happens. The operations to be classified are essentially the same except for a minor difference in the stage of construction. This makes the identification tasks at this level extremely difficult. Achieving high performance seems to be highly challenging. However, ANN classifiers perform consistently well here. Except for Classifier 4.8 and Classifier 4.14, all other classifiers deliver accuracy and F1 score 100%. The operations classified by these classifiers are observed to be related to support 6. This shows the dependency of the results on the data collected from that particular support. Considering the complexity of the identification problem, the accuracy is good enough and meets the purpose of identification. All other performance indices exhibit a similar pattern. kNN is observed to be the second-best learning algorithm based on overall performance. However, only 7 out of 20 classifiers delivered 100% accuracy. The accuracy of the worst-performing classifier using kNN (Classifier 4.14 and Classifier 4.20) is as low as 75%. Prediction results of other classifiers are not comparable. These results emphasize the significance of selecting the correct machine learning algorithm for operation identification.    In summary, ANN classifiers deliver the best performance in operation recognition at all identification levels. This shows that ANN can model the complex non-linear decision boundary that separates different operation classes. Other machine learning algorithms cannot easily model this relationship, or their learning strategies are not efficient enough to learn the correct relationship. Another interesting observation is that all ANN classifiers have simple network architecture. All classifiers possess only one hidden layer and, in most cases, the number of neurons is less than 10. Even for the most complex identification task, ANN has high accuracy. The accuracy obtained here is higher than what is reported in other operation recognition studies which used complex deep learning methods (Chen et al., 2020;Kim and Chi, 2019;Rashid and Louis, 2019;Roberts and Golparvar-Fard, 2019;Slaton et al., 2020). Irrespective of the complexity of the identification problem, conventional machine learning methods outperform complex identification methods with the right set of features, identification framework and learning algorithm.