Role and task allocation framework for Multirobot collaboration with latent knowledge estimation

In this work a novel framework for modeling role and task allocation in Cooperative Heterogeneous Multi‐Robot Systems (CHMRSs) is presented. This framework encodes a CHMRS as a set of multidimensional relational structures (MDRSs). This set of structure defines collaborative tasks through both temporal and spatial relations between processes of heterogeneous robots. These relations are enriched with tensors which allow for geometrical reasoning about collaborative tasks. A learning schema is also proposed in order to derive the components of each MDRS. According to this schema, the components are learnt from data reporting the situated history of the processes executed by the team of robots. Data are organized as a multirobot collaboration treebank (MRCT) in order to support learning. Moreover, a generative approach, based on a probabilistic model, is combined together with nonnegative tensor decomposition (NTD) for both building the tensors and estimating latent knowledge. Preliminary evaluation of the performance of this framework is performed in simulation with three heterogeneous robots, namely, two Unmanned Ground Vehicles (UGVs) and one Unmanned Aerial Vehicle (UAV).

1. minimizing personal injuries to rescue workers and rescue dogs by accessing unsettled structures, 2. raising speed of response by penetrating ordinarily inaccessible voids, 3. through the use of sensor fusion and multiple cameras in order to enhance the reach of rescue workers to regions that are otherwise inaccessible.
Rescue robots have many advantages compared to rescue workers and trained rescue dogs: 1. unlike humans, a rescue robot will not become stressed or fatigued; 3 2. rescue robots can be developed in large quantities, while experienced rescue professionals and trained rescue dogs are sparse resources; 2 3. robots are expendable but humans and rescue dogs are not: if a rescue robot is damaged, it can be easily repaired or replaced, but the loss of rescue workers or dogs could be very difficult due to their relationship within society. 2 Thus, the robot or multirobot deployment minimizes human exposure to danger. In planetary exploration homogeneous robot deployment introduces redundancy making the overall system more robust and reliable. 4 In agricultural as well as in manufacturing the use of heterogeneous robots can cope with hardware limitations of individual robot payload for complex task accomplishment. 5 Still, in assembly multiple robots deployment reduces human workload.
However, making multirobot deployment effectively operational has as a fundamental prerequisite the capability of the robots to collaborate among them. 6,7 Collaboration strongly depends on the capability of the robots to communicate, that is, to share as well as to exchange information about context-dependent tasks. 8,9 Information sharing cannot take place if it does not exist a communication platform, common to all the robots, that is, a knowledge management structure, in which a language for representing knowledge, a mechanism for knowledge association, a protocol for information exchange and, finally, a memory system are well established. [10][11][12][13] Languages for representing knowledge in multirobot systems are commonly based on beliefs intentions, 14 semantic networks, 15 frame languages, 16 and resource description frameworks. 17,18 Knowledge association is responsible of the bidirectional information flow, where low-level data is passed upward and the high-level information is returned downwards using logical inference, 19 Bayesian inference, 14 semantic relationships and hierarchies, 20 or computational learning methods. 21 FIPA (Foundation for Intelligent Physical Agents) 22 together with KIF (Knowledge Interchange Format) 23 are the standard protocols for communication. Finally, memory is usually deployed on either centralized 24 or on distributed 25 database systems.
Apart from providing a common ground for knowledge sharing supporting efficient collaboration among robots, knowledge management systems allow for information reuse. 11 In this regard, imagine a team of robots which operate into an environment such as a rescue scenario or a manufacturing building. Let us consider the variety of data, ranging from measurements, sensory information, robot descriptions, states and commands to environmental data, such as positions, maps and spatial relations, stored by the robots into the database of the knowledge management system. Now, imagine the same team of robots entering again in the same scenario, possibly changed due to some event. The team of robots might access to the information, gathered during previous operating activities and stored into the database, and filter out the information irrelevant for the current activities. Then, it might integrate this information with the incoming sensor measurements and reuse the fused information for reducing the computational overhead in the current task assignment process while taking into account the environmental changes.
Recently, several research efforts in robotics as well as in Artificial Intelligence (AI) have been made for developing robotic systems based on knowledge management structures incorporating information reuse. Tenorth et al 26 introduced KnowRob, a knowledge processing system that combines knowledge representation and reasoning methods with techniques for acquiring knowledge and for grounding the knowledge in a physical system thus serving as a common semantic framework for integrating information from different sources. KnowRob combines static encyclopedic knowledge, common-sense knowledge, task descriptions, environment models, object information and information about observed actions that has been acquired from various sources.
An extension of KnowRob framework, named RoboEarth, has been developed in Reference 12. On top of the KnowRob robot knowledge base, Beetz et al developed Open-Ease, 27 a remote knowledge representation and processing service for robots. Open-Ease allows to retrieve the memorized experiences of task episodes and to ask queries regarding what robots saw, reasoned and did as well as how robots did it, why and what effects it caused.
Until a few years ago, knowledge management structures for modeling multi-robot systems were fully integrated a single standalone architecture, thus limiting the knowledge of the robots to that stored into the memory over the past task episodes. 16 Recently, with the advent of the emerging fields of the Internet of Things and of the Web of Things, robots can rely on cloud-computing infrastructures to have access to a vast amounts of processing power and data in order to improve capabilities such as speech recognition, language translation, path planning and three-dimensional (3D) mapping. 5,[28][29][30][31] The possibility to have access to big data repositories, such as the Web, has made even more effective the use of methods and techniques of learning in the context of multirobot systems. 32,33 In robotics, assigning tasks to multiple robots is a challenging problem. Although, it has numerous practical applications. In the past, several strategies for task assignment were proposed for different application domains such as surveillance (aerial/underwater), 34,35 search and rescue, 36 multirobot patrolling, 37 feeding operation, 38 and health care. 39 For example, multirobot search and rescue missions aim to rescue survivors by exploring the surrounding environment. Each subtask explores a small area of the environment and seeks survival. The successful completion of each of these subtasks results in the completion of the entire mission. As survival (available tasks) and the size of the robot team increase, the task allocation process becomes more difficult. 40 Techniques for allocating tasks in multirobot systems are commonly based on clustering, 41,42 swarm intelligence, 43 and resources. 44 Some other task assignment methods for multirobot systems are listed in  In this regard, we propose a framework for learning multidimensional relational structures (MDRSs) regulating role and task assignment in cooperative heterogeneous multirobot systems (CHMRSs). These structures define collaborative tasks as both temporal and spatial relations between the processes and tasks of the single robots in a logical as well as in a geometrical fashion. Through learning we extract from memorized experiences of task episodes, stored in the from of a suitably defined Multirobot Collaboration Treebank (MRCT), the syntaxes, the semantics, and the geometrical spaces in which the MDRSs lie. Moreover, we propose a decomposition technique for dealing with both uncertainty of the data and the missing information in the treebank. Such a decomposition is also employed as a technique for knowledge discovery, based on link prediction. This framework allows for reasoning about task assignment based on both logical and geometrical inference. This work has been extended from the deliverable report of Tradr project. 48 The remainder of this work is organized as follows. Section 2 introduces the multidimensional relational structures. In Section 3 we describe the MRCT. Section 4 illustrates the main components of the proposed schema for learning MDRSs together with the probabilistic model for building the geometrical spaces on top of which collaboration lies. In Section 5 we describe the approach based on nonnegative tensor decomposition (NTD) for latent knowledge estimation. Section 6 concludes the work with preliminary results in simulation and discussion.

MODELING MULTIROBOT COLLABORATION THROUGH MULTIDIMENSIONAL RELATIONAL STRUCTURES
Let us consider a CHMRS composed of two heterogeneous robots, that is, an UGV, named UGV1, and an UAV, named UAV2. Suppose that UGV1 has to explore an area of the environment (see Figure 1A). In order to navigate, this robot needs to have a representation of the area specifying what regions are traversable. The analysis of traversability builds upon a 3D metric representation of the area. However, it might happen that, due to the high degree of harshness of the terrain, the metric map built by UGV1 is quite sparse, thus making traversability analysis very inaccurate (see missing points on the left side of the robot in Figure 1C). Under this situation, UGV1 sends the request to UAV2 to fly over the area to build a more dense metric map (see Figure 1B and 1D). Upon the completion with success of this process, UGV1 requests the map of the area to UAV2, it integrates its own map with the map provided by UAV2 and, finally, it computes a more accurate estimate of the traversability of the surrounding. This form of collaboration can be represented by a MDRS encoding a temporal relation Before between four entities, two of type Robot and two of type Process.
Under this perspective, we model a CHMRS as a set = { 1 , … ,  n } of multidimensional relational structures. Each MDRS is defined by a pair ⟨ i , Y i ⟩. Where i , is the signature of MDRS. It comprises a relational symbol R i of arity K ∈ N, a finite set Σ i = { 1 , … , n i } of types, also called sorts, with n i ≤ K, a finite set C i, k of constant symbols c i, k , for each sort k ∈ Σ i and, finally, a countable set V i, k of variable symbols v i, k , for each sort k ∈ Σ i . Furthermore Y i ∈ R M 1 × … ×M K + is a nonnegative multidimensional matrix, also called K-order tensor, where each M k is equal to the cardinality of the set C i, k with k ∈ Σ i sort of the k-th input term of R i , for k = 1, … , K.
In other words, Y i has a number of dimensions equal to the arity K of R i . Along the kth direction, Y i has a number of elements equal to the cardinality of the set C i, k , where k ∈ Σ i is the sort of the kth input argument of R i .
Note that, according to the definition of Y i , each tuple of indices (i 1 , … , i K ), with i k = 1, … , M k , corresponds to a tuple (c i, 1 , … , c i, K ) of constant symbols, with c i, k ∈ C i, k and k ∈ Σ i sort of the kth input term of R i , for k = 1, … , K. F I G U R E 1 UGV1 and UAV2 in both the virtual simulated environment 49 and in robot operating system RVIZ. 50  The temporal relation representing the collaboration between UGV1 and UAV2, in the above example, is encoded by the a four-dimensional structure  i ∈ , whose signature i , is defined as follows Finally, for the tuple (UAV2, mapping, UGV1, exploring) of constant symbols, there exists a tuple of indices (i 1 , i 2 , i 3 , i 4 ) such that y i i 1 ,i 2 ,i 3 ,i 4 ∈ R + is the element of the four-tensor Y i associated with this tuple. MDRSs are well suited for supporting two different types of inference. The first type is the standard logical entailment. The second type of inference is based on the extraction of fragments (eg, tensor fibers and slices) of the tensors through mode-n operations. The strength of combining together these two types of inference for role and task assignment in multirobot collaboration is demonstrated in the following example. Example 1. Let us consider a CHMRS composed of three UGVs, named UGV1, UGV2, and UGV3, respectively. Let us suppose that U includes a MDRS  1 , having the following signature 1  1 represents a collaborative pick and place task. This task requires that two UGVs simultaneously hold and lift an object. According to this specification, this form of collaboration has been encoded by a temporal relation Equal between four entities, two of type Robot and two of type Process. Moreover, let us assume to have another structure  2 representing the processes which can be performed by each robot. The signature 2 of  2 is defined as follows Now, suppose that UGV1 has to grasp an object for which it is required the collaboration of another UGV. Then, we want to know to which robot to assign this collaborative task. According to logical entailment, we know that, in order to answer to this query, we have to find, as usual, an interpretation = (, ), with  first-order structure and variable assignment function mapping sorted variable symbols to elements of the domain of the right sort such that Let us assume that the interpretation maps the constant symbols into themselves in the domain. Moreover, according to , let us assume that ∈ R + and y 1 , be the elements in the four-tensor Y 1 associated to the tuples in Equal  . Let us also assume that these elements represent the number of times that UGV1 performed this task in collaboration with the two UGVs under consideration. According to the meaning that we are given to these tensor elements, the choice of the UGV which has to collaborate with UGV1 might be dictated by comparing the value of y 1 with the value of y 1 . Then the choice falls onto UGV3.As above, let (i 1 , i 2 ), (î 1 , i 2 ) and (ĩ 1 , i 2 ) be the three tuples of indices such that y 2 ∈ R + are the elements in the matrix Y 2 associated with the terms in the interpretation Process  . Now, suppose that these elements y 2 represent the failure rate of the process grasping for UGV2 and UGV3, respectively. A comparison among the values of these two tensor elements should be also considered before assigning the task to UGV3. If the value of y 2 k 1 ,k 2 is greater than the value of y 2 k 1 ,k 2 then a better choice would be to assign the collaborative task to UGV2, rather than UGV3.
Example 1 demonstrates the ability of a system, modeled by a set of MDRSs, of flexibly reasoning on multiple choices by combining both logical inference and reasoning on tensor fields. However, this joint reasoning mechanism, as highlighted in Example 1, strictly depends by the meaning that we are given to the tensor elements. Moreover, it still unclear where these multidimensional relational structures come from, how these structures are built and where tensor elements come from.
To this end, we developed a learning schema trough which both signatures and tensors of each MRDS are learnt from data reporting the situated history of the activities performed by a group of cooperative heterogeneous robots. Here, data and, in particular, their linguistic structure, play a crucial role in making learning more tractable, as described in the next section.

MULTIROBOT COLLABORATION TREEBANK
Documents describing reports of missions executed by a team of robots are organized as a treebank. 51-54 A treebank is a collection of pairs ⟨s i ,  i ⟩, where each s i is a statement and each  i is a syntactic tree. Each statement is a sequence s i,1 , … , s i,n of words. Here, we assume that statements do not contain anaphoric as well as elliptical references. Each  i is composed of a set  i of nodes and a set  i of edges.  i is composed of a set  i,int of intermediate nodes and a set  i,leaf of leaf nodes. Each node n u ∈  i,int is labeled with a nonterminal symbol NT of a formal system FS. On the other hand, each node n v ∈  i,leaf is labeled with a terminal symbol T of FS. There exists an edge ⟨n u , n v ⟩ ∈  i , with n u , n v ∈  i if there exists a production rule of the form → in FS such that the label of n u is equal to , the label of n v is equal to , ∈ NT and ∈ NT ∪ T. Intuitively, each syntactic tree is a derivation of the string of words composing a statement. A derivation is a sequence of rule expansions defined by the formal system.
In linguistic, a formal language is commonly used as a generative grammar. Instead, in natural language processing, it is used for semantic parsing of natural language sentences, possibly unconstrained and including complex compositional expressions. In this context, the use of a formal language has a different purpose. It provides a priori a well-defined syntax of the relations encoding both individual and collaborative tasks. It has the effect to filter noise in the documents. Indeed, if a statement is aligned by a syntactic tree which does not correspond to any sequence of rule expansions of the formal system, then it can be simply disregarded and not considered in the learning phase. Finally, the syntactic trees associated with the statements together with the production rules of FS suggest a set of strategies to speed up learning of the MDRSs modeling a multirobot system.
In the following we provide a partial description of the formal system together with examples of syntactic trees annotating admissible statements describing the activities of heterogeneous robots cooperating in the execution of both individual and collaborative tasks. The full version of the formal system underlying the treebank is provided in Appendix A1.
The formal system consists of a set of domain-specific constituents. A constituent is either a single word or a group of words, acting as a unit. For the multirobot collaboration domain application we specify atomic process phrase (APP), temporal task phrase (TTP), shifting-inhibition phrase (SIP), and atomic process execution state phrase (APESP) constituents. According to these domain-specific constituents, a statement (S) can consist of either an APP or of a TTP, or of a SIP or of an APESP, as specified by the following production rule (1) APP constituents are introduced to annotate statements describing single processes executed by individual robots. These processes comprise simple operations like acquiring an image, mapping an area of the environment, holding an object, but also spatial relations between objects, spatial relations between robots and relations of dispatching of information among robots. Syntactic trees associated to statements describing these activities and generated from the expansion of an APP constituent are shown in Figure 2.
F I G U R E 2 Syntactic trees of statements on the basis of production rules (A1) to (A16) in Appendix A1 F I G U R E 3 S: Robot r4 picked up object o5 inside container c1 after robot r7 lifted up lid l1 of container c1 TTP constituents syntactically specify a wide class of complex collaborative behaviors of a group of robots. According to the definition of TTP, each behavior is a composition of individual robot processes linked by a temporal constraint. An examples of a syntactic tree annotating a statement represented by the expansion of an TTP is shown in Figure 3.
Statements in the treebank describing both shifting and inhibition behaviors of individual robots as well as reporting the status of the execution of a robot process are annotated by expanding SIP and APESP constituents, respectively. Syntactic trees associated with these kind of statements are shown in Figure 4A and 4B.
Note that the formal system differs from the semantic grammars used in case theory due to the choice of word-alignment. 55 In fact it is not a completely word-alignment grammar. Constituents may correspond to multiword expressions, such as the spatial relation on top of or the process picked up. The main motivation is to facilitate learning of the MDRSs, as described in the next section.

LEARNING OF THE MULTIDIMENSIONAL RELATIONAL STRUCTURES
be a collection of statement-syntactic-tree pairs where each syntactic tree annotates the corresponding statement on the basis of the formal system FS, introduced in Section 3. Learning of MDRS from the dataset D involves three main steps: (i) the definition of the signatures i ; (ii) building of the tensors Y i and finally, (iii) the estimation of the values of the elements of each tensor.
Signatures i are directly derived from the specification of both constituents and production rules of the formal system and from the syntactic trees  i annotating the statements s i . For example, let us consider the syntactic tree in Figure 5 annotating the following statement S ∶ Robot r5 inhibited wifi connection lost while moving from area a37 to area a51 The subtree of the root node S is labeled with the nonterminal symbol SIP of FS. As introduced in Section 3, this symbol denotes a SIP constituent. This constituent encodes the behavior of a robot to either switch between processes or inhibit a stimulus by focusing on the process at hands. These two behaviors can be discriminated by looking at the label of the right child of the node labeled with SIP. More precisely, if the label of the right child is labeled with the non-terminal symbol SwitchP in FS then the statement encodes a switching behavior. In other words, the production rule SIP → RP SwitchP has been applied to annotate the right branch of this subtree. Conversely, if the node is labeled with the non-terminal symbol InhibitP then the statement represents an inhibition behavior. This means that the node labeled with SIP has been expanded by applying the production rule SIP → RP InhibitP of FS (more details about the meaning of these rules can be found in Appendix A1). The label of the right child of the root in the syntactic tree in Figure 5 verifies the latter case. Therefore, from both the specification of both constituents and production rules of FS we know that the statement encodes a relation of inhibition Inhibition between three types of entities, the first of type Robot, the second of type Stimulus and the third of type Process. We also know that the set of constants C i,Robot of sort Robot contains a constant symbol r5, the set of constants C i,Stimulus of sort Stimulus includes a constant symbol wifi connection lost and, finally, the set of constants C i,Process of sort Process contains the symbol moving. By applying these simple heuristics to every pair ⟨s i ,  i ⟩ the signature i of the MDRS representing inhibition behaviors is extracted from the treebank D (see left side of Figure 6). Now, let i be the signature of  i , extracted from a treebank, according to the heuristics mentioned above. The tensor Y i associated with  i , with signature  i , is build as follows. The number of dimensions of Y i is fixed to be equal to the arity K of R i ∈  i . If the k-th input term of R i is of sort k ∈ Σ i then, along kth direction the number of elements is fixed to be equal to the cardinality M k of the constant set C i, k . Moreover, each index i k ∈ {1, … , M k } is linked to one and only one constant symbol c i, k ∈ C i, k . The third-order tensor associated with the 3D relational structure representing inhibition behaviors of a robot, on the basis of these rules, is shown in the right side of Figure 6.
Finally, the last step of learning of MDRS is to estimate the values of the entries of the tensors. To this end, we propose a generative approach based on a probabilistic model, similar to those applied for document classification and information retrieval. [56][57][58] Let  i be the language defined by the signature i of the MDRS  i . Let  i be the Herbrand Universe of  i , namely, the set of all ground terms of  i . Let  i be the Herbrand Base of the signature i , namely, the set of all possible ground atoms that can be formed from the relation symbol R i ∈ i and terms in  i . We denote with  D i ⊆  i the subset of those ground atoms in  i whose argument terms label the leaf nodes of the syntactic trees  i annotating the statements s i in the treebank D encoded by relation specifies which ground atoms are true in the interpretation. Now, we introduce a binary vector x ∈ {0, 1} W where W is the cardinality of  D i and each component x w is equal to 1 if the labels of the leaf nodes of the syntactic trees  i annotating the statements s i encoded by relation R i ∈ i are the argument terms of the ground atoms which are true, that is, that are in R We denote with w ∈ [0, 1] the probability that x w = 1. Then the probability of the vector x is given by Let us assume that the pairs ⟨s i ,  i ⟩ ∈ D are independent. Let us denote with m w the number of times that the component x w = 1 in D. According to this definition a treebank can be represented by a vector m = ( m 1 … m W ) ⊤ of counts whose probability distribution follows a Multinomial with parameter vector Here N = ∑ W w=1 m w . Now, knowing that • from the definition of Y i , each tuple of indices (i 1 , … , i K ), with i k = 1, … , M k , corresponds to a tuple (c i, 1 , … , c i, K ) of constant symbols, with c i, k ∈ C i, k and k ∈ Σ i sort of the k-th input term of R i ; • the leaf nodes of the syntactic trees  i might or might not be labeled with symbols in the tuples (c i, 1 , … , c i, K ); • the terms c F I G U R E 6 Signature and the third-order tensor associated with the three-dimensional relational structure representing inhibition behaviors of a robot, with respect to both processes and stimulus occurrences under these isomorphisms, the entries of the tensors Y i are filled according to the following rule Here null is introduced as a place-holder to denote the missing entries of the tensor. p(x w = 1|m) is the posterior predictive probability of the term c associated with the tuple of indices (i 1 , … , i K ). This probability is computed according to the following distribution Here p( |m) is the posterior distribution of . This distribution is obtained by multiplying the likelihood in Equation (2) with the prior distribution of . Since the distribution over the space of the parameter vector is confined to a simplex of dimensionality W − 1, as a consequence of the constraints 0 ≤ w ≤ 1 and ∑ W w=1 w = 1, a Dirichlet prior with concentration parameters 1 , … , W is chosen. Knowing that the Dirichlet is a conjugate prior of the multinomial and that the posterior distribution is also Dirichlet, the posterior predictive distribution in Equation (4) takes the following form that is, it is equal to the expected value E[ w |m] of the posterior. Therefore, when the rule in (4) is satisfied, the values of the elements y i i 1 , … ,i K of a tensor Y i are fixed to be equal to the probabilities computed according to Equation (5).
However, after this step, the tensor turns out to be sparse due to the lack of ground atoms in the interpretation R  i i . This depends upon the lack of statements in the treebank. In order to fill the missing entries of the tensors we resort to Non-Negative Tensor Decomposition (NTD). [59][60][61] Apart from completing the tensors, there are other reasons which motivate its application in the context of multi-robot collaboration. NTD reduces the dimensionality of the tensor through factorization. The components resulting from this factorization require less computational resources for both storage and information retrieval. NTD also filters the data thus reducing noise. Finally, by applying NTD, new knowledge can be discovered through link prediction. 62,63

LATENT KNOWLEDGE ESTIMATION THROUGH NTD WITH MISSING DATA
Here, × k denotes the mode-k tensor-matrix product, I is the identity tensor and E i is the residual error. • denotes the outer product. The model in Equation (6) is often referred to as the Parallel Factor Analysis model with non-negativity constraints. 61,64 Our objective here is to estimate the non-negative component matrices A (k) or, equivalently, the set of vectors a (k) j , with k = 1, … , K and j = 1, … , J given the number of factors J. Approaches based on Alternating Least Squares (ALS) minimization of the squared Euclidean distance are commonly employed for estimating the component matrices in NTD. 59 However, these approaches do not deal with missing entries in Y i due to the incompleteness of the observations (see Section 4). In order to cope with this issue, we propose an approach which embeds a variant of ALS, named Fast Hierarchical Alternating Least Squares (F-HALS), 65 into an imputation-alternation schema. 66 In this approach the missing values of Y i are imputed using the interim modelŶ i = A (1) , A (2) , … , A (K) , computed at the n-th iteration, as followsỸ Here W i ∈ {0, 1} M 1 ×, … ,×M K is the indicator tensor specifying which entries are missing in Y i , 1 is a tensor of the same dimension of W i whose elements are all equal to 1 and ⊛ is the Hadamard product. OnceỸ i is generated, the factor matrices are then updated via the sequential minimization of a set of local cost functions with the same global minima (e.g., squared Euclidean distances) performed by F-HALS. 65Ŷ i is then updated at every iteration according to Equation (6) as well asỸ i , on the basis of Equation (7).
More precisely, let us consider the following set of local functions for j = 1, … , J, subject to the nonnegativity constraints. Here a (k) j ∈ R M k + are the j-th column vectors of the loading matrices A (k) , with k = 1, … , K. ||⋅|| F is the Frobenius norm.Ỹ (j) i is the jth sub-tensor ofỸ i , defined as follows Based on the mode-k unfolding representation of a tensor, the cost function in Equation (8) can be rewritten as follows for j = 1, … , J and k = 1, … , K, where Here ⊙ denotes the Kharti-Rao product. By setting the gradient of the cost functions in Equation (10) to zero, by replacingỸ (j) i,(k) terms by those in Equation (9) and by exploiting the property of the Kharti-Rao and Kronecker product in Equation (11) we arrive at the update rules referred to as the Fast HALS NTD algorithm for j = 1, … , J and k = 1, … , K. Here (k) j are the scaling coefficients.Ỹ i,(k) is the mode-k unfolding representation ofỸ i .
⊘ denotes the element-wise division.
[⋅] + is the nonlinear half-wave rectifying projection replacing negative values of the argument by zero or by a small positive value .
[⋅] j selects the j-th column vector of the matrix argument. At the n-th iteration the common loading factors first are updated according to the rules in Equation (12) and then are fed into Equation (7) to update bothŶ i andỸ i for the next iteration. This imputation-alternation schema is illustrated in Algorithm 1.
, for k = 1, … , K, be the set of common loading factors, estimated according to the imputation-alternation Fast HALS in Algorithm 1. Let (i 1 ,i 2 , … , i K ) be a fixed tuple of indices, with i k = 1, … , M k . Then, the estimation of the value of the tensor at position (i 1 ,i 2 , … , i K ) is given bŷ However, reasoning about multiple choices for decision-making in a multirobot system requires that entire one-dimensional fragments are extracted from the tensors. To this end, we resort to mode-k fiber operations on the tensors. Fibers can be commonly obtained from tensors by fixing all indices except one, as illustrated in Figure 7. In Example 2 we illustrate how these operations support reasoning as well as how link prediction handles knowledge discovery.

Example 2.
Let us consider the scenario described in Example 1. UGV1 has to grasp an object for which it is required the collaboration of another UGV. Therefore, we have to take a decision regarding which UGV has to take at hands this collaborative task. Now, let us assume that the set C 1,Robot of the constant symbols of sort Robot includes another constant symbol referred to another UGV, that is UGV4. This means that in the treebank under consideration, there was a pair ⟨s i ,  i ⟩ which encoded a temporal relation Equal between four entities, two of type Robot and two of type Process and such that UGV4 appeared as the label of a leaf of the right sub-tree of  i . Moreover, let us assume that the tuple ⟨UGV1, grasping, UGV4, grasping⟩ ∉ Equal  1 , namely, it does not belong to the Herbrand interpretation F I G U R E 7 From left to right, 1-mode, 2-mode and 3-mode fibers of a third-order tensor  1 of Equal. This means that there was no pair ⟨s i ,  i ⟩ in the treebank that, as above, encoded a temporal relation Equal and it was such that UGV1 appeared as the label of a leaf of the left subtree of  i . Consequently, the ground atom Equal(UGV1, grasping, UGV4, grasping) ∉  D 1 (see Section 4). Therefore, after the completion of the procedure of building of the tensor Y 1 , the element in position (i 1 , i 2 ,ĩ 3 , i 4 ) of Y 1 , corresponding to the term (UGV1,grasping,UGV4,grasping) results to be empty.
However, after the factorization of Y 1 , on the basis of Algorithm 1, we have found a set A (k) = [a (k) 1 , a (k) 2 , … , a (k) J ] of factor matrices, with k = 1, … , 4, such that that is, we have an estimate of the value of tensor element associated with the term ⟨UGV1,grasping,UGV4,grasping⟩, even if this term was not present in the treebank. In other words, through the decomposition, we have discovered new knowledge. Now, by suitably performing a fiber operation on the tensor we obtain a one-dimensional fragment whose entries are an estimation of the collaboration of UGV1 with all the other UGVs (or at least with all included in the domain of discourse). By interpreting this fragment as a recommendation vector, we can take a decision about which UGV has to be in charge of supporting UGV1 in the task of grasping the object. 67

EXPERIMENTS
Through this paper we have shown, with several examples, the main features of the framework. In particular, the use of a many-sorted first-order logic, even if restricted to conjunctions of ground atoms, makes the framework expressive enough for modeling a heterogeneous team of robots. The use of temporal relations makes the framework expressive enough for encoding a wide class of collaborative tasks among the robots. Operations on tensors allow us to reason about multiple choices in role and task assignment. Tensor decomposition endows the framework with the capability of both dealing with missing information and discovering new knowledge through link prediction. In this section we aim to show another key attribute of the framework, namely, the capability of supporting inference for task assignment in communication denied situations.
In this regard, we built a multirobot system, composed of two UGVs and one UAV, within a virtual simulated environment. This environment comprises two main software modules. First module is responsible of modeling the dynamics of the system through a physics engine. This engine is based on the well-known Bullet Physics Library. 68 First module also implements a layer of bidirectional interfaces which allows for the integration with the Robot Operating System (ROS). 50 The second software module is responsible of interconnecting the main functionalities of the robots under consideration (eg, mapping, planning and control), developed in ROS, with the physics engine, through these bidirectional interfaces. First module has been developed within the cross-platform V-REP. 69 First UGV is endowed with an arm for manipulating objects (see Figure 8A). Second UGV is equipped with a Pan-Tilt Unit with on top RGB-D camera sensor for dense 3D reconstruction (see Figure 8B). The UAV is endowed with a two-dimensional laser range finder, mounted on a servomotor, on the bottom of the chassis for 3D scanning of the surrounding (see Figure 8C). These robots are deployed in a urban virtual environment simulating a disaster scenario. The scenario is composed of nine buildings (see Figure 9). Some of them are only accessible from a staircase. Others are accessible only from windows since we suitably simulated doors blocked at the ground floors. In others, it can be possible to enter only from holes in the roof caused by the collapse. Referred to the scenario in Figure 9, buildings A, C, E, F, H and I are only accessible through staircases leading to the free doors of the ground floors. Building B is accessible through both the hole in the roof and a door at the first floor. This door has not been blocked. Similarly, building G, namely the church on the right of Figure 9, is accessible through both the holes in the roof and a door. However, this door has been blocked. Building D is accessible only via its windows.
The goal of the mission is to ensure that each building has been visited at least once by at least a robot of the team. A graph-based topological representation of the virtual environment is provided to all the robots. 70,71 We simulate the presence of a communication channel through which robots can exchange information about the buildings yet to be visited as well as those already explored. A building is considered explored if a robot visited the node of the topological graph associated with that building. We also simulate events which resemble in the virtual environment, the deterioration of the channel, up to deny the communication between a robot and the rest of the team.    At the beginning of the mission, we manually instructed the UGV endowed with the arm, named in the above MDRS UGV1, to explore the ground floor of building B. We instructed the UAV, encoded in the MDRS with UAV1 to enter into building D through a window in order to internally inspect the building. Finally, we manually instructed the UGV equipped with a Pan-Tilt RGB-D camera, represented in the framework by UGV2, to move inside building H. After this initial phase of bootstrapping, UGV1 successfully computed a path from its current position, represented by term base0 in the MDRS, to building B (encoded in the MDRS by the symbol buildingB) and started to execute it. 72 Similarly, both UGV2 and UAV1 started to move toward their own destination targets. 73,74 During the phase of climbing the stairs at the entrance of building B, trajectory tracking process 75 of UGV1 failed, causing the interruption of the task. Here, UGV1 has two possible choices: (i) recompute the path from its actual position to the target or (ii) send a request of support to another robot. In order to show how the proposed framework handles collaborative tasks we forced UGV1 to select the second choice, namely ask for support. From the treebank we learned that UAV1 built the map of a certain area before UGV1 explored that area. Moreover, we learnt that UGV1 requested the map of that area to UAV1 and that UAV1 sent this map to UGV1. On the other hand, from tensor decomposition we also have an estimate of this form of collaboration between UGV1 and UGV2. Therefore, UGV1 knows that before exploring building B both UAV1 and UGV2 have to build the map of building B. In this regard, UGV1 has to take a decision concerning which robot has to be contacted in order to receive from it the map of building B. According to the estimates of this form of collaboration UGV1 chose UAV1. On the basis of this choice,UGV1 sent a request to UAV1, UAV1 flew over building B. Then UAV1 sent the map of the building to UGV1 which first integrated it with its own map of the area, then performed traversability analysis 76 and, successively, re-stared to climb the stairs.
In the meanwhile, UGV2 finished to explore building H and it was moving toward another node of the topological graph which has not been visited yet, namely that corresponding to building F. At this point of the mission we simulated an event which interrupted the communication between UGV2 and both UGV1 and UAV1. UGV1 and UAV1 only know that UGV2 was moving toward building H, but they do not know the reason why UGV2 is not reachable on the communication channel. Moreover, they do not know whether UGV2 is still executing the task at hands or it has switched to another task. From the treebank we learnt that during a motion task UGV2 might be elicited by several stimuli, such as wifi connecton lost, subtrack failure, battery low, tip over, etc. We also learnt the behavior of UGV2 in the case in which one of these stimuli occur. However, both UGV1 and UAV1 know neither which event occurred, causing the interruption of the communications with UGV2 nor which task is currently carrying on. That is the situation in which both the result of the tensors factorization and the mode-k operations on the tensors can be employed for estimating the current status of UGV2. Indeed, by suitably extracting the one-dimensional fragments of the tensors Y 1 and Y 2 , associated with the multidimensional structures  1 and  2 , respectively, we can have an estimate of the kind of stimuli which occurred as well as the response behavior of UGV2 to these stimuli. In fact, these estimate tell us whether UGV2 shifted from the current task to another task and which task has been chosen due to the occurrence of a stimulus (and which stimulus occurred) or if the robot inhibited the stimulus (and which stimulus has been inhibited) to focus on the task at hand. After reasoning on these estimates we discover that the stimulus wifi connecton lost might be occurred and that UGV2 probably inhibited this stimulus to continue its current task, that is, explore building F. Note that if UGV2 had shifted to another task such as go back to the base station base0 then building F resulted to be not visited and a re-allocation of the task of either UGV1 or of UAV1 would had been needed in order to accomplish the goal of the mission.

CONCLUSIONS
In this work we proposed a framework capable of learning, from data reporting the situated history of the activities performed by a team of robots, a model of collaboration. From the data we extracted information about the relations linking the robots, the tasks and the content-dependent features of the collaborative tasks. This information is used to define both the set of symbols and terms of the many-sorted first-order language encoding multirobot collaboration. This information is further used to learn the geometrical spaces underlying both individual and group task execution. The spaces are decomposed for obtaining the latent factors regulating the tasks of the robots. The decomposition is also used for dealing with the lack of information in the data as well as for discovering new forms of collaborations between robots, never encountered in the memorized task episodes.We demonstrated performance in simulation experiments by using a team of robots. In the future, we plan to demonstrate the usefulness of such a framework in different robotic domains or scenarios. In this context, more simulated experiments with different environments are needed. Finally, the impact of this framework will be measured.

PEER REVIEW INFORMATION
Engineering Reports thanks the anonymous reviewers for their contribution to the peer review of this work. aimed to provide natural collaboration between humans and teams of robots in urban search and rescue scenarios. At NIFTI, he designed an all-terrain 3D autonomous navigation framework for an articulated tracked robot. In 2012, He deployed the robotic system to assess damage to historical buildings and cultural artifacts, in Mirandola, Italy, hit by an earthquake. From 2014 until 2017 he worked for the EU-FP7-ICT-609763 project TRADR, developing new technology for long-term human-robot teaming for robot assisted disaster response. Where, he developed a framework for autonomous exploration, patrolling and coverage of rescue scenarios for a team of heterogeneous robots. In 2016, he deployed a team of ground and aerial robots to collect data for 3D textured models of the interior and exterior of two churches located in Amatrice, Italy, which were badly damaged by an earthquake. Recently, He was awarded by the EPSRC National Centre for Nuclear Robotics (NCNR) to undertake research activities in robot-teaming for collaborative terrain traversability assessment of nuclear environments. He established the Autonomous Mobile Robotics in Extreme Environments (AMREE) Lab in the University of Plymouth, UK. The AMREE is a new and emerging laboratory of SECaM pursuing cutting edge research activities in the field of Computer Vision, Perception, AI and Robot Learning applied to autonomous robot-assisted response in hazards, prevention and safety-critical operations. Currently, He is collaborating with a company in Dorset to automate remotely operated vehicles for Explosive Ordinance Disposal domain applications.

APPENDIX A. THE FORMAL SYSTEM UNDERLYING THE MULTIROBOT COLLABORATION TREEBANK
The formal system FS used for annotating the statements in the multirobot collaboration treebank, introduced in Section 3 is composed of • a finite set NT of nonterminal symbols; • a finite set T of terminal symbols; • a finite set P of production rules of the form → , with ∈ NT and ∈ NT ∪ T; • the symbol S ∈ NT used to represent the whole statement; Both Tables A1 and A2 report the description of each nonterminal symbol of NT. Given the definition of all possible non-terminal symbols of the formal system FS, the production rules of P have been constructed as follows. A statement (S) can consist of either an Atomic process phrase (APP) or of a Temporal task phrase (TTP), or of a Shifting-inhibition phrase (SIP) or of an Atomic process execution state phrase (APESP). An Atomic process phrase (APP) is composed of a robot phrase (RP) and a process phrase (PP). A robot phrase is composed of a robot noun (RobotNoun) followed by a robot identifier (RobotID). A process phrase is composed of a process (Process) followed by a target phrase (TP). A target phrase can simply comprise either an object phrase (OP) or a source-destination spatial phrase (SDSpaP). It can also be composed of an object phrase (OP) followed by a destination phrase (DP).
An object phrase can be specified by either a robot phrase or by an object noun (ObjNoun) followed by an object identifier (ObjectID). A source-destination spatial phrase comprises a source particle (SourceParticle) followed by an object phrase, followed by a destination particle (DestinationParticle), followed by another object phrase.
A destination phrase can be composed of either a specification phrase (SpecP) or of a spatial phrase (SpaP) or of an interaction phrase (IntP). A specification phrase is composed of a specification particle (SpecParticle) followed by an object phrase. According to this initial specification, the following production rules are included into the formal system Given the set of production rules in (A1) to (A12), we can build syntactic trees of statements admitted in the treebank, as those shown in Figure A1.
A spatial phrase can be composed of either a relational phrase (RelP) or of relational phrase followed by a spatial phrase or of a source-destination spatial phrase. A relational phrase is composed of a simple spatial relation (SimSpaRel) followed by an object phrase. The corresponding production rules are as follows Rules (A13) to (A16) have been suitably defined for aligning statements describing both robot manipulation tasks and robot motions from a starting to an ending position. Syntactic trees of such statements are illustrated in Figure 2A, Section 3, and in Figure A2.
An interaction phrase is composed of either a receiver phrase (RecP) or of a receiver phrase preceded by a specification phrase. A receiver phrase is defined by an interaction particle (IntParticle) and a robot phrase. Therefore, the following rules are added into the system Rules (A17) to (A19) model statements specifying a preliminary form of collaboration such as information dispatching with or without prepositions. An example of these forms of collaboration is shown in Figure A3.
Note that, rules (A13) to (A16) can be also recursively expanded to align statements specifying forms of coordination among robots, in particular, in motion tasks, as illustrated in Figure A4. Finally, a Temporal task phrase (TTP) is composed of either an Atomic process phrase (APP) or of an Atomic process phrase (APP) followed by a temporal constraint phrase (TCP). A Temporal constraint phrase (TCP) consists of a temporal constraint operator (TCO) and a Temporal task phrase (TTP). According to this definition, the following set of production rules is added to the system F I G U R E A1 Syntactic trees of statements on the basis of production rules in (A1) to (A12) F I G U R E A2 Syntactic tree of the statement S: Robot r1 computed path p10 from area a3 to area a6, aligned with the production rules in (A13) to (A16) F I G U R E A3 Syntactic tree of statement S: Robot r3 sent map m8 of area a5 to robot r1, on the basis of production rules in (A17) to (A19) F I G U R E A4 Syntactic trees of statement S: Robot r3 followed robot r5 along corridor c9, on the basis of production rules in (A13) to (A16) TTP → APP. (A20) TTP → APP TCP.
Recursive rules (A20) to (A22) align statements describing a wide class of complex tasks, performed by a group of robots under collaboration. Each collaborative task is a composition of individual robot tasks linked by a temporal constraint. Examples of syntactic trees of statements expressing complex collaborative tasks are shown in Figure 3, Section 3, and in Figure A5.
A SIP (shifting-inhibition phrase) is composed of either a robot phrase followed by a switching phrase (SwitchP) or of a robot phrase followed by an inhibition phrase (InhibitP). A switching phrase is composed of a switching process (SwitchProcess) followed by a switching relation process (SwitchRelP). A switching relation process is composed of a current process phase (CPP) followed by a new process phrase (NPP) followed by a causal stimulus phrase (CSP).

F I G U R E A5
Syntactic tree of the statement S: Robot r3 requested map m22 of area a7 to robot r2 before robot r3 computed path p61 from area a11 to area a7, aligned according to production rules in (A20) to (A22) A current process phase is composed of a source particle followed by a participle process phase (PartPP). A participle process phase is specified by a participle process (PartProcess) followed by a target phrase. A new process phrase is composed of a destination particle followed by a participle process phase. Finally, a causal stimulus phrase is defined by a causal stimulus particle (CSParticle) followed by a stimulus phrase (StimulusP). A stimulus phrase can be composed of either a stimulus noun (SNoun) or of a stimulus noun followed by a stimulus phrase. An inhibition phrase is composed of an inhibition process (InhibitProcess) followed by an inhibition relation process (InhibitRelP). An inhibition relation process is composed of a stimulus phrase followed by a inhibition conjunction phrase (Inhibit-ConjP). This is composed of a conjunction (Conj) followed by a participle process phase. On the basis of the above specification the formal system includes the following production rules An example of syntactic trees of statements for task switching behaviors is illustrated in Figure 4A, Section 3, and in Figure A6.
An atomic process execution state phrase (APESP) is composed of either a robot phase followed by a success state execution phase (SSEP) or of a robot phase followed by a failure state execution phase (FSEP). A success state execution phase is composed of a success process (SuccProcess) followed by a participle process phase whilst a failure state execution phase is specified by a failure process (FailProcess) followed by a participle process. The set of production rules annotating execution status statements is defined as follows Examples of syntactic trees of statements produced by rules (A36) to (A39) are shown in Figure 4B, Section 3, and in Figure A7.
To conclude, the formal system is completed with a set of production rules generating the leafs of the syntactic trees. An excerpt of these rules is described in the following ObjNoun → image | area | door | handle | table | path | map. (A40) Process → acquired | requested | picked up | computed | sent.
SimpSpaRel → on top of | behind | along | in front of | inside.