Loose Coupling: An Invisible Thread in the History of Technology

We present an interdisciplinary survey of the history of loosely coupled systems. We apply the presented concepts in communication networks and suggest hybrid self-organizing networks (SONs) as a universal model for future networks. Self-organizing networks can fulfill the tight requirements of future networks but are challenging to use due to their complexity and immaturity. Moreover, the lack of an externally defined goal and centralized control has resulted in many distributed self-organizing systems failing. This is because the nonlinear relationships between the system parts result in emergence, i.e., we cannot predict the behavior of the whole from the behavior of the parts. Furthermore, a set of local optima does not produce a global optimum. Hybrid SONs tackle these challenges with loose or weak coupling of interacting agents that combine centralized control for global optimization with distributed control for local optimization. In the loose centralized control of almost autonomous agents, decisions are made mostly locally with small delays. This architecture has beneficial properties such as stability, obtained by decoupling the feedback loops: vertically with time-scale separation and horizontally with interference avoidance. Applications of loose coupling include modular electronics and computer design, structured software design, and service-oriented architectures, especially for microservices. Cross-layer design for network optimization is a new reason to use loose coupling in networks to improve stability. We also summarize some recent trends and present a roadmap to the future. We expect that loose coupling will be widely used in self-organizing networks of future wireless systems.

µ Step size. d k Output signal of the unknown system.  Output of the adaptive filter.

I. INTRODUCTION
Loose or weak coupling [1] is a general principle independently used in many disciplines with different terms. Just as feedback [2], loose coupling is ''an invisible thread in the history of technology,'' based on the results of natural and social sciences and engineering. Complex systems use a hierarchical layer architecture. The essence of loose coupling is to reduce communications both between layers and between subsystems in the same layer, and to balance centralized and local decision making. It has many benefits and can solve problems when developing complex information and communication technologies (ICT) systems. Specifically, loose coupling supports sustainable development with its system-wide focus on resource usage. Large systems need a hierarchical and modular structure to manage complexity and cope with dynamic environments. This leads to a set of agents making local decisions within their sense-decide-act feedback loops, as in [3]. The feedback loops facilitate operating in uncertain and dynamic environments. On the other hand, global goals and requirements on resource usage call for centralized control that is aligned with local decisions. This can be achieved by organizing the set of agents into a hierarchy where the higher layer or level agents give goals and requirements for the lower layer local agents but leave detailed decision-making for those local agents. In other words, the higher and lower layer agents are vertically loosely coupled.
Loose coupling can also manage unintentional coupling, such as interference between system components (i.e., agents) at the same hierarchy level. This case is horizontal loose coupling. A hierarchical multi-agent system applying intentionally both vertical and horizontal loose coupling can operate in a dynamic environment and achieve system goals with good performance. The ability to adapt to changes in the environment and requirements can be further improved with self-organization [4], [5], [6]. That is, the agents adapt by updating their organization, structure, or architecture without any external control.
We apply loose coupling to build a hybrid self-organizing system. A minimum amount of control information moves downwards in the layer hierarchy, and a minimum amount of sensing signals moves upwards in the hierarchy or horizontally in the same layer, see FIGURE 1. We suggest this system architecture as complex systems are invariably formed by rational agents [7], [8], and self-organization improves adaptability and agility.
In this article, we introduce loose coupling, rational agents, feedback, hierarchy, self-organization, degree of centralization, and open systems and explain how they support building future complex ICT systems. Although the general principles of these concepts are widely known, this is the first time that results from a wide set of disciplines are collected together and used to propose a general architecture for complex systems. We concretize Simon's vision of vertical and horizontal loose coupling in general systems [1].
We discuss communication networks in detail as a prime example of systems entering new application areas, becoming increasingly more pervasive and complex, and operating in more and more dynamic environments. Sustainable development calling for efficient resource usage further tightens the requirements set for communication networks. In addition, the networks must be stable and scalable to support future needs. Reliability is an important general performance criterion. Finally, the networks must be agile well known that negative feedback can be used to improve the stability of an otherwise unstable system [11]. Our focus is on coupling between two or more feedback loops creating instability unless the loops are loosely coupled.
Cross-layer design can be used for joint optimization of the layers and their subsystems. The layers and subsystems must both be mutually loosely coupled to improve performance. We present how time-scale separation [12] and interference avoidance [13] belong to the loosely coupled paradigm. However, clear time-scale separation is not used in the Open Radio Access Network (O-RAN) [14]. Interference avoidance is not used in the nonorthogonal multiple access (NOMA) system [15]. Lack of loose coupling may result in instability in the network if not carefully designed. The problems with stability are demonstrated with simulations. A delay in the feedback loop increases stability problems. We also summarize some recent trends and present a roadmap to the future. This paper is a major extension of our earlier papers in [16], [17], and [18]. A historical approach is used in all the sections of this paper. The history of loose coupling is presented in detail. Furthermore, as we have observed that the history of open systems and emergence are still not well known in the IEEE literature, we decided to present their history in more detail. We often refer to survey papers and books to manage the number of references. Many concepts are explained in some detail in our earlier paper [19] using figures and references. When the literature is fragmented, knowledge about the origin of each idea has a unifying effect. The parallel threads related to hierarchy, modularity, and loose coupling are summarized in the timeline in FIGURE 3 for the last hundred years. In the figure, the development is presented in social and natural sciences, control theory, computer science, and communication theory to show the different terminology. The terminology is explained later.
The rest of this paper is organized as follows. Section II summarizes the basic ideas in intelligent systems, including feedback, rational agent and game theory, and optimal systems. Section III introduces loosely coupled systems, including vertical and horizontal coupling and some simulation results. In Section IV, we apply the ideas to self-organizing systems and communication networks. Finally, conclusions are made in Section V.

II. INTELLIGENT SYSTEMS A. FEEDBACK SYSTEMS 1) DEFINITION OF A SYSTEM
An observer defines the boundary between a system and its environment. A system can be defined in two ways [20]. According to the first definition, which is more general, a system is a set of parts with causal relationships between the parts [21]. Without relationships between the parts, we would have a set instead of a system. An open system also has relationships with its environment. The parts are coupled through the relationships to form a whole. The coupling or interaction may be intentional or unintentional. The coupling with the environment of open systems and between their parts takes place in the form of materials, energy, or information [22].
The second definition of a system is more specific. In this definition, a system is a set of active elements called agents interacting with each other and their environment [7]. The agents adapt or learn as they interact. Originally since the 1500s, an agent has meant ''the one who acts'' or ''deputy, representative'' [23]. A human agent is called an actor. In artificial intelligence, an agent is ''anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators'' [3].

2) COMPLEX SYSTEMS
A complex system is a system with emergent behavior, i.e., system behavior cannot be predicted with analytical tools from the behavior of the parts because of the nonlinearities involved [24], [25], [26]. Thus, the system is mathematically intractable, although it could be simulated. An example of intractability is a three-body system in physics, whereas a two-body system is mathematically tractable [27].
An intricate system with no emergent properties is complicated [26]. Although often observed, especially in biology, no theory exists for emergence [28]. Complex systems also require considering the fundamental limits of nature forming constraints to system design, the tragedy of commons hampering fair use of resources, and incommensurate resources hindering decision making.

3) FEEDBACK
Because of complexity, optimization must often be done hierarchically and iteratively using feedback. The optimum cannot be found directly except in some simple linear cases, hence iteration with feedback is crucial. A feedback loop consists of sense, decision making, and act blocks. The act block controls the environment, also called the plant or process [11]. The task of the decision-making block is to move the environment from the present state to an externally given goal, which may be a desired state or improved performance, usually iteratively [29]. Performance is optimized using an optimization criterion, also called a VOLUME 11, 2023 59459 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
metric. Many metrics can be combined into a utility to be maximized using a utility function such as weighted arithmetic or geometric average [30], [31], [32], [33]. Some simple feedback systems, such as a feedback amplifier, do not need a goal. Another example is a primitive reflex agent or robot [3], [34] such as an automatic vacuum cleaner or lawnmower, which are goalless. A reflex robot moves in some direction, and after finding an obstacle, it turns to a new direction selected randomly. However, the working area is defined using a boundary or constraint beyond which the robot does not move, and it may have a memory to know where it has already been.
In control systems, negative feedback targets reducing the difference between the actual and desired states, whereas positive feedback reinforces the difference. In general, negative feedback with small enough feedback gain and small enough loop delay is known to be stable, and positive feedback tends to be unstable [11], [35]. Loop delays may change a stable feedback loop to an unstable loop.
The feedback gain defines the speed of the loop, which must be decreased using a smaller feedback gain if the delay within the loop is made larger. In the case of two nested feedback loops, positive feedback can be used in the inner loop if the outer negative feedback loop dominates so that the loop as a whole forms a negative feedback loop. In a network, stability requires that negative feedback dominates at all levels of the system [36]). In complex ICT systems, many coupled feedback loops may cause instability and chaotic emergent behavior.
In present communication networks, common applications of feedback include transmitter power control, synchronization, channel estimation and equalization, automatic repeat request (ARQ), and flow and congestion control in the physical, data link, and transport layers [37], [38]. Feedback is also used in the network layer, for example, in the form of routing, admission control, handover, and load balancing [39]. Conventionally, feedback has been used strictly within each layer. In some cases, sensing information has been transmitted from the routers in the network layer to the transport layer [38], but these are exceptions. Instability may be caused by a long delay in a feedback loop [35] and tight vertical and horizontal coupling between feedback loops [12], [13], [40]. Now cross-layer design in the form of feedback loops is introduced in standards [41] and new proposals for standards [14]. These feedback loops may cause additional stability problems if not carefully designed.

4) HISTORY OF FEEDBACK SYSTEMS
The history of feedback is briefly summarized in [2]. Homeostasis and equilibrium in biological systems are closely related to the feedback concept. Bernard (1878) was the first to study homeostasis and equilibrium in living systems [42]. The meaning of homeostasis has been since 1926, ''tendency toward stability among interdependent elements'' [23]. As an example of homeostasis, our body temperature is kept almost constant, independently of the temperature of the environment. Cannon (1932) defined the concepts of homeostasis and equilibrium in open systems [42]. Rosenblueth (1943) later linked homeostasis to the feedback concept. He noticed that goal-directed operation in negative feedback systems is purposeful behavior, the opposite of purposeless or random behavior [29].
The term feedback has been used since 1920, meaning ''the return of a fraction of an output signal to the input of an earlier stage'' [23]. The feedback concept has been known since the antique [2]. Dreppel invented the thermostat in the 1600s. Watt (1769) used feedback in his steam engine, and Maxwell (1868) offered the first analysis. Minorsky (1922) developed the proportional-integral-derive (PID) controller, and Black (1927) invented a negative feedback amplifier. Interest in feedback rose after Nyquist (1932) published his stability analysis, and the generality of the concept was understood. Since then, feedback has formed the basis of automation, a term used since 1948; the adjective ''automatic'' has been used since 1812 with a related meaning [23]. The concept of feedback has been reinvented many times with different terms.
Wiener (1948) used feedback for his cybernetics, which combines communication and control theories. The concept of artificial intelligence (AI) was developed in 1956 to separate it from cybernetics [43]. Therefore, since then, computing has been included in system theories in addition to control and communications.

5) HIERARCHY
Hierarchy is a common method to manage complexity by dividing a complex problem into smaller problems. Hierarchical systems are divided into nested and nonnested hierarchies, and nonnested hierarchies are divided into dominance and layer hierarchies [19], [44]; see FIGURE 4. In a nested hierarchy, the higher levels contain the lower levels. In a nonnested hierarchy, the higher levels do not contain the lower levels. The dominance hierarchy is also called an organizational hierarchy, as in human organizations. The layer hierarchy is a special case of dominance hierarchy where each layer contains only one decision-making block to be controlled from above. Communication networks are based on the dominance hierarchy, which is often called layer hierarchy.
Each hierarchy level or layer includes one or more modular parts called subsystems. In the physical layer, an important module is, for example, power control for each network user and may result in coupling between users through radio waves. Hence, power control is needed in the uplink from a mobile terminal to the base station. Simon noticed that many complex systems are hierarchical, having: i) loose or sparse connections between different levels (this is vertical loose coupling), ii) at any level of hierarchy loose connections between different subsystems (this is horizontal loose coupling), but iii) tight or dense connections within each subsystem [1], [45]. Network functionalities and resources can be allocated optimally in a hierarchical layered architecture using optimization 59460 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  decomposition defined in the network utility maximization (NUM) theory [31], [46], [47]. The NUM theory is based on joint optimization. Such decomposition has been used earlier in control theory [48], [49].
The empty world hypothesis [45] presents a useful system design guideline. The hypothesis describes that in a complex system, one must consider only a small part of relationships, and the expression ''everything depends on everything'' is useless and misleading [50]. Usually, one can focus on the nearest neighbors vertically and horizontally.

6) HISTORY OF HIERARCHY
The term hierarchy has been used since the 1600s in the modern meaning ''ranked organization of persons or things'' [23]. The concept of hierarchy has been used much earlier in the army. The Roman army was one of the first formal hierarchies. The various military units in the army form a nested hierarchy, and the commanders and leaders in the command hierarchy form a nonnested hierarchy [50]. Egler (1942) and Novikoff (1945) studied hierarchy in ecology and biology, respectively [44]. Simon started scientific research on hierarchies and described hierarchy [44], [50] and modularity [1], [45], [51]. Mesarovic (1970) divided control hierarchies into stratified (nested), multilayer, and multiechelon (dominance) hierarchies [48]. The Open Systems Interconnection (OSI) reference model (1984) used the multilayer hierarchy concept [38].
In [52], the author referred to [45] and observed that biological systems use a nested hierarchy that consist of holons. The corresponding architecture was called a holarchy. In manufacturing systems, the holonic control architecture combines centralized and distributed control [53]. Some early papers on holonic manufacturing systems are mentioned in [53] and [54]. In [54], the author says that the first experiments with the holonic concept applied to manufacturing were already done in Japan in 1989. References [6], [55], and [56] present the holonic architecture as a nested or recursive architecture. In other references [53], the holonic architecture is based on a nonnested dominance hierarchy.
The concept of modularity started from the architectural theories by Bemis (1936) [51]. He used the term modular in the modern meaning ''composed of interchangeable units'' [23]. Modular electronics design (1949) was started at the National Bureau of Standards [51]. Later IBM used modular design in its computers (1957). Parnas presented criteria for modular design [57]. The modular design of products improves a system's comprehensibility and flexibility and shortens its development time [4], [57]. In modular design, each module is only loosely coupled with other modules.

7) DEGREES OF CENTRALIZATION
Complexity can be managed with various degrees of centralization, often combined with hierarchy. Architectures can be divided into centralized, decentralized, distributed, and hybrid architectures [19], [58], all used in communication networks. Decentralized and distributed architectures are not hierarchical, but centralized and hybrid architectures are hierarchical. In some disciplines, a distributed system is called a heterarchy as the opposite of a hierarchy [53], [59].
Centralized and decentralized control form two extremes of control, and distributed control is an intermediate form between them as in [60]. In centralized control, no autonomy is allowed; in decentralized control, complete autonomy exists if defined as in [49]. In decentralized systems, the agents are autonomous and compete with each other. This is the initial condition in an ad hoc network that does not yet form an actual network but only a set of nodes. An ad hoc network has no fixed infrastructure available. In distributed systems, the agents exchange information with their nearest neighbors (FIGURE 2).
In centralized control, all the intelligence is in the central agent, who must solve an optimization and decision problem with exponential complexity [61]. In decentralized and distributed control, all the intelligence is in the local agents, and the decision problem is divided into smaller problems that are easier to solve. However, a set of local optima do not result in a global optimum, as the suboptimization principle states: ''If each subsystem, regarded separately, is made to operate with maximum efficiency, the system as a whole will not operate with utmost efficiency'' [62], [63]. Therefore, systems cannot generally be optimal unless they use at least some weak centralized control [64]. Simon's idea on the bounded rationality principle (1953) explains this: the subsystems do not have full knowledge of the overall situation [65]. Although Simon's idea is older (from the year 1953), he used the term ''bounded rationality'' for the first time in 1957 [66].
On the other hand, centralized systems exchange much control information and have long delays, which may lead to slowness and instability. Distributed systems, in turn, can require a long time to obtain a general view of the environment unless the agents exchange sufficient information, increasing communication. Distributed systems' stability analysis is complicated [67]. Decentralized control is especially suitable for decoupled and loosely coupled problems. It may also be desirable for economic reasons when there is a geographical separation between the control units or unreliable links between them [49].
Centralized and distributed control can be combined with the hierarchy concept. The hierarchy levels may have different degrees of autonomy depending on the rate of exchanged information, and the control signals can be seen as commands to be obeyed strictly or as advice that can sometimes be ignored [53]. Coalitions can be formed, and there can be cooperation within those coalitions and competition between the coalitions. This kind of hybrid architecture has a weak central agent to obtain a general view, leading to loose coupling. The lack of a general view is a major challenge of decentralized and distributed systems, bringing unpredictability to global behavior and hindering global optimization of the system [53]. On the other hand, a hybrid system has high flexibility since it can implement different degrees of centralization, from centralized to decentralized, depending on the situation in the environment.
In many systems, resilience is an important requirement. Resilience is the ''degree to which a service recovers its operational condition quickly after a failure occurs'' [68]. The best resilience can be achieved with decentralized and distributed systems. Self-organizing biological ecosystems of organisms are typically highly resilient since they are not organisms as a whole [69]. The central agent in centrally controlled systems introduces a single point that may fail.
A typical solution in our society is to use redundancy in the form of a deputy agent with all the central agent's information. In hybrid control architectures, the lower-level agents can operate without the central agent. Thus, if the central agent fails to perform its tasks or is even destroyed, the hybrid control system can continue its operation although with a lower performance, which is impossible in ordinary centrally controlled systems.
To summarize, in a distributed system, the operation is locally optimal and resilient, the changes can be fast, and stability is improved. In a centralized system, a global goal improves stability. Since a general view is available, the system can be globally optimal, although the higher layers operate at a slower rate than the local optimization at lower levels because of the delays. A hybrid system realizes loose coupling and combines the benefits of both systems.

8) HISTORY OF DEGREES OF CENTRALIZATION
The term centralization was first used in 1801, and decentralization in 1839 [23]. The term centralization is from Napoleonic France, meaning ''concentration of administrative power in the central government at the expense of local self-government.'' Decentralization was later defined as ''act or principle of removing local or special functions of government from immediate control of central authority.'' Subsidiarity combines centralized and local control. Subsidiarity originates from social sciences [70] and is an efficient method of organizing hierarchy [71]. Subsidiarity is ''the principle that a central authority should have a subsidiary function, performing only those tasks which cannot be performed at a more local level'' [72]. A good example of subsidiarity is municipalities within a country where the municipalities are almost autonomous agents. Higher-level agents, such as countries, have a subsidiary role in the system; lower-level autonomy should be maximized to the point beyond which it would become harmful. Subsidiarity is a form of loose coupling and has similarities with the hybrid control architecture used in H-SONs. However, distributed control is not explicitly used in subsidiarity, although seen desirable as in the heterarchy [59].
Subsidiarity has been used with different terms for centuries [70]. Already Aristotle (300s BCE) discussed subsidiarity. Althusius (1603) had the idea of sovereignty related to federalism and subsidiarity. The subsidiarity concept was used in the constitutions of the USA (1787) and EU (1992). The term subsidiarity (Subsidiarität) was first used in German legal literature (1809) [73]. The term comes from the Latin verb 'subsidio' (to aid or help) and the related noun 'subsidium' (aid or assistance). Pope Pius IX (1931) selected subsidiarity as one of the three basic principles of the Catholic church, and the idea became well-known. A similar idea was presented for general social systems in [74] but with different terms. Boulding emphasized that power must be well specified, and central control should not interfere with local or an individual's problems.
The heterarchy concept was first developed in the modern context by McCulloch (1945) for cognitive sciences as the opposite of hierarchy [59]. Heterarchy corresponds with a distributed control architecture.
Theoretical research on different degrees of centralization started in the 1960s, first in the areas of communications and distributed computing [49], [75], [76], [77], [78]. One of the first attempts to define the degrees of centralization in communications is in [75]. In that paper, a centralized network is a star network, a decentralized network is a hierarchical network with a central controller, and a distributed network is a mesh network. However, in [49], the decentralized control is assumed to be completely decentralized without any central controller. A hybrid system, i.e., a combination of hierarchical centralized and distributed control, was presented in Fig. 2.10 [48].
The paper [79] describes one of the first distributed computing systems, but Dijkstra had started the whole field already in 1965 [80]. The opposite of distributed computing is local computing. Computing, in general, is divided into sequential and concurrent computing, and the latter is divided into parallel and distributed computing [81]. A shared memory, a global clock, and tight coupling between computing units characterize parallel computing. Distributed computing, in turn, is defined by a distributed memory using message passing, local clocks, and loose coupling between computing units. A special property in distributed computing is independent failures of computing units [77]. Honeywell developed the first distributed control system in 1975 [82]. In some disciplines, the hybrid form is called a holonic architecture [53].

9) OPEN SYSTEMS
Open system forms a unifying concept in systems thinking: thinking of wholes [83]. Hence, understanding open systems is crucial when developing complex systems. Interactions between systems and between subsystems are possible only if they are open systems; thus, open systems are closely related to the emergence concept through nonlinear coupling.
A system is open if it exchanges matter, energy, and information with the environment [21]. Information is carried by matter or energy, for example, in the form of mail, sound, or radio waves.
If there is no exchange of matter and energy between a system and its environment, the system is called an isolated system [84], [85]. If there is an exchange of energy but no exchange of matter between a system and its environment, the system is called a closed system. Energy can be transferred by conduction or convection in a matter or by radiation without any matter [86]. Energy can also be transferred using different forces, such as gravitation.
All biological and technical systems are open, but most conventional physics focuses on isolated systems to simplify the analysis. In a closed system, the internal energy will decrease towards a minimum value at equilibrium [87], [88]. The second law of thermodynamics follows the principle of minimum energy in closed systems and the principle of maximum entropy in isolated systems. Biological systems are not isolated. Hence, they do not follow two of the most important laws of physics, namely Newtonian mechanics and the entropy law [89].
The term system of systems describes systems made of loosely coupled systems with a common goal [90]. Multiagent agent systems realizing feedback loops in uncertain and dynamic environments are open in the sense that they exchange information with their environment [21].

10) HISTORY OF OPEN SYSTEMS
Open systems were at least implicitly used in early celestial mechanics. Newton (1687) used a simplified model for our solar system where the Sun and each planet, respectively, form an isolated two-body system [89], which corresponds to a second-order feedback loop [27]. The model is additive which makes analysis simple. Higher-order effects appear when the two-body systems are open, and each planet affects every other. The interactions cause some perturbations to the idealized model. Laplace (1786) developed his perturbation theory to explain the higher-order effects [91]. The theory is now used in satellites where we must consider that the Earth is not a complete globe and is inhomogeneous. The model must include the attraction of the Moon, solar radiation pressure, and aerodynamic drag. Henderson (1913) observed that in living systems, there must always be an exchange of matter and energy with the environment [92], but he did not use the term ''open system.'' He might be one of the first to describe the concept of open systems.
Lotka (1922) observed that biological processes generally improve their energy efficiency and simultaneously increase the total use of energy [93], which often happens also in technical systems: they become more popular when their energy efficiency is improved. According to [94], it was Lotka (1925) who introduced the open system concept, influenced by Boltzmann's statistical mechanics (1877) [95], [96], developed earlier by Maxwell (1866).
In his two-part paper (1875, 1878), Gibbs used the idea of open systems, but only for defining the chemical potential in statistical mechanics [84], [85], [97]. At about the same time as Lotka, also Bauer studied open systems independently [98]. Inspired by Lotka, von Bertalaffy (1932) further developed the open system concept. He published his results in English in 1950 [99], but in this paper, he defined open systems with the exchange of matter only; that is, the exchange of energy was not included in the definition. This is the paper from which the open system concept is best known, although the idea is 25 years older.
Prigogine (1955) developed a theory explaining why biological systems do not follow Newtonian mechanics or the entropy law [89]. Prigogine developed thermodynamics of open systems called nonequilibrium thermodynamics and showed that as open systems, biological systems are far from their equilibrium.
Eventually, information was included in the definition of open systems in addition to matter and energy [21]. In cybernetics, Wiener (1948) defined information as the quantity that can increase order and reduce uncertainty [83]. Information is related to patterns; thus, communication may be interpreted as exchanging patterns to improve order. This quantity should not be mixed with the statistical information defined by Shannon (1948) since the latter has no semantic content. The importance of information became obvious after the invention of the modern model of deoxyribonucleic acid (DNA) (1953).
A definition of automation, including the terms materials, energy, and information, was published in a Report of Automation Committee A, Radio-Electronics-Television Manufacturers Association, in 1955 [100]. The term open system was not used. Finally, Hall presented a clear and complete definition of open systems in [21]. Miller bridged the natural and social sciences with the open system concept [101]. He started the work in 1965 and published it in book form in 1978, presenting a comprehensive hierarchy of natural and social systems.
The first to recognize that human management organizations are open systems were March and Simon (1958) [102]. Later Katz and Kahn (1966) presented a detailed analysis.

11) EMERGENCE
Emergence means that the high-level properties cannot be derived from the low-level properties, and the global behavior of a system cannot be predicted from the local behavior [53], [78], [103]. Emergence often arises from nonlinear unintentional coupling and can even cause chaotic behavior.
Nonlinearity is a prerequisite for emergence, but simple nonlinear systems do not produce emergence [24], [25], [99], [104], [105]. In nonlinear systems, the principle of superposition fails; that is, the net response of a system to multiple stimuli is no longer equal to the sum of the responses to each stimulus individually. In complex nonlinear systems, the conventional analytical approach must be replaced with the systems approach, which covers the emergent phenomena [19], [106]. Emergence is a major reason why designing distributed self-organizing systems is not easy.
A rather new concept is complexity engineering, also called emergent engineering [28], [107]. In complexity engineering, which is not yet well developed, the goal is to manage emergent properties for our benefit by using strategies from biological evolution and free markets.

12) HISTORY OF EMERGENCE
Already Aristotle noticed that a whole is more than the sum of its parts. Mills (1843) and Lewes (1875) developed the emergence concept [24], [108] although Mills did not yet use the term emergence. It was further discussed by Broad (1923) and Morgan (1923) [83], [108].

B. RATIONAL AGENTS AND GAME THEORY
The challenges introduced by system complexity can be managed with rational agents [7] that are loosely coupled and realize feedback loops in uncertain and dynamic environments. The environment in controlling communication networks is the network and its environment. In the physical layer, the environment is the physical channel. As will be seen later in this paper, many disciplines have converged independently to the multi-agent model and recognized the need for a weak central agent, for example, [41], [109]. Multiobjective or joint optimization can be used in the presence of incommensurate resources and objectives (see the section on optimal systems).
In this paper, we do not discuss the details of multi-agent systems that are thoroughly described, for example, in [5], [6], and [110]. We focus on an agent-based control architecture realizing feedback loops and loose coupling. We use a broad definition of an agent: An agent is any system, implemented in whatever means, that forms a sensedecide-act feedback loop to control the environment and has in general an externally defined goal [3]. Agents may be, for example, human agents, robots, or software agents. The rational agent contains (1) optimization and robustification [111], (2) decision making, and according to the Conant-Ashby theorem, (3) a model of the environment in its memory [112], see FIGURE 5.
The separation of control and estimation of the model parameters is possible if the environment is linear, the metric is quadratic, and the noise is additive and Gaussian [113]. This is called the separation theorem. In wireless communications, a similar separation theorem is valid with similar assumptions for Kailath's estimator-correlator [114], where the correlator works as an optimized demodulator, and the estimator estimates the channel which forms the environment.
The model includes information on the earlier states of the environment to make the proactive operation possible. In optimization, the best possible solutions are selected based on certain objectives or criteria. If many objectives are to be maximized, the optimum is not unique, and separate decision making is needed. Optimal solutions are not always robust since they work optimally only in a certain environment. The term robustification or robust design is used to complement optimization [111], but it can also be interpreted as a form of optimization. In robustification, we look for systems that work well even when the conditions change. A robust system can function correctly in stressful environments or invalid inputs [68]. Decision making selects one of the many optima to provide fairness. Usually, the selection is made on subjective grounds. In computer science, an agent is often an IF-THEN structure where IF selects one of the stimuli (corresponding to sense operation) and THEN defines the response (corresponding to act operation) [7].
Often intelligence and rationality are synonymous [3], but in modern psychology, rationality is seen as a broader concept that includes intelligence [115]. The latter refers to algorithmic abilities. A rational agent is an intelligent agent that is able to reach its goals efficiently with the available resources in an uncertain environment [115], [116].
In communication networks, agents include network managers, transmitters, and receivers. For example, a transmitter forms an agent in the form of a transmitter power control loop. A receiver forms an agent in the form of a channel estimator. However, the term ''agent'' is not often used in present communication networks, although feedback is used in most layers [37], [38], [39].
Agents have lately been used to model programmable networks [117] and especially SONs [41], [118], [119]. Still, loose coupling is not covered in detail, and because of possible coupling between feedback loops, there is a danger that the networks become unstable [12]. Moreover, agents have been used to provide intelligence in the form of cross-layer design using feedback; see, for example, [14], [41], [120]. Feedback is also used in ETSI Zero Touch Network and Service Management (ZSM) and ETSI Experiential Networked Intelligence (ENI) networks, developing material for prestandardization in the form of Group Specifications (GSs). Other similar efforts are summarized in [121].

1) HISTORY OF AGENTS
Originally AI focused on self-organizing systems, and according to [122], even a book was published in 1962. The approach was too optimistic, and researchers had to select more focused topics since they understood that high-level concepts could not be learned without any knowledge at all, and a bottom-up approach is needed. The first successful AI applications were expert systems that were rule-based systems imitating the decision-making process of human experts. Hewitt developed the idea of a software agent (originally an ''actor'') in 1977 [123].
At about the same time, the research on distributed artificial intelligence started [124], [125], [126]. This new field was divided into multi-agent systems and distributed problem solving. Around 1987, AI became a theory of rational agents, which unified the whole discipline [3].
In addition to AI, interacting agents were taken as the central concept also in complexity theory that focuses on self-organizing systems called complex adaptive systems (CASs) [7], [8], [127], see the section on self-organizing systems. Thus, AI research has traveled a full circle since 1962, but now self-organizing systems are studied with more focused ideas.
The holonic multi-agent system is mentioned in [6], [55], and [56] among agent organizations. In [54], the author considers agent-oriented and holonic manufacturing paradigms, which had received much attention in industry and academia at that time. The paper shows that both paradigms have different views on manufacturing control. A combination is beneficial to both paradigms.
The holon has been compared with an agent also in [56]. Originally holons were recursive structures as implied by the nested hierarchy, but this property is not characteristic of agents. Holons form holarchies generally represented as dynamic hierarchical structures, but agent architectures form horizontal and vertical organizations. A holonic manufacturing system (HMS) has been standardized in the IEC 61499 standard (2005) [109]. The standard has also been applied in smart grids [128].

C. OPTIMAL SYSTEMS
In the future smart and sustainable world, efficient use of scarce basic resources calls for optimality [19], [83], [129]. Basic resources can be divided into materials, energy, information (data and control), frequency (bandwidth), time (delay), and space (size). Even when the available resources are sufficient, sustainability calls for minimizing their use. The end of Moore's law [130] is an example of approaching the fundamental limits of nature [19], [131] and hence illustrates the importance of resource efficiency.
Optimizing resource usage prevents the tragedy of commons [132], that is, the overuse of limited resources or commons when everyone can freely compete in their use, but the costs are divided equally, usually with some delay. The three basic solutions to the tragedy of the commons are to educate and exhort, privatize, and regulate [65]. Eventually, some form of cooperation is needed to solve the problem, as has been done for radio frequencies by the International Telecommunication Union -Radiocommunication Sector (ITU-R). In complex systems, cooperation is achieved by weak commands from the higher layer agents.
System complexity challenges the efficient use of resources. Hence we suggest accompanying optimization with simplified solutions and optimized hierarchy to tackle the complexity challenges. That is, in addition to optimization by individual agents in decision making, system structure needs to be optimized. These both are considered in loose coupling -the responsibilities of agents at different levels are defined. Self-organization, in turn, optimizes system structure while the system operates in a changing environment. Complex systems are hierarchical and modular. This structure has demonstrated many benefits in evolution and engineering. Architectural solutions can also improve resilience and robustness, but this section focuses on optimization methods.
Schoemaker noticed the central role of optimization in various disciplines [129], including natural and social sciences. VOLUME 11, 2023 In engineering, the objectives are in practice efficiency metrics [22], such as energy efficiency in bit/J. In the case of time, the objective is delay in milliseconds. Multiobjective optimization (MOO) is required to make decisions when the goals can be conflicting or even incommensurate, and constraints limit the solutions.
Decision making may be satisficing rather than optimizing [133]. Several aspects prevent full optimization. Firstly, finding the global optimum by exhaustive search is a problem with exponential complexity, and some heuristic methods must be used [61], usually hierarchically and iteratively, using feedback. Furthermore, convergence may be a problem, and an iterative optimization process may lead to a local optimum. Bellman (1957) called the complexity problem the curse of dimensionality. Loose coupling can significantly alleviate the problem [134]. When considering optimization, loose coupling is an example of a more general concept called optimization decomposition using separation principles [31].
Secondly, optimal systems imply lack of redundancy which imply lack of robustness. When robustness is improved through robustification, the system is not optimal anymore. Thirdly, many complex systems are distributed systems that generally cannot achieve optimum because of emergence. Resilient systems, in nature, are distributed systems, suggesting that distributing system components forms a basis for resilience. In more detail, distributed systems generally cannot obtain a Pareto optimum, but in the best case, they achieve Nash equilibrium as in game theory [135], [136], see multiobjective optimization below.
Fourthly, different resources are incommensurate; therefore, the trade-off depends on the availability of those resources, which may also depend on time. Pareto optimum cannot be objectively defined for incommensurate resources. The free market economy provides an example of satisficing decision making with incommensurate resources that can be applied also in allocating resources in complex ICT systems.
The free market economy can be seen as a self-organizing social system where the law of supply and demand finds the prices for products and services [69]. The relative prices are found in an evolutionary approach based on survivability. The process is a kind of game; hence game theory (see below) was originally applied to describe decision making in free markets. For example, the price of energy is defined by competing power producers, and the price of radio frequencies may be defined at an auction between operators organized by the state. Even in a free market, the system tends to drift toward monopolies and centralized control (e.g., the state government) must intervene to the benefit of the citizens. In a society, perfect competition (i.e., decentralization) tends to drift toward partial cooperation, and perfect cooperation (i.e., centralization) tends to drift toward partial competition [137]. Thus, both of these extremes are somewhat unstable situations. The drift towards monopolies is called the Matthew effect.
Optimization is a broad area, and we refer to the books [61], [135], [138] for further details.

1) MULTIOBJECTIVE OPTIMIZATION (MOO)
Multi-objective optimization is the joint optimization of many objectives, also called criteria or metrics [30]. The game theory is a theoretical framework for studying MOO. Instead of agents, the term players is commonly used in the literature. A game can be non-cooperative or cooperative. A non-cooperative game with rational players leads in an evolutionary way to a Nash equilibrium (1950), which is a situation where players cannot gain anything by changing their strategy [135], [136].
If the objectives are commensurate, the ideal solution is the Pareto optimum [30], [139]. A solution is Pareto optimal if no objective can be improved without making some other objective worse. The Pareto optimum is generally neither unique nor fair. Convergence to the Pareto optimum generally requires that all players cooperate. Cooperation corresponds to a single player game [30] and the use of centralized control. Pareto optima are only stable if they are Nash equilibria. In general, a Nash equilibrium is not Pareto optimal.
A Pareto optimum can be obtained in a free market, but only with strict conditions [140]. For example, all the market participants must have perfect information, and the market must be perfectly competitive, but in practice, much of the information is confidential. The problem with incomplete information can be modeled as a Bayesian probability distribution [135], [136]. This is equivalent to a Bayesian game with complete information, and the resulting equilibrium is Bayesian equilibrium.
Since many Pareto optimal solutions may exist, we need additional criteria to select the final solution [141]. A possible criterion is fairness. Nash Bargaining Solution (NBS) forms a Pareto optimal, unique, and fair solution to multiobjective problems using a cooperative game [135], [136], [142]. NBS is a distributed solution that can be obtained when everyone negotiates with everyone else. However, in practice, not all players can or want to negotiate. In geographically distributed and dynamical situations with limited control information, the solution must be approximated since the cost of negotiation can be significant, but this cost is ignored in optimization.
An alternative to the free market is coordination done by an arbitrator or leader, who can send private or public signals to the players [135], [136]. The resulting equilibrium is correlated equilibrium. In centralized systems, all players are coordinated and have a global perspective, which helps achieve a social optimum. A social optimum is a situation where the sum of the utilities is maximized. This optimum is efficient, i.e., optimal for a social group, but, in many cases, unfair for the individuals in the group. An example of the sum utility used for the social optimum is the sum of journey times and the sum of users' bit rates [143]. A rather new idea in social sciences is nonequilibrium economics by Georgescu-Roegen (1971) [144], inspired by a similar theory by Prigogine and further developed by Arthur (2015) and Ayres (2016). The idea has not yet been used in technical selforganizing systems.
An interesting, coordinated game is the Stackelberg game (1934), which includes a leader and a set of followers that compete with each other on certain resources [145], thus forming a hybrid solution combining centralized and distributed decision making. It has been applied in wireless networks since 2011 [146]. Stackelberg game can be made either optimal [147] or fair [148].
Multiobjective optimization algorithms include scalarization which searches for a single optimal solution, and multipolicy algorithms, which search for a set of optimal solutions in a single run [149]. A conventional reinforcement learning algorithm receives a scalar feedback signal for its behavior, but more generally, a multiobjective reinforcement algorithm called Pareto Q-learning (PQL) is needed.
In scalarization, a multiobjective problem is reduced to a single-objective problem by combining the objectives, for example, using a weighted sum or product [30]. This works well if the multiple objectives are monotonic and almost independent, and the set of all possible solutions is curving out, i.e., convex [31]. Scalarization has the same limitation with the incommensurate resources since we must somehow define the weights of the efficiency metrics. Those weights depend on the prices, which are assumed to be known in engineering design as a starting point. Furthermore, we can only find all possible Pareto optima for convex problems by selecting suitable weights.
The original idea regarding favorable properties of utility functions is from Kelly (1998), and the idea resulted in the NUM theory, see Section II-A. The selection of the utility function and the weights is not a scientific problem; thus, if available, the scientific solution is a set of Pareto optima, not a single optimum, unless, for example, fairness is used as an additional criterion. A form of optimization is constrained MOO, where most objectives form constraints, and only one objective is used for actual optimization. For example, the constraints may include minimum bit rate and maximum delay, and the energy consumption is minimized.
Recently, large multilayer neural networks have been used in the form of deep learning [150]. Such networks can beat humans in perfectly known fixed environments, such as in the Go game [151]. However, humans can usually beat machines in uncertain dynamic environments like the real world. Some new results on AI in uncertain environments are in [152], showing that AI can be successful also in the Stratego game. Large neural networks are tightly coupled and complex systems and may need lengthy learning times due to the generality of the structure [153]. The neural network operation is not easily understandable since hierarchy and modularity are not used. A neural network is flexible because it can provide a model of many kinds of environments, even nonlinear.

2) HISTORY OF OPTIMAL SYSTEMS
In [137], the author defined the basic resources listed above, except the bandwidth, as limiting factors of production. Moreover, the author called information know-how and knowledge. Bandwidth is a specific resource used in communications and distributed computing.
Smith (1776) proposed the free market economy. The tragedy of commons was first outlined by Lloyd (1833) and later by Gordon (1954), Scott (1955), and Hardin (1968) [65], [154]. Edgeworth (1881) developed the idea of the Pareto optimum 25 years before Pareto (1906); hence the optimum could be called the Edgeworth-Pareto optimum. Debreu (1959), Arrow (1964), and Greenwald and Stiglitz (1986) developed the theorems for welfare economics for obtaining a Pareto optimum in a free market [140]. The desirable properties of a utility function are commented on in [30] and [32]. Kelly's (1998) discussion on utility functions led to the NUM theory [31].
Game theory was developed in economics by von Neumann and Morgenstern (1944) [135], [136]. Game theory preceded artificial intelligence but is now part of it [3]. The Nash equilibrium and Nash Bargaining Solution were introduced by Nash (1950). In communication networks, it was first applied to network optimization by Mazumdar et al. (1991), to bandwidth allocation by Yaiche et al. (2000), and to radio resource management by Boche and Schubert (2009) [136], [142].

III. LOOSELY COUPLED SYSTEMS A. VERTICAL AND HORIZONTAL LOOSE COUPLING 1) LOOSELY COUPLED SYSTEMS
As mentioned in the introduction, loose coupling has been independently used in many disciplines with different terms. In loose coupling, all relationships between layers and subsystems in the same layer are minimized. Regarding control, centralized and local control are balanced, making the system also resilient. Loose coupling has a solid theoretical basis in the NUM theory [31], [46], [47]. The authors in these references show that clean-slate optimization naturally results in a vertically loosely coupled cross-layer solution.
Unintentional and harmful horizontal coupling often occurs in the lowest physical layer through the environment. In addition to optimizing the system, loose coupling facilitates avoiding unintended coupling. Loose coupling is the general rule of systems and systems of systems to improve stability [63], as stated by the system separability principle: ''System stability increases as the mean strength of interaction between components is decreased.'' Thus, stability can be improved by separating the parts of a system from each other, thus decreasing failure propagation.
When the size of a connected system increases, the likelihood of the system being stable decreases. The stability analysis of a complex system is, in general, a demanding task. Thus, loose coupling is a practical approach to obtaining a stable network. This is known as the golden rule of system design [155].
The degree of vertical coupling at high hierarchy levels should be loose, and the speed slow according to time-scale separation (see below). The lowest hierarchy levels are the opposite: the degree of coupling should be tight, and the speed VOLUME 11, 2023 high since there is no other level below the lowest level. In other words, an agent at higher hierarchy levels should control the next lower level weakly and roughly, but an agent at lower levels works more accurately and tightly. Ignoring these guidelines can lead to conflicts (see below).
Loose coupling facilitates the analysis of complex systems. Similarly, analysis is simplified if the interactions between subsystems are nonexistent, weak, or linear [99]. When the interactions between subsystems are nonlinear, the system may be intractable.

2) VERTICAL AND HORIZONTAL COUPLING
As described above, loose vertical coupling refers to loose connections between different levels, and loose horizontal coupling refers to loose connections between different subsystems at the same level of hierarchy.
Interference avoidance targets minimizing the interference between feedback loops at the same hierarchy level, i.e., unintended horizontal coupling, see FIGURE 6. This is loose horizontal coupling. Interference avoidance through signal design [13] is an important approach in communication networks. The signals of different users should be orthogonal.
In time-scale separation, the lower or inner hierarchy level of a system is assumed to be operating fast enough compared to the higher or outer level so that the lower level has reached a steady state between consecutive commands from the higher level [12], [156]. For the lower level, the changes from the higher level are slow, and because of the slowness of the higher level, it sees the changes of the lower level in an averaged form. This is vertical loose coupling used to improve the stability of the system when different hierarchy levels control the same variable, such as transmitter power [12], [48], [49], [156], see FIGURE 6. At least some vertical loose coupling is needed because otherwise, there would be no control. Different time scales may allow the use of complex algorithms at higher levels where the time scale can sometimes be minutes or even hours [157]. A requisite for time-scale separation is that phenomena have different time scales in the network.
The difference between hierarchy levels in terms of speed is ideally several orders of magnitude so that the levels are decoupled from each other. The whole hierarchical system achieves a steady state from bottom up. In wireless communications, the changes in a physical channel form a hierarchy where the changes in the path loss are the slowest, shadowing is faster, and multipath fading is the fastest [114].
The range (sometimes called scope) in amplitude, time, frequency, and spatial domains should be broad and the resolution low at high hierarchy levels [116]. The opposite is valid at low hierarchy levels. The resolution or quantizing interval is the smallest measurable change within the range in each domain [158]. The range should increase geometrically, and the resolution decrease geometrically when one moves upwards in the hierarchy so that the complexity is reasonable at each level and the energy consumption is balanced. The ratio of resolutions at adjacent levels can be optimized to minimize computational complexity [116], often measured with energy consumption. In general, the ratio of range and resolution is roughly constant at each level [116]. The ratio can be called the number of resolution bins within the range [159].
Unintended coupling may result in conflicts. A conflict between hierarchy levels is called an interlevel conflict, and at the same hierarchy level, an intralevel conflict [48]. These conflicts are related to unintended tight vertical and horizontal coupling, respectively.
Unintended vertical coupling arises, e.g., if an upper level controls the environment at the same speed as a lower level. Such behavior may lead to conflict since the levels may control the environment (i.e., network) in different directions, implying instability and chaos. This conflict can be avoided with time-scale separation. Unintended horizontal coupling leads to interference between feedback loops and possibly instability [160], [161]. Eventually, chaotic behavior may appear. Such conflicts can be avoided with interference avoidance.
A deadlock means that the system reaches a state where it cannot continue. Deadlocks and different conflicts are commonly found in distributed systems [77], [162], [163]. Conflicts appear easily since there may be conflicting objectives between different network users. An obvious conflict comes from using common resources such as energy and bandwidth. The common resources are sometimes called ''commons,'' as in the expression ''tragedy of the commons'' [65]. In a hierarchical system, prioritizing the upper levels over the lower levels helps avoid deadlocks [48].
In control theory, chaos can be avoided in two parallel interfering loops by using the complex decoupling multivariable controller developed by Falb and Wolovich [40], [85]. This controller is based on a similar idea as in the recursive least-squares (RLS) algorithm developed by Gauss (1826) and Plackett (1950), in the Kalman filter by Swerling (1958) and Kalman (1960) [164], [165], and in the orthogonalized least-mean square (LMS) algorithm [166] where the simple LMS algorithm would have tight coupling in the form of high correlation. Orthogonality means that subsystems are isolated so that the used signals do not interfere with each other since they have zero cross-correlation [114], [167]. The convergence rate of the orthogonalized LMS algorithm is the fastest of all adaptive algorithms, similar to that of the RLS algorithm.

3) DEGREE OF COUPLING
The degree of coupling is usually defined only qualitatively, but in network theory, it has been defined quantitatively to have values between zero and unity [168]. The degree can be classified as uncoupled, loosely coupled, tightly coupled, and fully coupled [10], [153], [169]. In the uncoupled case, also called decoupled or noninteracting, the subsystems are isolated, there are no interconnections, and no real system is formed as defined by [21]. In the loosely coupled case, interconnections are loose or slow. In the tightly coupled case, the interconnections are strong, as well as in the fully coupled case, also called interleaved. In distributed software systems, loose coupling corresponds with information exchange via message passing, tight coupling with shared memory, and full coupling with function calls [124], [170].
In communication networks, the horizontal degree of coupling in the physical layer can be defined as the inverse of the received signal-to-interference ratio (SIR) [121]. Loose coupling implies that the degree of coupling is close to zero (i.e., the SIR is high), and tight coupling implies that the degree of coupling is close to unity (i.e., the SIR is low). In vertical coupling, the degree of coupling is the ratio of the speed of the higher layer and the speed of the lower layer. The speed corresponds to the bandwidth of the corresponding changes. In loose coupling, the feedback loops in the different layers and the same layer work as if the other loops do not exist. Coupling metrics for layered and modular software design are discussed in [171].
Pautasso describes the degree of coupling as a multi-faceted phenomenon and presents 12 facets [10]. Few systems are tightly or loosely coupled according to all the facets. For example, one of the facets is interaction, which can be synchronous, i.e., tight coupling, or asynchronous, i.e., loose coupling. In asynchronous systems, a lower-level system does not wait for responses from the higher level. For serviceoriented architectures, loose coupling means that software modules and services share only a small set of assumptions. Therefore, the impact of change is limited, and the software modules and services can evolve independently and rapidly and scale easily.
Software agents can be analyzed based on the 12 facets [10]. Shared or distributed agent memory and partial isolation of control loops are two examples of facets of hierarchical multiagent systems. According to [171], the types of coupling in software design can be compactly classified into parameter coupling, external medium coupling, inheritance coupling, and common coupling. Two modules have parameter coupling if one module passes a parameter to another. Two modules have external medium coupling if they access the same external medium, for example, a file. Two modules have inheritance coupling if one module descends from another module. This type of coupling is typical of object-oriented software systems. Two modules have common coupling if they use the same global variable.
In self-organization, a system must be neither too tightly nor too loosely coupled [85]. In communication networks, vertical loose coupling can be implemented with time-scale separation [12] and horizontal loose coupling with interference avoidance [13]. Orthogonality has been used in interference avoidance, which needs additional control but improves capacity [13]. This is a form of loose horizontal coupling. In a decentralized system, we must take care that there is no interference between different agents.

4) HISTORY OF LOOSE COUPLING
After inventing the pendulum clock, Huygens observed the resonance (1665) because of loose coupling between two clocks [172]. Later it was observed that rotating parts might have flexible mechanical couplings, providing a physical model of stable loose coupling [173]. If in a multi-body system, the distances of the bodies are large enough and one of the bodies is much larger in mass than the others, as in our solar system, a form of loose coupling is formed using mass hierarchy, and the system is highly stable.
Orthogonal signals became popular in communications after the work of Peterson et al. [174] and Gabor [175]. Packet-switched networks are based on loose coupling [176]. Loose coupling is also used in network roaming in interworking architectures [177].
Poincare (1890) was the first to study chaotic phenomena in three-body systems, which is, in practice, an intractable problem, although Sundman (1912) found an infinite series solution with slow convergence [91]. Deterministic chaos is easily produced by a feedback loop that includes a nonlinearity [27]. In meteorology, time-scale separation can be observed between climate and weather [178].
Loosely coupled systems were first studied scientifically in [1] and [179] using first the descriptive term near decomposability, a term that has been since then used in biology [179]. The term decomposition is used in mathematical optimization [31]. The term decomposition has been used since 1762, meaning ''act or process of separating the constituent elements of a compound body; state of being decomposed'' [23]. After [45], Milne (1965) analyzed loosely coupled dynamical systems [49], [180]. Since then, such systems have had many applications in control theory.
Simon described the principle of loose coupling in physical, biological, and social systems. All multicell biological organisms use this principle because only such systems with their stable intermediate forms could succeed in evolving in the available time. They have survived since they have a fast adaptation rate.
Simon defined the vertical and horizontal loose coupling in [1]. Time-scale separation has been observed in biological systems since the work by Michaelis and Menten (1913) [156], [179]. Thompson was one of the first to use the term loose coupling in organizations in 1967 [10], [102]. Klir defined loose coupling in general systems [181]. Independently of Thompson and Simon and referring to Klir, Glassman used the term loose coupling in biology [173]. Weick (1976) referred to the work of Glassman [173] and used the term loose coupling in educational organizations [9]. In this form of coupling, the coupled subsystems respond to each other, but each subsystem also has its own identity and logical or physical separateness.
Loose coupling has been used in computing to support modularity [79], [182]. Constantine developed structured software design in the 1960s, and the results were later published in the paper [183] and the book [169]. The book [169] includes a separate chapter on loose coupling. In addition, the idea of loose coupling is widely used in modular design [51] and service-oriented architectures [10], [184], [185].
Hoare (1978) developed the idea of message passing, motivated by its use in the 1960s in the design of operating systems [170]. In tight coupling, shared memory is used in information exchange in the blackboard architecture. An example of message passing and shared memory is a cognitive radio system proposed by Mitola (1999) [186]. An example of tight coupling is federated learning by McMahan et al. [187].
In control systems, weak coupling between parallel feedback loops is preferred [180]. Similarly, different decision intervals in different hierarchy levels are a form of time-scale separation [48], [49]. Conflicts are avoided and resolved using different self-coordination methods [48], [163]. Conflicts were already discussed by March and Simon (1958) in human organizations.
In [188], the author suggests that the term subsidiarity is more prescriptive than the term loose coupling. The term subsidiarity includes the idea of loose centralized control. Subsidiarity and loose coupling are discussed in parallel only in a few papers, such as in [188], which shows that they were developed independently.
Holonic systems combine the beneficial properties of hierarchical, centrally controlled, and distributed systems and loose coupling [109], [128] as in H-SONs. In the holonic architecture, the lower-level agents are almost autonomous [53], just as the subsidiarity principle defines. However, hybrid solutions are not usually explicitly mentioned.
In addition to subsidiarity and holonic control, locality or local interaction is used in cellular automata and systolic arrays. Cellular automata were developed by von Neumann (1948,1963) to simulate self-reproducing systems [189]. Systolic arrays were originally used in the Mark 2 Colossus computer (1944) for massively parallel computing and regular data flow, and Kung and Leiserson (1979) elaborated on the idea [190]. Locality is used for minimizing energy consumption and delays. Edge computing (1999) uses the same idea near the terminals at a network's edge [191]. Edge computing was originally called content delivery. Since edge computing reduces delays, the stability of the network is improved compared to the older concept called cloud computing, whose origin can be traced back to 1961 [192]. A cloud is a platform for distributed computing.

B. SIMULATIONS
In the simulations, we estimate an unknown system using an adaptive filter. Additional details of simulations with adaptive filters can be found in [165]. All the signals and models are real. The system model is shown in FIGURE 7. The effect of correlation or coupling is demonstrated with a moving average (MA) process. The MA process of order q is given by k is an uncorrelated zero-mean random signal with Gaussian distribution and power equal to unity. Division by q + 1 in the MA process is needed so that the power of the signal is not changed. The samples x k and x k−1 are correlated because the values depend on the previous samples. In the simulations, we used the value q = 3.
The unknown system is modeled as a finite impulse response (FIR) filter whose output is corrupted by additive white Gaussian noise (AWGN) [166]. We model the unknown system as a low-pass filter. We have used the first example in [193] for the filter. The number of taps for both adaptive and low-pass filters is N = 13, although they do not have to be equal. 1 To focus on the functionality of adaptive algorithms, a very high signal-to-noise ratio (SNR) is assumed to ignore the effect of noise.
The objective of the adaptive filter with N weights is to minimize the error signal e k between the outputs of the adaptive filter y k and the unknown system d k . The signals are assumed to be real. The delayed LMS algorithm with a delay is given as W k+1 = W k + 2µe k−D X k−D where the weight and input sample vectors are W = (w 1 , . . . , w N ) T and X k = (x k , . . . , x k−N +1 ) T , respectively [166], [194]. The delay is D, and the delayed error signal is e k−D = d k−D − y k−D . The ordinary LMS algorithm is obtained when D = 0. A common simpler alternative to the LMS algorithm is the clipped LMS algorithm [195] where e k is replaced by sign(e k ), and sgn(x) = 1 for x > 0, sgn(x) = −1 for x < 0, and sgn(x) = 0 for x = 0.
The step size parameter µ controls stability and convergence rate: the larger the value, the faster the convergence rate, but too large a step size causes instability. In the simulations, the fixed step size µ is 0.04. For a stationary process, the autocorrelation matrix is R = E X k X T k . If the process is uncorrelated, the matrix has constant values in the main diagonal and the other elements are zero. However, for correlated processes, the LMS and clipped LMS algorithms are associated with a deterioration in performance.
The LMS algorithm slows down when there is a correlation between the samples. In addition to slowing down the convergence, the coupling also causes instability. To eliminate a potential deficiency of the LMS algorithm due to the correlation, in the orthogonalized LMS algorithm we multiply the input vector X k using equation X † k = R −1/2 X k so that the correlation is reduced [166]. Coupling is lack of orthogonality, and in the orthogonalized LMS algorithm, the correlation is reduced by the R −1/2 operation. Thus loose coupling can be implemented by using the orthogonalized LMS algorithm.
In the figures, the ensemble average mean-square error (MSE) is presented as E e 2 k ≈ (1/L) L i=1 e 2 k,i where L is the number of independent simulations. The convergence of the algorithms is demonstrated by using the learning curves as in [196].
In FIGURE 8, we compare the LMS and clipped LMS algorithms (FIGURE 8a) and demonstrate the effect of coupling (FIGURE 8b) and delays (FIGURE 8c). The clipped LMS algorithm does not even converge in this case. If we used a smaller step size, the algorithm would converge but very slowly. Because of the coupling in the form of correlation, the LMS algorithm converges, but eventually it becomes unstable and starts to behave chaotically. The orthogonalized LMS algorithm has reduced the coupling and improved the stability. The same happens with the delay. The orthogonalization compensates for the degradations caused by delay, but if the delay increases sufficiently, the orthogonalized LMS algorithm also becomes unstable.
In hierarchical systems, the delays are caused by geographical distances. In loosely coupled systems, coupling is minimized by using time-scale separation and interference avoidance, and thus stability is improved. The simulation example shows that coupling and delays may produce instability and chaos in feedback loops. Similarly, the coupling may cause a cocktail party effect in power control loops [159].

IV. SELF-ORGANIZING SYSTEMS A. SELF-ORGANIZING AND HYBRID SELF-ORGANIZING SYSTEMS
In self-organization, individual subsystems' cooperative behavior forms some organization, structure, or pattern autonomously without any external control and with or without internal centralized control [5]. The degree of selforganization is determined by the ratio of inside control vs. total control [197]. Self-organizing systems are the least mature systems since they are the highest in the hierarchy of human-made systems [19], [106].
Self-organization provides many benefits but is not widely exploited in technical systems since it can become the primary source of failure [103]. In fact, many distributed self-organizing systems have failed due to emergence that may even produce chaotic phenomena. Even when the arising emergence can be managed, local optimization rarely results in global optimization, as explained in the suboptimization principle [63]. Hybrid systems with loose coupling tackle this problem.
Much information on self-organization is available, but the literature is disconnected, and different terms for similar concepts are used. General discussions on different self- * terms are included in [198], [199], and [200]. We focus on SONs. Earlier general surveys on SONs include [58], [163], [201]. They offer a good state of the art survey, but the history presented in these papers is mainly limited to the work after the change of the millennium, although the SONs have a much longer history, as we explained in [121]. Earlier surveys cover distributed SONs [202] and ad hoc networks [203]. Emergence is rarely discussed in papers on SONs.
In the available literature, SONs are often defined to be distributed without any centralized control [204], [205], probably because natural self-organizing systems are often distributed. The need for central control has recently been observed but is usually not studied in detail [5]. Centralized control introduces a hierarchy missing from distributed systems [6].
Hybrid self-organizing networks (H-SONs) combine the ideas of C-SONs and D-SONs [58] and thus form a universal model for different SONs. An H-SON can act as a general solution for stability problems and the tragedy of the commons [65] in SONs.
In communication networks, packet switching realizes self-organization [206]. The network selects the route of each data packet autonomously from one of the predetermined routes. If one route is blocked, another route is selected. In communications, self-organizing networks that, in the beginning, have separate parts and form connections as they operate are commonly called ad hoc networks.
Generally, a self-organizing communication network improves the quality of service (QoS) of all users by changing its topology and routing and by adapting its transmitters in each link [58], [201]. The QoS is measured by bit rate (often called throughput), reliability (one minus error rate), and delay, implemented with minimum energy.

B. SELF-ORGANIZATION USING A MULTI-AGENT SYSTEM
Self-organizing systems can be implemented as loosely coupled systems in the form of interacting agents, as in complexity theory, where such systems are called complex adaptive systems (CAS) [4], [7], [8], [127]. Systems where the interactions between the parts of the system do not change, have an analytical description [207]. The analysis becomes difficult in such complex systems where the interactions change over time. Such systems have an algorithmic description. This corresponds with self-organization. CAS concept has been extended to complex, adaptive, and evolvable systems (CAES) [8].
An obvious approach to implementing a self-organizing, loosely coupled system is thus a multi-agent system. Automatic and autonomous systems are, in general, stable because they have an externally given goal, which may be a set-point value, a reference signal, a reference trajectory, or performance [11], [116], [196]. The goal acts like a handlebar in a bicycle to steer the system in the right direction unless there are convergence problems. If there is no given goal, this may lead to unpredictable behavior and stability problems.
In biology, self-organization is called morphogenesis [99]. Biological systems generally do not have any set-point value or target performance [34]. In fact, biological systems need a new set of fundamental explanatory principles [208]. Organisms are optimizing fitness [209], but energy efficiency is an important part of fitness because of the scarcity of resources.
We focus on self-organizing multi-agent systems. In [6], the multi-agent systems are divided into leaderless (i.e., distributed) and leader-follow (i.e., centralized) systems, but self-organization is not discussed in detail. Self-coordination has been proposed to avoid and resolve emergent conflicts in SONs in 3GPP Rel. 11 (2011) [163], but hierarchy and vertical and horizontal loose coupling as the most obvious methods for self-coordination.
With a multi-agent H-SON, local decisions can be made by agents, and global decisions through agent collaboration in a hierarchical way. More specifically, we interpret an H-SON as a group of loosely coupled interacting agents where both time-scale separation [12] and interference avoidance [13] are used to decouple the feedback loops and avoid instability and chaotic phenomena. Harmful interactions between loops are difficult to analyze and should be avoided [160], [161].
H-SONs are general and universal since they can adapt to all degrees of centralization, implying flexibility. Hierarchy, modularity, and loose coupling result in systems that are simple and easy to comprehend. Stability is improved since large delays are managed at high levels with the slow operation, and at lower levels, the delays are smaller due to locality. Small delays help to avoid instability [35]. In addition, the feedback loops are decoupled at the same hierarchy level.
Scalability means that the system can easily be adapted to meet greater needs in the future, i.e., the systems are allowed to grow and adapt to new user requirements. H-SONs have good scalability as well as flexibility since they can adapt to completely centralized and distributed control depending on the available resources for decision making and the situation in the environment. The loose centralized control somewhat limits scalability, but the problem can be minimized by using more hierarchy levels. A weakly centrally controlled system can offer efficiency and fairness using basic resources. Loose coupling improves reliability since local failures are less likely to propagate [10]. The purpose of locality at the edge of the network near users is to minimize delays. The network is agile because of the separation of feedback loops from each other. Evolution has had a finite, although a long time, but has benefited from stable, loosely coupled intermediate forms to speed up the development [45], [52], thus showing agility.
Recent papers on self-organization focus on using machine learning [211]. The capability to achieve the planned macroscopic behavior through emergence, that is, by controlling only local interactions, would enable the engineering of highly robust technical systems. However, methods for developing self-organization using multi-agent systems are still in progress [5]. Especially the proper solution of the trade-off between centralized and distributed self-organization and managing emergence are open problems [28], [212].
In general, automatic, autonomous, and self-organizing systems need a goal for their stable operation. These systems were defined in [19], forming a hierarchy in FIGURE 9. Thus, all self-organizing systems are autonomous systems, and all autonomous systems are automatic. Since automatic systems usually need a goal, as also observed in [213], this implies that for stable and reliable operation and rapid convergence, autonomous and self-organizing systems also need a similar more general goal using a desired state to be attained or performance criterion to be maximized [29].
The lack of an externally defined goal may be the reason why many distributed self-organizing systems fail. Some form of centralized control is needed as goalless progress may lead to staggering behavior similar to a random walk process and eventually to instability. In self-organizing systems, the goals provided to the loose centralized control can define constraints for using basic resources. Such goals guide the system towards efficient resource usage and help avoid the tragedy of the commons.

1) HISTORY OF SELF-ORGANIZATION
Summaries of the history of self-organization are included in [85] and [105], and a history of multi-agent systems is in [125]. Surveys on self-organizing and multi-agent systems are presented in [5], [6], [204], and [205]. Relationship between automatic, autonomous, and self-organizing systems. Autonomous systems are advanced automatic systems, and self-organizing systems are advanced autonomous systems and, therefore, advanced automatic systems. Self-organizing systems are on top of the hierarchy and, therefore, the most complex and least mature.
Morphogenesis is the greatest problem in biology [99]. The term was first used in 1863, meaning ''the production of the form or shape of an organism'' [23]. Thompson (1917) and Turing (1952) were the first to describe it scientifically [214]. Wiener used the negative feedback concept in his cybernetics (1948) to describe control and communication in animals and machines [83]. A related term to morphogenesis is autopoiesis meaning self-producing [43]. The term was proposed by Maturana (1974).
Ashby proposed the terms adaptive system and selforganization in 1947 [43]. Two theoretical approaches have been proposed to the problem of self-organization, using either a combination of positive and negative feedback or second-order cybernetics [215]. Maruyama (1963) first studied positive feedback in detail, and second-order cybernetics by von Förster (1981). Such ideas have not been widely used in technical self-organizing systems.

C. SELF-ORGANIZING COMMUNICATION NETWORKS
Agent theory can be applied to wireless communication networks when we interpret the network manager and all transmitters and receivers as rational agents. A bidirectional link connects a transmitter and a receiver. In FIGURE 10, a transmitter on the left is shown as an agent, which receives sensing signals on the state of the channel from the corresponding receiver on the right. An example of the actions of the transmitter is power control based on a feedback loop. The receiver is an agent since it includes feedback loops in synchronization and channel estimation implementing actions on the received signal. Hierarchically the receiver is below the transmitter. In FIGURE 11, we show a network using agents. They implement feedback loops, and thus the term ''agent,'' commonly used in artificial intelligence within computer science, a convenient and well-defined general term [3], is also useful in communications. The network is hierarchical, and the network manager implements the central agent that controls the use of network resources such as energy, time, bandwidth, and space.
Human network administrators supervise the whole network [216] since automatic and autonomous systems may sometimes have failures. The network may combine centralized and distributed control, thus implementing the hybrid SON architecture. The network must be loosely Transmitters can be seen as rational agents. The sensing information comes from the corresponding receiver, another agent hierarchically below the transmitter agent, including synchronization and channel estimation. A feedback loop often requires an externally given goal but is ignored for brevity. coupled to avoid excessive control information: the network manager should be weak, and interference should be avoided. The network is thus based on time-scale separation and interference avoidance. Each receiver is hierarchically below the corresponding transmitter. Therefore the feedback loop in the transmitter should be much slower than the loops in the corresponding receiver to avoid conflicting behavior.
In the physical layer, the unintended coupling effect can be seen clearly (FIGURE 11). The transmitter typically has a power control loop that may interfere with other power control loops since the signals are not completely orthogonal. This is called coupling through additive interference and can be classified as unintentional horizontal coupling at the same hierarchy level (FIGURE 12a). Transmitter 1 (Agent 1) typically has a power control loop that may interfere with other power control loops (Agent 2) since the transmitted radio waves propagate in all directions.
Vertical coupling between hierarchy levels corresponds to multiplicative interference [217] (FIGURE 12b and c). If we consider hierarchical power control, the transmitted signal in the physical layer is multiplied or modulated by the act signal from a higher-level agent (Agent 1). In this way, the transmitted power is changed with a slower time scale than in Agent 2. This scheme resembles Brooks's (1986) subsumption robot architecture [34].
In communication networks, links are generally open systems and interfere with each other because the radio waves spread in all directions. The energy consumption is large because of the high attenuation. In [180], the authors explain that in mobile wireless communications, each mobile user is loosely coupled with every other mobile user that uses the same communication channel. The idea is not developed further in the book.
Understanding the information requirements to describe the network is crucial for efficient operation. In an H-SON, the information is contained in the state of the network. The state includes bit rate, delay, availability, reliability, and energy efficiency of the links and the whole network and interference between the links [218], [219]. Interference is usually measured using the signal-to-interference ratio (SIR). More generally, the state of the network includes the impulse responses and the noise and interference spectral density of all links between nodes.
Examples of intelligent agents in communication networks include transmitter power, frequency, and timing control, and beamforming, which reduce interference in the receiver. These are interference avoidance methods [13] needed for loose coupling. When additive interference is reduced, the required energy is reduced both in the transmitter (transmission energy reduced) and the receiver (simple processing). Especially transmitter power control may easily lead to instability in a network since many power control loops may be coupled by interference, as in a cocktail party [159], [220]. NOMA [15] and code division multiple access (CDMA) systems are based on nonorthogonal or quasi-orthogonal signals between users, respectively. Such systems may need complicated multiuser receivers to avoid the cocktail party effect [114].
The European Telecommunications Standards Institute (ETSI) has published a system called Generic Autonomic Networking Architecture (GANA) representing the H-SON architecture, a standard called Technical Specification (TS) 103 195-2, thus forming a holistic framework for SONs [41]. Similar ideas are now introduced also in the O-RAN system that is under development [14]. The goal of O-RAN is to implement its design principles on top of the 3 rd Generation Partnership Project (3GPP) Long-Term Evolution (LTE) and New Radio (NR) RANs. H-SONs have also been suggested in the literature [142], [146]. In [142], the authors noticed that the resource allocation interval in centralized control is a few minutes, in distributed control, typically milliseconds, and in the H-SON, a few seconds. This numerical example of the time scales shows the benefits of H-SONs compared to C-SONs.
In the GANA architecture, the time scales of the fast and slow control loops are left open. They are defined in the implementation phase [41]. The number of control loops depends on the number of relationships between Decision Making Elements (DEs) and Managed Entities (MEs). The hierarchy is nested hierarchy since the highest layer can directly control the lowest layer, although at a slow speed.
In the O-RAN architecture, the time scales are called loop times. They exceed 1 s in the non-real-time control loop, vary from 1 ms to 1 s in the near-real-time control loop, and are below 1 ms in the real-time control loop [14]. The scales have no clear time-scale separation; thus, stability problems 59474 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  may arise due to possible conflicts. Usually, such conflicts can be avoided and resolved with self-coordination methods, as summarized in [163].
C-SONs, D-SONs, and H-SONs differ regarding optimality, stability, energy consumption, and control or signaling overhead [121]. C-SONs can be optimal, but the delays and control overhead are large, and there may be stability problems because of the long loop delays. In D-SONs, the stability is improved with small control overhead, energy consumption, and small delays. Still, the global behavior cannot be predicted from local behavior, and a set of local optima does not lead to a global optimum [53]. H-SONs implement the advantages of C-SONs and D-SONs but have only a few of their disadvantages. The advantages include improved stability, reduced delays and control overhead, improved energy efficiency compared to C-SONs, and improved global optimality compared to D-SONs. The main disadvantage of H-SONs results from the flexibility that somewhat reduces global energy efficiency, but this is a trade-off that must be made in all programmable solutions.
The number of configuration parameters illustrates the complexity of communication networks. A typical 2G node has 500 parameters to be configured and optimized, a 3G node 1000, and a 4G node 1500 [221]. A typical 5G node can be estimated to have 2000 parameters.

1) HISTORY OF SONs
The first self-organizing networks in communications used packet switching, invented by Kleinrock (1961) [206]. Baran proposed a distributed network to survive nuclear attacks [75]. Independently of Baran's work, Licklider, Kleinrock, and Roberts developed the Arpanet (1969) as the first self-organizing network, leading eventually to the Internet (1983) as a distributed best-effort network [206], [222]. Arpanet was one of the first applications of loose coupling in communications [176]. Its performance can be improved using various techniques to look more like a dedicated Internet for a user. Cherry (1953) observed the cocktail party effect in social systems [220].
Packet radio networks have been developed in wireless communications since 1972 [203]. The interest in distributed self-organizing networks increased in the 1980s [202]. A special form of them is ad hoc networks. Although the term ''ad hoc network'' is older, it was first recommended by the IEEE 802.11 subcommittee in 1993 [203], [223]. Beni and Wang (1993) invented swarm intelligence using the concept of cellular automata [224]. Swarm intelligence is based on the collective intelligence of simple agents. The agents form a distributed self-organizing system, useful in ad hoc networks [225], [226]. Swarm intelligence is a form of evolutionary computing.
Self-organization can be realized with programmable networks. They are divided into active and software-defined (SDNs) networks [227]. Active networks are based on mobile agents by Chess [117], [228]. The networks are active because nodes can modify the packet contents. Mobile agents are software agents that can roam between the nodes. In general, the agents in modern networks are not mobile since roaming may increase energy consumption, and network management becomes complex. Chess's paper [117] is one of the first to use the agent concept in communication networks. Later SDNs were defined as networks where the data plane and control planes are separated. Now the more general term network automation is preferred since one can use application programming interfaces (APIs) [229] in programmable networks. Using agents in SONs was suggested in [118] and [119].
Already in [75], a mixture of central and distributed control was mentioned as a practical form of communication network. One of the first papers to discuss H-SONs in communications is [230]. The concepts of C-SON, D-SON, and H-SON were defined briefly without any details in 3GPP Rel. 8 (2008) [58]. The GANA architecture was originally proposed in [231] using the terminology developed for autonomic computing in [232]. The ETSI published a GANA white paper (2016), later becoming an ETSI standard [41]. Hybrid systems have been used in other disciplines with different names, including multilevel, multigoal systems in hierarchical control [48], and holonic control architecture [53]. In social systems, subsidiarity is the closest to hybrid systems [70].

2) RECENT TRENDS AND ROADMAP
Understanding of recent trends and the development of a roadmap must be based on deep knowledge on the history and relevant literature. The GANA architecture represents one of the first standardized H-SONs, and it is also using vertical loose coupling in the form of time-scale separation. O-RAN, ZSM, and ENI architectures are still under development, and they must consider loose coupling to guarantee stability. Loose coupling is still not very well known in the physical and network layers of the OSI model [12], [14], [15], although it is well known in the application layer since it has a long history after the introduction of the structured software design [10], [183]. SOA is now implemented in the form of microservices (2011), which are a new form of service-oriented computing (SOC) [233], [234]. The SOC is a computing paradigm that uses services as fundamental elements. Microservice architectures consist of small vertically and horizontally loosely coupled services that can be independently replaced. We expect that H-SONs with vertical and horizontal loose coupling will be widely used in future networks because of their beneficial properties, especially stability, and agility [17].
An open problem is whether AI can work reliably in uncertain dynamical environments [151] although some initial results exist [152]. Some fundamental limitations of AI are discussed in detail in [235]. One important limitation is that AI uses deductive, inductive, and statistical methods, whereas humans are more versatile and creative and able to use abduction or inference to the best explanation.
In engineering, we now need knowledge from biology (especially systems biology [208]) and social sciences, in addition to physics and chemistry. Biological systems are known to be very resource efficient, which is mandatory in sustainable development [19]. The use of biology in engineering is called bionics or biomimetics [236]. For example, our brain is the most complex system we know and, therefore, a good model for us. The brain is known to form a small-word network [237] and is based on reinforcement learning [238]. A small-world network is both globally and locally efficient [239], [240]. Applications already exist in wireless networks [241]. A small-world network is formed with shortcuts.
We expect that reinforcement learning will be common in multiobjective optimization in the form of Pareto Q-learning [149]. It is also possible that Prigogine's nonequilibrium thermodynamics for open systems [89] will find applications in SONs since it has been applied in economics for decades [144]. Finally, complex adaptive systems are sets of interacting agents and together with the hierarchy concept lead naturally towards more advanced self-organizing networks [8], [207].

V. CONCLUSION
We have presented a multidisciplinary history of loosely coupled systems. Loose coupling has a long history in structured software design but is not very well known in physical and network layers of communication networks. We have proposed a vertically and horizontally loosely coupled hybrid SON as a universal solution for system design to improve the performance of complex networks. Coupling between layers and their subsystems may be intentional or unintentional. Like feedback, loose coupling has been ''an invisible thread in the history of technology.'' It is a simple form of self-coordination. Loose coupling has a solid theoretical basis in optimization decomposition. Furthermore, the need for weak centralized control can be derived directly from welfare economics and game theory in the form of the Stackelberg game, strengthening the theoretical basis.
Hybrid SONs combine centralized SONs for global optimization and distributed SONs for local optimization. Hierarchy, modularity, and local interactions are used. The principles of a loosely coupled hybrid SON are applied in the ETSI GANA architecture for communications and the IEC 61499 standard for manufacturing systems.
Harmful interactions between feedback loops are difficult to analyze and should be avoided. To guarantee stability, the feedback loops must be loosely coupled using time-scale separation in vertical loose coupling and interference avoidance in horizontal loose coupling. In a distributed system, information exchange using sensing results is possible. In communication networks, the central agent in the form of a network manager must be much slower than the transmitter agent, and the transmitter agent must act much slower than the receiver agent. The possible interference between the links must be avoided using orthogonality in different domains, including time, frequency, and space.
When nonlinear relationships between the parts of a system lead to emergence, the global behavior cannot be predicted from the local behavior, and a set of local optima does not lead to a global optimum. Presently there is no theory for emergence; therefore, it is seen as a harmful phenomenon to avoid using loosely coupled architectures. Moreover, analysis of systems is, in practice, possible only when the interactions between subsystems are linear, loose, or nonexistent. However, in the last case, there is no system to analyze but a set of parts.
The advantages of loosely coupled networks include stability, scalability, efficiency, reliability, agility, and resilience. Stability is achieved by using loose coupling between the control loops. Weak centralized control may be useful in avoiding deadlock situations. Scalability is improved since the network can become either centrally controlled or distributed depending on the requirements and the situation in the environment. The network aims at using basic resources efficiently. Resource efficiency can be improved by applying the subsidiarity principle, which is a general solution to the tragedy of the commons. Using the externally given goal, a leader in agent theory, an arbitrator in game theory, a network manager in communications, or a weak central agent can prioritize and ration the use of basic resources using constraints to improve fairness without too much control information. However, efficiency is somewhat reduced by the high flexibility of these systems, but this is a trade-off that must be made in all programmable solutions.
Reliability is improved since errors in loosely coupled networks do not easily propagate because of loose coupling. The network is agile because of the separation of feedback loops from each other. The networks are resilient since they can also act as distributed networks with lower performance. There are various applications of loose coupling in many disciplines, including structured software design, modular electronics design, cross-layer design, serviceoriented architectures, and interworking architectures. Open Radio Access Network (O-RAN) and nonorthogonal multiple access (NOMA) systems are examples of state-of-theart communication technologies that would benefit from loose coupling. Introducing such capabilities for H-SONs facilitates building future communication networks fulfilling their requirements and supporting sustainable development.
A recent trend is towards microservices based on loose coupling as a form of service-oriented architecture. We need further work on concrete algorithms for hierarchical distributed multiobjective optimization and robustification of the network. Good models can be found in biology since living systems are highly resource efficient and thus support sustainable development in engineering. For example, our brain forms a small-world network and is based on reinforcement learning. An open problem is whether AI can really work reliably in uncertain dynamical environments.