Steering the aggregative behavior of noncooperative agents: a nudge framework

This paper considers the problem of steering the aggregative behavior of a population of noncooperative price-taking agents towards a desired behavior. Different from conventional pricing schemes where the price is fully available for design, we consider the scenario where a system regulator broadcasts a price prediction signal that can be different from the actual price incurred by the agents. The resulting reliability issues are taken into account by including trust dynamics in our model, implying that the agents will not blindly follow the signal sent by the regulator, but rather follow it based on the history of its accuracy, i.e, its deviation from the actual price. We present several nudge mechanisms to generate suitable price prediction signals that are able to steer the aggregative behavior of the agents to stationary as well as temporal desired aggregative behaviors. We provide analytical convergence guarantees for the resulting multi-components models. In particular, we prove that the proposed nudge mechanisms earn and maintain full trust of the agents, and the aggregative behavior converges to the desired one. The analytical results are complemented by a numerical case study of coordinated charging of plug-in electric vehicles.


Introduction
Nudging is an approach in behavioral economics that is proposed to improve people's health and happiness by providing "indirect suggestions" termed as nudges. A nudge, by definition, is any characteristic of the choice structure that predictably changes people's behavior without restricting any options or significantly affecting economic incentives 1 . Therefore nudges are different from mandates as they are easy and cheap to avoid [27]. Due to their aspects of preserving freedom of choice and being non-intrusive, nudge policies have become popular over the last few years. The most notable example is the "Behavioural Insights Team" (known as the "Nudge Unit") that applies nudge theory in British government, and, for instance, its most recent report concerns energy consumption analysis and the impact Email addresses: m.shakarami@rug.nl (Mehran Shakarami), a.k.cherukuri@rug.nl (Ashish Cherukuri), n.monshizadeh@rug.nl (Nima Monshizadeh). 1 Nudge was originally defined as the minimalist intervention in a given situation such that a desired outcome is achieved [28]. However, the Nobel laureate Richard Thaler presented another definition in [27] which is more popular and is used here. of smart meters on customers' energy consumption [24]. Another example is "informational nudging", defined as sending manipulated, and possibly misleading, information about options to a decision maker for altering its choices [10]. Informational nudging is studied recently in the context of transportation systems [5] and boundedly rational decision makers [6].
The problem of coordinating a population of noncooperative price-taking agents and altering their aggregative behavior appears in various applications such as charging of plug-in electric vehicles in a coordinated way [18], residential energy consumption scheduling [20], and congestion control in networks [2]. To address this problem, a common approach in the literature is treating the price as a design signal. If the system regulator has access to all information of the agents, a linear price with respect to the actions of the agents is sufficient to achieve a desired behavior [1]. In case such information is not available, which is often the case, dynamic pricing algorithms are posed as a solution to overcome this lack of knowledge; see e.g. [1,2,8,9,17]. The underlying assumption in dynamic pricing is that price is fully controllable, which in turn facilitates the regulator's task in steering the behavior of the agents. However, the actual price could depend on various elements such as fixed and vari-able production costs and daily market conditions; see e.g. [22] in the context of power systems. Here, instead, we allow the signal designed by the regulator to be different from the actual price dictating the costs incurred by the agents. Motivated by the advantages of nudging, we propose a framework in which the regulator alters the aggregative behavior of price-taking agents, without directly designing the price and without fully knowing the cost/utility functions of the agents. In our setup, the regulator transmits a price prediction signal to all the agents. The agents choose their actions taking this prediction into account; however, they do not blindly follow it since they are aware that the prediction signal can differ from the actual price that they will incur. We model such behavior by associating a trust variable to each agent, which increases/decreases depending on the history of the accuracy of the communicated price prediction. In other words, here the agents cross-check the validity of the communicated information. This novel cross-checking step is a key feature of our work, and distinguishes it from the existing informational nudging schemes [5,6,10]. Moreover, the trust dynamics couple the price prediction dynamics to the actual price, consequently the proposed nudge mechanisms do not simplify to conventional dynamic pricing schemes.
The presented framework is referred to as a nudge since it does not directly affect economic incentives of the agents and respects their freedom of choice. Putting it differently, we use price information to indirectly suggest desired behaviors to the agents rather than enforcing mandates. For the idea of nudging through price information in a different discipline, namely agricultural economics, we refer the interested reader to [3].
Contributions: We present a novel framework which is able to capture the multi-components model resulting from nudge mechanisms in conjunction with agents' actions and trust dynamics 2 . Within this framework, we first consider stationary desired behaviors and design two nudge mechanisms for the regulator, termed hard and soft nudge. We show that under these mechanisms, full trust of agents is gained in finite time and the aggregative behavior of the agents converges asymptotically to a desired set point. Afterwards, we extend the results to temporal desired behaviors and present an adaptive nudge mechanism that can cope with the variations in the desired behavior. We analytically show 2 Preliminary results of this work are presented in the conference article [25]. Different to the conference article, this paper reports the proofs of Theorems 4.1 and 4.4, studies convergence for stationary desired behaviors outside of the admissible set (Corollary 4.2), presents a nudge mechanism for temporal desired behaviors (Section 5) and establishes its convergence (Theorem 5.2 and Appendix B), applies these results to coordinated charging of plug-in electric vehicles (Section 6), and studies existence of solutions for nonautonomous projected dynamical systems (Appendix A). that this mechanism obtains and maintains full trust of agents, and consequently the aggregative behavior converges to the time-dependent desired behavior. Moreover, a byproduct of our analysis gives sufficient conditions for existence of Carathéodory solutions for nonautonomous projected dynamical systems.
The structure of the paper is as follows. Preliminaries are provided in Section 2. The proposed framework is introduced in Section 3. Section 4 includes the hard and soft nudge mechanisms for stationary desired behaviors and their convergence analysis. The adaptive nudge mechanism for temporal desired behaviors is presented in Section 5. The case study is included in Section 6, and finally, conclusions are drawn in Section 7. Existence of solutions for nonautonomous projected dynamical systems is established in Appendix A and stability analysis for the adaptive nudge is provided in Appendix B.
Notation. We denote the set of natural, real, and nonnegative real numbers by N, R, and R ≥0 , respectively. The standard Euclidean norm is denoted by · . The symbols 1 n and 0 n respectively denote the vectors of all ones and zeros in R n . We denote the Kronecker product by ⊗. The vectorization of a matrix M ∈ R m×n is denoted by vec(M ). We denote the boundary, the interior, and the closure of a set X ⊆ R n with bd(X ), int(X ), and cl(X ), respectively. Given the vectors x 1 , · · · , x N ∈ R n , we use the notation col( For a given vector x ∈ R n and a positive semidefinite matrix M , we denote the weighted Euclidean norm of x by x M := √ x ⊤ M x. The Frobenius norm of a matrix M ∈ R m×n is denoted by M F := Tr(M ⊤ M ) where Tr( · ) is the trace operator. A closed ball with center x ∈ R n and radius r > 0 is denoted byB(x, r) := {y ∈ R n | x − y ≤ r}. A function F : X → R m is locally Lipschitz on an open set X ⊂ R n if for any point x ∈ X , there exist some positive scalar r and Lipschitz constant L, both dependent on x, such that F (y ′ ) − F (y) ≤ L y ′ − y for all y ′ , y ∈B(x, r). The function F is Lipschitz on X if there exists a positive constant L satisfying F (y ′ ) − F (y) ≤ L y ′ − y for all y ′ , y ∈ X .

Preliminaries
This section provides basic notions on convex analysis and projected dynamical systems.
Convex analysis: Consider a nonempty, closed, convex set X ⊆ R n . The map proj X : R n → X denotes the Euclidean projection on to the set X , i.e., proj X (z) := arg min y∈X y − z . The normal cone to X at a given point x ∈ X is the set N X (x) := y ∈ R n | y ⊤ (s − x) ≤ 0, ∀s ∈ X , and the tangent cone is defined as the set T X (x) := cl (∪ y∈X ∪ λ>0 λ(y − x)). The projection of a vector z ∈ R n on to T X (x) is denoted by Π X (x, z) := proj TX (x) (z). Given any point x ∈ X , it follows from Moreau's decomposition theorem [14,Thm. 3.2.5] that any vector z ∈ R n can be written as z = proj NX (x) (z) + proj TX (x) (z). The reader may refer to [14,Fig. 5.3.1] for a geometrical representation of normal and tangent cones.
Projected dynamical systems: Given a nonempty closed set X ⊆ R n and a continuous function h : R n × [0, ∞) → R n , the nonautonomous projected dynamical system associated to them isẋ The right-hand side of this system is discontinuous on the boundary of the set X . Following [21, Def. 2.5], we specify a notion of solution to the above projected dynamical system. A map x : [0, ∞) → X is a Carathéodory solution of the projected dynamical system (1) if it is absolutely continuous and satisfieṡ

Problem formulation and the model
We consider a set of agents I := {1, . . . , N } that interact repeatedly with a central regulator. The agents are noncooperative, that is, each agent i is associated with a cost function J i that it wishes to minimize by choosing its action. In particular, the cost function of each agent i ∈ I is given by J i (z i , p), which determines the total cost of action z i ∈ R n given the price p ∈ R n and n ∈ N.
For simplicity, we assume that J i admits the following linear-quadratic form where Q i = Q ⊤ i ∈ R n×n , Q i ≻ 0, and c i ∈ R n . The cost function J i consists of two terms, the local penalty and the cost of action z i ⊤ p. Note that c i is the optimal action of the agent when the price is zero. The structure (2) appears in applications where z i indicates the demand of a product that comes at price p, for instance coordinated charging of plug-in electric vehicles [18] and scheduling of residential energy consumption [20].
Before providing further details, we give an overview of our model. The regulator provides a prediction of the price for all the agents. This prediction is potentially different from the actual price that determines the costs incurred by the agents. The agents use the price prediction to choose their actions with the aim of minimizing the cost they incur under the actual price. The actual price is determined and revealed only after the actions are chosen.
The regulator, on the other hand, aims at steering the aggregative behavior of the agents to a desired point using the price prediction signal. We assume that the regulator does not know the cost functions of the agents. A common approach of steering aggregate behavior, often referred to as dynamic pricing, is to use the price as a control signal to regulate the system of agents [1,8,17]. In contrast, here the actual price signal is not available for design and the regulator needs to rely on the price prediction signal to manipulate the agents' behavior. Our motivation stems from the fact that, in reality, the actual price may not be prescribed a priori as a dynamic function of demands/actions.
The discrepancy between the price prediction and the actual price readily brings the issue of trust or reliability. Namely, the central regulator needs to earn and maintain the trust of the agents in order to influence their decisions. We take this into account by considering that the agents associate a level of trust/reliability to the regulator's prediction based on the history of its accuracy.
In the sequel, we aim to carefully model the above described features and design update schemes, termed nudge mechanisms, that enable the regulator to steer the aggregative behavior of the agents to a desired reference. We first look at the problem from the agents' side and put forward a model where agents use available information to decide on their actions. The regulator's side will be dealt with in Section 4, where nudge mechanisms are proposed.

Agents' actions and trust dynamics
In choosing their actions at time t ∈ [0, ∞), the agents have access to a price predictionp(t) ∈ R n sent out by the regulator. Note that this value is common for all agents. In addition, we assume that each agent i ∈ I has a local perception of the price, denoted byλ i ∈ R n , that the agent would have used in the absence of the predictionp(t).
As mentioned before, different from conventional dynamic pricing, the distinction between the actual price and its prediction brings the issue of reliability, and we incorporate this in our model by associating a level of trust/reliability to the regulator's prediction based on the history of its accuracy. In particular, let γ i (t) ∈ [0, 1] be the trust variable of agent i associated with the price predictionp(t). Note that γ i (t) = 1 and γ i (t) = 0 stand for full and no trust, respectively. Given the amount of trust, predicted price, and the local perception, agent i adopts a trust-adapted price perception 3 If γ i (t) is close to 1, the agent disregards its own perception of the price and follows the price prediction communicated by the regulator. Conversely, as γ i (t) approaches 0, the agent loses trust in the price predictionp(t) and follows its own price perceptionλ i when deciding on its optimal action. The agent i uses this trust-adapted price perception to determine its optimal action as follows: x i (t) := arg min By using (2) and (3), the explicit expression of the optimal action of agents is given by The actual price t → p(t) is available to the agents once they have taken their actions. If the discrepancy between the predicted and actual price is large, then agents lose their trust in the predictions. We capture the changes of trust based on these positive or negative experiences by providing a trust update rule. In particular, we consider the following trust dynamics: where η i > 0 and ψ i : R ≥0 → [−1, 1] determines whether the agent loses or gains trust in the price prediction. We assume that ψ i ( · ) satisfies the following assumption, and an example of this function is depicted in Fig. 1. The scalar δ i quantifies the tolerance of agent i towards the prediction error. That is, if the error between the actual and the predicted price p(t) −p(t) is greater than δ i , agent i begins losing trust in the prediction with the rate η i . Conversely, trust increases as long as the error is within the tolerance δ i . The rationale behind this dynamics is that, excluding the extreme cases of unconditional trust or distrust, trust can be gained or lost after several positive or negative experiences [15]. Note that trust variables are defined in the interval between 0 and 1. To respect this, we slightly revise (5) by adding projection operators to it, namely: We note that the essence of the trust update rule remains the same as (5). The projection operators become active only if the bounds γ i = 0 or γ i = 1 are hit. In particular, if γ i (t 1 ) = 1 at some time t = t 1 and ψ i ( p(t 1 ) −p(t 1 ) ) is positive (thus suggesting an increase in γ i ), the projection becomes active, and setsγ i (t 1 ) to 0, thus prohibiting the trust variable to exceed its maximum value 1. An analogous scenario occurs for the case γ i (t 1 ) = 0.
For simplicity of presentation, we rewrite the model of agent i, consisted from (4) and (6), as follows: Note that the actual price p and the price predictionp are the inputs of the model, and the action vector x i is the output. Having introduced the model of the agents, we next discuss the desired aggregative behavior.

Desired aggregative behavior
The goal of the system regulator is to coordinate the agents such that they cumulatively behave in a desired fashion. Here, we are interested in regulating i∈I x i (t), which we refer to as the aggregative behavior. Such quantity often reflects total production or total demand depending on the application at hand. More precisely, the regulator aims to achieve lim t→∞ i∈I for some desired setpoint x * ∈ R n . 4 To this end, we propose suitable nudge mechanisms that can be implemented by the regulator. A mechanism is a nudge if it influences the behavior of a group of individuals through providing indirect suggestions. We use this concept and propose mechanisms in which the regulator manipulates the price predictionp(t) to achieve its goal, namely (9).
Recall that the actual price is considered here as an exogenous signal. In particular, we assume that it admits where p 0 is a constant base price, known to the regulator, and ∆p(t) ≪ p 0 accounts for price fluctuations. We assume that the following condition holds throughout the paper: The actual price function p : [0, ∞) → R n is continuous, and its fluctuations satisfies Note that in the absence of the objective (9), the best the regulator can do is to provide the agents with the true value of p 0 . In that case, the price prediction error amounts to ∆p(t) . Therefore, the inequality constraint in Assumption 3.2 simply means that the prediction error in such a manipulation-free case is within the tolerances of all agents. In other words, the price fluctuations, per se, should not lead to a loss in trust. • The fact that the agents do not blindly follow the price predictionp(t) implies that not any arbitrary aggregative behavior x * is achievable. Next, we identify a set of aggregative behaviors to which the agents can be driven by applying our nudge mechanisms.
Let Assumption 3.2 hold, and chooseδ ∈ R such that We leverage the idea that if Assumption 3.1 holds and p(t) belongs to the closed ball then ψ i ( · ) takes positive values and γ i (t) increases for all i ∈ I following (7a). As a result, the regulator can gain agents' trust in the price prediction by constraininĝ p(t) to the ball B. Bearing this and the action of agents in (7b) in mind, we define the set of admissible x * as: From (11), the set X * can be explicitly written as Thus, the regulator can alter the aggregative behavior inside a compact set around x 0 . Putting it differently, X * characterizes the set of aggregative behaviors that are potentially achievable while monotonically increasing the trust variables. Note from (10) and (13) that the bigger the agents' tolerances δ i 's are, the larger can beδ and thus, the set X * .
For any x * ∈ X * , there exists a unique p * ∈ B such that or equivalently The vector p * is an important quantity. If the agents fully trust the price prediction and the regulator communicates p * as the prediction, then the aggregative behavior of the agents will be x * . However, the regulator cannot directly compute p * since it does not know the exact parameters defining individual cost functions. Moreover, trust can only be gained over time. To address these issues, suitable nudge mechanisms are designed in the next section. Each of those mechanisms can be interconnected with the agents' dynamics, as demonstrated in Fig. 2, in order to drive the price predictionp(t) to p * , and consequently x(t) to x * . The key parameter used in the proposed mechanisms isδ satisfying (10). The precise values of the tolerances of the agents δ i 's are unknown to the regulator, and the price fluctuations ∆p(t) are not available a priori. Thus the regulator typically needs to rely on lower estimate of min i∈I δ i − ∆p(·) to selectδ. The less the regulator knows about the righthand side of (10), the more conservative the value ofδ has to be chosen, which in turn results in a smaller ball B as well as a smaller set of admissible desired behaviors X * . Learning a feasibleδ from experiments is an interesting research question for future research.

Nudge mechanisms for stationary desired behaviors
In this section, we design two nudge mechanisms, referred to as hard and soft, that provide suitable price prediction signals.

Hard nudge mechanism
The first nudge mechanism that we propose is the following projected-integral control laẇ where B is defined as (11) and x * is the desired aggregative behavior. We note that from [21, Lem. 2.1], the projection operator on the right-hand side can be explicitly expressed using the definition of B. In particular, let e(t) := i∈I x i (t) − x * , then we obtain: where α(t) := max{0, e(t) ⊤ (p(t) − p 0 )}. The intuition behind the nudge mechanism in (16) is as follows: this mechanism provides a suitable integral action that updates the price prediction such that the error between the desired behavior and the current aggregative behavior diminishes. To gain and maintain the trust of the agents, the price prediction is constrained to the ball B for all time, and thus we refer to (16) as hard nudge.
The overall system, as shown in Fig. 2, is obtained by interconnecting (16) with agents (7), and the theorem below addresses its convergence.
Proof. The proof is divided into two parts. Since the vector field of the overall system is discontinuous, we show existence of Carathéodory solutions of the system in the first part. The second part is devoted to convergence analysis.
Existence of solutions: Let ξ := (p, col(γ i )) and Ω := B× [0, 1] N . Then, by substituting the expression of x i from (7b) into (16), we obtain the nonautonomous projected dynamical system that represents the closed-loop system (7) and (16) Note that the map (p, t) → ψ i ( p(t)−p ) is measurable 5 in t and locally Lipschitz inp. The former follows from Assumptions 3.1 and 3.2 and the fact that every continuous function is measurable [23,Prop. 3.3]. The latter is a consequence of Assumption 3.1 and the fact that the norm operator is Lipschitz. Consequently, the function (ξ, t) → h(ξ, t) is locally Lipschitz in ξ and measurable in t, and using the compactness of the set Ω, existence of solutions for any initial condition (p(0), col(γ i (0))) ∈ B × [0, 1] N is guaranteed by Lemma A.1.
Convergence analysis: Our proof proceeds by showing that for any solution of the system, there exists a finite time by which full trust of agents is achieved and maintained. Subsequently, with full trust, we show thatp(t) converges to p * .
Note from (16) thatp(t) ∈ B for all t ≥ 0. Using this fact along with Assumption 3.1, we obtain ψ i ( p(t) − p(t) ) > 0 for all i ∈ I and t ≥ 0. Consequently, along any solution, the trust variable of agent i at any time t is given by Bearing in mind thatp(t) belongs to the ball B given by (11), we have p(t) −p(t) ≤ ρ < min i∈I δ i for some ρ > 0. Hence, by Assumption 3.1, we obtain that . As a consequence, in the time interval [T, ∞), the price prediction dynamics (16) reduces tȯ where We next analyze the asymptotic properties of (18) and show that its solutions converge asymptotically to p * . Consider the Lyapunov candidate V (p) := 1 2 p − p * 2 . Since solutions of (18) are absolutely continuous and V is continuously differentiable, the time-derivative of the evolution of V along any solution of (18) is equal to the inner product of the gradient of V and the right-hand side of (18). This inner product is computed as where we used Moreau's decomposition theorem (cf. Section 2) to obtain the above equality and . We use (19) and the expression of x * in (14) to obtain This implies that V decreases monotonically along every solution of (18). Consequently,p converges to p * , and the aggregate behavior i∈I x i converges to x * .
As shown in Theorem 4.1, the hard nudge mechanism (16) successfully steers the agents to the desired aggregative behavior, for any x * ∈ X * . An implicit requirement is that the regulator has partial knowledge on expected desired aggregative behaviors, i.e., a subset of X * , to pick a feasible x * . In case this information is not available and x * / ∈ X * , convergence of the aggregative behavior is still guaranteed, but to a different point, namely to x ′ ∈ X * that is the closest point to x * in a suitable norm. This is formally stated in the following corollary.
Corollary 4.2. Consider the closed-loop system formed by agents' model (7) and the hard nudge mechanism (16) with x * / ∈ X * . Then, for any initial condition (p(0), col(γ i (0))) ∈ B × [0, 1] N , there exists a Carathéodory solution t → (p(t), col(γ i (t))) of the closed-loop system over the domain [0, ∞). Moreover, i∈I x i (t) converges to x ′ = x * given by Proof. Based on the proof of Theorem 4.1, the closedloop system admits a Carathéodory solution for all x * ∈ R n , and thus existence of a solution t → (p(t), col(γ i (t))) is guaranteed for all t ∈ [0, ∞). Next, we consider x ′ and characterize its corresponding price prediction, namely p ′ . We prove convergence of (p, col(γ i )) to (p ′ , 1 N ) afterwards. Subsequently, convergence of i∈I x i to x ′ follows from the definition of p ′ .
Recalling the definition of X * given by (12), we see that for any y ∈ X * , there exists some s ∈ B such that the relation y = i∈I c i − Q −1 i s holds. Therefore, the above inequality can be rewritten as Note from (16) thatp(t) ∈ B for all t ≥ 0. Following the steps of the proof of Theorem 4.1, there exists some finite time T ≥ 0 such that col(γ i (t)) = 1 N and the hard nudge mechanism reduces to (18) for all t ≥ T . Considering again the Lyapunov candidate V (p) := 1 2 p − p ′ 2 , its derivation along (18) yields Now we add the left-hand side of (20) evaluated at s =p to the right-hand side of the foregoing inequality to get where the equality follows from the definition of f given by (19). We conclude that V decreases monotonically along every solution of (18) andp converges to p ′ . Remark 4.3. If Assumption 3.2 is not satisfied, one may still be able to provide convergence guarantees under suitable conditions. In particular, let S denote the collection of agents that violate Assumption 3.2 for all time, i.e, S := {j ∈ I | ∆p(t) ≥ δ j , ∀t ∈ [0, ∞)}.
The remaining agents satisfy the assumption, namely ∆p(t) < min i∈I\S δ i for all time. We can then show that under the hard nudge (16) withδ ∈ R satisfying the revised inequality 0 <δ < min min for all time, the aggregative behavior of the agents in S converges tox := j∈S (c j − Q −1 jλ j ), whereas the aggregative behavior of the agents in I \ S converges to The set Y is similar to X * in (12) but with the set of agents restricted to I \ S. In case x * −x ∈ Y, we have x ′ = x * −x which implies that the aggregative behavior of all agents converges to x ′ +x = x * . The details of the analysis are omitted due to lack of space. •

Soft nudge mechanism
While using the nudge mechanism in (16) is effective for driving the aggregative behavior of the agents to a desired point, convergence is guaranteed only if the price prediction is initialized in the ball B. We now present an alternative nudge mechanism under which convergence is guaranteed globally, i.e., for all (p(0), col(γ i (0))) ∈ R n × [0, 1] N . The proposed mechanism is given bẏ where B is defined in (11) and ε > 0 is a design parameter. We note that the explicit expression of the projection ofp(t) on to the ball B is as follows 6 : In the mechanism (21), the term i∈I x i (t)−x * provides a suitable integral action as before to steer the aggregative behavior towards x * . However, different from (16), this term is outside the projection operator, and solutions of (21) need not belong to the ball B at all times. To emphasize this feature, we denote the dynamics (21) as soft nudge 7 . We note that outside the ball B, the term proj B (p(t)) −p(t) is nonzero with the penalty gain ε −1 , thus attracting the price predictionp(t) to the ball and preventing the loss of trust. The parameter ε is chosen sufficiently small such that trust variables increase and reach the value of 1 in finite time. Below we establish the convergence properties of the soft nudge mechanism. . 7 For related work on replacing projected dynamical systems with dynamics consisting of a penalty term, as in (21), see the anti-windup approximation scheme studied in [11].
Proof. The proof is divided into three parts. In the first part, we show that for any given (p(0), col(γ i (0))) ∈ R n × [0, 1] N , there exists a bounded Carathéodory solution of (7) and (21). The second part argues that there exists some ε * > 0 such that for all ε ∈ (0, ε * ], the price prediction converges exponentially fast to the neighborhood of the ball B. We prove convergence of the solution to the point (p * , 1 N ) in the last part.
Existence of solutions: By using (7) and (21), we write the dynamics of the overall closed-loop system aṡ where h(p, col(γ i )) := i∈I π i (p, γ i )−x * + 1 ε (proj B (p)− p). Noting the nonexpansive property of proj B [4, Prop. (8), the map (p, col(γ i )) → h(p, col(γ i )) is locally Lipschitz in its arguments. Also, as discussed in the proof of Theorem 4.1, we have that (p, t) → ψ i ( p(t) −p ) is locally Lipschitz inp and measurable in t. Consequently, existence of solutions follows by showing that the hypotheses (i)-(iii) of Lemma A.2 are satisfied.

2.1.3(c)] and the definition of π i given by
We use the expression of π i given by (8) and rewrite the dynamics (23a) as follows: where ν : Note that the term ν is bounded for allp ∈ R n and γ i ∈ [0, 1]. In particular, it follows from proj B (p) ∈ B that there exists a constantν > 0 such that ν ≤ν for all (p, col(γ i )) ∈ R n × [0, 1] N . Now consider the Lyapunov candidate V (p) := 1 2 p − proj B (p) 2 . Since proj B (p) is unique at any pointp ∈ R n (cf. equation (22)), it follows from Danskin's Theorem [4, Prop. B.25(a)] that V (p) is differentiable and ∇V (p) =p − proj B (p). Therefore V (p) satisfies Lemma A.2(i)-(ii). We next establish existence of solutions by analyzing the inner product of ∇V (p) and the righthand side of (24). Recalling that h(p, col(γ i )) denotes the right-hand side of (24) (cf. equation (23a)), this in-ner product is computed as The first term on the right-hand side of the above equation is nonpositive as γ i ∈ [0, 1] and Q i ≻ 0 for all i ∈ I. Using this fact and the bound on ν, we get This implies that the inner product ∇V (·) ⊤ h(·) is negative for all p ≥ p 0 +δ + 2εν and γ i ∈ [0, 1]. Therefore, hypothesis (iii) of Lemma A.2 is satisfied, and the closed-loop system has a bounded Carathéodory solution for all t ≥ 0.
Here, T 1 is equal to zero ifp(0) ∈B(p 0 ,δ), and T 1 is given by (28) otherwise. In other words, T 1 is the time when the trajectory t →p(t) enters and stays in the set B(p 0 ,δ). Recall that γ i (t) ∈ [0, 1] at all times. We will next show that full trust of all the agents is achieved in the time interval [T 1 , T 2 ] for some finite time T 2 .
Noting thatδ satisfies (26) andp(t) ∈B(p 0 ,δ) in the time interval [T 1 , ∞), there exists someρ > 0 such that p(t) −p(t) ≤ρ < min i∈I δ i in the same time interval. By Assumption 3.1, we deduce that ψ i ( p(t) −p(t) ) ≥ ψ i (ρ) > 0 for all i ∈ I. This implies that, analogous to the discussions of trust variables in the proof of Theorem 4.1 and (17), In the time interval [T 2 , ∞), using γ i (t) = 1 for all i ∈ I, the dynamics of the price prediction (24) reduces tȯ wherep(T 2 ) ∈B(p 0 ,δ) and we used the expression of x * in (14). Now, we consider the Lyapunov candidate W (p) := 1 2 p − p * 2 and analyze its evolution along the solution of (29). We havė The second term on the right-hand side satisfies . This implies thatp exponentially converges to p * in the time interval [T 2 , ∞), and the aggregate behavior i∈I x i converges to x * .
Remark 4.5. While Theorem 4.4 guarantees existence of a sufficiently small ε * given by (27), computing its value requires the knowledge of bounds on agent parameters c i , Q i , δ i , andλ i . If such bounds are not available, one can opt for the hard nudge mechanism (16) at the cost of restricting the initial conditionp(0) to B.
• Remark 4.6. The results of the hard and soft nudge mechanisms remain valid for more general classes of cost functions than (2). In particular, let the cost functions be of the form J i (z i , p) := c i (z i ) + z ⊤ i p, where c i : R n → R is C 2 and strongly convex. It follows that the model of the agents in (7) will be modified to It can be shown that for any desired behavior x * ∈ X * with both hard and soft nudges guarantee convergence of the aggregative behavior to x * . However, when the desired behavior is time-varying, as considered in the next section, devising a suitable nudge mechanism becomes much more challenging. Therefore, to unify the presentation throughout the paper, we have provided our results for the linear-quadratic cost function (2). •

A nudge mechanism for temporal desired behaviors
So far, we have treated the desired aggregative behavior as a fixed point. However, this point may vary with time in practice due to changes in the market condition, the climate, and government policies. In the context of power systems, for instance, climate change affects the efficiency of power production as well as the energy consumption [7]. The policies passed by the government also affect the market substantially, see e.g. [29] regarding renewable energy. These changes entail variations of the desired aggregative behavior over time. Building on (21), we design here a nudge mechanism that steers the aggregative behavior of the agents to a desired timevarying signal t → x * (t). The set of admissible reference signals x * ( · ) is given by the assumption below.
Assumption 5.1. The signal t → x * (t) belongs to the set X * given by (12) for all t ∈ [0, ∞). In addition, x * ( · ) is continuously differentiable with bounded derivative over the domain [0, ∞), that is, there exists a constant θ > 0 such that ẋ * (t) ≤ θ for all t ∈ [0, ∞). • The above assumption indicates that the desired aggregative behavior of the agents satisfies a regularity condition in the sense that it is smooth and belongs to the admissible set X * . For all t ∈ [0, ∞), since x * (t) ∈ X * , we obtain from (12) that there exists a unique p * (t) ∈ B such that Rearranging the terms, p * (t) can be written explicitly as Note from Assumption 5.1 that the signal t → p * (t) is differentiable with a bounded derivative. If the system regulator had accurate knowledge of all Q i and c i parameters, it could have obtained the desired behavior by setting the price prediction equal to p * (t). However, since the cost functions of the agents are unknown to the system designer, such a simple strategy cannot be implemented. This asks for a more sophisticated design, and to that end, we propose the following adaptive nudge mechanisṁ where B is given by (11), K(t) F is the Frobenius norm of K(t), ε > 0, τ > 0, and the function σ s : R ≥0 → [0, σ] is given by In the above definition, σ > 0 and k 0 > 0 are design parameters that are selected afterwards.
Interpretation of the adaptive nudge mechanism: There are several remarks in order concerning the adaptive nudge (33): (i) This mechanism simplifies to the soft nudge mechanism (21) in case of a stationary desired aggregative behavior. Namely, withẋ * (t) = 0, the dynamics (33a) reduces to (21) and (33b) can be discarded. (ii) Compared to the soft nudge mechanism, the additional term K(t)ẋ * (t) is included to cope with the temporal nature of the desired aggregative behavior by tracking the signalṗ * (t) given by (cf. equation (32)) p * (t) = K * ẋ * (t), Again since the regulator is not aware of all cost func-tions, a static choice K(t) = K * would not be feasible and we, therefore, appeal to the adaptive law (33b). (iii) The first term on the right-hand side of (33b) is chosen such that sign-indefinite terms in the time-derivative of the Lyapunov function are canceled out. The second term provides a state-dependent damping that prevents the matrix K(t) to become unbounded.
Selection of design parameters: In order to guarantee convergence of the adaptive nudge algorithm, the design parameters ε, σ, and k 0 should be chosen appropriately. The treatment in Lemma B.1 in the appendix suggests to choose ε ∈ I ε , σ ∈ I σ , and k 0 ∈ I k0 with Note that the design parameters can take any values within the bounds indicated above, and therefore their selection is oblivious of the exact values of the cost parameters.
The main result of this section is provided in the following theorem.
Proof. Our proof builds on the results of Lemma B.1. Let ε ∈ I ε , σ ∈ I σ , and k 0 ∈ I k0 , then it follows from Lemma B.1 that the closed-loop system admits a bounded Carathéodory solution over domain [0, ∞). Consider any solution t → (p(t), K(t), col(γ i (t))). Again from Lemma B.1, there is a finite time T ≥ 0 such that for all t ≥ T , we have p(t) ≤p and K(t) F ≤k with p andk given by (B.7). Next we prove convergence of (p(t), col(γ i (t))) to (p * (t), 1 N ) by considering three time intervals [T, T 1 ], [T 1 , T 2 ], and [T 2 , ∞). The first time interval concerns the convergence analysis ofp(t) to the neighborhood of B. Full trust of the agents is achieved in the second time interval, while convergence ofp(t) to p * (t) is established in the last time interval.
We analyze the interval [T, T 1 ] by considering the price prediction dynamics (33a) as a system with bounded exogenous signals. In particular, we substitute the expression of x i given by (7b) and (8) into (33a) to get: where t → γ i (t) and t → ν(t) are treated as exogenous signals and ν(t) : . From the proof of Lemma B.1, we see that the time instant T and the ultimate boundsp andk are uniform for all ε ∈ I ε . This, in addition to proj B (p) ∈ B, γ i (t) ∈ [0, 1], and boundedness of x * (t) andẋ * (t) (cf. Assumption 5.1), imply that ν(t) is uniformly ultimately bounded. More precisely, there exists some constantν > 0 such that ν(t) ≤ν for all t ≥ T and all ε ∈ I ε . Next we use this property and show that suitable selection of ε provides convergence ofp(t) to the neighborhood of B in finite time. Let withδ satisfying (26). This results in ε * ∈ I ε . Moreover, following the steps of the proof of Theorem 4.4, there exists some T 1 ≥ T such that by choosing 0 < ε ≤ ε * , p(t) belongs the ballB(p 0 ,δ) for all t ≥ T 1 . We note that such selection of ε is possible sinceν, and hence ε * , are independent of the choice of ε ∈ I ε .
Bearing in mindp(t) ∈B(p 0 ,δ) for all t ≥ T 1 , an analogous argument to the proof of Theorem 4.4 can be used to show that there exists a finite time T 2 ≥ T 1 such that we have γ i (t) = 1 for all i ∈ I and t ≥ T 2 . Next we exploit γ i (t) = 1 to establish convergence ofp to p * in the time interval [T 2 , ∞). We perform a change of coordinates to ease the notation, namely, (p, K) → (p, Φ) with p =p − p * and Φ = K − K * where K * is given by (35). In these coordinates, the closed-loop system, comprised of (7) and (33), takes the forṁ where we have used γ i (t) = 1 and the expressions of π i , x * (t), andṗ * (t), respectively given by (8), (31), and (35). For the rest of the proof, we use the following definition for notational simplicity.
Consider the following Lyapunov candidate V (p, Φ) := 1 2 p 2 + 1 2τ Tr Φ ⊤ Q −1 Φ . The evolution of V along the solutions of (37) is given bẏ It follows fromp ⊤ Φẋ * (t) = Tr ẋ * (t)p ⊤ Φ and (30) thaṫ We proceed to show that, given k 0 ∈ I k0 , the second term on the right-hand side is nonpositive. We note that Φ = K + Q −1 due to (35) and (38). It then follows from In the previous inequality, we used Tr K ⊤ Q −1 K ≥ λ min Q −1 K 2 F and λ min Q −1 = 1/λ max (Q) to find the first term on the right-hand side, and the second term is obtained using Cauchy-Schwarz inequality as |Tr(K ⊤ Q −2 )| ≤ K F Q −2 F . In addition, notice that we have Q −2 F ≤ √ n/λ 2 min (Q). It then follows from the definition of I k0 that ||Q −2 F ≤ k 0 /λ max (Q) for all k 0 ∈ I k0 . The latter implication implies that (40) can be further bounded as Bearing in mind the definition of σ s ( · ) ≥ 0 given by (34), we find that σ s K F ( K F − k 0 ) ≥ 0 for all K ∈ R n×n . Combining this with the above inequality results in −σ s K F Tr K ⊤ Q −1 Φ ≤ 0. Consequently, the relation (39) providesV Next, recalling that the dynamics (37) is a nonautonomous system, we use Barbalat's lemma [26,Lem. 4.2] to conclude convergence ofp(t) to the origin. Let f (t) := t T2 p(s) 2 Q ds for t ≥ T 2 . From (37), we see thaṫ p(t) is bounded for all t ≥ T 2 . This implies thatf (t) is bounded too, and thusḟ (t) is uniformly continuous. The next step is to show that the function f (t) has a finite limit as t → ∞. For that, we integrate both sides of (41) and use the definition of f (t) with V (t) ≥ 0 to obtain The left-hand side of the inequality above is bounded since V (T 2 ) is bounded. It then follows from Barbalat's lemma that lim t→∞ḟ (t) = 0, i.e.,p(t) → 0 as t → ∞. We conclude thatp(t) converges to p * (t) in the time interval [T 2 , ∞), and in turn, the aggregative behavior i∈I x i (t) converges to x * (t) as desired.
Remark 5.3. We note that one can also devise an adaptive nudge mechanism that is built on the hard nudge (16) as follows: where K(t) F is the Frobenius norm of K(t), τ > 0, and the function σ s : R ≥0 → [0, σ] is defined in (34). We can then show that for any t → x * (t) satisfying Assumption 5.1, choosing the design parameters σ > 0 and k 0 ∈ I k0 with I k0 given by (36), results in convergence of the aggregative behavior to x * (t). We note, however, that the resulting convergence is restricted to the ball B and is thus not global, unlike in the adaptive (soft) nudge mechanism (33). The details of the analysis are omitted due to lack of space. •

Case study
We illustrate the performance of our nudge mechanisms by considering the problem of coordinated charging of plug-in electric vehicles [19]. In this problem, the objective of the regulator is to control the aggregative power demand over a charging horizon.
We consider a population of I = {1, . . . , 10} agents, where each agent i aims at choosing its charging strategy over the charging horizon of length n = 24, namely z i ∈ X i ⊂ R n , such that its cost function given below is minimized: where a i ∈ [0.004, 0.006] and b i ∈ [0.065, 0.085]. The set X i is nonempty, compact, and convex, and it is defined as follows: wherex i ∈ [8,10](kW) is the maximum charging rate at any instant, and d i ∈ [25,35](kWh) is the total energy required by the agent.
Since agents choose their actions from the sets X i , rather than R n , the expression of the optimal action (4) modifies to [4, Prop. 2.1.2 and 2. 1.3(b)], Note that for X i = R n , the expression (44) reduces to (4). As for the choice of ψ i , we pick Taking Assumption 3.2 regarding the actual price signal into consideration, we pick p 0 = 0.31 n ($/kWh) and consider price fluctuations to satisfy ∆p(t) ≤ 0.1($/kWh) for all t ≥ 0. Let ρ = 0.2, then ρ is less than or equal to the expression on the right hand side of (10). Consequently, the open ball B(p 0 , ρ) = {p ∈ R n | p−p 0 < ρ} is a feasible set for the price prediction such that the regulator can gain agents' trust. We also define the ball B by choosingδ = 0.15. Therefore the condition (10) is satisfied noting thatδ < ρ.

Stationary desired behavior
Here we demonstrate convergence of the aggregative behavior to a desired behavior x * shown in Fig. 3, under both hard and soft nudge mechanisms. The desired aggregative behavior specifies the goal of the system regulator in nudging the vehicles to charge their batteries in a specific interval.
We choosep(0) = p 0 ∈ B for the hard nudge, whereas we set ε = 10 −3 andp(0) = p 0 + 0.061 n / ∈ B for the soft nudge to demonstrate convergence for an initialization outside the ball B. Fig. 4 shows the distance of the mechanisms' price predictions to p 0 and the average of the trust variables. We observe that for the hard nudge, the price prediction belongs to the ball B for all times, and as a result, the trust variables converge to one. The latter is deduced from convergence of the average of the trust variables to one and γ i ∈ [0, 1]. For the soft nudge, the price prediction converges to a positively invariant set inside the open ball B(p 0 , ρ), which in turn increases the agents' trust onp. After gaining full trust of the agents, the price predictions of both mechanisms converge to p * ∈ B. Therefore, the aggregative behavior of the agents, namely the aggregative power demand, converges to x * as demonstrated in Fig. 5.

Temporal desired behavior
Next, we consider the case where the desired aggregative behavior varies with time, and employ the adaptive nudge protocol to steer the aggregative behavior towards such behavior. We choose the desired behavior as x * (t) = 1+cos(3t) 2 m + 1−cos(3t) 2 s with m and s shown in Fig. 6. Recalling the structure of the cost function C i as (43), we observe that its minimization is equivalent to minimization of J i given by (2) with Q i = 2a i I n and c i = − bi 2ai 1 n . Therefore, the matrix K * in (35) and thus the matrix K in (33) becomes a scalar matrix, i.e., K = kI n , and the adaptive nudge (33) reduces tȯ For the design parameters of the mechanism, we set ε = 2 × 10 −5 , σ = 10 5 , and k 0 = 10. Noting the bounds of a i 's, i.e., 0.004 ≤ a i ≤ 0.006, the chosen parameters belong to the intervals defined in (36). Fig. 7 presents the simulation results for τ = 1,p(0) = p 0 + 0.061 n / ∈ B, and k(0) = 0. The results demonstrate that the price prediction enters the ball B(p 0 , ρ) and the trust variables converge to one. Subsequently, the price prediction converges to p * (t), and as a consequence, the aggregative behavior converges to the desired one as depicted in Fig. 8.

Conclusions
We have presented a nudge framework where a regulator can steer the aggregative behavior of a set of price-taking agents to a desired behavior by sending a suitable price  prediction signal. Due to the discrepancy between the signal sent out by the regulator and the actual price, we have incorporated trust dynamics in the agents' model, where the trust variables get updated based on the history of the accuracy of the price prediction signal. Nudge mechanisms have been proposed to steer the aggregative behavior of the agents to desired stationary as well as temporal behaviors. Analytical convergence guarantees have been provided for the proposed nudge mechanisms and the results are demonstrated on a numerical case study. Future works include investigating the application of the proposed nudge framework in transportation as well as power networks.
have the function (x, t) → h(x, t) is Lipschitz on the set X [16,Ex. 3.19] and measurable in t. Consequently, by [13,Thm. 2], the system admits Krasovskii solutions. Note that in the referred results, the map h is required to be Lipschitz everywhere in the domain. However, the implication holds even when h is Lipschitz only on the set X , that is, the set where the solutions are restricted to. The proof concludes by using [12,Thm. 6.3] which shows that the set of Krasovskii and Carathéodory solutions are equivalent for autonomous projected dynamical system. The result extends to the nonautonomous case using the same reasoning.
Lemma A.2. Consider a nonempty compact set Y ⊂ R m and two vector fields h : R n × R m × [0, ∞) → R n and g : R n × R m × [0, ∞) → R m that are locally Lipschitz in the first two arguments and measurable in the third one. Consider the nonautonomous projected dynamical systemẋ = h(x, y, t), y = Π Y (y, g(x, y, t)) . (A.2) Moreover, assume that there exist a continuously differentiable function V : R n → R satisfying: there exists a constant µ > 0 such that the following holds for all y ∈ Y, t ∈ [0, ∞), and x ≥ µ, ∇V (x) ⊤ h(x, y, t) ≤ 0.
Proof. Our proof proceeds in two steps. First, for each initial condition, we design a nonautonomous projected dynamical system that admits a solution starting from the said initial point. Second, we show that this solution is also a solution of (A.2).
Next, we rewrite the dynamics of the overall closed-loop system in a suitable form to argue existence of solutions. Let ϕ := vec(K) and ξ := col(p, ϕ), then the closed-loop system, made of (7)  where h defines the right-hand side of (B.1). Note that the map t → h(ξ, col(γ i ), t) is measurable as a consequence of Assumption 5.1. Further, using the fact that σ s is Lipschitz and following arguments analogous to those provided in the proof of Theorem 4.4, we deduce that the map (ξ, col(γ i ), t) → h(ξ, col(γ i ), t) is locally Lipschitz in (ξ, col(γ i )). Also, the map (p, t) → ψ i ( p(t)−p ) is locally Lipschitz inp and measurable in t. Hence, the existence of bounded solutions over the domain [0, ∞) follows from verifying that the hypotheses (i)-(iii) of Lemma A.2 hold. The rest of the proof achieves this.
In the following discussion, we show existence of some µ > 0 such that for all col(γ i ) ∈ [0, 1] N and t ≥ 0. This verifies that Lemma A.2(iii) holds and establishes existence.
For simplicity of presentation, we compute H in the coordinates of (p, K, col(γ i )). Note that in this coordinates, the Lyapunov candidate becomes V (p, K) = where ṗK stands for the right-hand side of (B.1).
Expanding on the expression, we get In (B.4), we have dropped the arguments of H for simplicity. Since γ i ∈ [0, 1] and Q i ≻ 0 for all i ∈ I, the first term on the right-hand side of (B.4) is nonpositive. Hence, we have Further, one can show that I n − i∈I γ i Q −1 i ≤ 1 + λ max ( i∈I Q −1 i ). This yields where we used ẋ * (t) ≤ θ (cf. Assumption 5.1), K ≤ K F and Young's inequality 2 d(p) K F ≤ d(p) 2 + K 2 F . Consequently, using the above inequality and the bounds on ν(t) andẋ * (t), we deduce that We proceed the proof by showing that, by selecting the design parameters carefully, there exists a compact set such that the right-hand side of the foregoing equation is negative outside of this set. Toward this end, we make use of the definition of σ s ( · ) and deduce that, for any σ > 0 and k 0 > 0, the last term on the right-hand side of (B.5) satisfies −σ s K F K 2 F ≤ − σ 2 K 2 F + σ 2 k 2 0 . This implies that