Breaking Consensus in Kinetic Opinion Formation Models on Graphons

In this work, we propose and investigate a strategy to prevent consensus in kinetic models for opinion formation. We consider a large interacting agent system, and assume that agent interactions are driven by compromise as well as self-thinking dynamics and also modulated by an underlying static social network. This network structure is included using so-called graphons, which modulate the interaction frequency in the corresponding kinetic formulation. We then derive the corresponding limiting Fokker– Planck equation, and analyze its large time behavior. This microscopic setting serves as a starting point for the proposed control strategy, which steers agents away from mean opinion and is characterized by a suitable penalization depending on the properties of the graphon. We show that this minimalist approach is very effective by analyzing the quasi-stationary mean-field model solutions in a plurality of graphon structures. Several numerical experiments are also provided to show the effectiveness of the approach in preventing the formation of consensus steering the system toward a declustered state.


Introduction
Opinion formation models have been extensively studied in several research communities in the last decades.Classical models are based on the assumption that an individual's opinion is influenced by binary interactions with others as well as their surroundings (for example through social media).Most of them describe the dynamics of each individual, resulting in complex large systems, see for example [8,26,32,36,37,39,40,45,54].In many of these models the underlying microscopic dynamics lead to the formation of complex macroscopic patterns and collective states.Methodologies from statistical mechanics, especially kinetic theory, have been used successfully to derive and analyze these complex stationary states in suitable scaling limits.Toscani's seminal work on kinetic opinion formation models, see [55], was one of the starting points of various kinetic models for collective dynamics, studying for example the effects of leaders on the opinion formation process [4,28,29], decision making [50,51] and the influence of exogenous factors [15,31,59] to name a few.
There has been a general agreement that individuals change their opinion due to interactions with others (these interactions are almost always assumed to be binary).Most models assume that only likeminded individuals interact, known as bounded confidence models , and that dynamics are strongly driven by the tendency to compromise.In addition they often assume that individuals change their opinion due to self-thinking, for example because of exposure to different media channels.There is a rich literature on models for opinion formation in large interacting agent systems, see for example [18,37,45] as well as the influence of social networks on the opinion formation process.The later trend was accelerated as more and more data from social networks, such as X (formerly known as Twitter) or Facebook became available.This allowed researchers to investigate for example challenging questions related to the influence of voter's behavior on the success of vaccination campaigns, see for example [2,6].For other related works about the control of opinions on evolving networks see [5], polarization see [9,41,43] and marketing aspects see [56].
Several works proposing different control strategies to enhance consensus formation can be found in the multi-agent system literature, see for example [20,21].On the other hand control strategies to prevent consensus formation, we will refer these strategies as declustering, have been less studied [53].However, these declustering strategies can be a useful tool to understand how for example software-managed social media accounts, also known as bots, can prevent consensus formation or steer opinions in social networks; see for example [7,34].It is therefore of interest to understand control mechanisms, which prevent consensus formation in large social networks.
Social networks can be studied using graph theory.Graph theory has become one of the most active fields of research in connection with the collective behavior of large populations of agents, see for example [10-13, 47, 58].The necessity to handle millions, and often billions, of vertices lead to the study of large-scale statistical properties of graphs.In recent years, large discrete networks have been treated as continuous objects through the introduction of new mathematical structures called graphons, which stands for "graph functions", see [17,19,35,42,57].The main feature of graphons relies on the possibility to bypass the introduction of classical adjacency matrix by looking at an associate function W (x, y) encoding all the information on the connectivity of the original discrete graph.Graph-theoretical research aside, the concept of graphons have been also applied to optimal control [33,38] and especially epidemiological theory [27,30,46].We mention the recent works on kinetic and mean-field equations on graphons [14,16,24,48]: in particular, graphons are leveraged to analyze the behavior of mathematical models acting on networks when their number of nodes becomes very large.Thorough analysis of such limiting procedures have been also investigated recently-see e.g., [44] and [52] and references therein-, increasing the interest on connections between differential equations models as continuous limits of stochastic processes happening on discrete graphs, like random walks and diffusion processes, for example.
Fokker-Planck type equations acting on suitable limits of large dense graphs in particular have been receiving increasing attention ( [14,24]).Within this framework, our analysis is devoted to a continuous model acting on a very large network (not necessarily dense) via its graphon representation rather than its limiting sequence of finite graphs.This allows for the very natural interpretation, from the kinetic theoretical point of view, of the network as a (continuous) interaction kernel, which then leads to an effective surrogate mean-field model that incorporates the graphon.
In this paper we propose and investigate possible strategies to prevent consensus formation in a kinetic model for opinion formation on networks.Our main contributions are • Development and analysis of a minimal control problem on the agent based level.
• Derivation of a closed form solution of the controlled model in suitable scaling limits, showing that the proposed strategy does indeed prevent consensus in the long time limit (under certain conditions on the parameters).
• Provide extensive computational experiments illustrating the effectiveness of the proposed control strategy for power-law, k-NN and small-world graphons.
This paper is structured as follows: in Section 2 we introduce and analyze a kinetic opinion formation model on a stationary network.In Section 3 we propose a simple but very effective control mechanisms, which prevents the formation of consensus.In particular we are able to provide closed form optimal controls, which prevent consensus formation in certain parameter regimes.We corroborate our analytical results with computational experiments in Section 4.

Kinetic models for opinion dynamics on graphon structures
We consider a large population of indistinguishable agents each characterized by their opinion w belonging to I := [−1, 1], where ±1 corresponds to two opposite beliefs.Agents change their opinion through binary interactions with an interaction frequency modulated by an underlying static network.The opinion formation process itself is based on two different mechanisms: • first compromise dynamics-so individuals with close opinion try to find a compromise • and second opinion fluctuation, which are included via random variables.
The interaction frequency of agents depends on the underlying graph structure, which we model by a graphon in this paper.
Before discussing the binary agent interactions, we recall some basics about graphons (and refer the interested reader to Appendix A for a more detailed introduction as well as further references).Graphons are continuous objects that generalize the concept of simple graphs with a large number of vertices.In case of discrete graphs, nodes are usually referred to using the index i = 1, 2, . . ., N , where N is the number of vertices.However, in graphons the discrete set {1, . . ., N } is mapped onto the continuous interval [0, 1], so that nodes are labeled as x ∈ Ω ⊆ [0, 1], where Ω is a suitable subset of the unit interval of R. We will therefore consider agents which are not only characterized by their individual opinion w on a topic, but also their static position on the graphon x ∈ Ω ⊆ [0, 1].
We consider the following setup for binary interactions: given two interacting agents characterized by their opinion and position in the graphon, that is (x, w), (y, w * ) ∈ Ω × I we compute their post interaction opinions (x, w ′ ), (y, w ′ * ) (note that their positions x and y in the graphon did not change) as: where γ ∈ (0, 1) is the so-called compromise parameter.The interaction function P ( • , • ) ∈ [0, 1] depends on the graphon coordinates x, y ∈ Ω and may also depend on the opinion variables w, w * ∈ I.For instance, in [55] the case P = P (|w|) as a non-increasing function with respect to |w| and such that 0 ≤ P (|w|) ≤ 1 is explored.A different choice, resembling the case of a bounded confidence model like Hegselmann-Krause's [37] can be obtained setting P = P (|w − w * |).Moreover, we remark that this particular choice has the advantage of ensuring the conservation of the average opinion but presents analytical difficulties arising from the presence of the absolute value.In the rest of the manuscript, we focus on the case where the interaction function only depends on the agents' positions on the graphon.A possible choice could be where d i (z) is the in-degree of the node at coordinate z ∈ Ω as defined in Definition 2 of Appendix A. Hence interactions depend on the connectivity of each agent-the more connected an agent is the less it is influenced by the other, while agents with a lower connectivity are affected more.In particular d i (x)/d i (y) ≫ 1 implies that, on average with respect to the random variable η, the agent with the highest degree keeps their opinion; ≈ 1 implies that agents with a similar number of incoming connections are the ones that can most influence each other; implies that the less influential agents tend to adopt the opinion of the more connected ones.
A different possible choice of P (x, y) that would give rise to similar dynamics would be, for instance, P (x, y) = (1 + d i (x)/d i (y)) −α , with α > 0.
In equation (1), η and η are independent and identically distribution centered random variables with finite moments up to order three and such that ⟨η 2 ⟩ = ⟨η 2 ⟩ = σ 2 < +∞.Here we denote by ⟨ • ⟩ the expectation with respect to the distribution of the random variables.The variables η, η account for random fluctuations in an individual's opinion due to, for example, media exposure.The function D( • , • ) ≥ 0 encodes the local relevance of the diffusion, possible choices include D(w) = √ 1 − w 2 .In this case agents diffuse the most if they have an indifferent opinion, that is w ≈ 0, while they are less influenced by external factors once they settled on one of the two 'extreme' choices.Note that this choice also ensures that opinions stay within I.
Next we discuss some basic properties of the binary interaction defined before.Under the assumption that the compromise propensity function P satisfies 0 ≤ P (x, y) ≤ 1 and 0 < γ ≤ 1/2, the following Proposition (see, e.g.[50,55,56]) ensures that the post-interaction opinions still belong to the reference interval.then the binary interaction (1) preserves the interval and the post-interaction opinions are such that w ′ , w ′ * ∈ I.
A direct computation shows that for all w, w * ∈ I If P (x, y) = P (y, x), i.e., the compromise function P is symmetric mean opinion is preserved in interactions, that is ⟨w ′ + w ′ * ⟩ = w + w * .On the other hand, the energy is not conserved on average since If σ 2 ≡ 0 and we have symmetric interactions, we see that the mean energy is dissipated.
We can now state the evolution equation for the distribution of agents f = f (x, w, t) with respect to their position x ∈ Ω and opinion w ∈ I. Consider a fixed number of players, N , then the binary interactions (1) induce a discrete-time Markov process with N -particle joint probability distribution P N (x 1 , w 1 , x 2 , w 2 , . . ., x N , w N , t).This allows us to write a kinetic equation for the one-marginal distribution function, using only the one-and two-particle distribution functions [22,23], Here, ⟨•⟩ denotes the mean with respect to the random variables η, η.By continuing this process one obtains a hierarchy of equations, the so-called BBGKY-hierarchy [22,23], describing the dynamics of the system of a large number of interacting agents.
It is a standard assumption to neglect correlations, implying that P 2 (x i , w i , x j , w j , t) = P 1 (x i , w i , t)P 1 (x j , w j , t).
By scaling time and performing the thermodynamical limit N → ∞, one can use standard methods of kinetic theory [22,23] to show that the time-evolution of the one-agent distribution function f is governed by the following non-Maxwellian Boltzmann equation where Q(f, f ) is the so-called collisional operator Here (x, ′ w) and (y, ′ w * ) are pre-interaction opinions generating the post-interaction opinions (x, w) and (y, w * ) and ′ J is the Jacobian of the transformation ( ′ w, ′ w * ) → (w, w * ).In equation ( 4) the kernel B(x, y) : Ω 2 → R + is a given graphon.It can be thought of as the continuous equivalent of an adjacency matrix.Its use in (4) allows us to include an underlying network structure on the continuous level.
Recent approaches to opinion formation modeling in the kinetic communities, e.g., [2,5,6,56], take into account a graph structure via some of its statistical descriptions, like for example considering the number of connections of each agents as an adjoint variable.Instead, the use of a graphon kernel allows for a richer and more general description of individual connections among agents.Solutions to equation (3) preserve features of the underlying microscopic interaction rule.To compute the evolution of the mean opinion, we consider the weak formulation of equation ( 3).Let φ(x, w) be a test function, then Setting φ(x, w) ≡ 1 in (5) yields conservation of mass.Furthermore, from the conservation of the microscopic average opinion (2) we see that φ(x, w) = w gives conservation of the mean opinion (again assuming that the interaction function is symmetric).Using conservation of the mean opinion, we can show that for any φ(x, w) = w α φ(x), where φ( • ) is a suitable test function, the macroscopic quantities are conserved in time.Indeed, from (5) choosing φ(x, w) = φ(x)w α we get From the above equation we get-for α = 0, 1-conservation of any weighted macroscopic moment of order α defined in (6).
To investigate the second order moment of f (x, w, t) we introduce the quantities Λ(x, t) := I wf (x, w, t) dw, and Ξ(x, t) that is, the first-and second-order moment, respectively, with respect to w of agents with label x ∈ Ω.Clearly, integrating Λ(x, t) and Ξ(x, t) over Ω gives us the mean opinion and the energy of the population.We will see that for bounded graphons B(x, y) and no diffusion, that is σ = 0, the agent distribution f ( • , w, t) converges toward a Dirac delta distribution centered in the initial mean opinion.This is not surprising since the opinion dynamics corresponds to a consensus formation process modulated by the graphon.Indeed, we have for all x ∈ Ω, Therefore, in the uniform interaction case P (x, y) ≡ 1, even in the presence of non-homogeneous graphon structure supposing ∥B(x, y)∥ L ∞ (Ω×Ω) > 0, we get Thus, we have that the second order moment of f ( • , w, t) tends to its mean squared, and its variance vanishes exponentially fast.Integrating both sides with respect to x ∈ Ω gives the estimate where are the second order moment for the whole population and the global mean opinion, respectively.The latter quantity, thanks to equation ( 6) is conserved in time.Therefore, a bounded graphon kernel implies exponential convergence of the distribution f (x, w, t) toward a Dirac's Delta centered at the initial mean opinion.

Derivation of a mean-field description
Since large time statistical properties of the introduced kinetic model are very difficult to obtain, several reduced complexity models have been proposed.In this direction, a deeper insight on the large time distribution of the introduced kinetic model can be obtained in the quasi-invariant regime presented in [55].The idea is to rescale both the interaction and diffusion parameters making the binary scheme (1) quasi-invariant.The idea has its roots in the so-called grazing collision limit of the classical Boltzmann equation, see [28,49] and the references therein.The resulting model has the form of an aggregationdiffusion Fokker-Planck-type equation which is capable of encapsulating the information of microscopic dynamics and for which the study of asymptotic properties is typically easier.We consider ϵ ≪ 1 and introduce the following scaling for which w ′ ≈ w and w ′ * ≈ w * .Next we Taylor-expand the term encoding the binary interactions in the weak form of the collision operator of equation (5).
Introducing the new time variable τ = ϵt and the corresponding rescaled density g(x, w, τ ) = f (x, w, τ /ϵ) we can rewrite (5) as and R φ (f, f )/ϵ → 0 under the hypothesis ⟨|η| 3 ⟩ < +∞, see [25,55].Consequently, in the limit ϵ → 0 + we get Therefore, integrating back by parts, we formally obtained a Fokker-Planck equation for the evolution of the distribution g(x, w, τ ) where The operator K corresponds to the network-modulated compromise process, the operator H corresponds to the network-weighted density g(x, w, τ ).Equation ( 9) is complemented with no-flux boundary conditions for all x ∈ Ω: This choice of boundary conditions ensures that system (9)- (11) shares the same conservation properties as its microscopic kinetic counterpart.Indeed, we see that the mean opinion is conserved since where we dropped for clarity the dependence on x, w and t for the operators K[g] and H[g].Furthermore, any macroscopic quantity of the form is conserved in time.

Large time agent distribution
Due to the presence of a general compromise propensity P ( • , • ), a closed solution to equation ( 9) is difficult to obtain.Nevertheless, under suitable assumptions on the graphon structure and on the diffusion function, we can write down a closed formulation for the large time agent distribution (9).
In the following we restrict our analysis to the simplified situation where the interactions are homogeneous, i.e., P (x, y) ≡ 1, and the diffusion function is defined as We recall that this choice of diffusion function ensures that w stays within the domain I. Furthermore, we suppose separability of the graphon B(x, y), which corresponds to From the modeling point of view this choice is coherent with relevant examples of graphon structures, like the graphon associated to the case of scale free networks as proposed in [17].Indeed, in this case we have satisfying the introduced separability assumption.Network structures that are found commonly in life and social sciences are often modeled using scale-free networks [10,11,57], i.e., simple graphs whose degree distribution possesses fat tails.
From (13) we define the weighted mass and momentum as Note that both quantities ρ and µ are conserved in time.
Assuming relation (15) holds, the steady state g ∞ (x, w) of the Fokker-Planck model ( 9) satisfies the following equation Due to mass conservation and definitions (16) we can simplify it further and obtain For our particular choice of diffusion function, that is (14), we can compute the steady state of g ∞ (x, w) explicitly, see [55].In particular, setting λ = σ 2 /γ, we get which, as a function of the opinion, is a Beta distribution, weighted by the in-degree d i (x) at x ∈ Ω times a graphon-dependent constant C B which depends on the way the splitting B(x, y) = B 1 (x)B 2 (y) is obtained and such that the right-hand-side of equation ( 17) has unitary mass.Model parameters appearing in the steady state allow to get insights on the shape and other characteristics of the equilibrium opinion distribution: for instance, when µ = 0, g ∞ is an even function of w, so that the population has a neutral opinion on average.On the other hand, the balance between the actions of compromise and self-thinking dynamics expressed by the parameter λ tells us that if the action of the self-thinking is much stronger than the compromise one, i.e., λ ≫ 1, then the tendency of the population would be to polarize at the extremes, tending to a mixture of Dirac's deltas at the boundary points of I. We refer the interested reader to [55] for an in-depth analysis of the roles of interactions parameters on the equilibrium distribution.

Analytical properties
We continue by discussing some analytical properties for solutions to (9).First we show that (9) preserves the L 1 regularity.To this end, we may rewrite equation (9) as Note that we will again consider a specific form of diffusion, that is D(w) = √ 1 − w 2 .Furthermore, we introduce the following quantities, dependent on the compromise propensity function ρ P (x, τ ) := Ω×I B(x, y)P (x, y)g(y, w, τ ) dy dw, µ P (x, τ ) := 1 ρ P (x, τ ) Ω×I B(x, y)P (x, y)w g(y, w, τ ) dy dw.
Note that µ P ( • , τ ) is well-defined since we are considering P to be positive almost-anywhere and the graphon B to be nonnegative, as well.Next, we take, for a given parameter ξ, a regularized non-decreasing approximation of the sign function sgn ξ ( • ), and then introduce the anti-derivative of sgn ξ [g(x, w, τ )](w) for every w ∈ I as the function |g(x, w, τ )| ξ (w), where we stress the dependence on these functions on the variable w.Now, let us fix x ∈ Ω, multiply each side of equation ( 18) and integrate with respect to w.This gives where we used the boundary conditions (11).Since we can substitute this expression and obtain Now, the first integrand on the right-hand side vanishes as ξ → 0 + if we integrate by parts one more time (since by construction lim ξ→0 + sgn ξ [g(x, w, τ )](w)g(x, w, τ ) = |g(x, w, τ )|(w) for almost every w ∈ I).This leaves us with the second integrand, which is nonnegative since we chose a non-decreasing approximation of the sign function.Finally, since graphons are nonnegative by definition at all points in their domain, we conclude that for all x ∈ Ω.This implies that an initial datum in L 1 (I) would ensure that g(x, w, τ ) ∈ L 1 (I) for all τ > 0.
Remark 1.The weak contractivity of the L 1 norm with respect to the opinion also allows us to prove uniqueness of solutions to (18).The proof is based on contradiction; assume there exist two solutions g(x, w, τ ) and s(x, w, τ ) and evaluate the regularized modulus of their difference for each point x ∈ Ω.If we fix x ∈ Ω, then due to linearity with respect to w we have that g(x, w, τ ) − s(x, w, τ ) is a solution to equation (18), too.Therefore which implies that, at x ∈ Ω, g = s for almost all w ∈ I and τ > 0 since by construction we have g(x, w, 0) = s(x, w, 0) for all x ∈ Ω and w ∈ I.The claim then follows since x ∈ Ω was chosen arbitrarily.
Remark 2. The weak contractivity of the L 1 norm with respect to w (i.e., the norm is not increasing in time) gives us as a corollary that the model ( 18) is positivity-preserving.The claim follows noting that its solution g(x, w, τ ) has a vanishing negative part if the initial datum is nonnegative.Indeed, we can express the negative part of g(x, w, τ ) via the regularization we introduced earlier, that is This way, if we integrate with respect to w, we have thanks to the first boundary condition in (11).Then it holds for all x ∈ Ω, which implies that the nonnegativity of the initial datum is preserved by the model (18).
We conclude this section by extending the previous regularity result-if the initial datum is in L p (I), p > 1 at x ∈ Ω then the solution g ∈ L p (I) at x for all τ > 0. We will use both the positivity-preserving and the L 1 regularity of its solution in the following.
We note that we only show L p -regularity for all p ≥ 2 since the result will also hold also for p ∈ (1, 2) due to the boundedness of the interval I.The idea is to rewrite equation (18) and impose the associated no-flux boundary conditions in order to estimate the time evolution of ∥g∥ L p using integration by parts.The hypothesis on p to be greater or equal than 2 is needed since we will need to take the derivative of g p−1 (x, w, τ ) under the integral sign, but as stated above it is not restrictive.
We define the right hand side of (18) as Q(g, g)(x, w, τ ) that is Suppose that the following no-flux boundary conditions hold for all x ∈ Ω and for all τ > 0 Now let us multiply each side of equation ( 9) by g p−1 (x, w, τ ) and integrate with respect to w Now the goal is to show that T 2 is non-positive, so that it can be ignored in the estimate, and then focus on T 1 .Starting with T 2 , we integrate by parts and use the second boundary condition in ( 19) to obtain since w ∈ I, g(x, w, τ ) is nonnegative and ρ P (x, τ ) is nonnegative at all x ∈ Ω.Next we consider T 1 , and derive two different estimates for it.If we expand the derivative with respect to w under the integral sign, we have On the other hand, we could as well integrate by parts and using the first boundary condition in (19) and get If we now use the identity to replace T 1 as the appropriate convex combination of the two equations we obtain: Putting everything together we deduce that for a given x ∈ Ω.Then Gronwall's lemma implies that if the initial datum belongs to L p (I) at x ∈ Ω, then g(x, w, τ ) ∈ L p (I) at x for all τ > 0.
3 Declustering: preventing consensus via control strategies In this section we focus on control strategies to prevent consensus.In particular, the one driven by the compromise process.This objective is different to more common optimal control strategies, which would for example steer the average opinion to a given target [1,3,5].To this end, we propose an additional interaction to prevent the formation of opinion clusters by enforcing a controlled interaction.In particular, we consider a convex combination of two updates weighted by the parameter θ ∈ (0, 1) such that a fraction 1 − θ of the population follows an opinion transition of the type (1), whereas a fraction of size θ follows an opinion update given by a controlled interaction of the form where u * is an agent-based control arising from the solution of a suitable optimization problem and S(w) ≥ 0 a suitable selection function dependent on the opinion.In particular, the optimization problem focuses on the minimization of a suitable convex cost functional J on the set U of admissible controls, which in our case are those such that the post-interaction opinion w ′′ stays within the interval I and has the form The quantity w ′′ * appearing on the right-hand-side of ( 21) is a virtual update which the functional J would be subject to: in fact, the optimal control u * is the solution of the following optimization problem where m denotes the average opinion of the population on the network at time t and ν > 0 is a regularization parameter.Notice that the actual update w ′′ and the virtual update w ′′ * have an opposite effect on w.Solving the associated Lagrange-multiplier problem and so the resulting interaction is In particular, in the rest of the paper we focus on the selection function Remark 3. Thanks to the presence of the indicator function in (24), we can verify that w ′′ ∈ [−1, 1] and therefore that the controlled interaction is admissible.
We recall that we balance the two types of interactions: at a rate 1 − θ, agents update their opinion according to (1), at the rate θ they interact with the external control (23).This yields a kinetic model whose right-hand side is convex combination of two non-Maxwellian operators The role of the parameter θ ∈ [0, 1] is to model the frequency at which the different kinds of interaction take place: it can be thought as the percentage of automated users (e.g., bots programmed by a third party) on the network.The operator Q(f, f ) is the same introduced in equation ( 4); the operator Q u (f )(x, w, t), instead, encodes the controlled update of agents' opinions as prescribed by the elementary interaction (23) and is therefore given by for any test function φ( • , • ).

Mean-field limit of the controlled model
In this section we explicitly show how the introduced control is capable of breaking consensus on the mean-field level for suitable choice of the penalization.We proceed like we did in Section 2.1 to derive a more approachable mean-field limit of equation ( 25), using the same scaling for a certain κ ∈ R + .In this case, particular care is needed in treating the weak form (26) due to the presence of the indicator function in the interaction (23).Using Taylor expansion like we did in Section 2.1, yields Here, K[g], H[g] and R φ (g, g) are the same operators as in equations ( 10) and ( 8) in Section 2.1, while we denote which can be shown to go to zero with computations analogous to the ones for R φ (g, g).We continue with the term θA[g](x, w, τ )/ϵ: since in the limit ϵ → 0 + we have where we recall that κ > 0 and |m| ≤ 1.Therefore, we obtain the following Fokker-Planck equation where this time we define Here d i (x) is the in-degree of x as defined in Section A and The associated boundary conditions which are necessary to perform the integration by parts are Remark 4. The mean opinion of the population is preserved.Indeed, multiplying each side of equation ( 28) and then integrating by parts we get thanks to equations ( 29) and ( 12).The evolution of the second order moment is trickier, due to the presence of general graphon kernel and the compromise propensity function.In case of the specific diffusion function D( • , • ) ( 14) we can simplify the expression and obtain where we note Φ(x, τ ) := I g(x, w, τ ) dw. Equation ( 31) is still quite general: further insights on the trend of the second order moment can be found in the specialized setting of Remark 6.

Effects of the penalization coefficient on the quasi-equilibrium distribution
In the following we will compute the quasi-equilibrium distribution of (28).This corresponds to solving which we can write in closed form (due to the no-flux bc (30)).Then the quasi-equilibrium distribution f qe is given by where C is a normalizing constant and under the formal assumption that H[g](x, τ )D 2 (x, v) can be taken to be nonzero almost everywhere at x ∈ Ω and on [−1, 1].
To compare the controlled case to the one obtained in equation ( 17) we focus on the case D(x, w) = √ 1 − w 2 , and, having chosen a unitary compromise propensity function we can write where we define If we fix x ∈ Ω, the quasi-equilibrium state of equation ( 32) is a Beta distribution with respect to the variable w.In fact, taking θ = 0 and a separable graphon kernel B(x, y) in equation (32) gives us precisely the steady state we found for the uncontrolled problem, given by (17).We can exploit our knowledge of the quasi-equilibrium to influence the level of declustering of the system: indeed, the opinion distribution is in its least clustered form when it is uniform, i.e., when α − = α + = 0.If we impose these constraints, we can solve them for the penalty term κ as a function of the network position x ∈ Ω and of time τ , i.e., κ = κ(x, τ ).We obtain which, under the further hypothesis that B(x, y) = B 1 (x)B 2 (y) and P (x, y) ≡ 1, simplifies to Remark 5.The constraint m = 0 in (33) is necessary to have a uniform distribution over a symmetric domain, while imposing µ 1 (x, τ ) = 0 is needed to have α − = 0 and α + = 0 simultaneously.When g ≈ U (Ω × [−1, 1]), that is the distribution is almost uniform over the domain Ω × [−1, 1], we have ρ 1 (x, τ ) ≈ d i (x), so that Finally, we stress that the choices ( 33) and ( 34) that would appear in the controlled update (20) come from the analysis of a quasi-equilibrium state for the distribution g(x, w, τ ) which has been computed from the Fokker-Planck equation (28).This implies that the penalty term would be effective in achieving the declustering effect only for a parameter regime in which ϵ is sufficiently small.Remark 6.If we consider again the evolution of the second order moment as in (31), assume that , and use the κ = κ(x, τ ) defined in equation ( 33), we obtain This estimate can be further simplified if m = Λ(x, τ ) = 0 for all x ∈ Ω (a condition needed to observe a centered uniform distribution): In particular, by Gronwall's inequality we have that whenever the graphon kernel is bounded, the variance converges exponentially in time toward 1/3, since, integrating both sides with respect to x ∈ Ω gives where 1/3 is the variance of the uniform distribution over Ω × I.

Numerical tests
We conclude by illustrating the declustering strategy with various computational experiments.We show first the consistency of the quasi-invariant limit of the controlled model ( 25) in the network-homogeneous case.In this case, we choose a uniform, constant graphon kernel B(x, y) ≡ c ∈ (0, 1]; in particular, B(x, y) ≡ 1.In the second example we check the quasi-invariant limit using the power-law graphon, which is separable, to model an interaction happening on a scale-free network.In the third experiment we illustrate the dynamics for non-separable, graphon kernels.All tests were performed using direct simulation Monte Carlo methods for the Boltzmann equation ( 25); we refer to [49,50] and references therein for further details.We start describing our method by rewriting equation (25) in strong form as a sum of gain and loss parts: We indicate with Q Σ and Q Σ u the operators obtained replacing the graphon kernel B(x, y) with the approximated version B Σ (x, y, given by B Σ (x, y) := min{B(x, y), Σ}, where Σ is an upper bound for B(x, y) over Ω 2 .Whenever we consider an unbounded interaction kernel (e.g., the power-law case) we consider a suitable truncation for Σ.If we now highlight the gain and loss parts of Q Σ and Q Σ u , we have where we define Then, we discretize the time interval [0, T ] with time step ∆t > 0 and denote as f n (x, w) the time approximation f (x, w, n∆t) to consider the forward-Euler-type scheme where we define We remark that under the condition Σ∆t ≤ 1, f n+1 is well-defined as a probability density.We report in Table 1 all the parameters we used in our computational experiments.Moreover, we always fix D(x, w) = √ 1 − w 2 as diffusion function and P (x, y) ≡ 1 as compromise tendency function.Finally, we consider as initial distribution a state close to full consensus represented by a truncated Gaussian distribution over the interval I for all x ∈ Ω where u 0 = 0, σ 2 0 = 1 10 and C > 0 is a normalization constant.

Consistency of the mean-field model: the network-homogeneous case
We take model (28) with B(x, y) ≡ 1, which corresponds to a fully connected network, in which every node is adjacent to every other.Using the quasi-invariant scaling, we again approximate the dynamics of ( 25) by the one-dimensional Fokker-Planck equation Note that we choose κ = κ, as in equation (34), which corresponds to the optimal scaling to ensure declustering.From Figure 1 we can see that as ϵ approaches zero, the controlled update gets fully effective and the state relaxes toward a uniform distribution.This is also testified by Figure 2, where we report the profile of the distribution g(w, τ ) at time τ = T , with T = 8.We also report the evolution from τ = 0 to τ = T of the entropy, computed as We can see that when ϵ ≪ 1 the entropy approaches the value log (2), which corresponds to the entropy of the uniform distribution over the interval I.

Consistency of the mean-field model: the power-law network case
Next, we take model (28) with B(x, y) = 9/16(xy) −1/4 , i.e., the power-law graphon.We recall that this special choice yields equation (9).Since the power-law graphon is separable, the optimal value for the penalty term is κ of equation ( 34) as for the network-homogeneous case.Figure 3 shows the evolution of g(x, w, τ ) for different values of ϵ.Since the distribution depends on both the opinion and the network position, we illustrate g(x, w, τ ) at three instances in time in the first row, that is τ = 0, τ = T /2 and τ = T , where this time we fix T = 32.The second row of plots in Figure 3 shows the opinion marginal Ω g(w, t) dx for different values of ϵ.In Figure 4 we report again for ease of viewing the power-law graphon kernel that we use in our simulations and the time evolution of the entropy, computed as Again, we see that in the limit ϵ → 0 + the state reaches an uniform distribution over Ω×I = [0, 1]×[−1, 1], since the power-law graphon is defined on the entire unit square [0, 1] 2 .

Declustering on non-separable networks
The last computational experiments illustrate the dynamics in case of non-separable graphons: the k-NN graphon and the small-world graphon.The first one models the k-NN networks as described, e.g., in [27,57,58] and it is defined ( [27]) as where χ( • ) is the indicator function and p, r ∈ (0, 1) are constant real numbers, while the small-world graphon (see, e.g., [27,57,58]) can be defined as In Figure 5 we reported the surface plots for both graphon kernels for fixed values of r = 1/8 and p = 3/4.We consider model (28) with scaling parameter ϵ = 10 −3 and let evolve in time until T = 8, where for this test we considered the network and time dependent optimal penalty coefficient κx,τ as written in equation (33).As we can see, g(x, w, τ ) approaches a uniform distribution over both network topologies.

Conclusion
In this paper we proposed a simple yet very efficient optimal control strategy to break consensus in a kinetic model for opinion formation on graphons.The proposed approach allows us to include complex microscopic features, such as social networks, on the continuum limit and understand the impact of simple declustering mechanisms.
In doing so, we investigate the uncontrolled and controlled models in the mean-field limit.We then investigate the large time behavior and are able to write down closed form solutions of the (quasi)stationary agent distribution of the uncontrolled and controlled problem for certain choices of parameters.This formulation allows us to identify the necessary controls to prevent consensus and steer the crowd toward a uniform distribution.We corroborate our analytical results with computational experiments for various types of graphons.The numerical results confirm our theoretical findings and the success of the proposed declustering strategy.Extensions of the designed approach to include dynamic networks for fully nonlinear equations are actually under study and will be presented in future researches.

A Basic definitions on graphs and graphons
We follow here [17,35] to give a brief overview on graphs and graphons.
A (simple, unweighted) graph G is a pair of sets: V (G) that indicates vertices, or nodes, of G, and E(G) that refers to the edges between nodes, pairwise distinct.Two connected vertices are also said to be adjacent, and in this spirit we can describe the graph G via its associated adjacency matrix A ij (G), where A ij = 1 if and only if (i, j) ∈ E(g), that is, if nodes i and j are connected.A way to represent the adjacency matrix of a graph is its pixel picture, i.e., we discretize the unit square [0, 1] 2 ⊂ R into a grid of N 2 squares of size 1/N , where N is the number of nodes of G. Then the square whose north-west coordinate is (i − 1)/N, (j − 1)/N is painted black if and only if A ij (G) = 1.Since matrices are labeled starting from their upper-left corner, the same happens for the pixel pictures, where the origin is at their upper-left corner as well.If we let the number N become large, we see that the pixel picture "converges" toward a grayscale image.For example, if we consider a random graph where each node is linked to another with probability −→ −→ −→ 1/2 seems to approach the constant function 1/2 over [0, 1] 2 .This concept of limit of a sequence of graphs can be made rigorous introducing the notion of graphon.
We define a (labeled) graphon as a Lebesgue-measurable function from Ω 2 ⊆ [0, 1] 2 to R + , with the tacit assumption that we identify graphons that are equal almost everywhere.An unlabeled graphon is a labeled graphon to which is applied an invertible, measure preserving map to [0, 1] (called re-labeling).
A familiar example of a graphon is just a finite graph, so that it is sensible to see graphons as a consistent generalization of graphs.Indeed, finite graphs can be described as stepfunctions on [0, 1] 2 ], i.e., measurable functions that are piecewise constant.We can construct the step graphon W G associated to the finite graph G directly, pretty much like we did to pass from adjacency matrix to pixel picture: where This implies that the space of graphons endowed with the cut metric is complete: indeed, it is the completion of the space of finite graphs when itself is equipped with the cut norm.
While these results give us a nice framework to analyze graphs when their number of vertices grows very large, an issue arises when we consider the limit of a sequence of sparse graphs.The (edge) density of a graph can be defined as the fraction of its number of edges |E(G)| with respect to the maximum possible, which for simple, directed graphs is |V The form (38) of the density of a graph is a particular case of the p-norm for weighted graphs where α i is the weight of the i-th node, β ij is the weight of the edge that connects node i with node j (with β ij = 0 if the two are not adjacent) and finally α G is the total weight of the graph Now we see that when we consider an unweighted, directed, simple graph, equation ( 38) is equation (39) when p equal 1.We may define the p-norm for graphons, too: for 1 ≤ p < +∞ it has the form ∥W ∥ p := .This tells us that a sequence of sparse graphs will converge toward one of vanishing density, so that its limit graphon will effectively be the uninteresting null graphon.
To overcome this difficulty, we will consider the normalized graphs, i.e., we will divide them by their 1norm: this will allow us to compare graphons with different densities and to consider meaningful sequences of sparse graphs, with the aim of introducing and studying sparse graphons.
Results in [17] assure that normalized graphs converge, up to subsequences, to an L p graphon in the cut metric; so do sequences of step graphons.Moreover, the L p ball of graphons is compact with respect to the cut metric (up to identification of objects at zero distance).
At last, we recall the concept of connectivity and connected component for a graphon.
We say that W is connected if it is not disconnected.A subset S ⊂ Ω such that S is connected and S ∪ {x}, for all x ∈ Ω \ S, is not, is called connected component of the graphon W .
Clearly, we can always write a graphon as a disjoint union of its connected components.as the in-degree of x and the out-degree of y with respect to the graphon W (x, y), respectively.The two definitions agree in case of undirected, i.e., symmetric, graphons.

Figure 2 :
Figure 2: Left: comparison of the distribution g(w, T ) for various values of ϵ, as in Figure 1.Right: comparison of the time evolution for the entropy H[g](τ ) for the same values of ϵ.

Figure 5 :
Figure 5: Top row: simulation results for the small-world graphon kernel.Bottom row: simulation results for the k-NN graphon kernel.Column-wise, from left to right: surface plot of the graphon kernel; time snapshots slices of the distribution g(x, w, τ ) for τ = 0, τ = 4 and τ = 8; time evolution of the opinion marginal distribution g(w, τ ).

Figure 6 :
Figure 6: Example of pixel picture of a 5 × 5 adjacency matrix.

Figure 7 :
Figure 7: Convergence of pixel pictures of adjacency matrices of random graphs to the function 1/2.

1 ]Theorem 1 .Theorem 2 .
is an interval of length 1/|V (G)|.Graphons themselves become elements of a metric space (which we refer to as (G, δ )) once we consider a distance between pairs (W, U ) of them, called the cut distanceδ (W, U ) = inf φ,ψ sup S,T S×T W φ(x), φ(y) − U ψ(x), ψ(y) dx dy ,where φ and ψ are re-labeling while S and T are measurable subsets of Ω ⊆ [0, 1].Notice that the cut distance is just a pseudo-metric, since δ (W, U ) = 0 =⇒ ̸ W = U .There are two main results that are useful when dealing with graphons[35], which we report for convenience.Given a graphon W there exists a sequence (W n ) n such that δ (W n − W ) − −−−− → n→+∞ 0. The space (G, δ ) is compact.