The ambiguous role of social influence on the wisdom of crowds: An analytic approach

"Wisdom of crowds"refers to the phenomenon that the average opinion of a group of individuals on a given question can be very close to the true answer. It requires a large group diversity of opinions, but the collective error, the difference between the average opinion and the true value, has to be small. We consider a stochastic opinion dynamics where individuals can change their opinion based on the opinions of others (social influence $\alpha$), but to some degree also stick to their initial opinion (individual conviction $\beta$). We then derive analytic expressions for the dynamics of the collective error and the group diversity. We analyze their long-term behavior to determine the impact of the two parameters $(\alpha,\beta)$ and the initial opinion distribution on the wisdom of crowds. This allows us to quantify the ambiguous role of social influence: only if the initial collective error is large, it helps to improve the wisdom of crowds, but in most cases it deteriorates the outcome. In these cases, individual conviction still improves the wisdom of crowds because it mitigates the impact of social influence.


Introduction
The idea to establish social science in the spirit of mathematics and physics dates back to the first half of the 19th century, when Auguste Compte (1798-1854) launched sociology based on the belief that the society follows general laws very much like the physical world. At about the same time Adolphe Quetelet (1796-1874) published his "Essays on Social Physics" (1835), where he applied probability theory to data from humans. Later developments in sociophysics [30] tried to adhere to these two approaches: derive a general dynamics applicable to societies, and analyze social data to find universal laws. Our paper aligns to these developments.
We do not enter the controversial discussion to what extent sociophysics has really contributed to the understanding of social systems. But a few conceptual frameworks from physics have indeed inspired the discussion about how to formalize social dynamics. At the heart of statistical physics, as proposed in the late 19th century based on the fundamental works by Ludwig Boltzmann (1844-1906) and J. Williard Gibbs (1839Gibbs ( -1903, is the problem of how the microscopic dynamics of system elements is linked to the dynamics of macroscopic system variables. This question is of paramount importance also for the description of social and of economic systems. For In this paper, we aim at deriving formal expressions for the systemic measures that are able to capture the effect of social influence on the WoC effect. These measures, as proposed in the literature [17,26], are the collective error, measuring how close the average opinion is to the true value, and the group diversity, measuring the variance of individual opinions. For the wisdom of crowds, it is required that the group diversity is large, while the collective error is small. But it is to expect that, in the presence of social influence, the group diversity reduces drastically. This bears the risk that individual opinions converge to a common opinion that is far way from the true value. But there may be conditions under which social influence may help to converge to an opinion closer to the truth. With our investigations, we want to better understand the ambiguous role of social influence. Our approach is motivated by (i) published experimental findings [17,27] and (ii) their analysis by means of an agent-based model [22,23]. The novel contribution of this paper is in the derivation of analytical expressions for the macroscopic quantities that shall describe the WoC effect.
So far, agent-based simulations have been used to obtain insights into the evolution of these quantities. But the challenge to analytically solve the problem has not been tackled yet, also because it requires a lot of effort and some patience, as we demonstrate in this paper. Simulation results are usually quicker and more handy, but provide less insights into how the macroscopic indicators for the WoC effect are composed, and what they depend on. Eventually, one wishes to quantify the range of parameters of the opinion dynamics that may lead to an enhancement or a deterioration of the WoC effect.
2 Quantifying the wisdom of crowds

Measures for the wisdom of crowds
Opinion distribution.
Let us consider an experimental situation, where subjects had, for instance, to estimate the length of the border of Switzerland, which is a non-zero, positive and possibly large value [17]. Each of these subjects returns an individual estimate, x i , which we call an opinion in the following. The values of x i are strictly positive and very broadly distributed. As the mentioned experiments have shown, the expected distribution of opinions, P (x), is rightskewed and can be proxied by a log-normal distribution. This means that the logarithms of x i follow a normal distribution ln x i ∼ N µ ln x , σ 2 ln x with mean µ ln x and variance Var[ln x] = σ 2 ln x .
Because the underlying distribution is very broad, the average opinion is not well represented by the arithmetic mean because it is much larger than most opinion values. Instead, the geometric mean is the appropriate aggregation measure for the average opinion: In the following, we consider that subjects can change their opinion over time either because of random influences or in response to information received from other subjects. The details of these influences are specified in the following section. But from now on, all quantities become time-dependent, i.e. opinions become x i (t). While their initial values x i (0) follow a log-normal distribution, the same does not hold for times t > 0, because of the assumed influences on the change of opinions.
The wisdom of crowds is expected to work if the diversity of individual opinions is large, while the deviation of the average opinion from the true value T is small. Therefore, in line with previous studies [17], we will use the group diversity, D(t), and the collective error, E(t), as macroscopic measures to evaluate these conditions. We are particularly interested in the timedependent change and the stationary values of these measures, as they can indicate under which conditions the wisdom of crowds will break down. To analyze these conditions is the aim of our paper.
Collective error. E(t) shall be defined as the squared deviation of the average opinion from the true value, T : For the dynamics follows: Group diversity. We express D(t) by the variance of the opinion distribution: In order to derive a dynamic equation for the group diversity we use the delta method to approximate the variance. This method is in essence a first-order Taylor expansion of the form: The method will be a poor approximation in cases where f (X) is highly non-linear. This is not the case when f (X) = ln X. For the calculation we write where δ i (t) is an individual's deviation from the average opinion, with ⟨δ(t)⟩ = 0. With this notation and Eqn. (6), the group diversity becomes: This compact expression allows us to derive the dynamics in the form:

Opinion dynamics
To formalize how the opinion x i (t) of each subject changes over time, we build on the framework of Brownian agents [29], which considers a superposition of deterministic and stochastic influences on the dynamics: x i (0) denotes the initial value. The parameter β describes the individual conviction about the own opinion. The larger β, the more an agent tries to stick to the initial opinion. ξ i (t) is Gaussian white noise, i.e. it is not correlated in time, ⟨ξ i (t)ξ i (t ′ )⟩ = δ(t−t ′ ), and zero on average, ⟨ξ i (t) = 0⟩.
A denotes the strength of the stochastic force. The term F ij (t) eventually describes how the change of opinion of agent i is influenced by the opinion of other agents j. Here, we assume that agents have information only about the average opinion of all other agents, which is equivalent to a mean-field scenario. This is reflected in the following assumption for F ij (t): The parameter α describes the strength of the social influence of other opinions x j (t) on the opinion x i (t). In our ansatz the social influence from other opinions increases with the difference between opinions. While this sounds like a simplified assumption, it has been empirically justified in [23], therefore we use it here. Because the coupling variable α is effectively a constant, equal for all i, we have: where ⟨x(t)⟩ is denoted as the mean opinion in the following. Eqn. (10) therefore results in the stochastic dynamics: Averaging the dynamics of Eq. (13) over the whole agent population yields a simple linear form: which is a standard Ornstein-Uhlenbeck process with the solution: Therefore the time average of the ensemble average of the opinons, ⟨x(t)⟩, equals ⟨x(0)⟩ for large t.

Initial configuration.
In order to solve our dynamic equations for the macroscopic measures D(t) and E(t), we still need to specify the initial distribution of opinions. We take the log-normal distribution P (x, 0) with the parameters µ ln x (0) and σ 2 ln x (0) as an input, from which N values x i (0) are sampled. Each of these initial values can be represented as x i (0) = ⟨x(0)⟩ + δ i (0), Eqn. (7), where δ i (0) is the deviation of the initial opinion from the initial mean ⟨x(0)⟩. By definition ⟨δ(0)⟩ = 0. We note that only the δ i (0) result from the log-normal distribution, while in Eqn. (7) the x i (t) and δ i (t) are determined by Eqn. (13).
The initial collective error E(0) and the initial average opinion are related by Eqn. (3): Hence, we only need an additional value σ 2 ln x (0), to calculate the initial group diversity D(0). Then, the initial configuration on the macroscopic level is given by the pair {E(0),D(0)}. It will be of interest to us to study the dynamics of these values, in particular their long-term values With these specifications of the systemic measures and the agent variables, we now proceed solving the dynamics analytically.

Collective error
Dynamic solution.
To calculate E(t), Eqn. (3) and dE(t) dt, Eqn. (4), we need to have the explicit expressions for the following quantities: ⟨ln x(t)⟩, d ⟨ln x(t)⟩ dt. These are derived in P. Mavrodiev, F. Schweitzer: The ambiguous role of social influence on the wisdom of crowds: An analytic approach

Submitted for publication
Appendices E and F. Here we only present the results: Integration leads to the solution: With these expressions we have completely described the dynamics of the collective error.

Asymptotic solution.
The long term behavior of the collective error results from t → ∞, and we find As we see, the final outcome of the collective error is mainly determined by the properties of the initial opinion distribution, in particular ⟨x(0)⟩ and ⟨δ(0)⟩, and further by the two model parameters social influence α and individual conviction β. We will discuss the consequences of this in the next Section.

Group diversity
Dynamic solution.
To calculate D(t), Eqn. (8), and dD(t) dt, Eqn. (9), we need to have the explicit expressions for ⟨x(t)⟩ and ⟨δ 2 (t)⟩ as well as their time derivatives. Again, here we only present the result, the derivation is provided in Appendix D. Plugging in ⟨δ 2 (t)⟩ from Eq. 47 into D(t), Eqn. (8), yields: which is always positive.

7/22
P. Mavrodiev, F. Schweitzer: The ambiguous role of social influence on the wisdom of crowds: An analytic approach

Submitted for publication
Asymptotic solution.
The long term behavior of the group diversity results from t → ∞, and we find where we made use of the fact that ⟨x(t)⟩ ≈ ⟨x(0)⟩, see Eqn. (15).
We can further derive how the long-term group diversity depends on the two model parameters, social influence α and individual conviction β: This will be tested in the next section. Further, we see that the final group diversity increases with the initial deviation from the mean, ⟨δ 2 (0)⟩:

Collective error
We now use our analytical solutions from the previous section to study the dependence of the collective error and the group diversity on the two model parameters, social influence α and individual conviction β, and on the variance of the initial opinion distribution.
The collective error E(t) is defined as the squared difference between the true value, ln T , and the mean opinion, ⟨ln x(t)⟩. Hence, the plot gives a parabola with the minimum at ln T = ⟨ln x(t)⟩.
The asymptotic value ⟨ln x LT ⟩ is calculated from Eqn. (18). Figure 1 illustrates how ⟨ln x LT ⟩ and consequently E LT , Eqn. (19), depend on the two parameters (α, β) (a) and on the initial variance Comparing the sets of parameters (0.01, 0.1) and (0.1, 0.1) we see that for smaller social influence α the motion of ⟨ln x(t)⟩ is slower and the quasi-stationary state of ⟨ln x⟩ is respectively closer to the initial value (Figure 1 between social influence α and individual conviction β. A larger α expands the range of motion of ⟨ln x(t)⟩, consequently increases the convergence limit, whereas a larger β restricts it. These opposing effects may lead to a reduction of long-term E, as in Figure 1(a). But, as we will demonstrate in Figure 3 in the next Section, conditional on the initial conditions E(0) and D(0), these two opposing effects may also lead to a deterioration.
The effect of the initial variance, ⟨δ 2 (0)⟩, on the long-term collective error is similar to the influence of α. A larger heterogeneity in the initial opinions which directly translates to larger initial group diversity, leads to longer motion of ⟨ln x(t)⟩ (Figure 1(b), dotted vs. dotted-dashed lines) and consequently to a higher ⟨ln x LT ⟩. In line with the above argumentation, whether this has a positive or a negative net effect on the collective error depends entirely on the initial distribution of opinions and is further discussed in Section 4.3.
In conclusion, for given parameters (α, β) the initial condition ⟨ln x(0)⟩ uniquely determines the end value of the collective error. It should be noted that the log of the geometric mean exhibits only rightward motion, since d⟨ln x(t)⟩ dt > 0, Eqn. (17). Hence, if we start from ⟨ln x(0)⟩ > ln(T ) the collective error will always increase.

Group diversity
The dependency of the group diversity D(t) on the parameters (α, β) and the initial variance  illustrates that an increase of the initial variance slightly increases also the group diversity. However, because of the social influences, the impact of the initial variance becomes smaller over time.

Exploring the parameter space
To complement the discussion of the specific cases above and to reveal a more profound insight on the impact of social influence and individual conviction, we run a parameter sweep on (α, β). In Figure 3 we have calculated the long-term collective error E LT , Eqn. (19), for three different initial values of the collective error which are chosen such that they represent three characteristic scenarios. The color code indicates the value of E LT , (blue) for low values, which is positive, (red) for high values, which is negative. We note that for each plot, the color code represents different values, and (blue) not always means E LT = 0. Each plot also indicates with a black line those parameter combinations (α, β) for which E LT = E(0). Except for Figure 3(c), these lines are barely noticeable because they coincide with α = 0.
We remind that the average opinion can only increase, since d⟨ln x(t)⟩ dt > 0, Eqn. (17). This was described as a rightward motion of ⟨ln x(t)⟩ in Figure 1. So, can we reach a situation where  Figure 3(a,c) depict initial configurations where ⟨ln x(0)⟩ < ln(T ). Thus, according to the plot in Figure 1, there is a chance that E LT = 0, i.e. a convergence of the collective opinions to the true value. This is not the case for Figure 3(b), where the initial configuration is ⟨ln x(0)⟩ > ln(T ).
Nevertheless, in all situations there are parameter sets (α, β) that minimize the collective error, E LT . Thus, we are interested to know whether this minimum value increases or decreases if we vary α or β. Figure 3(a) illustrates a situation where we have initially a large collective error, E(0) = 0.8. Because of this, an increase in social influence, α, always improves the collective error. But an increase in individual conviction β makes this worse, because it reinforces the initial situation, which was bad. For α = 0.4, for instance, we see with increasing β a color change from (green) to (red). Figure 3(c), which as a similar initial configuration, but a very small initial collective error, E(0) = 0.01. Because of this, a large increase in social influence, α > 0.5, will lead to a deterioration. This can be counterbalanced by an increase of individual conviction β. Taken e.g. at α = 0.8, an increase of β leads to a considerable improvement with respect to the collective error.

The situation is different in
At difference with the two other configurations (a) and (b), in Figure 3(c) we see a nonmonotonous dependency of the collective error on the parameters (α, β). Even that the initial collective error was very small, there are large parameter ranges where E LT < E(0) (deep blue). The collective error can also reach the minimum E LT = 0. Obviously, those parameter ranges optimize the wisdom of crowds. We note that that this is the case for a non-zero, but not too large social influence, α < 0.5.

11/22
P. Mavrodiev, F. Schweitzer: The ambiguous role of social influence on the wisdom of crowds: An analytic approach Submitted for publication Figure 3(b), despite the different initial condition ⟨ln x(0)⟩ > ln T , resembles more the dependency of Figure 3(c) than of Figure 3(a), because the initial collective error is also small, E(0) = 0.02. But here we do not find a non-monotonous dependency of the collective error; instead it always increases. In this situation, any social influence α will only deteriorate the outcome, in particular if it becomes large. This can be counterbalanced by an increasing individual conviction β.
From this discussion we have to conclude that there is no simple monotonous impact of α or β on the collective error. It is very important how far the initial average opinion is away from the truth, and it is as important if it is below or above the true value. Given that, there are parameter ranges, where an increasing social influence can also improve the collective error. But this impact cannot be decoupled from the influence of the individual conviction, which reinforces a good or a bad initial opinion distribution. In Figure 4 we have plotted how the long-term group diversity depends on the parameters (α, β) for the three cases shown in Figure 3. We note that D LT behaves the same for all three cases, as it depends only on α, β and ⟨δ 2 (0)⟩, which are invariant across the three starting states. We recall that the wisdom of crowds depend on a large group diversity, thus (red) indicates the better and (blue) the worse outcomes. As it can be expected from the discussion above, a larger social influence α always negatively impacts the group diversity, regardless of the starting configuration. Individual conviction β acts in the opposite way -it maintains diversity in the group by increasing the reluctance against a change of opinions.

Conclusion
The wisdom of crowds is an intriguing phenomenon. The observation that the average of diverse opinions about a given questions is often more accurate than any single estimate is counter intuitive at first sight. But numerous anectodal, empirical and theoretical investigations across a variety of settings support this finding [10,16,28,32].

12/22
P. Mavrodiev, F. Schweitzer: The ambiguous role of social influence on the wisdom of crowds: An analytic approach

Submitted for publication
However, the wisdom of crowds depends on a crucial assumption that is hard to maintain under real circumstances, namely the indepencence of opinions. Quite often, external events, random occurrences or social influences from others lead to a change of individual opinions that also impact the wisdom of crowds.
The aim of our paper was to study this impact of social influence in more detail. We utilize a opinion dynamics model proposed before [22] that considers three different ingredients: (i) the individual conviction β to keep the initial opinion x i (0) despite other influences, (ii) the social influence α to change the own opinion x i (t) if information about the average opinion ⟨x(t)⟩ becomes available, and (iii) small random influences Aξ i (t).
Instead of running agent-based simulations, we aimed at deriving analytic expressions for the two most important systemic quantities that describe the wisdom of crowds effect: the collective error E(t) and the group diversity D(t). Our only input, in addition to the opinion dynamics model, Eqn. (13), are assumptions about the initial opinion distribution, which allows us to derive expressions for the average initial opinion ⟨ln x(0)⟩ and the average initial variance ⟨δ 2 (0)⟩. A quite cumbersome derivation allowed us to find closed-form expressions for the long-term collective error E LT , Eqn. (19), and the long-term group diversity D LT , Eqn. (21). These expressions could then be calculated numerically to analyze the impact of the two parameters (α, β) and the initial conditions, E(0), D(0), which have been derived from ⟨ln x(0)⟩ and ⟨δ 2 (0)⟩.
Our systematic evaluation of the impact of the parameters and the initial conditions reveals the ambiguous role of social influence α on the wisdom of crowds. We could identify scenarios where increasing the social influence indeed improves the wisdom of crowds: if the initial collective error is high, but the initial average opinion is below the true value. But we could also demonstrate that an increasing social influence deteriorates the wisdom of crowds, if the initial collective error is already low.
This behavior is mitigated by the influence of the individual conviction β, which always reinforces the impact of the initial opinion. In those cases, where the initial collective error is large, i.e. the starting configuration of opinions is rather bad, a large individual conviction does not improve the outcome. But in those cases, where the initial collective error is already small, a large individual conviction helps to counterbalance the impact of social influence, and thus leads to better outcomes.
In particular, we could identify scenarios where the collective error vanishes, i.e. the average opinion converges to the true value. This is the case if the initial collective error is low, the initial average opinion is below the true value and the social influence is at low to moderate values.
The generality of our results of course depend on the assumptions about the opinion dynamics. It may seem that a coupling of the individual opinion to the average opinion is not the most realistic scenario. But this beguiles. Analyzing experimental data, it was shown that this dynamics indeed 13/22 P. Mavrodiev, F. Schweitzer: The ambiguous role of social influence on the wisdom of crowds: An analytic approach Submitted for publication captures the opinion dynamics of subjects in these experiments [23]. So, there is evidence for our proposed dynamics.
One could wish to generalize the derivations provided in this paper to more complex opinion dynamics that do not assume couplings to the mean. However, we were unable to find analytic expressions for the relevant measures, collective error and group diversity, for more complex cases. Nevertheless, the analysis provided in this paper allows us to understand and to quantify the impact of social influence on the wisdom of crowds, without the need for agent-based computer simulations. As we have demonstrated, social influence is not inherently "good" or "bad", per se. It depends particularly on the initial opinion distribution, i.e. the prior knowledge of the individuals, whether social influence can improve the wisdom of crowds.
Before we can calculate our quantities of interest, we have to derive expressions for some terms involving δ(0) and δ(t) to be used later. We remind that these δ, according to Eqn. (7), describe deviations from the mean at different times. δ(0) is the initial deviation and follows the log-normal distribution. δ(t), however, depends on the dynamics of x(t), Eqn. (13).
Keeping in mind that the ensemble average and the differential are linear operators and thus interchangeable: The terms dδ i (0) dt = 0 since all δ i (0) are constants. Hence: Now we can plug in x(t) from Eqn. (13): where we have used the fact that ξ i (t) and δ i (0) have negligible covariance. This is true because generation of the δ i does not depend on the white noise, nor does the ξ i 's depend on the initial distribution of δ i . The solution of the above equation is: The constant C is given by the initial condition ⟨δ(0)δ(0)⟩: 16/22 P. Mavrodiev, F. Schweitzer: The ambiguous role of social influence on the wisdom of crowds: An analytic approach

Submitted for publication
However, ⟨δ(0)δ(0)⟩ = ⟨δ 2 (0)⟩, hence: Therefore: B Derivation of ⟨δ(0)δ 2 (t)⟩ We use again the fact that the ensemble average and the differential are interchangeable: The δ(0) are constant, hence dδ(0) dt = 0: Plugging in x(t) from Eq. 13 yields: The noise term becomes negligible for large N , hence we can rewrite: Making use of the expression for ⟨δ 2 (0)δ(t)⟩, Eqn. (40), derived below, we can rewrite Eqn. (34) as: which has the closed-form solution: This can be generalized as follows: P. Mavrodiev, F. Schweitzer: The ambiguous role of social influence on the wisdom of crowds: An analytic approach Submitted for publication C Derivation of ⟨δ 2 (0)δ(t)⟩ We can use the same method applied in the previous section to expand: Using the expressions for x(t) and ⟨x(t)⟩ from Eqs. (13), (14), we obtain: where we have used that the expectations involving the noise terms are 0. The solution to this first-order ODE is given by: Using the same line of arguments, we can obtain: which is a generalization of the previous Eqns. (30), (37), (41).

D Derivation of ⟨δ 2 (t)⟩
Eventually, we express: 18/22 P. Mavrodiev, F. Schweitzer: The ambiguous role of social influence on the wisdom of crowds: An analytic approach

Submitted for publication
Rewriting δ(t) from Eqn. (7), and using Eqs. (13), (14) for x(t) and ⟨x(t)⟩, respectively: is Gaussian white noise, and we assume that it has a negligible influence on x i (t). This is a reasonable assumption, because the major contribution to dx i (t) dt comes from α and β, hence it is justified to conclude that x i (t) and ξ i (t) have a negligible covariance. Hence ⟨δ(t)ξ(t)⟩ − ⟨δ(t) ⟨ξ(t)⟩⟩ ≈ 0. Using this we obtain: Plugging in Eqn. (30), the closed form solution is given by: where C 1 is a constant obtained from ⟨δ 2 (t = 0)⟩ = ⟨δ 2 (0)⟩, and C is the constant from 29. Substituting the constants yields: Again, this can be generalized to: E Derivation of Eqn. (17) We start from Eqn. (13), which can be rewritten as: The ensemble average of Eqn. (48) is: x i (t) and x i (0) can be expressed by small deviations from their respective means, as written in Eqn. (7). We further use a Taylor expansion around zero: With this, Eqn.49 can be approximated as follows: We now express ⟨δ n (t)⟩ by means of Eqn. (47) and ⟨δ(0)δ n (t)⟩ by means of Eqn. (41). Further, we make again the assumption that the random noise, ξ i (t) is negligibly small and not correlated to x i (t). As a consequence: Hence, we obtain: d ⟨ln x(t)⟩ dt = α + β)