Lightning Network: a second path towards centralisation of the Bitcoin economy

The Bitcoin Lightning Network (BLN), a so-called"second layer"payment protocol, was launched in 2018 to scale up the number of transactions between Bitcoin owners. In this paper, we analyse the structure of the BLN over a period of 18 months, ranging from 14th January 2018 to 13th July 2019, at the end of which the network has reached 8.216 users, 122.517 active channels and 2.732,5 transacted bitcoins. Here, we consider three representations of the BLN: the daily snapshot one, the weekly snapshot one and the daily-block snapshot one. By studying the topological properties of the binary and weighted versions of the three representations above, we find that the total volume of transacted bitcoins approximately grows as the square of the network size; however, despite the huge activity characterising the BLN, the bitcoins distribution is very unequal: the average Gini coefficient of the node strengths is approximately 0.88 causing 10% (50%) the of the nodes to hold the 80% (99%) of the bitcoins at stake in the BLN (on average, across the entire period). Like for other economic systems, we hypothesise that local properties of nodes, like the degree, ultimately determine part of its characteristics. Therefore, we have tested the goodness of the Undirected Binary Configuration Model (UBCM) in reproducing the structural features of the BLN: the UBCM recovers the disassortative and the hierarchical character of the BLN but underestimates the centrality of nodes; this suggests that the BLN is becoming an increasingly centralised network, more and more compatible with a core-periphery structure. Further inspection of the resilience of the BLN shows that removing hubs leads to the collapse of the network into many components, an evidence suggesting that this network may be a target for the so-called split attacks.


Introduction
The gain of popularity of Bitcoin 1 has made apparent the problems in terms of scalability of the technology upon which it is based: in fact, only a limited amount of transactions per second -whose number is proportional to the size of a block and its release frequency -can be processed by Bitcoin. This shortcoming may prevent the adoption of this payment network at a global scale, especially when considering that classic payment mechanisms (e.g. traditional credit cards) are able to achieve tens of thousands of transactions per second. A naïve (and short term) solution would be represented by an increase of the block size: larger blocks, however, would require larger validation time, storage capability and bandwidth costs, in turn favouring centralisation, as fewer entities would become able to validate the new blocks that are appended to the Blockchain; moreover, centralisation in the validation process would make the system less resilient, i.e. more prone to faults and attacks.
The Bitcoin Lightning Network (BLN) [2][3][4] aims at breaking the trade-off between block size and centralisation by processing most of the transactions off-chain: it is a "Layer 2" protocol that can operate on top of Blockchain-based cryptocurrencies such as Bitcoin. The origin of the BLN can be traced back to the birth of Bitcoin itself, as an attempt to create payment channels across which any two users could exchange money without burdening the entire network with their transaction data -thus allowing for cheaper and faster transactions (as both the mining fees and the Blockchain confirmation are no longer required). The BLN can, thus, be seen as a solution that does not sacrifice the key feature of Bitcoin, i.e. decentralisation, that characterises its architecture (i.e. the number of computers constituting the network), its political organisation (i.e. the number of individuals controlling the network) and its wealth distribution (i.e. the number of individuals owning the actual supply), while enhancing the circulation and the exchange of the native assets.
The BLN has recently raised a lot of interest: Seres 4 argued that the BLN structure can be ameliorated to improve its security; Rohrer 5 showed that the current BLN can be prone to channel exhaustion or attacks aimed at isolating nodes, thus compromising the nodes reachability, the payment success ratio, etc. In this paper, we consider the BLN payment channels across a period of 18 months, i.e. from 14 th January 2018 to 13 th July 2019, and analyze it at both the daily and the weekly timescale. Our results show that the BLN is characterised by an unequal wealth distribution and by a larger-than-expected centrality of nodes, thus suggesting that the BLN indeed suffers from the aforementioned centralisation issue.

Methods
Notation. For each time snapshot t, the BLN can be described as a weighted, undirected network with total number of nodes N (t) and represented by the N (t) × N (t) symmetric matrix W (t)6, 7 whose generic entry w (t) i j indicates the total amount of money exchanged between i and j, across all channels, at time t. The total amount of money exchanged by node i, at time t, is s i j , a quantity that will be also called capacity. For the present analysis, we also consider the BLN binary adjacency matrix A (t) , whose generic entry reads a i j . Centrality measures. Indices measuring the centrality of a node aim at quantifying the importance of a node in a network, according to some, specific topological property [8][9][10][11] . Among the measures proposed so far, of particular relevance are the degree centrality, the closeness centrality, the betweenness centrality and the eigenvector centrality. Let us briefly describe them: • the degree centrality 10, 11 of node i coincides with the degree of node i, i.e. the number of its neighbours, normalized by the maximum attainable value, i.e. N − 1: where k i = ∑ N j( =i)=1 a i j . From the definition above, it follows that the most central node, according to the degree variant, is the one connected to all the other nodes; • the closeness centrality 10, 11 of node i is defined as where d i j is the topological distance between nodes i and j, i.e. the length of the shortest path(s) connecting them: in a sense, the closeness centrality answers the question "how reachable is a given node?" by measuring the length of the patterns that connect it to the other vertices. From the definition above, it follows that the most central node, according to the closeness variant, is the one lying at distance 1 by each other node; • the betweenness centrality 10, 12-14 of node i is given by where σ st is the total number of shortest paths between node s and t and σ st (i) is the number of shortest paths between nodes s and t that pass through node i. From the definition above, it follows that the most central node, according to the betweenness variant, is the one lying "between" any two other nodes; • the eigenvector centrality 10,14,15 of node i, e c i , is defined as the i-th element of the eigenvector corresponding to the largest eigenvalue of the binary adjacency matrix (whose existence is ensured by the Perron-Frobenius theorem). According to the definition above, a node with large eigenvector centrality is connected to other "well connected" nodes. In this sense, its behavior is similar to the PageRank centrality index.
Gini coefficient. The Gini coefficient has been introduced to quantify the inequality of a country income distribution 16,17 : it ranges between 0 and 1, with a larger Gini coefficient indicating a larger "unevenness" of the income distribution. Here, we apply it to both the distribution of the centrality measures of nodes, i.e.
where c i = k c i , c c i , b c i , e c i and to the distribution of the total amount of money exchanged by the nodes of the BLN, i.e.
Centralisation measures. The centrality indices defined above are all normalized between 0 and 1 and provide a rank of the nodes of a network, according to the topological feature chosen for their definition. Sometimes, however, it is useful to compactly describe a certain network structure in its entirety. To this aim, a family of indices has been defined (the so-called centralisation indices), encoding the comparison between the structure of a given network and that of the reference network, according to the chosen index. In mathematical terms, any centralisation index reads represents the maximum value of the chosen centrality measure computed over the network under consideration and the denominator is calculated over the benchmark, defined as the graph providing the maximum attainable value of the quantity ∑ N i=1 (c * − c i ). As it can be proven that the most centralized structure, according to the degree, closeness and betweenness centrality, is the star graph, one can define the corresponding centralisation indices: • the degree-centralisation index, as • the closeness-centralisation index, as • the betweenness-centralisation index, as • the eigenvector-centralisation index, as For what concerns the eigenvector index, the star graph does not represent the maximally centralised structure: however, we keep it for the sake of homogeneity with the other quantities.

3/11
Benchmarking the observations. Beside providing an empirical analysis of the BLN, in what follows we will also benchmark our observations against a model discounting available information to some extent. Like for other economic and financial systems, we hypothesise that local properties of nodes ultimately determine the BLN structure: specifically, we focus on the degrees and adopt the the Undirected Binary Configuration Model (UBCM) as a reference model 18,19 . The UBCM captures the idea that the probability for any two nodes to establish a connection depends on their degrees and can be derived within the constrained entropy maximization framework, the score function being represented by Shannon entropy and the constraints being represented by the degree sequence {k i } N i=1 . Upon solving the aforementioned optimization problem 18,19 , one derives the probability that any two nodes establish a connection representing the so-called Lagrange multipliers enforcing the constraints. In order to numerically determine them, one can invoke the likelihood maximization principle, prescribing to search for the maximum of the function with respect to the vector {x i } N i=1 , a procedure leading to the resolution of the following system of equations 18,19 Core-periphery detection. Inspecting the evolution of centralisation is useful to understand to what extent the structure of a given network becomes increasingly (dis)similar to that of a star graph; however, although encoding the prototypical centralised structure, carrying out a comparison with such a graph can indeed be too simplistic. Hence, we also check for the presence of the "generalized" star graph structure also known as core-periphery structure, composed by a densely-connected core of nodes surrounded by a periphery of loosely-connected vertices. In order to do so, we implement a recently-proposed approach 20 , prescribing to minimize the score function known as bimodular surprise and reading where is the total number of node pairs, L = ∑ N i=1 ∑ N j=i+1 a i j is the total number of links, C is the number of node pairs in the core portion of the network, P is the number of node pairs in the periphery portion of the network, l c is the observed number of links in the core and l p is the observed number of links in the periphery. From a technical point of view, S is the p-value of a multivariate hypergeometric distribution 20 .

Data
Since payments in the Bitcoin Lightning Network are source-routed and onion-routed, the sender must have a reasonably up-to-date view of the network topology, in order to pre-compute the entire payment route. Nodes in the BLN regularly broadcast information about the channels they participate in: each time a channel is opened, or any of its details changes, the two endpoints of the channel announce such changes to the rest of the network. This exchange of information, called gossip, allows other nodes to keep their view of the network topology up-to-date, an information that is, then, used to initiate a payment.
The network topology can be visualised by means of the the so-called routing table. For this paper, we took regular snapshots of the routing table (every 15 minutes, between January 14 th 2018, at blockheight 503816, to July 13 th 2019, at blockheight 585844); these snapshots were, then, aggregated into timespans, each timespan representing a constant state of a 4/11 J a n . channel from its start to its end. In addition, this information is enriched with data from the Blockchain: since every channel consists of an unspent transaction output on the Bitcoin Blockchain, we can determine the size of a channel and its open and close dates within minutes. Other heuristics can be used to search for potential channels on the Blockchain, without involving the gossip mechanism: this allows us to put a lower bound on the completeness of our measurements.
In the Bitcoin Blockchain, the time between blocks is Poisson distributed with an expected value of 10 minutes between blocks. On a single day, the expected number of new blocks added to the Blockchain is 144. For the sake of simplicity, and without altering in any way the results, we consider this number of blocks our natural timescale (for example, the blocks of the first day range from the 503816 th one to the 503959 th one while the blocks of the second day range from the 503960 th one to the 504103 rd one). In this paper, three different representations of the BLN are studied, i.e. the daily snapshot one, the weekly snapshot one and the daily-block snapshot one -even if the results of our analysis will be shown for the daily-block snapshot representation only. A daily/weekly snapshot includes all channels that were found to be active during that day/week; a daily-block snapshot consists of all channels that were found to be active at the time the first block of the day was released: hence, the transactions considered for the daily-block representation are a subset of the ones constituting the daily representation.

Results
Empirical analysis of the BLN binary structure. Figure 1 plots the evolution of basic network quantities since launch of the BLN, i.e. the number of nodes, which is a proxy of the number of users, the number of links and the link density. As it can be seen, although the network size increases (for the daily-block snapshot N ranges from 2 to 6476 and L ranges from 1 to 55866; in particular, in the last daily snapshot of our dataset we have 6476 nodes and 54440 links), it becomes sparser. However, two different regimes are visible: a first phase where a steep increase of N and L (descrease of ρ) takes place is followed by a phase during which a much smoother increase (decrease) of the same quantities is observed. Further insight on the BLN evolution can be gained by plotting the link density ρ = 2L N(N−1) versus the total number of nodes N: a trend whose functional form reads ρ ∼ cN −γ , with γ 1, clearly appears. However, such a functional form seems to describe quite satisfactorily the BLN evolution up to the period when N 10 3 : afterwards, a different functional dependence seems to hold. Notice also that the value of the numerical constant c coincides with the value of the average degree, since c = 2L k. By imagining a growth process according to which each new node enters the network by establishing at least one new connection with the existing ones, to ensure that L t ≥ N t − 1, a lower-bound on c k can be deduced: c ≥ 2 ( fig. 1 shows the trend y = 3N −1 even if the inspection of the evolution of the quantity c = 2L N−1 reveals that periods where c k assumes different, constant values can be individuated).
In order to comment on the centrality structure of the BLN, let us explicitly draw it: fig. 2 shows the largest connected component of the BLN daily-block snapshot representation on day 16 and on day 34. Several hubs are present (e.g. on day 34, the largest one, having degree k 34 hub = 121, is linked to the 34.3% of nodes): notice that each of them is linked to a plethora of other nodes that, instead, are scarcely linked among themselves. The emergence of structurally-important nodes is further confirmed by plotting the evolution of the Gini index for the distribution of the centrality measures defined in the Methods A visual inspection of the network evolution suggests the presence of a core-periphery structure since its early stages. section (i.e. the degree, the closeness, the betweenness and the eigenvector centrality): fig. 3 shows that G c is increasing for three measures out four, pointing out that the values of centrality are more and more unevenly distributed (irrespectively from the chosen indicator). The flat trend characterizing the closeness centrality could be explained by the presence of nodes with large degree ensuring the vast majority of nodes to be reachable quite easily. On the other hand, the evolution of the centralisation indices indicates that the BLN is not evolving towards a star graph, although the eigenvector centrality reaches quite large values in the middle stages of the BLN history. As anticipated above, imagining that the picture provided by a star-like structure could provide a good description of the BLN topology is indeed too simplistic.
Benchmarking the observations. Let us now benchmark the observations concerning the centrality and the centralisation indices with the predictions for the same quantities output by the UBCM. More specifically, we have computed the expected value of G c and C c (with c i = k c i , c c i , b c i , e c i , ∀ i) and the corresponding error, by explicitly sampling the ensembles of networks induced by the UBCM. In fig. 4 we plot and compare the evolution of the observed and expected values of G c and C c , both as functions of N. Such a comparison reveals that the UBCM tends to overestimate the values of the Gini index for the degree, the closeness and the betweenness centrality and to underestimate the values of the Gini index for the eigenvector centrality 1 . These results point out a behavior that is not reproducible by just enforcing the degree sequence (irrespectively from the chosen index). The evidence that the UBCM predicts a more-heterogeneous-than observed structure, could be explained starting from the result concerning the eigenvector centrality. The latter, in fact, seems to indicate a non-trivial (i.e. not reproducible by lower-order constraints like the degrees) tendency of well-connected nodes to establish connections among themselves -likely, with nodes having a smaller degree attached to them. Such a disassortative structure could explain the less-than-expected level of unevenness characterizing the other centrality measures: in fact, each of the nodes behaving as the "leaves" of the hubs would basically have the same values of degree, closeness and betweenness centrality.
On the other hand, the betweenness-and the eigenvector-centralisation indices suggest that the BLN structure is indeed characterized by some kind of more-than-expected star-likeness: the deviations from the picture provided by such a benchmark, however, could be explained by the co-existence of multiple star-like sub-structures (see also fig. 2 and the Appendix for a more detailed discussion about this point).  Figure 3. (colour online) Top panels: evolution of the Gini index for the degree, closeness, betweenness and eigenvector centrality for the daily-block snapshot representation: G c is characterised by a rising trend, irrespectively from the chosen indicator, pointing out that the values of centrality are increasingly unevenly distributed. Bottom panels: evolution of the degree-, closeness-, betweenness-and eigenvector-centralisation measures: although the eigenvector-centralization index reaches quite large values in the middle stages of the BLN history, the picture provided by a star graph is too simple to faithfully represent the BLN structure.
Core-periphery detection. A clearer picture of the BLN topological structure is provided by the analysis aimed at clarifying the presence of a "core-periphery -like" organization. Inspecting the evolution of the bimodular surprise S across the entire considered period reveals that the statistical significance of the recovered core-periphery structure increases, a result leading to the conclusion that the description of the BLN structure provided by such a model becomes more and more accurate as the network evolves. As an example, fig. 5 shows the detected core-periphery structure on the snapshots depicted in fig. 2: the nodes identified as belonging to the core and to the periphery are, respectively, coloured in blue and yellow.
Empirical analysis of the BLN weighted structure. Let us now move to the empirical analysis of the weighted structure of the BLN, by inspecting the evolution of the total capacity W of (i.e. the total number of bitcoins within) the BLN daily-block snapshot representation: fig. 6 shows the evolution of W as a function of network size N. The trend shown in the same figure reads y = aN b with a = 2 · 10 −5 and b = 2. Although the total number of bitcoin rises, inequality rises as well: in fact, the percentage of nodes holding a given percentage of bitcoins at stake in the BLN steadily decreases (on average, across the entire period, about the 10% (50%) of the nodes holds the 80% (99%) of the bitcoins -see the second panel of fig. 5). This trend is further confirmed by the evolution the Gini coefficient G s , whose value is 0.9 for the last snapshots of our dataset (and whose average value is 0.88 for the daily-block snapshot representation).

Conclusions
The Bitcoin Lightning Network is a sort of "Layer 2" protocol aimed at speeding up the Blockchain, by enabling fast transactions between nodes. Originally designed to allow for cheaper and faster transactions without sacrificing the key feature of Bitcoin, i.e. its decentralisation, it is evolving towards an increasingly centralised architecture, as our analysis reveals. In particular, its structure seems to become increasingly similar to a core-periphery one, with well-connected nodes clustering together (as revealed by the study of the eigenvector centrality). More precisely, our analysis reveals the presence of many star-like sub-structures with the role of centers played by the hubs, seemingly acting as channel-switching nodes. Such a tendency seems to be observable even when considering weighted quantities, as only about 10% (50%) of the nodes hold 80% (99%) of the bitcoins at stake in the BLN (on average, across the entire period); moreover, the average Gini coefficient of the nodes strengths  Figure 4. (colour online) Top panels: comparison between the observed Gini index for the degree, closeness, betweenness and eigenvector centrality (blue dots) and their expected value, computed under the UBCM (red diamonds) for the daily-block snapshot representation. Once the information contained into the degree sequence is properly accounted for, a (residual) tendency to centralisation is still visible. Bottom panels: comparison between the observed degree-, closeness-, betweennessand eigenvector-centralisation measures and their expected value computed under the UBCM (red diamonds). Once the information contained into the degree sequence is properly accounted for, the emerging picture is that of a network characterized by some kind of more-than-expected star-likeness: deviations from this benchmark, however, are clearly visible and probably due to the co-existence of many star-like sub-structures (see also fig. 2). is 0.88. These results seems to confirm the tendency for the BLN architecture to become "less distributed", a process having the undesirable consequence of making the BLN increasingly fragile towards attacks and failures.

Authors contributions
Jian-Hong Lin and Kevin Primicerio performed the analysis. Tiziano Squartini, Christian Decker and Claudio J. Tessone designed the research. All authors wrote, reviewed and approved the manuscript.   A visual inspection of these networks confirms that star-like sub-structures are present to a much lesser extent with respect to the observed BLN in the same snapshots. Bottom panel: evolution of the comparison between the empirical assortativity coefficient r (blue dots) and its expected value, computed under the UBCM (red diamonds), for the daily-block snapshot representation. The BLN is significantly more disassortative than expected.