How to Measure the Liquidity of Cryptocurrency Markets?

This paper investigates the efficacy of low-frequency transaction based liquidity measures to describe actual (high-frequency) liquidity. We show that the Corwin and Schultz (2012) and Abdi and Ranaldo (2017) estimators outperform other measures in describing time-series variations, irrespective of the observation frequency, trading venue, high-frequency liquidity benchmark, and cryptocurrency. Both measures perform well during high and low return, volatility and volume periods. The Kyle and Obizhaeva (2016) estimator and the Amihud (2002) illiquidity ratio outperform when estimating liquidity levels. These two estimators also reliably identify liquidity differences between trading venues. Overall, the results suggest that there is not yet a universally best measure but there are reasonably good low-frequency measures.


Introduction
Bitcoin and other cryptocurrencies are now firmly entrenched in the financial system. Bitcoin is becoming a widely accepted form of online payment and more than 35 million bitcoin wallets are in existence. Trading in bitcoin exceeded $930 billion USD in January 2020. Bitcoin also forms a growing part of investment The growing importance of bitcoin for payments and investments is dependent on an efficient transfer of bitcoin for other currencies on cryptocurrency exchanges. The number of exchanges has exploded, making it difficult for investors to select an exchange for trading and hedging. While trading has become 20 relatively frequent in cryptocurrencies the liquidity of these markets is difficult 1 See: https://www.cmegroup.com/trading/bitcoin-futures.html 2 https://www.coindesk.com/bitcoin-futures-pass-1b-in-open-interest-on-bitmex-for-firsttime-since-march-crash 3 Data for bitcoin volume come from https://coinmarketcap.com/currencies/bitcoin/ historical-data/ and NYSE volume data are from https://focus.world-exchanges.org/ issue/january-2020/market-statistics. Both sites were accessed on February 15, 2020. We note, though, that some cryptocurrency exchanges appear to be overstating their reported volume (e.g. Hougan et al. (2019)) and that ranking websites like coinmarketcap.com may be tempted to report inflated trading volume due to their revenue model which is largely dependent on crypto-exchanges (Alexander and Dakos (2019)).
to determine. Cryptocurrency markets lack a regulated data feed like the consolidated tape for U.S. equities. The lack of a consolidated feed, coupled with the high number of exchanges and jurisdictions makes it difficult to calculate highfrequency bid-ask spreads thereby hampering the comparison of liquidity across 25 cryptocurrency exchanges. The bid-ask spread is an important metric when assessing an exchange in that it represents the costs of immediately buying or selling a security. Bid-ask spreads are usually calculated using high-frequency intraday data that are both expensive to purchase and time-consuming to process. We compare high-frequency measures of liquidity with easy to compute 30 low-frequency measures.
Characterising liquidity across exchanges is important for investors, traders, and hedging strategies that use cryptocurrencies (Hu et al. (2019)) that can be negatively affected by the costs of illiquidity. Additionally, cryptocurrency prices are not integrated across exchanges (Makarov and Schoar (2020)) and 35 the decision to trade on an exchange is binding as orders cannot easily be rerouted to exchanges that are more liquid or offering better prices. With little information about individual exchanges traders may have to rely on transactions data such as the daily high, low and closing prices to evaluate market quality.
We study the accuracy of liquidity measures derived from transactions data. 40 To this end we estimate low-frequency measures derived from aggregate transactions data 4 (prices and volumes) and compare them to high-frequency measures of transaction costs and price impact calculated from order book data. Our objective is to identify the transactions-based measure that best describes actual liquidity on a cryptocurrency exchange. 45 Data on best bid and ask prices and order books are hard to obtain and process. 5 As such, few papers use full order book data to study the liquid-  (2020)). Cryptocurrency markets have characteristics that differ from traditional markets, 6 suggesting that liquidity formation on cryptocurrency exchanges may differ from those of other 55 asset markets.
We use a novel and comprehensive set of continuous transactions data and order book snapshots comprising the 50 best bids and asks for two major cryptocurrencies (bitcoin and ethereum) and three large exchanges (Bitfinex, Bitstamp and Coinbase Pro) over a two-year period. First, we use these data to 60 construct high-frequency measures of transaction costs and price impact. These measures serve as our liquidity benchmarks. In a second step, we use transactions data (prices and volumes) and calculate various liquidity proxies at lower frequencies (1 hour, 1 day, and 15 days, 7 respectively). Data to compute the measures are collected at the 1-minute, 1-hour and 1-day frequency. 8 Individual 65 low-frequency measures have been used to describe liquidity in cryptocurrency markets (e.g. Brauneis and Mestel (2018), Dimpfl (2017), Fink and Johann 5 It can be downloaded in real time from the REST APIs of each cryptocurrency exchanges or it can be purchased from vendors such as Kaiko. 6 e.g. markets are highly fragmented and weakly regulated; 365 days a year and a 24 hour trading a day, and direct market access for all traders; trading platforms allow a direct transfer of fiat currency from and to bank accounts or credit cards and transactions are cleared and settled by exchanges directly; margin trading and short-selling is uncommon 7 Providing results at the monthly frequency is infeasible because high-frequency data for cryptocurrencies are limited to 24 months. 8 In contrast to CRSP for equity markets that provides daily prices, cryptocurrency prices are available at higher than daily frequency. For example, the site cryptodatadownload.com provides free data for many currency pairs and trading venues at the hourly frequency.
trade) and for both bitcoin and ethereum. 9 Average time-series correlations describe the average relationship between benchmark measures and proxies but do not capture the relationship for extreme liquidity events that may be important for investment and hedging strategies. We use quantile dependence plots to understand how well transactions-based liquidity measures capture the timeseries properties of the benchmark measures across the distribution. This is an 100 important extension since the relative performance of liquidity proxies might be different depending on the liquidity regime. We perform several sample splits and find similar performance rankings of our liquidity proxies for high and low volume and volatility periods. Given the extreme volatility associated with bitcoin and cryptocurrency markets more generally, identifying liquidity proxies 105 that are invariant to volatility and volume swings is an important contribution.
The popular Amihud (2002) illiquidity ratio does not capture the time-series variability of liquidity in the cryptocurrency markets. 10 The poor performance is driven by the relationship between volume and liquidity that is assumed to be negative in Amihud (2002) and is positive in cryptocurrency markets. 110 The positive relation between bid-ask spreads and volume is at odds with most theoretical predictions but has recently also been documented by Bogousslavsky and Collin-Dufresne (2020) for large US stocks.
Consistent with prior efforts to identify good low-frequency proxies of liquidity (Hasbrouck and Seppi, 2001), we construct a composite estimator, the 115 first principal component of the low-frequency proxies, and find that it does not improve on the performance of the best individual proxies using intraday data, but offers some improvements at the daily and lower frequencies.
9 Consistent with our results, Karnaukh et al. (2015) report the Corwin and Schultz (2012) measure to have the highest correlation with high-frequency bid-ask spreads in FX markets.
10 Conceptually, the Amihud (2002) illiquidity ratio is a proxy for the price impact, not for the spread. We find that it also performs poorly in tracking the time-series variation of price impacts, a component of the effective spread.
The measures that best describe the level of the benchmark measures are the Kyle and Obizhaeva (2016) and Amihud (2002) estimators. In this application 120 the Corwin and Schultz (2012) and Abdi and Ranaldo (2017) estimators perform poorly. We find that the values obtained for these two estimators and for the Roll (1984) estimator are negatively related to the data frequency, a finding that has been documented previously for the Roll estimator (Roll (1984), Harris (1990)) but has, to the best of our knowledge, not been documented for the high-low 125 spread estimators.
An important application of liquidity proxies is to select an execution venue among a number of alternatives. We use the low-frequency estimators to rank trading venues according to their liquidity. We find that the Amihud (2002) illiquidity ratio and the Kyle and Obizhaeva (2016) estimator replicate the 'true' 130 ranking when compared to the ranking generated using high-frequency order book measures. 11 Our findings are useful for researchers, investors, traders, trading venue operators and regulators to understand liquidity levels and dynamics on cryptocurrency exchanges with relatively easy to acquire and process aggregate price and 135 volume data. Investors seeking the most liquid exchanges are best advised to use the Amihud (2002) illiquidity ratio or the Kyle and Obizhaeva (2016) estimator.
These two measures also provide good approximations of the level of liquidity. The remainder of the paper is organized as follows. In section 2 we describe our data and methodology, section 3 presents the results, and section 4 concludes.

155
We compile a high-frequency data set that covers the two years period from 12/16/2017 00:00 UTC to 12/16/2019 00:00 UTC, a total of 730 trading days (17, 520 hours). Over this period we used Matlab to continuously access the public and freely accessible REST APIs of three large trading venues, Bitfinex, Bitstamp and Coinbase Pro (formerly known as GDAX). These are among the 160 largest cryptocurrency spot trading platforms. All three venues operate an electronic central limit order book with orders being matched based on price and time priority.
The REST APIs provide live information on transactions and the current state of the order book. All public endpoints at each of these exchanges use 165 GET requests for different types of information. We request records on 'Trades' / 'Transactions' and the 'Order book'. Depending on the venue, request parameters vary. For instance, Bitstamp only provides the full order book (with usually thousands of entries) whereas order book requests at Bitfinex and Coinbase Pro may be limited to the 50 best orders on each side of the market.

170
A potential problem associated with transactions data from cryptocurrency exchanges are fake data. A widely cited report by Hougan et al. (2019) argues that up to 95% of exchange-reported trading volume in bitcoin might not represent economically meaningful transactions or might even be plain fake. Collecting unique high-frequency trade and order book data for bitcoin the authors 175 subject 83 cryptocurrency exchanges to several tests to identify exchanges that are likely to overstate trading volume. Only 10 exchanges passed all the tests and are characterized as "real volume" exchanges. The three trading venues that we consider in the present study all belong to the latter group.
From each trading venue we download data for two currency pairs, bit-180 coin versus US dollar (BTCUSD) and ethereum against US dollar (ETHUSD).
The data set includes the price and the corresponding dollar trading volume for each transaction, a UNIX time stamp, a unique exchange-specific ID and a trade indicator which indicates whether a transaction was buyer-initiated or seller-initiated. Table 1 lists the total number of transactions and order book 185 snapshots for both currencies and all three markets. A total of 90.7 (53.4) million transactions were executed for bitcoin (ethereum) during the investigation period, most of them on Coinbase Pro while Bitstamp reports least transactions.

INSERT TABLE 1 ABOUT HERE
We observe several time intervals with gaps in the data. These may be due 190 to actually missing trading activity, technical problems (failure of the internet connection, no response from the server etc.), or exchange-specific trading halts (e.g. due to maintenance, updates or hacker attacks). We identify between 6,329 (Coinbase Pro -BTC) and 187,254 (Bitstamp -ETH) intervals without transaction data exceeding 60 seconds (1 minute), between 2,641 and 195 5,920 intervals exceeding 600 seconds (10 minutes) and between 1,573 and 2,199 intervals exceeding 1,800 seconds (30 minutes). Besides transactions data we retrieve order book data from the three trading platforms. Specifically, we collect the 50 best bid and best ask prices with corresponding volumes, resulting in a total 14.6 million (13.5 million, 16.4 million) order book snapshots for Bitfinex (Bitstamp, Coinbase Pro) for the two cryptocurrencies under investigation (see Table 1). As for the transactions data we 215 observe a considerable number of intervals without order book snapshots. There are between 13,038 (Coinbase Pro) and 84,461 (Bitfinex) intervals without data exceeding 60 seconds. The numbers of intervals without order book snapshots exceeding 600 seconds and 1,800 seconds are roughly equal across the three exchanges and amount to approximately 3,300 and 2,200, respectively. The 220 12 We note that the differences in the standard deviation of returns may reflect liquidity differences because the return standard deviation is affected by bid-ask bounce. In fact, as shown in Table 3 below, bid-ask spreads are largest on Bitstamp, a result that has also been confirmed by Brauneis et al. (2019). standard deviations of quote midpoint returns is similar across trading venues and is generally higher for ETHUSD than for BTCUSD (see Table 2).

Measures of liquidity
The purpose of our paper is to assess and compare the accuracy of transactionsbased measures of liquidity. In doing so we take the perspective of a researcher 225 who has access to data on open, high, low and close prices and on the number of transactions and the dollar trading volume.
For our analysis we need to specify (a) the frequency at which these data are available (measured by the length of the subintervals i in the sequel) and (b) the frequency at which the transactions-based measures are calculated (measured 230 by the length of the intervals t). Unlike for other financial markets (e.g. stock markets), price and volume data for cryptocurrencies are easily available for higher than daily frequencies. We therefore choose three distinct setups.
• Data are available at the 1-minute frequency and are used to estimate transactions-based liquidity measures at the hourly frequency.

235
• Data are available at the 1-hour frequency and are used to estimate liquidity measures at the daily frequency.
• Data are available at the daily level and are used to calculate liquidity measures at a 15-day frequency.
To construct the data set we use our record of all transactions and extract 240 the open, high, low and close price as well as the number of transactions and the dollar volume at the respective frequencies of one minute, one hour and one day. These data are then used to calculate the transactions-based measures (to be described below) at the hourly, daily and 15-day frequencies, respectively.
Because the three trading venues under investigation are located in different 245 time zones we follow coinmarketcap.com and define a trading day as lasting from 00:00 UTC to 23:59 UTC. For an interval to be included in the analysis we require that data are available for at least 80% of the subintervals. Thus, when we aggregate minute-by-minute (hour-by-hour, daily) data to the hourly (daily, 15-day) frequency we require at least 48 minutes (19 hours, 12 days) with valid In the sequel we first describe the high-frequency measures which we use as benchmark measures and then the transactions-based measures that we wish to evaluate. The percentage quoted spread is the difference between the best ask price P a and the best bid price P b of each order book snapshot, divided by the quote midpoint M Q = (P b + P a )/2 and averaged over all observations in 265 the interval The subscript j denotes the j th order book snapshot in interval t and N t is the total number of order book snapshots in interval t.

270
To estimate the effective bid-ask spread we combine order book snapshots with the first transaction that occurs after the snapshot. 13 The price of this transaction is denoted P + . The average percentage effective spread in interval t is then calculated as N + t is the number of order book snapshots that are followed by a transaction before the next order book snapshot is recorded.
• Percentage Price Impact (P I) To estimate the price impact we use data sequences consisting of an order book snapshot, the first transaction after the snapshot and the subsequent 280 order book snapshot. The percentage price impact is then calculated as the signed percentage change in the quote midpoint from the pre-transaction order book snapshot to the post-transaction snapshot, 14 averaged over all data sequences (as defined above) in the interval where Q + j denotes the trade indicator (+1 for a buyer-initiated trade and −1 for a seller-initiated trade) of the transaction occurring after the order book snapshot j. 15 • Percentage cost of a roundtrip trade (CRT (Y )) To assess the liquidity for larger trades we use the order book data to 290 calculate the weighted average prices at which a buy and a sell order of 13 When there is more than one transaction between two order book snapshots we only use the first of these transactions.
14 From the numbers in Table 1 it follows that we observe an order book snapshot every nine seconds on average. Thus, the horizon over which we calculate the price impact is slightly less than nine seconds on average. This is in line with Conrad and Wahal (2020) who recommend to use a horizon of no more than 15 seconds for liquid stocks.
15 As before, when there is more than one transaction between two order book snapshots we only use the first of these transactions. We lose one observation in each interval because the last order book snapshot in an interval is discarded as it is not followed by another snapshot in the same interval. a given size Y would execute. The weighted average price for executing a transaction of size Y USD given the current state of the order book is defined as where A k and V k are the price and volume of the k th order, respectively. Note that the K th 295 order may be subject to partial execution, depending on the outstanding dollar volume required to entirely fill the transaction volume Y . We set Y equal to the 99% quantile of the corresponding (aggregate) trade size distribution. For the currency pair BTCUSD this value is approximately equal to USD 32,100, while for ETHUSD Y roughly corresponds to USD 300 17,400.
To estimate the cost of a roundtrip trade of size Y , CRT (Y ), we calculate the weighted average prices for a market buy order and a market sell order of size Y and then express the difference between the two prices as a fraction of the quote midpoint. Finally, we calculate an equally-weighted 305 average across all order book snapshot in interval t.
The CRT (Y ) measure is conceptually similar to the quoted bid-ask spread.  We also calculated correlations between our benchmark liquidity measures.
Using the daily data set for the pair BTCUSD (results for other frequencies as well as for ETHUSD are similar and available upon request) we find the highest correlation between QS and ES (exchange average: 0.92) while PI has the lowest correlations with all other benchmark measures (e.g. average across 330 exchanges between PI and QS: 0.55.). This confirms that PI captures a different dimension of market liquidity than QS and ES. Concerning the correlations between our spread measures QS and ES, respectively, and our hypothetical 16 By way of comparison: Mancini et al. (2013) report liquidity for the 9 most traded exchange rates on the EBS platform over the period January 2007 to December 2009. They find EURUSD to be the most liquid rate with a mean relative quoted spread (effective spread) of 1.05 (0.31) basis points. USDCAD is the least liquid of the analyzed pairs with respect to the quoted spread (8.27 basis points) while AUDUSD has the highest effective spread (1.38 basis points. 17 Because of the higher execution costs on Bitstamp traders may want to avoid Bitstamp. However, there are several reasons why we may still observe significant trading activity on Bitstamp. First, most transactions are small. The median trade size on Bitstamp is 354 USD (the corresponding values for Bitfinex and Coinbase Pro are 500 and 140 USD, respectively). Assuming a quoted spread of 6.6 bp (the median quoted spread on Bitstamp), the execution costs of a median-sized trade on Bitstamp amount to 0.14 USD (354 USD multiplied by the half-spread), an amount which traders may deem negligible. Second, there are frictions beyond the bid-ask spread. For example, trading venues differ in the ways how traders can transfer and withdraw fiat money to and from their accounts. These differences can result in cost and speed differences between the exchanges. Third, not all traders are free to choose where to trade. For example, Bitfinex did not accept US residents as customers during our sample period. Fourth, traders may prefer to trade on a venue in or close to their home country, e.g. because they are more familiar with the legislative regime.  2000)). We take these results as evidence that QS and ES are good indicators for market liquidity not only at but also beyond the inside spread.  Table 3, Table 9 in the appendix provides descriptive statistics for the benchmark liquidity measures for the currency pair ETHUSD 350 at the daily frequency (again, results for the other frequencies are essentially identical and are available upon request). As for BTCUSD we find the levels of our four benchmark measures to be very similar on Bitfinex and Coinbase Pro.
Average quoted spreads are about twice as high as those for the pair BTCUSD.
Average effective spreads for ETHUSD are below 1 bp on Bitfinex and Coinbase

355
Pro, but again are higher than those for BTCUSD. Average price impacts on both trading venues are almost equal to effective half-spreads, implying that the suppliers of liquidity earn a very small realized spread on average.
As for BTCUSD, Bitstamp is substantially less liquid for ETHUSD than the other two exchanges. 18 The average quoted spread (effective spread) amounts 360 to 13.26 bp (1.77 bp). Again the average price impact is not larger on Bitstamp than on Bitfinex and Coinbase Pro, implying substantial realized spreads to be earned by liquidity suppliers on Bitstamp.

Transactions-based proxy measures
As noted previously, all transactions-based liquidity measures are calculated 365 from data on open, high, low and closing prices as well as the number of transactions and the dollar trading volume for each subinterval i. The data for the subintervals are then aggregated to one liquidity estimate for each interval t.
We use the following transactions-based measures.

370
For each interval t we calculate the unweighted average of the number of transactions in the subintervals i, T X t = 1 I i T X t,i , where I denotes the number of subintervals in interval t.

• Dollar Volume ($V ol)
Our second transactions-based measure is the unweighted average of the 375 reported dollar transaction volume $V ol t = 1 I i $V ol t,i in all subintervals i belonging to interval t.
• The Amihud (2002) illiquidity ratio (Amihud) The Amihud (2002) illiquidity ratio for each subinterval is the absolute return (measured from the opening price to the closing price of the subin-380 terval) divided by the dollar trading volume in the subinterval, Amihud t = 18 Evidently, ETHSUD is a rather infrequently traded pair on Bitstamp (roughly 5 million transactions over our investigation period, compared to roughly 24 million on Bitfinex and Coinbase Pro) which is why we only have 4,410 observations of hourly data on Bitstamp that match the 80% data availability criterion (compared to more than 11,920 one hour intervals that match this criterion on Bitstamp for the pair BTCUSD where O t,i and C t,i denote the opening and closing price in subinterval i in t, respectively. The illiquidity ratio for interval t is the unweighted average of the ratios for the subintervals i in t. We note that conceptually the illiquidity ratio is a measure of price impact. However, in 385 empirical applications it is routinely used as a proxy for liquidity at large. • The Roll (1984) serial covariance estimator (Roll) The Roll (1984) where ∆ is the first difference operator. In the results section, Roll p 395 (Roll r) refers to the price-(return-)based version, respectively.
• The Kyle and Obizhaeva (2016) estimator (Kyle) Kyle and Obizhaeva (2016) derive an illiquidity index based on the ratio of volatility to dollar volume of an asset within a given interval. It is defined where the volatility estimator σ 2 t,i (r) is the mean of the squared returns of all subintervals i in interval t.

• The Corwin and Schultz (2012) estimator (CS).
The CS estimator is calculated from the high and low prices of two adjacent 405 subintervals i, i + 1. It is defined as 2 H i and L i denote the high and low prices, respectively, in subinterval i, 410 while H i,i+1 and L i,i+1 refer to the high and low price, respectively, of two adjacent subintervals i and i + 1. We follow Corwin and Schultz (2012) and set negative values of the proxy to zero. The CS t estimator for period t is the unweighted average of all CS estimators for adjacent subintervals in t.

415
Corwin and Schultz (2012) propose a method to adjust their estimator for the overnight trading halt. We do not need to implement this modification because cryptocurrency exchanges operate 24 hours a day and seven days a week. There are thus no regular trading halts.
• The Abdi and Ranaldo (2017)  which uses high and low price data from two adjacent subintervals i and i + 1. It is defined as The AR t estimator for interval t is the average of the AR t,i measures for all adjacent subintervals i in t, smallest spread estimates. We will compare the mean values shown in Table 4 to the effective spread calculated from high-frequency quote data in section 3.5 below.

INSERT TABLE 4 ABOUT HERE
For the currency pair ETHUSD descriptive statistics for our proxy liquidity 445 measures are reported in Table 10 in the appendix. As for our benchmark measures the results indicate that the pair ETHUSD is less liquid than BTCUSD: volume-based proxy measures show lower values, while price-based measures are higher. 19 We will discuss these results in more detail in section 3.7.

450
We present the results in six steps. We first report time-series correlations between the transactions-based proxies and the benchmark measures. Correlations are a global measure of linear dependence. To analyze whether the dependence structure is different in the tails of the distribution we analyze, in step 2, quantile dependencies based on the empirical distribution functions. In a third 455 19 We note that the price-based Roll measure delivers an estimate of the dollar spread, not of the percentage spread. The numerical values are lower for ETHUSD than for BTCUSD because the dollar price of ethereum is only a fraction of the bitcoin price. In sections 3.1 to 3.6 we present results for the currency pair BTCUSD in 470 detail; qualitative results for the pair ETHUSD are similar in most respects and are summarized in section 3.7.

Time-Series Correlations
An accurate transactions-based measure should capture the time-series variation in liquidity and should thus be positively correlated with the bench-475 mark measures. We therefore estimate time-series correlations between the lowfrequency measures and the high-frequency measures. We do so separately for three exchanges (Bitfinex, Bitstamp and Coinbase Pro) and three time frames (1-minute data (1-hour-data, daily data) aggregated to the hourly (daily, 15daily) frequency). As it turns out, the results for the three exchanges are very 480 similar. We therefore report averages across the trading venues. 20 In the description of the results we emphasize the findings for the daily data (i.e. hourly data aggregated to the daily frequency). We believe that most researchers using transactions-based liquidity proxies will do so to obtain daily estimates, and the hourly raw data required to calculate these daily estimates are easily and freely 485 available, e.g. from cryptodatadownload.com.
The results are presented in Figures 2, 3 and 4. Focusing on the results for the daily intervals ( Figure 2) we find that the Abdi and Ranaldo (2017) and Corwin and Schultz (2012)  of the correlation is even negative. 21 The poor performance of the Amihud (2002) illiquidity ratio deserves discussion because this ratio is widely used as a measure of liquidity in empirical microstructure research. We argue that the lack of correlation between the illiquidity ratio and the benchmark measures is 510 caused by the strong and positive relation between liquidity and trading activity discussed above. The illiquidity ratio is based on the presumption that, in a less liquid market, a given dollar trading volume will have a larger impact on prices and will thus result in a larger price change. Put differently, for a given price change higher volume points to a more liquid market and should thus be 515 associated with lower execution costs according to the inherent logic of the measure. However, in the markets under investigation volume is positively related to execution costs, a relation that runs counter the logic of the illiquidity ratio.
We wish to reemphasize that the finding of a positive relation between trading activity and execution costs, even though at odds with the predictions of stan-  The strong and positive correlations documented above for the daily data frequency between the transaction frequency and dollar trading volume on the one hand and the benchmark measures on the other hand persist at the other data frequencies.
The performance of the Kyle and Obizhaeva (2016) estimator is better at 535 higher data frequencies while the Roll (1984) estimator appears to perform better at lower frequencies. The Amihud (2002) illiquidity ratio continues to be the worst-performing measure. As explained above, the most likely reason is the positive relation between trading activity and spreads in the cryptocurrency markets.

INSERT FIGURE 3 ABOUT HERE INSERT FIGURE 4 ABOUT HERE
So far we have documented differences in correlations across the different transactions-based measures, but we do not know whether the differences are significant. We therefore now perform a formal test based on the Fisher r-to-z 545 transformation. Specifically we test, separately for each of the four benchmark measures and the three trading venues, whether the correlation between the best-performing proxy and the benchmark measure is significantly higher than the correlation between the second-best performing proxy and the benchmark.
The results are reported in Table 5. The evidence in favor of significant differ-550 ences is limited to the highest data frequency and to two benchmark measures, the quoted and the effective spread. In four out of six cases the Corwin and Schultz (2012) estimator performs best and displays significantly higher correlation with the respective benchmark measure than the second-best transactionsbased measure (which, in all four cases, is the Abdi and Ranaldo (2017) es-555 timator). In one case (effective spreads on Bitstamp) the ranking of the two best-performing proxies is the same but the difference is insignificant, and in one case (quoted spreads on Bitstamp) the order of the CS and AR estimators is reversed. For the other benchmark measures and for lower data frequencies there is no compelling evidence in favor of significant differences between the 560 two best-performing transactions-based measures.

Quantile dependence
The correlation between the time series of transactions-based proxies and the benchmark measures of liquidity provides a global measure of dependence.

565
However, it is conceivable that a proxy measure that fits the benchmark well in times of high liquidity (i.e. in times of low bid-ask spreads) performs poorly in times of low liquidity and vice versa. We therefore use quantile dependence to analyze the dependence structure between the benchmark and proxy measures in more detail. The quantile dependence of order q between two random variables 570 η p and η b is generally defined as the conditional probability that F p (η p ) is smaller (greater) than q given that F b (η b ) is smaller (greater) than q for q ≤ 0.5 (q > 0.5): 22 , for q ∈ (0.5, 1).
In our application the subscripts b and p refer to the benchmark measures and the transactions-based proxies, respectively. q denotes a quantile and F , for q ∈ (0.5, 1).
F j (η j ) ; j ∈ {b, p} denotes the empirical distribution functions of the benchmark and proxy measure, respectively. We estimate it using scaled ranks, i.e.
we transform the data into ranks and then rescale these ranks onto the unit 580 interval.
Intuitively, quantile dependence works as follows. For any q ≤ 0.5 consider the q · T smallest observations for the benchmark measure, where T is the total number of observations. Then consider the q · T smallest values for a transactions-based proxy and determine the fraction of coinciding values. This 585 fraction is the estimate of the quantile dependence, λ p,b q .
We use data at the daily frequency 23 to estimate the quantile dependence separately for each trading venue and then calculate averages across venues.
We present the results using quantile dependence plots which show the quantile dependence as a function of q. Higher quantile dependence implies a closer rela-590 tion between the benchmark measures and the transactions-based proxies. The results for our four benchmark measures and eight proxies are shown in Figure   5. The dependence between the benchmark and proxy measures is generally stronger in the center of the distribution and weaker in the tails. The dependence in the tails appears to be asymmetric, it tends to be higher for larger than 595 for smaller values. 24

INSERT FIGURE 5 ABOUT HERE
23 Results for the hourly frequency are qualitatively similar and are available upon request. The number of observations at the 15-day frequency is too low to reliably estimate quantile dependence, particularly in the tails of the distributions.
24 When interpreting the results note that a quantile dependence of 0.5 for q = 0.5 is expected when the two distributions are independent.
With respect to the ranking of the transactions-based liquidity measures the results from the quantile dependence analysis are consistent with the results shown in Figure 2 above. In particular, the Abdi and Ranaldo (2017) and 600 Corwin and Schultz (2012) estimators perform very well over the entire distribution, i.e. for high as well as low levels of liquidity. The two measures of trading activity, the number of transactions and the dollar volume, perform very well when the effective spread is used as benchmark. As before, the Amihud (2002) illiquidity ratio performs poorly.

Composite Estimator
The different transactions-based measures capture different aspects of liquidity. It is, therefore, conceivable that a combination of these measures better captures the time-series variation of liquidity. To test whether this is the case we construct a composite estimator based on the eight low-frequency measures. 610 We first standardize all variables by subtracting the mean and dividing by their standard deviation. We then extract the first principal component of the standardized data and estimate the time-series correlations between the first principal component and the benchmark measures. Table 6 shows the results for each time frame and each of the three exchanges. Comparing the results in Table 6 to those in Figures 2, 3 and 4 reveals that the best performing individual estimators, the Corwin and Schultz (2012) and Abdi and Ranaldo (2017) estimators, achieve higher time-series correlation than the composite estimator for each benchmark measure at the hourly frequency and for three out of four benchmark measures (the exception being the effective spread) at the daily data frequency. At the lowest data frequency, on the other hand, the composite estimator improves upon the best-performing individual measure for three out of four benchmark measures. We thus conclude that the 630 benefit of calculating all transactions-based proxies and aggregating them to a composite estimator is limited at least at higher data frequencies.

Sample splits
It may be the case that some of the transactions-based liquidity measures perform better under specific circumstances, e.g. earlier or later in the sample Further, because the magnitude of the execution costs determines whether a given price difference (e.g. for the same cryptocurrency at two different trading venues) can be profitably exploited, the level of liquidity is also related to market efficiency. 685 We use as performance metrics the prediction error between the liquidity benchmark and the liquidity proxy as measured by the root mean squared error (RMSE) and the mean absolute error (MAE). Table 7 Table 7 suggest that the high-low spread 715 estimators developed by Corwin and Schultz (2012) and Abdi and Ranaldo (2017) are subject to a similar bias.

Cross-Sectional Analysis
One potential application of transactions-based liquidity measures is to compare the liquidity of different trading venues. A good proxy measure should pro-720 duce the same ranking of the venues as the benchmark measures. Therefore, in order to evaluate the low-frequency measures we simply analyze how frequently the liquidity ranking across trading venues produced by the transactions-based measures is equal to the ranking produced by the benchmark measures. We per-26 Specifically, he showed that the expected value of the serial covariance estimator is where s is the spread, σ 2 is the variance of price changes and n is the number of observations. The bias in the serial covariance estimator, − −σ 2 n , increases with the square of the observation interval. Under ideal conditions (i.e. i.i.d. returns and continuous trading seven days a week, as is the rule in cryptocurrency markets), the variance of weekly price changes is seven times the variance of daily price changes while the number of observations is one seventh. Consequently, the bias in weekly data is 49 times the bias in daily data.
form the analysis separately for each exchange pair (Bitfinex/Bitstamp, Bitfinex/Coinbase 725 Pro, and Bitstamp/Coinbase Pro) and for each time frame. For each interval (one hour, one day, 15 days) and each exchange pair we record the corresponding liquidity ranking based on the benchmark measures and based on the transactions-based proxies and then simply count the fraction of identical rankings. By chance, this fraction should be 50%. Therefore, we test whether 730 the actual fractions are significantly larger than 50% using a simple binomial test. The results are presented in Table 8.   INSERT TABLE 8

760
In this section we briefly discuss our results for the currency pair ETHUSD which are in most respects qualitatively similar to those for BTCUSD. As for BTCUSD, results for the three trading venues are very similar. We therefore report averages across the venues. All tables and figures we are referring to are in the appendix.

765
When considering the time-series correlations between our low-frequency liquidity measures and the high-frequency benchmark measures we generally find correlation levels that are slightly lower for ETHUSD than for BTCUSD, particularly at the lowest data frequency (see Figures 10 to 12). The Corwin and Schultz (2012) and Abdi and Ranaldo (2017) estimators yield the highest 770 correlations for QS, ES and P I at the hourly and the daily frequency, with often almost identical correlation levels achieved by these two estimators. When the cost of a roundtrip trade CRT (Y ) 28 is used as benchmark measure the Kyle 27 Note that the Amihud (2002) illiquidity ratio does not capture the liquidity differences between Bitfinex and Coinbase well. However, as documented in Table 3 above, the liquidity differences between these two exchanges are small. When liquidity differences are small, ranking venues according to their liquidity is less important. 28 We set the dollar trading volume Y to USD 17,400 which corresponds to the 99% quantile of the aggregate trade size distribution for the currency pair ETHUSD. and Obizhaeva (2016) estimator (which performs rather poorly for the other benchmark measures) performs best. As for the pair BTCUSD the volume 775 proxy measures are surprisingly highly positively correlated with the benchmark measures, particularly with the effective spread.
The quantile dependence plots for ETHUSD yield results similar to those for BTCUSD. The ranking of the low-frequency proxy measures mostly holds over the entire distribution, implying that the ranking of the proxies tends to 780 be independent of the level of liquidity (see Figure 13). When we construct a composite estimator from the eight proxy measures by means of a principal component analysis we find the time-series correlations between the composite estimator and our benchmark measures to be mostly lower than the correlations between the benchmarks and the Corwin and Schultz (2012) and the Abdi and 785 Ranaldo (2017) estimators, respectively (see Table 11). Only for the 15-day time frame does the composite estimator achieve higher correlations than the best-performing individual proxy for two of the benchmark measure (QS and

ES).
We separately calculate the time-series correlations between the proxy liq- To summarize, our findings relating to time-series correlations between proxy and benchmark measures for the pair ETHUSD are similar to the results for BTCUSD.

810
In a next step we investigate the ability of the low-frequency measures to capture the level of the high-frequency benchmarks. We use the same performance metrics as in section 3.5 and obtain results for the pair ETHUSD that are again qualitatively very similar to those for BTCUSD (see Table 12). The Again, the performance of these proxy measures gets worse the longer the time frame, probably because of the small sample bias mentioned on page 31.
Finally, we analyze the ability of the low-frequency measures to replicate 825 the ranking produced by the benchmark measures as in section 3.6 above. Results are displayed in Table 13. Two measures stand out, the Amihud (2002) illiquidity ratio and the Kyle and Obizhaeva (2016)

Conclusion
In this paper we compare the performance of transactions-based liquidity measures to benchmark measures derived from high-frequency order book data.
We use data for the two most actively traded cryptocurrencies, bitcoin and ethereum, and from three trading venues. We consider four benchmark mea-  These differing findings suggest that the the setting is important in determining the best liquidity proxy.
Our results can be used by researchers, investors, traders, and regulators to understand liquidity levels and dynamics with relatively easy to acquire and pro-865 cess aggregate price and volume data. In many applications, the transactionsbased aggregate measures perform adequately when describing high-frequency measures derived from order book data. The use of these low-frequency measures is far less time-consuming and memory-intensive, offering a reasonable compromise between accuracy and computational workload. Strategies that re-870 quire more granular data such as triangular arbitrage or market-making will of course require higher frequency measures. Tables   number of transactions number of order books  BTC  ETH  BTC  ETH  Bitfinex 37,148,069 23,820,982 7,271,422 7,336,639 Bitstamp 15,310,565 5,207,030 6,913,021 6,611,326 Coinbase Pro 38,241,727 24,420,844 8,186,287 8,186,287