A Replication of Four Quasi-Experiments and Three Facts from ‘The Effect of File Sharing on Record Sales: An Empirical Analysis’ (Journal of Political Economy, 2007)

The influential piracy paper by Professors Oberholzer-Gee and Strumpf, although mainly based on proprietary data, contained an “important complement” to the main results, consisting of four “quasi-experiments” using publicly available data. This replication examines all of these quasi-experiments by using identical data and statistical methods where possible, as well as sometimes extending or augmenting the data or methods. This study concludes that the quasi-experiments performed by OS each contain important errors, oversights or inconsistencies that most often, but not completely, overturn the results claimed in the original OS article. (Published as Replication Study) JEL Z1 O3 L8


Introduction
One of, if not the most influential article on the effects of digital piracy (file-sharing) on the sound recording industry, appeared as a lead article in the Journal of Political Economy in 2007. 1 Its authors, Felix Oberholzer-Gee and Koleman Strumpf (OS), using data obtained from a small pirate server, concluded that piracy had essentially no impact on record sales. Although I have elsewhere (Liebowitz, 2016b) discussed issues with their main data set and analyses, direct replication of their main regressions is not possible because OS never made their main data public. 2 Nevertheless, OS used publicly available data when conducting four "quasi-experiments" described as "an important complement" to their main analysis. In this replication I examine these quasi-experiments and also several pieces of data that OS use to support their analysis.
Note that additional details and results can be found in the Appendix, to which I will sometimes refer.

Does American Piracy Decrease Every Summer?
Here are OS describing their first quasi-experiment: The first experiment involves variation over time. The number of filesharing users in the United States drops 12 percent over the summer (estimated from BigChampagne [2006]) because college students are away from their high-speed campus Internet connections. If downloads crowd out sales, we should observe that 1 According to Google Scholar, Oberholzer-Gee and Strumpf (2007)  the share of albums sold in the summer increases following the advent of file sharing. [2007, page 36] OS purport to demonstrate that American piracy fell each summer, supposedly because American college students lost access to their high speed Internet connections when they went home for the summer. 3 Thus the effects of piracy should have been weaker in the summer than during the rest of the year. OS then compared the yearly summer shares of record sales for four pre-Napster years and seven post-Napster, expecting the summer share of record sales to be higher in the post-Napster years if piracy hurts sales, everything else equal.
I would agree that this quasi-experiment would be a test of the piracy thesis if piracy routinely fell in the summer, as OS claim. Since a regular decline in summer piracy is the key requirement for this quasi-experiment, OS's examination of summer piracy is a worthy target for replication.
The BigChampagne data that OS use to measure piracy runs from August of 2002 to May of 2006. OS assume that what happens in the three complete summers during this 46-month period holds for all the summers during the full seven-year post-Napster period, because college students presumably went home every summer. Figure 1 charts the monthly number of pirates using the BigChampagne data. The three complete summer periods in these data are denoted by circles around them. 4 Visual inspection of Figure 1 would seem to indicate that only the first summer clearly reflects lower piracy than the non-summer months. Indeed, if one calculates piracy levels in the summer months relative to the other months in each of those three years, as was done in my prior replication (Liebowitz, 2007), 3 Note the contradiction between OS's claim here that American school holidays (which affect both secondary and college students) reduce American piracy when in their main regressions OS claim that German secondary school holidays increase German piracy. This claim also appears inconsistent with OS's main (first stage) regressions which imply that German summer school holidays would strongly increase American summer piracy. 4 American university summers tend to take place during June through August, although OS also include May and September. I use their 5-month definition of summer vacation in all the calculations below.
it is only the first summer that has a notable decline. The average summer piracy decline is 12%, matching the value reported by OS. 5 Nevertheless, OS have explained, in response to the prior replication, that they were comparing the summer decline relative to a trend. They stated "[a]t a time of rapid growth in the number of file sharers -the number of US users doubled between January 2004 and January 2006 -these summer months represent clear breaks from the growth trend in this period." 6 Figure 1 does not seem to comport any better with this latter explanation. It does not appear to reveal a "clear break" of an otherwise upward trend during the summer of 2005, since piracy in that summer is higher than in the previous winter/spring, and is basically the same as during the following winter. But let's bring this claim more formally to the data.
In their response to my earlier replication, OS tell us that they regressed the number of pirates: 5 The first summer has a piracy drop of 38.8%, the second summer a drop of 0.2% and the third summer an increase of 3.8%. 6 This quote comes from a response by OS to an editor from an earlier submission of this replication.  in a specification which includes a time trend term and an indicator variable for summer. The regression results imply file sharing activity dropped 12% in the summer when we include a time trend and fell by 8% when we do not.
The singular term "indicator variable" suggests that they used a single dummy variable to represent all three complete summers. This seems strange, since using a single dummy would not allow an answer to the question of whether a summer piracy decline is a regular, routine occurrence each summer. Answering the latter question would require separate dummies for each summer. Nevertheless, it is easy to replicate the regression described above and the results (highlighted in red) are found in the first two data columns of Table 1 above. With the time trend included, the summer dummy is negative and of borderline statistical significance (8%), indicating that on average piracy falls in the summer, but the average decline appears to be only 8.8%, not the 12% reported by OS. 7 If we are going to test the claim that piracy declines in each summer due, say, to college vacations, it is necessary that we examine each summer individually. The two rightmost columns of Table 1 provide the coefficients from including separate yearly dummies (all statistically significant) for each of the three summers (highlighted in blue).
The results of these regressions confirm the intuition from Figure 1. The key finding is that the summer of 2005 has a significant increase in piracy. This runs counter to OS's claim that piracy routinely falls in each summer, thus destroying the logic of this quasi-experiment.
Interestingly, the average decline is the same as was originally found just by taking the average three summer decline.
But note, also, that the 2003 decline might be due, at least in part, to special circumstances. In According to Internet2's website, "Internet2 operates the nation's largest and fastest, coast-tocoast research and education network." OS are clearly aware of this since they define Internet2, in their 2009 article (p. 28), as "the U.S. high-speed network which primarily connects universities." But American university activities slow during the summer, with most students and many faculty members on vacation. Therefore, Internet2 usage, particularly piracy, declines in the summer for idiosyncratic reasons having nothing to do with overall changes in piracy.
Internet2 piracy statistics, therefore, cannot legitimately be used to resurrect their quasiexperiment from the failed replication using the same BigChampagne data that they used. OS present some statistics that they take as supporting their conclusion that piracy has not harmed East Coast sales relative to West Coast sales. Here is their empirical summation:

East Cost Sales versus West Coast Sales
In 1998, the last year in the pre-P2P [pre-piracy] period, the share of album sales in the eastern time zone was 43.9 percent. This share has hardly moved since then. In 1999-2002, the mean was 43.5 percent and the range was 42.7-44.0 percent. This is consistent with some common national factors, rather than file sharing, driving sales trends.
OS note that the share of East Coast sales has remained in a narrow range and conclude that this small variation somehow demonstrates that sales in the East did not fall relative to sales in the West. OS do not compare actual market share changes in these two (out of four) U.S. time zones. Surely, to test their hypothesis, it would seem to matter whether the shares fell on the East coast relative to the West Coast after piracy became popular in 2000, since that is what the experiment claims to test.
Using the same Nielsen SoundScan 'album-sales-by-city' data for the same years as OS, plus an additional year's data, 2003, allows a comparison of yearly East Coast and West Coast market shares. After examining the data, I find only small changes in East Coast market shares in the pre-Napster and post-Napster periods, just as OS did, so in that sense the replication largely confirms their claim about East Coast market shares hardly changing. 8 But I also examined and compared the changes in sales for the East and West Coasts before and after piracy's birth, which is shown in Table 2 (with the raw data in the Appendix), for various starting and ending years. When the data are arranged to actually provide an answer to the question this quasiexperiment posed, the results are contrary to OS's conclusion, and are entirely consistent with piracy being harmful to sales. The sales declines in the East are greater than the declines in the West, 9 meaning that market shares rise in the West and set in the East. 10 The decline in East Coast sales was almost ten percentage points greater than the decline in the West Coast (equivalently, a 70% larger decline), a surprisingly large number. The implication, of course, is that if the slight increase in piracy due to enhanced access to a sliver of non-sleeping Europeans caused a sales decline of ten percentage points in the East (relative to the West), then the overall impact of piracy should be considerably larger.
This replication fails to support OS's conclusion.
I should note that I do not consider this to be evidence supporting a claim that piracy harms sales. I find the suggestion that a variation in the small number of Europeans awake late at night could measurable influence piracy behavior of Americans to be sufficiently farfetched to attribute these results to some other unknown factor.

The relationship between monthly changes in file-sharing and record sales.
The next quasi-experiment is a simple regression relating American record sales to the number of American pirates, with monthly fixed effects, over a 46-month period. In their words: 9 Although not shown, the market share of the combined middle time zones rose relative to the East but fell relative to the West, also consistent with the hypothesis that OS reject. Calculation of correct market shares for the middle time zones was not possible for OS given the way they treated a large category ("others") of unallocated sales that could not be fit into DMAs. OS put those sales into the middle time zones, which is an arbitrary procedure that obviously would incorrectly overstate the market shares for the middle time zones. I removed those "others" sales from the analysis, which allows the calculation of a correct market share for the middle time zones while not changing the shares of East  identical data, and a regression that also includes a simple measure of the health of the economy (the full results with coefficients for monthly fixed effects is in the Appendix). The OS results are found in the first column and the second column contains a seemingly exact replication (the full regressions with monthly dummies are in the Appendix).
It is somewhat difficult to categorize this attempted replication. The coefficient in the replication is 48% larger than OS's coefficient, although they should have been quantitatively identical. 11 The qualitative results are largely the same-a negative relationship between piracy and record sales that is not statistically significant. This direct replication fails, therefore, but it does not change the qualitative conclusion. [**The Appendix contains the raw data which in principle, could allow researchers to understand the cause of the different coefficients, if OS were to make their data available.] A very straightforward broadening of the replication would be to include some measure of the health of the economy. To this end, a monthly unemployment rate was included in the 11 Stata, SPSS, and Excel each generated identical coefficients.
regression, seen in the last column of Table 3. 12 The coefficient on piracy quadruples in size while achieving statistical significance (4%). The unemployment rate is also of borderline significance (7%) with a negative sign, implying that record sales fall when the economy worsens, as we would expect. 13 OS claim that the coefficient on the number of pirates is economically unimportant. To address this claim, the next to last row of Table 3  just as Napster was ramping up. 14 The OS result implies that somewhat more than one fifth of the actual sales reduction was due to piracy and the directly replicated results imply that approximately one third of the actual sales reduction was due to piracy. When the unemployment rate is added, the entire decline in album sales could be attributed to piracy, 15 a result that happens to match what the majority of the economic studies have found about the impact of piracy on record sales (Liebowitz, 2016a). I do not believe that declines of this size are economically unimportant.
12 These values come from the U.S. Bureau of Labor Statistics. I used the non-seasonally adjusted values since the purpose is to measure the overall strength of the economy, and if it is always a little stronger during the Christmas season we would want to include that extra strength. Using the seasonally adjusted values would slightly increase the absolute value of the coefficients and t-statics for both the number of pirates and unemployment rate. 13 OS have argued, in their response to an earlier version of this replication, that multicollinearity between the number of pirates and the unemployment rate makes these coefficients unreliable (the correlation is -.68). Their concern has some merit, but the VIFs are 5.91 and 4.78 for unemployment and piracy and respectively. These VIFs are moderately high, but in a range normally thought not to be indicative of a serious multicollinearity problem (typically thought to require a VIF above 10). 14 The predicted decline in sales due to piracy is calculated as the product of the average number of pirates over this interval (2003)(2004)(2005) and the piracy coefficient. In the appendix I provide these measurements separately for each year. The overall sales decline is based on Neilsen SoundScan average record sales over this period, compared to 2000, which was the peak year of sales using SoundScan statistics. 15 A value over 100% means that there would have been an increase in sales from 2000 to the 2003-5 period, except for the negative consequences of piracy.
In conclusion, a narrow replication was 48% larger than the coefficient reported by OS although it should have been identical, and a slightly broader replication indicated a considerably larger economic impact of piracy that is clearly inconsistent with their findings.

The relationship between genre piracy and genre sales
The final quasi-experiment examines whether genres of music that are most prone to piracy suffer larger sales declines than less piracy prone genres. 16 Here is OS's explanation: OS found that genres with high piracy intensities were less likely to suffer sales declines than genres with low piracy intensities, although this result was statistically insignificant. They provided few details of the underlying analysis other than to say that they ran regressions on a genre's sales changes controlling for genre piracy intensity and changes in the genre's popularity as measured on radio.
These missing details are important because it appears to be impossible for OS to have correctly done what they claim to have done for the genres used throughout their paper.
Throughout their paper they include 11 "genres" into which albums were classified when in fact several of the "genres" are merely categories of sound recordings and not musical genres. of those categories (catalog, new artists, current hits, and soundtrack) 18 are based on sales volume, age of recording, experience of the artist, or whether the music was in a movie.
The reason it does not seem possible to correctly do what they claim to have done is that there are no radio station music genres that match these four record categories. For example, consider the record category "new artists." A radio station may play new rap artists, or new country and western artists, or new rock artists, or even new classical or jazz artists, but these stations would also play established artists and there are no radio stations that play only new artists from every musical genre. Further, there are no radio genres that match the 'new artists' category (a list of radio genres is found in the Appendix). Nor are there radio stations playing only soundtracks of every type of music. Ditto for "current hits" and albums older than 18 months (catalog) of any style of music. Since there are no radio station genres that match these record categories, it does not seem possible to control for radio format popularity trends for these four "genres," although that is what OS claim to have done. OS do not specifically mention whether they include these four genres in their genre regressions, although it appears that they did. 19 Nor do they tell us what radio genres they "matched" to these four record categories that do not have similar radio genres.
It should be noted that even for actual musical genres, it is surprisingly difficult to match radio genres to sound recording genres. 20 18 As an example of the heterogeneity with any of these four genres, consider the soundtrack from the movie "Frozen" and the soundtrack from the movie "Straight Outta Compton." Although both are successful soundtracks, one is a Disney movie for children and the other is a biography of the first gangsta rap group. There are no radio station genres which would play both albums. 19 It appears that the four genres are not all removed because OS state on page 37 that the mean value of pirate downloads per album title, across genres, is 61.2 (which, strangely, does not match the 57.1 value in their Table 3), which is very different than the average value of 35.9 if the four non-musical genres are removed, using the numbers in their Table 3. 20 For example, one of the OS-chosen sales genres is "jazz" but although there is a radio "jazz" subcategory of New AC/Smooth Jazz, most listening measurements in that category are zero. On the other hand, "new adult contemporary", which is also a subcategory of "NewAC/Smooth Jazz" appears to be the closest to jazz, although it A separate issue is the OS measure of piracy intensity using their proprietary piracy data (the quality of which I have questioned elsewhere 21 ). OS construct a variable intended to measure piracy intensity which they name "downloads per album" but this is apparently a ratio of the number of pirate downloads per album title, in a genre, 22 not downloads per album sold. I create an alternative measure of piracy intensity (which I refer to as "OS corrected"), which is downloads per albums sold. The Appendix contains a discussion of why the latter is the superior variable, and also addresses how all the piracy intensity variables used here differ from more ideal measures. The correlation between the OS variable and my "corrected" version of this variable, for all OS genres, is only 6.5%, although it increases to 62.3% when the four non-music record categories are removed.
Putting these difficulties aside, we can try to replicate their regressions using the seven actual musical genres in their data. I broaden the replication by also examining the period 2000-2005, which appears to be a more appropriate time frame because file-sharing (beginning with Napster) did not become prominent until 2000 23 and because SoundScan's measured sales of sound recordings peaked in 2000, making this period more representative of the post Napster regime of declining record sales. One further broadening of the replication is to use an additional source of is not clear how close. Similarly, the OS "hard" genre (related to "metal") is not a radio category, with the closest appearing to be the "new rock" subcategory under "Alternative" or the "active rock" subcategory of "Rock." A complete list of American radio genres is in the Appendix. 21 See Liebowitz (2016b). 22 As seen in OS's Tables 1 and 3, their album sales averages are more than a thousand-fold larger than their measure of downloads, implying that the mean value of downloads per sold album would be much less 1. Therefore, the average value that OS provide (61.2 downloads per album) implies that the number of album titles per genre is in the denominator. 23 I provide, in the Appendix, details on Napster's size that make clear that Napster was not economically important until the summer of 2000.
genre piracy intensity data, and I was able to acquire 2005 data on piracy intensity from the firm NPD. 24,25   The actual replicated regressions linking piracy intensity and genre sales changes (controlling for radio genre audience changes) are found in Table 5. Higher piracy intensity is linked to lower sales in all six instances, although the coefficients are only statistically significant in one case (using NPD data during 2000(using NPD data during -2005, which should not be surprising with only seven observations. The results of the regressions can be summarized in an intuitive fashion, as found in the bottom two rows of Table 5, which calculates the change in genre sales as piracy levels increase under two hypotheticals. In the first hypothetical (Economic Impact I), this calculation is performed under the assumption that the piracy intensity level increases from zero to the highest piracy intensity genre (which differs for the three measures of genre piracy intensity). 27 The first 24 NPD data do not necessarily reflect a representative sample of internet users because its web users voluntarily allowed themselves to be monitored. There is no reason to think that this problem would bias the relative amount of piracy across genres, however. 25 NPD provided data for nine genres but we could only match six to the OS genres and seven to the SoundScan sales genres. OS's data did not include "classical" and NPD's data did not include "Latin." 26 OS have wondered why the NPD correlations here are weaker than in my 2007 working paper. The answer is that in 2007 I equated their "hard" genre with NPD's "rock" but I later decided that NPD's "metal" genre was probably a better match. 27 The highest piracy-intensity value for the original OS measurement of downloads per album title is associated with "alternative", for the corrected OS measure of downloads per unit sales it is "hard", and for NPD it is "rap".
hypothetical gives an idea of the loss (of revenue) that might occur as other genre's reach the same piracy-intensity as the most pirate-intense genre in the early 2000s. Five of the six values are unambiguously large and economically important, and the sixth is not trivial. The second hypothetical calculates the expected loss when a genre, starting from zero piracy, achieves the average piracy rate for all genres. 28 The results from the second hypothetical, found in the bottom row of Table 5, are considerably smaller than the first hypothetical since it measures the piracy decline for the average genre. Although smaller, note that these average predicted declines are a fairly large portion of the actual decline in sales that took place from 2000-2005, which was 16.6%, 29 so the piracy-induced component would be an economically important portion of the decline.
The conclusion that the genres with the highest piracy rates suffered the largest sales losses is fairly consistent across specifications (although we cannot have a great deal of confidence in these results), revealing that it makes little difference which time period was used, whether the original OS piracy intensity numbers or the corrected OS piracy intensity numbers were used, or 28 The average NPD value is only an approximation since we only have data for 67% of the albums sales. Also, the difference between average genre and maximum genre differs between the NPD data and the OS data. 29 The equivalent value for 1999-2005 is 13.3%.
whether the NPD values were used. This is in contrast to OS, who found a positive (statistically insignificant) relationship.
I view these results as economically quite different than the OS genre results, even if the small number of observations makes it difficult to have normal levels of confidence in the results.

Data Replication
In this section we compare three of OS's data claims made in support of their thesis, against Because OS provide no reference for these factual claims (and the JPE apparently did not ask), we cannot directly source-check their factual claims but instead we must go to our own direct sources. The primary, perhaps only, source for international comparisons of record sales by year is the IFPI (International Federation of the Phonographic Industry).
With regard to the first claim about there being major markets with flat or rising sales, we first need to define "major." I would think that major national markets are presumably larger than Switzerland, which has a population similar to that of New York City. I think most economists would define rising sales as an increase in real revenues. 30 In my earlier 2007 working paper I discussed other claimed facts that I believe to be incorrect, but the demonstration in some of those cases was quite lengthy, so I merely refer the reader to Liebowitz (2007).
Using IFPI data and looking at the top 10 markets (Switzerland is number 10), reveals that real retail sales rose in 0 of the top 10 markets during 2000-2005. 31 The IFPI data also reveal that These data replications clearly fail.

Conclusion
None of the economic conclusions from OS's quasi-experiments hold up under replication.
Several factual claims made about the industry also could not be replicated.

EAST COAST AND WEST COAST
Calculating market shares of music albums by U.S. time zones merely requires putting the SoundScan cities (DMAs) into their appropriate time zones, although this is not as simple as just assigning DMAs to states. 34 The table in the text shows the change in sales over different periods of time. These are the yearly raw numbers, including the shares of the East and West Coasts. The middle time zones (Central and Mountain) had market share increases after 2000, but not as large as those in the West. This is also consistent with the hypothesis that OS reject.

MONTHLY PIRACY AND RECORD SALES
Here are the full regressions, including the monthly dummies, that lead to 3. Genre PIRACY and Genre Sales:

Piracy Intensity Measure
The genre piracy intensity measure, for our purposes, should reflect of number of recorded pirated songs listened to as a share of all recorded songs being listened to by those consumers who would have listened to those songs if piracy did not exist. If piracy has no effect on sales, this ratio should be close to zero since pirated songs would only be listened to in order to decide whether to purchase the song or not, and that decision should not take too many repeated plays. Pirate listeners who would not have purchased the song without piracy should be excluded from these calculations since their behavior does not directly affect record sales.
For obvious reasons, we cannot construct an ideal measure of piracy intensity. OS have information from their sample of pirates on the number of pirated downloaded songs and the national sales for their sample of albums, classified by genre. NPD has data on the share of users in their sample who listen to pirated songs and the share who listen to purchased songs.
These variables allow the creation of rough measures of piracy intensity. OS use "number of downloads" [number of pirated files] and appear to divide it by the number of album titles in their sample of albums in a genre [footnote 22 discussed why this appears to be the case] although it is not clear that this was their intention. I create a variable which I believe better fits their variable name "downloads per album" by using album sales in the denominator, not album titles.
Downloads per album title is not a good choice for a pirate intensity measure because the number of album titles may not reflect the number of albums sold. Here is a simple example to illustrate the point: Genre A has 1 million albums sold, 1 million total pirated albums, and 1,000 titles (each selling 1000 units, on average). Genre B has 1 million albums sold, 1 million total pirated albums, and 2 titles (each selling half a million on average). Each genre has the same number of albums sold. Each genre has the same number of pirated albums. Each genre, therefore, has the same piracy intensity, about 50%. Yet, the OS measure of piracy intensity would indicate that Genre B has 500 times the piracy intensity as Genre A because the number of pirated files per album title is 500 times greater.
Finally, these measures take the form a/b, where, if we could do it correctly, a is a measure of piracy (on the part of those who would otherwise purchase the music) and b is a measure of sales. A more correct variable would take the form a/(a+b) [where a+b equals the size of the potential market], and this ratio would be linearly related to piracy intensity as it would be expected to influence sales, assuming that piracy reduces sales. The variable actually used, a/b, grows more rapidly as piracy grows than the variable a/(a+b), if piracy reduces sales (b). In other words, piracy increases not only increase the numerator of a/b, but also decrease the denominator. Thus, the measures of piracy being used will show too high a level of piracy for genres with high levels of piracy and the coefficients relating sales changes to piracy intensity will be understated relative to the more appropriate variable a/(a+b). This understatement of the size of the results discussed in the text merely lends more credence to the claim that this evidence supports the hypothesis that piracy decreases sales. 14.000

RADIO STATION GENRES
Here is a list of radio station genres and sub genres according to Arbitron, the leading organization measuring the size of radio audiences. It should make clear how difficult it can be trying to match a record category, of which there are not as many, to the correct radio genre, since the two have different terminologies.:

Leading Markets with sales increases 2000-2005 and 2004-2005
OS claim to have found leading markets with increases in sales over the period 2000-2005. Below are listed the nominal and real sales changes for the ten leading markets over the period 2000-2005. None of the top 10 markets have an increase in real retail revenue over this period. These data come from the IFPI annual publication "Recording Industry in Numbers."

2000-2005 Change in Retail Revenue
OS also report that music sales rose in 4 of top 5 markets in 2005. The table below, using the same IFPI data on retail sales, reveals that 1 of the top 5 markets had a real increase in revenue and 2 of the top 5 had a nominal increase in revenue. An additional five countries (ranked by size) are shown to indicate none of the other top 10 markets had real sales increases that year although Italy had a nominal increase.