P2P, Online File-Sharing, and the Music Industry
Rufus Pollock
2005-11-10, updated 2006-03-31

2007-11-07: info on more recent papers can be found in the filesharing section of the main site

Please post comments and corrections on the main site or by email to comments [at] rufuspollock.org.
Licensed under a CC Attribution License v2.0 (all jurisdictions).

Contents

  1. Introduction
  2. Summary of Evidence
    1. Zentner (2003)
      1. Summary
      2. Comments
    2. Oberholzer and Strumpf (2004)
      1. Summary
      2. Comments
    3. Oberholzer and Strumpf (2005)
      1. Summary
      2. Comments
    4. Peitz and Waelbroeck (2004)
      1. Summary
      2. Comments
    5. Rob and Waldfogel (2004)
      1. Summary
      2. Comments
    6. Blackburn (2004)
      1. Summary
      2. Comments
    7. Hong (2004)
      1. Summary
    8. Rochelandet and Le Guel (2005)
      1. Summary
      2. Comments

Introduction

Peer-to-Peer and its relation to online file-sharing has been a matter of great controversy for several years. Intersecting, as it does, the interests of innovators, content owners and consumers it has posed difficult and interesting questions not least those regarding how the interests of some IP owners should affect the development of technology. This brief literature summary does not seek to address these wider questions about how copyright and technology policy can be balanced in the best interests of society, but rather to simply address the basic question of the impact of online file-sharing on sales and welfare.

An explosion in research (mainly dependent on access to proprietary data) as a result of public interest in these issues means that we are now in a position to provide answers with some degree of certainty. The basic result is that online illegal file-sharing probably has some negative impact on traditional sales but the effect is appears to be quite small. The size of this effect is debated, and ranges from 0 to 100% of the sales decline in recent years, but a figure of between 0 and 30% would be a reasonable consensus value (i.e. that file-sharing accounted for 0-30% of the decline in sales not a 0-30% decline in sales). At the same time there is still substantial disagreement in the literature with the most impressive paper to date (Oberholzer and Strumpf 2005) estimating no impact from file-sharing.

Beyond this basic result several other very interesting facts have emerged. First is the differential impact of file-sharing on an artist depending on their existing popularity. According to Blackburn who investigates this issue the 'bottom' 3/4 of artists sell more as a consequence of file-sharing while the top 1/4 sell less. Second is the first tentative estimates (by Waldfogel and Rob) of the welfare consequences of file-sharing. Waldfogel and Rob's dramatic result is that file-sharing on average yields a gain to society three times the loss to the music industry in lost sales. While, as they emphasize, this result is preliminary and based on limited data it indicates the urgent need for more research on this issue as well as the possibility to have a win-win situation in which both creators and the public get a better deal, for example by using an alternative compensation system such as a levy.

Summary of Evidence

Zentner (2003)

Summary

Zentner obtained the proprieparty data from Forrester Research European consumer survey in October 2001. This data contained a discrete {0,1} variable on music purchase in the previous month. Using this and other data in the survey Zentner regresses music purchase on a vector of other variables including the binary 'regularly download MP3 files' response.

Summarizing, if music downloading reduces the probability of buying by 30%; if 15% of the population download music; if downloaders are twice as likely to buy music than non-downloaders; if -- conditional on buying -- downloaders and non-downloaders buy the same quantity of units, under all these assumptions sales in 2002 would have been 7.8% (0.3*[(0.15*2)/115)]) higher than the level they experienced.

[zentner_2003:23]

Comments

Zentner, who is a PhD student at Chicago, produced one of the first papers in this area but there are reasons to have reservations about Zentner's analysis and his conclusions. For example in his calculation of loss he assumes that downloaders are likely to buy twice as much music on average as non-downloaders. But this entirely ignores the frequently cited possibility that downloaders large propensity to purchase is positively correlated with downloading (the so-called sampling effect). If this is so you cannot simple multiply estimated reduction in purchases due to (illegal) downloading with their likelihood to purchase uncorrected for downloading (the two variables may be correlated in opposite directions which means the uncorrected estimates overestimate loss). To illustrate suppose that normally all individuals would purchase the same amount of music but downloaders hear more and therefore would purchase twice as much. Then in this framework downloading actually increases sales by (1-0.3)*200 - 100 = 40%.

Another issue relates to his method for dealing with unobserved heterogeneity in taste for music by using an instrumental variables approach. His chosen instruments are measures of internet sophistication and broadband access. However both these sets of variables while clearly correlated with downloading (as is intended) are also likely to be correlated with unobserved heterogeneity (through indirect factors such as living in a city).

Oberholzer and Strumpf (2004)

Summary

Oberholzer and Strumpf, use data on downloads provided from a P2P system combined with sales data extracted from Billboard to look at the impact of file-sharing. They conclude:

We find that file sharing has no statistically significant effect on purchases of the average album in our sample. Moreover, the estimates are of rather modest size when compared to the drastic reduction in sales in the music industry. At most, file sharing can explain a tiny fraction of this decline. This result is plausible given that movies, software, and video games are actively downloaded, and yet these industries have continued to grow since the advent of file sharing. While a full explanation for the recent decline in record sales are beyond the scope of this analysis, several plausible candidates exist. These alternative factors include poor macroeconomic conditions, a reduction in the number of album releases, growing competition from other forms of entertainment such as video games and DVDs (video game graphics have improved and the price of DVD players or movies have sharply fallen), a reduction in music variety stemming from the large consolidation in radio along with the rise of independent promoter fees to gain airplay, and possibly a consumer backlash against record industry tactics.26 It is also important to note that a similar drop in record sales occurred in the late 1970s and early 1980s, and that record sales in the 1990s may have been abnormally high as individuals replaced older formats with CDs (Liebowitz, 2003).

[:24]

Like most other authors they do a panel data regression of albums sales over time against instrumented estimates of downloading (in an attempt to eliminate the default correlation between downloading and sales that will otherwise bias estimates). Instruments consist of variables that affect downloads but would not be expected to affect sales. They suggest two possible instruments. The first, track length, which influences download size and speed, is constant across time but varies across albums. The second, network congestion and exogenous shifts in the supply of albums, are time varying but affect all albums equally. They also first difference after finding stationarity in their sales data.

Comments

This study has sparked a great deal of debate and a detailed critique by Liebowitz. The main shortcoming is the method of estimating the effect of downloading on legal sales by looking at contemporaneous changes, i.e. what effect downloading in week X has on sales in week X. However it seems clear that substitution can occur across time not simply contemporaneously.

Futhermore as Blackburn argues persuasively, pooling across albums is a major problem (a problem not of course confined to this paper but present in all of the other papers discussed). As Blackburn demonstrates pooling may lead to a 0 or even +ve effect across all albums (because positive effects on some albums cancel out negative effects on others). However this does not mean that overall effect on industry is negative since the per-album effect may be correlated with per-album sales.

Oberholzer and Strumpf (2005)

Summary

This paper updates and extends Oberholzer and Strumpf's original effort (see previous) to take account of more recent literature and does significant additional work to ensure that the results obtained are robust.

Instruments used are similar to the original paper but they add the case of 'misspellings' of album titles as well as more complex ones based on German school holidays. They then perform several analyses including:

  1. Pooled sample models (20ff): pool sales and downloads across all weeks. Using 2SLS (IV - Instrumental Variables) find a positive impact of downloads on sales though at a statistically insignificant level.
  2. Various panel data models (21ff): continue to find small and statistically insignificant effect of file-sharing on sales. They extend their analysis to the case where downloads may have non-contemporaneous effects on downloads (Table 8) and continue to find a samll, statistically insignificant positive impact of downloads on sales. They also specifically allow for the possibility that the effect of downloading varies with album popularity and find:

    The estimates in Table 9 provide some evidence that the effect of file sharing varies by popularity. While the download terms are typically positive, the estimated coefficients on the three types of popularity interactions in columns (I)-(VI) are all negative, indicating that the effect of downloads on sales is less positive for more popular artists. However, the joint effect of the download and the popularity interaction terms is never statistically significant (see the hypothesis tests at the bottom of the Table.)

    [:24]

  3. They perform an extensive set of additional robustness checks including those on: the possibility of a heterogeneous response to the instruments, the 'Drop-out' hypothesis (file-sharers and buyers are completely seperate groups and therefore growth in file-sharing community matters), excluding the holiday season (December), etcetera.

Overall the impact on sales, even in the most pessimistic model, is minimal: Focusing on the most negative point estimate (column X in Table 9), the annual industry sales loss due to file sharing is 3 million copies. This is virtually rounding error given that sales in 2002 were 803m CD albums (RIAA, 2004).[:34]

Comments

The excellent dataset and instruments combined with the significant additional work to examine the robustness of the results under a variety of specifications make this the most impressive paper on these issues published to date.

Peitz and Waelbroeck (2004)

Summary

Peitz and Waelbroeck run a cross-country analysis and find significant impact of downloading:

We have analyzed the RIAA’s claim that music downloads are causing a substantial decrease in music sales. Our macro data confirm their fear: we find that music downloading could have caused a 20% reduction in music sales worldwide between 1998-2002. While this is only a crude estimate, we believe that it is a good reference value that other studies, especially microeconometric ones, could use to assess the exact substitution that has taken place between CDs and MP3s. Our analysis also reveals that other factors than music downloads on file-sharing networks are likely to be responsible for the decline in music sales in 2003.

[:78]

Comments

The results in this case are very unreliable and should probably be ignored. Cross-country analyses are notoriously difficult (due to the problem of controlling for unobserved heterogeneity) and in this case are done over a relatively short time period with very limited set of independent variables. Furthermore as the authors state: We do not have data on MP3 downloads, Broadband, DMP and DVD variables for the years prior to 2002. Thus we used these variables in level in the regressions. However, levels are likely to be a good proxy for first differences in that period for most countries: in particular, music downloading on file-sharing networks and broadband access essentially started in 1999[:74]. This is a huge assumption and one about which one should have grave doubts given that their sample period is 1998-2002!

Moreover in several of the regressions including the most complete one (i.e. the one with all explanatory variables included) downloads are not statistically significant (at the 10% level).

Rob and Waldfogel (2004)

Summary

Rob (Univ of Pennsylvania) and Waldfogel (Wharton Business School) base their study on a survey of students at the University of Pennsylvania:

We argue that successfully measuring the possible sales-displacing effect of unpaid music downloading requires data on the quantities of purchases and downloads made by individuals, leading us to conduct original surveys. Using a variety of empirical approaches, we document that downloading displaces sales among a convenience sample of college students. The estimate we consider most conservative indicate that an additiona l download reduces sales by between 0.1 and 0.2 units. As a result, for the individuals in our sample, downloading reduced expenditure by about 10 percent but possibly much more. Supporting incomplete sales displacement is our finding that downloaded music is valued much less than purchased music.

While downloading reduces expenditure (on hit albums, 1999-2003) by $25 per capita in the sub-sample for which we perform a direct welfare analysis of downloading, it raises sample consumers’ welfare associated with these albums by $70 per capita. Some of the benefit to consumers are transfers from sellers, but most of the benefit ($45 per capita) comes from reductions in deadweight loss.

Two facts bear emphasis again. First, our sample is not representative, so our results should not be generalized. Second, our evaluation of welfare takes supply as given. It is entirely possible that downloading has important effects on the quantity and types of music recorded and marketed in the first place. This is an important area for further research.

[:28-29]

Comments

Due to its survey design and acquisition thereby of individual level quantity and (self-reported) valuation data able to address the more fundamental question of social welfare. Point out that survey based studies are the only real way to identify as they allow us to monitor individual changes and avoid the issues that pooling brings (for example a negative relation between donwloading and sales could simply indicate that downloaders and buyers have different tastes p. 8).

But the really important aspect of this paper is that is among the first to obtain the data necessary to estimate the welfare impact of downloading. They find a consumer welfare benefit of $70 per person of which around $45 is social welfare benefit. This is a very large amount and over twice the loss estimated in terms of reduced sales.

Blackburn (2004)

Summary

Blackburn (PhD Student, Harvard) provides one of the most detailed studies so far in large part thanks to the excellent dataset (proprietary Nielscan data). Blackburn uses announcement of RIAA lawsuites as instrumental variable impacting on downloads but not on albums.

Blackburn focuses on issues related to aggregation of data (ie. regressing all sales on all downloads rather than sales for an individual artist or album against downloads) and suggests that is a source of significant bias. The reason why aggregate estimates (i.e. without accounting for popularity) can yield no effect while actual effect is negative is that popularity is interacted with download quantity ie. phi * P_{i} * q_{i}^{t}. Since download quantity is correlated with popularity this means setting phi to 0 is not innocent.

Blackburn uses his data to reproduce the 0 effect result of Oberholzer and Strumpf but then goes on to show that disaggregation changes this result.

In particular, the point estimates imply that the median 'new' artist, whose weekly sales are 2,163 albums, would see a decrease in weekly sales of 101 albums per week were files shared to be reduced by 10%. A similar calculation can be made for an artist of maximum popularity. At the median level of sales for these artist, the estimate implies an increase in sales of 490 albums per week if file sharing were to be reduced by 10%. This stark contrast between the magnitudes of the effects for artists of varying levels of popularity highlights the importance of this heterogeneity in estimating the aggregate effects of file sharing.

...

A similar calculation can be made for estimating the total effect of file sharing on sales. To estimate the aggregate effect of a 30% reduction in file sharing across the board,33 I simply subtract out the effect of the deleted files from the second stage estimation in Table 4 and then aggregate up to market level numbers using the appropriate weights. The estimated effect of such an across-the- board reduction in file sharing is to increase aggregate sales by 15%. Again, while these calculations were useful for placing the analysis inside the framework of the previous literature, they do not take into account competition effects across albums, and so the effects of file sharing will be overstated in these estimates.

[:21]

Competition Effects: Blackburn introduces a multinomial logit model to allow for competition effects (i.e. fact that buying one album reduces likelihood of buying another as budget is fixed).

Doing this reveals that as a result of the lawsuit strategy followed by the RIAA against users of file sharing networks, album sales increased by 2.9% over the 23 weeks in the data sample after the strategy was announced. During this period, actual record sales in the U.S. were an average of 11,470,652 albums per week, based on national level data reported by Billboard magazine (2003) each week, and thus would have been 11,147,378 per week in the absence of the reduction in file sharing caused by the lawsuit strategy. Again using a baseline of $5 markup per CD, this translates to an increase in industry profits of $1,616,370 per week, or $37 million over the 23 week period after the lawsuit strategy was announced to the public. Note that, as expected, this increase in profits is much smaller than the number obtained using the simple reduced form estimates (approximately $160 million), which fail to account for the effect of competition among albums.

[:30]

p.32 provides a very interesting table that lists the estimated effect of 30% less file-sharing on artists depending on their position in the popularity distribution. By percentile (with 1% being lowest selling, 100% the highest selling) we have the break even point at the 75th percentile: that is the bottom 3/4 of artists gain from file-sharing while the top 1/4 lose.

Percentile | Actual Sales | Sales with 30% less file-sharing  
1%                73             70
5%                170            166
10%               281            277
25%               757            745
50%               2852           2851
75%               10110          9831
90%               26531          26934
95%               45255          47357
99%               133983         165054

Comments

There are two concerns about bias in Blackburn's results. First, the fact that Blackburn does not have data on downloading itself but on the stock of files on file-sharing networks. Using the stock of files to proxy for downloading activity is problematic since there is no necessary correlation between the two. This is particularly concerning since sales of an album generally peak in the weeks just after its release and then decline while the stock of files on file-sharing networks tends to show a steady increase. This will lead to an automatic negative correlation between the stock of files and sales even when there may be no connection at all.

Second Blackburn identification strategy uses the exogenous timing of the RIAA law suits as the instrumental variable affecting downloads but not sales. However the impact of the lawsuits would likely be to deter 'marginal' downloaders but have little impact on 'serious' users of filesharing networks. It is likely that 'marginal' users are more like samplers who go on to ultimately purchase albums they download while 'serious' users are more likely to be substituters who don't purchase what they download. Thus the lawsuits while reducing downloads would also alter the composition of downloaders. As a consequence there would be a negative bias in the estimate of the effect of downloading on sales.

One also has some reservations about estimates of financial harm. Most notably extrapolation of point estimates of elasticity to substantial changes is dubious and might be the reason for the (slightly incredible) size of the effects on sales (seems implausible given competition from other areas such as DVDs and internet generally that music sales should continue on a trend extrapolate from 1996-1999).

Hong (2004)

Summary

From the abstract:

This paper quantifies the magnitude of changes in household-level expenditures on recorded music in the United States, particularly attributed to the emergence of Napster. Exploiting the rich information contained in the Consumer Expenditure Survey, I use three approaches to measure the effect of Napster. The difference-in-difference kernel matching (DDM) method directly quantifies the effect. I find that the quarterly music expenditure of the average U.S. household has declined by approximately three dollars as a result of using the Internet and plausibly Napster. This accounts for 39% of the decrease in total recording sales in 2000. The second approach estimates a demand system for entertainment goods. The estimated cross-price elasticities imply that changes in prices of other entertainment goods also explain the slump in recorded music sales. In 2000, roughly 37% of the decline in recording sales is due to such changes in prices. The final method constructs synthetic cohorts. The results indicate that transition from LPs to CDs might describe the increase in music sales during the 1990’s as well as the recent slowdown. These two other methods indirectly measure the effect of Napster in that they explicate that more than 80% of music sales decrease in 2000 might have resulted from factors aside from Napster. This implies that the estimated magnitude using DDM may quantify changes in the household-level music expenditure due to not only Napster but also factors other than file-sharing of copyrighted music.

Rochelandet and Le Guel (2005)

Summary

The authors conducted a survey of 2533 individuals in January and February 2005. The dependent variable was an integer 0-3 scale ranging from never download (0) to frequently download (3) from P2P networks. Responses were very evenly divided between the categoris and overall 74% of participants stated that they had downloaded from P2P networks. Armed with a large number of independent variables the authors then estimated an ordered logit model to investigate what variables were associated with copying. The econometric results are summarized in their table on page 5 which is reproduced here:

Effects on copying behavior Determinants
Favorable - Social neighboring***
- Internet skills***
- Copying of software***
- Cultural diversity***
- Being male**
Unfavorable - Increasing in the WTP for originals***
- Increasing in ethical concerns***
- Higher education***
- Increasing in age**
Neutral - Perception of legal and technical risks - Cultural spending - Location - Experience in copying - Socio-professional group - Income - Diploma - Household size

The main facts relevant to us are that 'cultural spending' appears to be unrelated to copying but increased willingness to pay (WTP) is negatively related. Willingness to pay for a track averaged 0.30 euros with a standard deviation of 0.27 euros (ed: For comparison the iTunes music store charges 0.99 euros a track).

Comments

The two facts singled out above regarding cultural spending and willingness to pay would both tend to increase the welfare benefits of P2P file-sharing. The first by indicating that spending, and therefore the supply of creative works, is unaffected by P2P (the authors suggest that this indicates the substitution effect approximately cancels the sampling effect). The second by indicating that copying is among those who would not purchase anyway and who would, with monopoly pricing, make up the deadweight costs of monopoly (i.e. those who would acquire the good at marginal cost but who will not when it is priced at monopoly rates).

Bibliography and References

  1. [blackburn_2004] Online Piracy and Recorded Music Sales , Blackburn, David; 2004-12-30
    1. http://www.economics.harvard.edu/~dblackbu/papers/blackburn_fs.pdf
  2. [geist_2005] Piercing the peer-to-peer myths: An Examination of the Canadian Experience , Geist, Michael; First Monday 2005-04-05
    1. http://www.firstmonday.org/issues/issue10_4/geist/
  3. [hong_2004] The Effect of Napster on Recorded Music Sales: Evidence from the Consumer Expenditure Survey , Hong, Seung-Hyun; 2004-01
    1. http://siepr.stanford.edu/papers/pdf/03-18.pdf
  4. [leguel_ea_2005] P2P Music-Sharing Networks: Why Legal Fight Against Copiers May be Inefficient? , Fabrice Le Guel; Fabrice Rochelandet; 2005-10-01
    1. http://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=476297
  5. [oberholzer_ea_2004] The Effect of File Sharing on Record Sales: An Empirical Analysis , Oberholzer, Felix; Strumpf, Koleman; 2004-03
    1. http://www.unc.edu/~cigar/papers/FileSharing_March2004.pdf
  6. [oberholzer_ea_2005] The Effect of File Sharing on Record Sales: An Empirical Analysis , Oberholzer, Felix; Strumpf, Koleman; 2005-06
    1. http://www.unc.edu/~cigar/papers/FileSharing_June2005_final.pdf
  7. [peitz_ea_2004] The Effect of Internet Piracy on Music Sales: Cross-Section Evidence , Peitz, M.; Waelbroeck, P.; Review of Economic Research on Copyright Issues pp. 71-79 2004
    1. http://www.serci.org/docs_1_2/waelbroeck.pdf
  8. [rob_ea_2004] Piracy on the High C's: Music Downloading, Sales Displacement, and Social Welfare in a Sample of College Students , Rob, Rafael; Waldfogel, Joel; 2004-09-30
    1. Preliminary Draft 2004-09-30 NBER working paper (have to pay!)
  9. [zentner_2003] Measuring the Effect of Music Downloads on Music Purchases , Zentner, Alejandro; 2003-06
    1. http://home.uchicago.edu/~alezentn/musicindustrynew.pdf