Yesterday the MPAA issued a report commissioned from the global PR firm Millward Brown looking at "the role of search in online piracy." This coincided with the RIAA's Cary Sherman testimony before the House IP subcommittee that search engines are not doing enough to protect his industry from piracy. Here are some thoughts on the new report and the issue generally.
The report tries to ascertain how much of the traffic to infringing content is sent there by search engines. To measure this, the report employs "a customized, hybrid approach" that doesn't merely look at whether the visit to an infringing URL came from a link on a search page. Instead, it looks at whether a user searched for a "qualifying" search term within 20 minutes of reaching the infringing URL. "Qualifying" search queries, the report says, are associated with attempts to find illegal content and include "domain terms like '1Channel' and 'sidereel', generic terms like 'watch movies online' and movie and TV title-based terms like 'Dark Knight Rises'." As the report puts it, "This holistic approach contrasts with a more narrow definition that counts search only when a visit is preceded by a visit to a search engine."
The report is clear that "this method did not seek to indicate the degree to which infringing content appears on search engine results pages themselves," but merely sought to show that search engines "influenced the path" users took to reach infringing content. It concluded that "approximately 20% of all visits to infringing content were influenced by a search query from 2010-2012."
I have a couple of concerns with this methodology. First is that it implicitly puts search engines on the hook not just for linking directly to infringing content (for which there is a notice-and-takedown process available), but also for "influencing the path" that a user takes on their web travels. As we all know, correlation is not causation, so it's not clear to me that because I searched for "transformers" 15 minutes before I visited the URL for a pirate stream of Game of Thrones that necessarily means that the search engine influenced me in any way, and much less should be responsible for my behavior.
Second, I'm concerned that "qualifying search queries" include generic terms like "watch movies online" and title-based terms like "Dark Knight." That means that if you searched simply for the title of any movie or TV show, then the clock started ticking and, if you hit an infringing URL within 20 minutes, you're path to that URL was deemed influenced by the search engine. I don't know about most users, but I probably visit a search engine more than once every 20 minutes. If one is a film and TV enthusiast, I wonder if the influence shot clock wouldn't be constantly ticking.
It's therefore not surprising that, according to the report, 58% of all visits to infringing URLs that were "influenced" by a search engine came from queries for either generic or title-based terms, not from the more-clearly suspicious "domain" terms. As the report points out, this "indicat[es] that these consumers did not display an intention of viewing content illegally." So the question is, why did these consumers who had no illegal intent end up at infringing sites? Could it be that they did not have a legal alternative to accessing the content they were seeking? That would not excuse their behavior, and it's the movie industry's prerogative whether and when to make their content available. Indeed release windows are part of its business model, although a business model seemingly in tension with consumer demand as evidenced by the shrinking theatrical release window. That all said, it's not clear to me why search engines should be in the business of ensuring other industries's business models remain unchanged.
Finally, the report looks at whether Google's change to its search algorithm last summer affected referrals to infringing content. The change penalizes a site in Google’s search results based on the number of copyright removal notices received for that site. According to the report, the new algorithm did not really change the percent of direct referrals from Google to sites in Google's own transparency report—remaining at around 9%.
That's very interesting, but what might be more interesting than the percent of direct referrals is the total number of direct referrals. After all, the percent of referrals could remain at 10%, but if the total traffic to those sites decreases as a result of the algorithmic change, that would be a success. Indeed, even if traffic to those sites increases, but the share remains the same, it could also be considered a success given the potential counterfactual situation in which but for the change the Google's share of referrals would have been bigger.
Also, as Google explained when it announced the algorithmic change, penalized sites would appear lower than they otherwise would in search results, not be blocked altogether. So it may well be the case that a search for something like "dark knight free download divx pirate bay" will still return links to infringing content high in the search results even though they have been algorithmically demoted because there is no other content that beats it even with the demotion. I don't think anyone can seriously expect Google to display links to Amazon and Netflix and Warner Bros. as the top results for that search.
So, what might be more interesting is to figure out the effect that the algorithmic change had not on the share of referral traffic broadly, but on direct referrals from search results for generic and title-based searches made by "consumers [that] did not display an intention of viewing content illegally." I wonder if the algorithmic change has not had an impact there.