Who Writes Wikipedia?

by on September 30, 2006

Via Patri Friedman, here’s a fascinating article on the people who write Wikipedia:

I purchased some time on a computer cluster and downloaded a copy of the Wikipedia archives. I wrote a little program to go through each edit and count how much of it remained in the latest version. Instead of counting edits, as Wales did, I counted the number of letters a user actually contributed to the present article.

If you just count edits, it appears the biggest contributors to the Alan Alda article (7 of the top 10) are registered users who (all but 2) have made thousands of edits to the site. Indeed, #4 has made over 7,000 edits while #7 has over 25,000. In other words, if you use Wales’s methods, you get Wales’s results: most of the content seems to be written by heavy editors.

But when you count letters, the picture dramatically changes: few of the contributors (2 out of the top 10) are even registered and most (6 out of the top 10) have made less than 25 edits to the entire site. In fact, #9 has made exactly one edit–this one! With the more reasonable metric–indeed, the one Wales himself said he planned to use in the next revision of his study–the result completely reverses.

I don’t have the resources to run this calculation across all of Wikipedia (there are over 60 million edits!), but I ran it on several more randomly-selected articles and the results were much the same. For example, the largest portion of the Anaconda article was written by a user who only made 2 edits to it (and only 100 on the entire site). By contrast, the largest number of edits were made by a user who appears to have contributed no text to the final article (the edits were all deleting things and moving things around).

When you put it all together, the story become clear: an outsider makes one edit to add a chunk of information, then insiders make several edits tweaking and reformatting it. In addition, insiders rack up thousands of edits doing things like changing the name of a category across the entire site–the kind of thing only insiders deeply care about. As a result, insiders account for the vast majority of the edits. But it’s the outsiders who provide nearly all of the content.

And when you think about it, this makes perfect sense. Writing an encyclopedia is hard. To do anywhere near a decent job, you have to know a great deal of information about an incredibly wide variety of subjects. Writing so much text is difficult, but doing all the background research seems impossible.

On the other hand, everyone has a bunch of obscure things that, for one reason or another, they’ve come to know well. So they share them, clicking the edit link and adding a paragraph or two to Wikipedia. At the same time, a small number of people have become particularly involved in Wikipedia itself, learning its policies and special syntax, and spending their time tweaking the contributions of everybody else.

This dovetails perfectly with Yochai Benkler’s explanation for the power of peer production. One of Benkler’s key points is that peer production is a means of accomplishing an efficient division of labor in circumstances where potential contributors have radically varying aptitudes for the various discrete tasks. If you’re a physics grad student doing research on high-energy physics, then adding a couple of paragraphs to an article on particle colliders requires almost no effort at all. For any given subject, there’s almost certain to be somebody who happens to know about that subject and is willing to devote an hour or two of his time to it.

The problem is that in traditionally structured markets, with firms and money-mediated transactions, finding that guy and negotiating his participation requires far more work than his actual contribution. Writing a paragraph might take a guy with the appropriate expertise only 10 minutes, but finding a guy with the appropriate expertise might take hours–if he can find him at all. As a result, the editor of a traditional encyclopedia is inevitably forced to economize on her own search time by choosing a much smaller number of contributors–say a single high-energy physicist to write all the articles about high-energy physics–and having them write more content. This causes two related problems: first, the physicist chosen may not be an expert on all the subjects on which he’s asked to write about, requiring him to do a lot of original research. Secondly, because he’s being asked to contribute a large amount of content, some of its outside his immediate area of speciality, he’s likely to ask for substantial compensation. That raises the per-article cost of the encyclopedia, limiting both how much information it can contain and how many people will be able to afford the finished product.

Peer production, then, can be seen as a way of enhancing the division of labor by reducing search costs bourne by the editors. Because potential contributors self-select for the tasks at which they’ve got the most expertise, all the editor needs to do is aggregate the various contributions into a coherent whole. It turns out that for encyclopedias, at least, aggregating a lot of small contributions from self-selected authors is much less work than seeking out contributors.

Hence, when the human inputs to the production process are highly heterogenous–when each small part of the finished product is best provided by a different person–the mechanics of the traditional market can become an impediment to the efficient division of labor. Finding the right guy, giving him access to the work-in-progress, and paying him for his trouble has a non-zero cost. This cost is fairly small when you’re in the market for a relatively homogenous good like oil or janitorial services. But when you’re looking for a kernel patch or a paragraph about strange quarks, finding the best person can be extremely difficult.

Luckily, people tend to know their own abilities. So if you make it easy enough for talented people to find tasks where their skills are needed–and easy enough for them to actually contribute–at least some of them will do so. And in a few years, as if by magic, you’ll have an encyclopedia that in at least some respects outperforms every reference work ever written.

Comments on this entry are closed.

Previous post:

Next post: