Library clips

sharing ideas thoughts and feedback

February 13, 2009

Measuring participation distribution

Filed under: network

I was reading Clay Shirky’s book "Here Comes Everybody" over the weekend, and came across another version of his great analysis in measuring participation distribution in social systems.

Rather than a Gaussian Bell Curve, a more precise model is used based on a Power Law Distribution.

It plays on the concept of Pareto’s principle of 80% of the work/people comes from 20% of the work/people, and also the participation inequality description from the 90-9-1 principle.

Basically, "in social groups, some people actively participate more than others."

Some statistics:

"Over 50% of all the Wikipedia edits are done by just .7% of the users … 524 people."

"Just 0.16% of all visitors to YouTube upload videos to it"

In fact in social systems "averages" really don’t mean much, as there is such a massive difference in levels of participation. 

That is, a few heavy contributors put an average measure way out of whack, so much so that it doesn’t reflect the "average" at all.
In fact this means that most people contribute below the average.

SCREENSHOT - AN EXAMPLE OF PHOTO CONTRIBUTIONS TO A SITE

118 photographers
Top 10th contributed half
Top contributor uploaded 238 photos (about one in twelve)
Average is 26
Median 11
Mode 1

“…the nth position has 1/nth of the first position’s rank. In a pure power law distribution, the gap between the first and second position is larger than the gap between second and third, and so on” 

- average (all items divided amongst all contributors)
- median (the middle contributor)
- mode (the number of items that appeared most frequently)

Notice the sharp drop off after the two heavy contributors

- because of these disproportionate contributions due to 2 people, three quarters of photographers contributed a below-average number of pictures

“…the imbalance drives large social systems rather than damaging them…no effort is made to even out their contributions. The spontaneous division of labour…wouldnt be possible if there were concern for reducing inequality…most large social experiments are engines for harnessing inequality rather than limiting it”

I really like this bit as the power law distribution is better at describing the abstract in these cases, whereas a Bell Curve is cursed by the abstract:

“…large social systems cannot be understood as a simple aggregation of the behaviour of some nonexistent ‘average’ user”

“for anything like [measuring people’s] height that falls on a bell curve, knowing any one of these numbers-average, median, mode-is a clue to the others”

We are used to the average being the same as the median (middle contributor) 

Example - "Bill Gates walks into a bar, and suddenly everyone inside becomes a millionaire, on average. The corollary is that everyone else in the bar also acquires a below-average income”

This is almost going back to Popper’s the problem of induction, in whether inductive reasoning is valid:

“…you cannot understand…any large social system by looking at any one user or even a small group and assuming they are representative of the whole”

This is key in that it takes into account some of the complex nature of what it’s measuring:

“power law distributions tend to describe systems of interacting elements, rather than just collections of variable elements. Height is not a system-my height is independent of yours. My use of Wikipedia is not independent of yours…we’re used to being able to extract useful averages from small samples and to reason about the whole system based on those averages…to understand the creation of something like a Wikipedia article, you can’t look for a representative contributor, because none exists. Instead, you have to change your focus, to concentrate not on the individual users but on the behaviour of the collective

I agree with Stewart Mader that once social software matures, the social influence and discipline will vary the 90-9-1 participation ratio dynamics inside organisations to 60%-40%. Due to people influencing others to re-purpose emails, disciplining them to use forums, blogs and wikis as they will be used to working that way in other groups…kind of like a participation virus.

Another great book I recommend is Nassim Taleb’s, "The Black Swan", he really goes to town on the bell curve, especially because risk management is based on bell curve assumptions,…here’s a review.

For example, what we get when Bill Gates walks into a room is an L-curve rather than a Bell curve…and if we base predictions and forecasting on a Bell curve (what we would like real life to be, but it just isn’t), we are going to get into trouble when we are hit by a black swan.

Another related phenomena is based on Chris Anderson’s book "The Long Tail", where in e-commerce the less popular items in aggregate make up a similar market share as the popular items.

ie "products that are in low demand or have low sales volume can collectively make up a market share that rivals or exceeds the relatively few current bestsellers and blockbusters …"

When it comes to measuring community participation at work I’m going to use a power law, as we want to see the reality of the situation, not some average user that doesn’t exist.

2 Comments »

The URI to TrackBack this entry is: http://libraryclips.blogsome.com/2009/02/13/measuring-participation-distribution/trackback/

  1. Hvem bidrager og deltager i web 2.0?

    På bloggen Library Clips har de dette glimrende indlæg om at måle deltagelse på web 2.0 sites: Measuring participation distribution
    Konklusionen er, at det er meget få, der deltager. Der er nogle enkelte, der er store bidragsydere, mens resten sto…

    Trackback by CBS Bibliotek Blog - Innovation & Ny Viden — February 13, 2009 @ 11:08 am

  2. It’s an interesting study..

    Comment by edrams — February 13, 2009 @ 7:22 pm

RSS feed for comments on this post.

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>



Anti-spam measure: please retype the above text into the box provided.

Please note that comments are moderated and will                  not therefore appear immediately.
                    Please do not repost.


Library clips
Library clips Subscribe by Email                                                    

Get free blog up and running in minutes with Blogsome | Theme designs available here