Wednesday, August 29, 2012

The problem with crowdsourced content

We have whole lot of sites featuring reviews written by their readers on pretty much all topics. Of these, at least the front runners attract large traffic volumes, have a devoted user base, and are, I assume, making money. However, I want to discuss what I think is wrong with them.

The problem is the method of generating content. All these sites rely on a bevy of users to come write reviews on the products/service that the site focuses on and that they have used/experienced. This can be restaurant, book, travel, gadgets, or whatever else. Via friendly UIs, facilitated social media interaction etc., users are encouraged to contribute data for each others benefit. This strategy is very effective in generating large amounts of data. However, it is very bad at generating cohesive data.

In general, I have three somewhat interconnected problems with this:
  1. Data is of poor quality – Since the website wants readers to submit reviews, it can rarely hold them accountable to the quality of their writing. The aim is to lower the barrier to writing and social media sharing. Get him to write. No matter he writes, get him to write. To be fair, most of the reputed website will intervene if you write inflammatory or profane material, but apart from that, pretty much anything goes. As a direct consequence, the quality of review in terms of both the content (what is written) and the form (how it is written) goes down. Most people write unbalanced reviews, either giving full marks and endless praise or griping about a very bad experience.This can be avoided to some extent by making sincere efforts at moderating and community building (StackOverflow is a great example), but none of the major commercial websites seem to be doing so.
  2. There is just too much data – This is the explicit result of successfully inducing readers to submit content to a site. Since anyone can submit anything, the data volumes are large, and it becomes well-nigh impossible to find information. This is what I like to call the Problem of 500 reviews. What I mean by that is that on any successful reviews site, today you can find 500 reviews for pretty much every single item. Too much data is not much better than no data. The best this deluge of disjointed reader inputs can give us is a general sense of how people like something. As an experiment, choose any famous book about which you know nothing, and try to find out about the book using only GoodReads reviews. I am fairly regular on the site, but I mostly do it for the bragging rights, and to let my friends know what I am up to.
  3. Data is without structure – I totally agree that a successful travel site of the kind that we are talking about will have all data about some destination. But how do we find this data? Since it is broken up across a large number of unconnected reviews, it is very difficult to present the information in a coherent, intuitive manner. It is now left to the reader to sift through the data that each reviewer has provided and collate the data he needs (when to get there, how to get there etc.)
The crowdsourced content model is like a group discussion where everyone is talking at the same time. There is no anchor or reference around which a discussion can be built.

IMO, a far better alternative is to have an informed member write one review, and then use that to gather all sorts of varied and personalized experiences regarding the topic of discussion. It may seem so up front, but such a model (critic-driven model, if you will) is not about classroom style information broadcast. The Expert has not spoken. It is about providing a structured core, the basic information, and then inviting the readers to extend that into a wider body of information. If you want people to spend time and effort sharing their opinion, it is only fair that you offer them something in return.