Gaming recommender systems for fun and profit
There's a big demand from the Drupal community to add fivestar-like ratings to the contrib modules. This would be a pretty cool feature, but it has other concerns too.
In a paper presented in WWW'04 conference titled "Shilling recommender systems for fun and profit" (PDF), the researchers talked about different ways and examples of gaming recommender systems. Such things could happen because the hackers what to 1) promote certain items, 2) "nuke" certain items, and 3) disrupt the entire systems. The researchers also simulated 2 kinds of automatic attacks, NormalBot and AverageBot, and showed that recommender systems are indeed vulnerable to such manipulation attacks.
Apart from academic research literature, it's easy to imagine possible manipulation on Drupal module ratings, either for fun or for profit. New modules are especially vulnerable because a few initial false negative ratings could very likely prevent further evaluations. In fact, if you search "rating manipulation" in Google, you'll find eBay, imdb, Amazon, etc are all victims. And I heard the module rating system in Joomla suffered the same thing too.
To cope with this issue, we need more sophisticated algorithms. My advisor, Paul Resnick, is one of the leading researchers in this area. He and Prof. Rahul Sami published a paper proposing a manipulation-resistance algorithm (PDF. However, a simpler and more intuitive alternative might be to maintain two rating scores -- one from all users, one from the experts. And the experts could be defined as users who registered for more than 1 year and submitted issues, etc.
To sum up, in this blog I'm trying to argue:
- Module rating system for drupal.org would be helpful,
- But the concern of gaming the system is real and legitmate
- Measurements should be taken if the rating system is deployed.
My summer funding would probably come from an NSF research grant on manipulation-resistant algorithms. I'll try to make some proposals on how to prevent gaming the module rating system, if d.o. infrastructure team decides to implement it, after I read more papers in the area.