Friday, August 27, 2010

Lessons from another million dollar prize


 I do remember that on October 2, 2006 Netflix announced a million dollar prize for improving their algorithm used to suggest movies based on the previous rankings by the viewer or mathematically speaking it meant to have Root Mean Squared Error drop by 10% (or Mean Squared Error by 19%).  If nobody comes up with this result, Netflix promised $50,000 for the best result each year. I have to confess that I was hoping that somebody will come up with a better algorithm in a year. But I did not hear anything after a year. Somehow I had missed the news last September about announcing winners of this million dollar prize. So I was really happy to notice in the first seminar bulletin of the new academic year that today Robert Bell from AT&T labs-Research will be talking about this Netflix Prize.

In the beginning there were 50,051 contestants for the prize. It was interesting to learn about the amount of data and parameters of this problem and also that solutions were searched not by individuals but by teams. The whole process was a long and dramatic marathon that ended with a photo finish. First all teams were given training data - it means 100 million ratings over 6 years (2000-2005) by 480,000 users for 17,700 movies (movies here included also TV shows etc.). Then there were test data - last few ratings of each user. Robert Bell gave some interesting details about these ratings - for example, in 2004 people became happier - the movie ratings became higher, why - it is not clear, as he put it - it does not mean that Hollywood became better. Data were scattered - some people rated only movies they liked or did not like, some rated every movie they watched, some rated only occasionally. There were also some extreme cases - one user had rated 17,651 movies (almost every single one) with average rating 1.9 (on a scale form 1 to 5), the other rated 17, 32 with average rating 1.81. One user managed to rate 5000 movies in one day - as Bob Bell commented - that user must have had some computer science background to do so. For unknown reason the most rated movie appeared to be Miss Congeniality. Average number of ratings per user was 208. Huge amount of data, nevertheless 99% of data were missing and certainly not missing at random. Users were identified just by their ID number. It means that if some user did not rate any children's movie, you can assume that this user is not a child. It became clear that what you rate and what don't provides info about your preferences. User behavior may change over the time.
The research team came up with various models - largest one took about a month to create and consisted of billions of parameters.

Statistician G. Box : All models are wrong; some of them are useful.
When solving such large problems it is important to find good teammates. In the process of the competition some teams combined together, some invited helpers. First year Bell's team had 107 sets (methods), at the end the number was over 800. But one of the lessons they learned was that a handful of simple models   achieve 80% of improvement.
The race at the final days of the competition was very very close - when it was 11 days left to go first three teams had results 9.91%; 9.89% and 9.83%. The two leading teams had the same result but the winner was the team who submitted their result 20 minutes earlier...
At the end of course there is a question - how about the money? Robert Bell said that the million dollars initially were important to attract researchers to the problem, later somehow the money issue lost significance. It was the challenge to solve this multi-layered large data problem that kept researchers going. There were interesting ideas appearing during this competition, lots of collaborative work.
Who received million dollars? Well, look here.
More about how this Netflix prize was won.

No comments:

Post a Comment