Thursday, September 14, 2006

The perceived value of recommendations

On the TidBITS Talk list, a discussion was started on the quality of recommender systems. An rather laughable suggestion by email from Amazon prompted the original poster to ask why these systems are still mediocre at best.

It reminded me of the session on the Techlens project at the CNI fall 2005 task force conference. There were some interesting observations on when recommenders work, and when they don't (most in the Q&A, so not covered in the abstract).


There are two problems for such systems: the quality of the underlying data, and the problem of the desired neighbourhood. I'll start with the latter.

How widely do you, as user, want to have the recommendations vary? When you are new to a subject, you want the defining standard works - a narrow view. As you get more versed in the subject, you actually don't want those predictable results anymore, as you will be already familiar with them. Without surprise, it has no value. Different users want different results.

Research on users expectations showed that users were most content with a recommender service if it would give 5 suggestions (in an unobtrusive interface), as long as out of these five one or two would be 'interesting'. Keep in mind though that this was research on users in a strictly defined research field, which can't be translated directly to other fields, but it gives an indication, and at least it is real, non-anecdotal data.


How does this translate to amazon? Like the original poster, I get the occasional amazon suggestion by email, most of which I delete instantly. Only rarely they were actually interesting. As a result, I find them annoying or amusing, depending on the actual suggestion - and they irritate me almost as much as spam.

However, when I browse amazon, the recommendations are much less obtrusive, so I glance at them when I want, and then I sometimes do find something interesting in there. And I find myself agreeing with the outcome of the techlens research: my amazon miss:hit ratio is 25:1, and I would like more hits, but it needn't be 1:1.


Now the data. The suggestions depend on the quality of the data. The ACM techlens used citations to see which objects were linked. That provides high-quality information on the links between objects.

Amazon however has to rely on more primitive metadata, such as the author, and refines this with buying and browsing patterns. It is actually surprisingly good at this, but as with all 'social sites' (of which amazon arguably is the granddaddy) this needs a critical mass to get reliable. In the dustier corners of the inventory, you get oddball results.

(nothing new here BTW - until recently, in our rare books department, the quality or even availability of indeces of specialized collections depended totally on the personal interest of the specialist...)

A good recommender system will always give you some surprising suggestions. It may not always be the surprise you wanted, but if it would be predictable, it would be of no value at all! So by definition, there is a high miss to hit-ratio. The key is that the system must be unobtrusive enough, so the misses can be ignored.

~

PS: in the long run, this will all change, when the systems will be able to parse the actual objects and build relationships based on the content. There is a lot of research in this area, largely spin-off of 'Homeland Security' projects. But it is still years away.

No comments: