Analytics Strategist

January 27, 2009

Recommendation Algorithm and Personalization

Filed under: Datarology, Web Analytics — Tags: , , — Huayin Wang @ 6:29 pm

Recommendation algorithm is at the heart of personalization of contents!

Why? The answer lies in the growing importance and availability of data and speed of changes. 

This of course is breaking a tradition where human knowledge and insights drive designs and decisions directly – in case of personalization, many of them are incresingly mix human manual process with data and algorithm driven process and in many other cases, it can be completely data/algorithm driven.

We should really not be too surprised about this if we stop and ask, where the human knowledge and insights come from? It based on data, many different kinds of data.  When the relevant data are sufficiently available and the learning process is well understood, put human effort in between the otherwise automatable processes can only add inefficiency. 

I have just run into this interesting post about  music recommendation, a field rich with many different ways of doing personalization/recommendation.  Here it is: Four Approaches to music recommendations.

December 18, 2008

Google’s achilles’ heel – a follow up

Filed under: Datarology, Technology — Tags: , , , , — Huayin Wang @ 9:52 pm

Follow up to one of my early post about google’s achillis’ heel, I’d like to add that Google’s latest searchwiki seems to be an interesting response to what I mentioned earlier — I know I know it is not quite like that 🙂

I love to count the many different ways of ranking stuff in response to a search query.  The objects, the stuffs, can be text, link, document, image, video etc..  The ranking principle is the essentially a rule of relevance and/or similarity. I count four main types:

1) by content similarity, the algorithm could be PageRank, HITS etc..  For images and videos, this can prove to be very difficult because it involves not only the hard core technology such as pattern recognization for images, but also involves large stocks of prior knowledge about object categorization etc..

2) by similarity of user behavior, when applied some kinds of collective intelligence, or collaborative filtering type of algorithms.  User behavior can serve as implicit voting; with algorithms’ help, the complexity of the ranking operation can be dramatically reduced.

3) by similarity of user explicit ratings.  Users’ search phrase and explicit ratings ( ratings/reviews on amazon, as well as Google’s latest searchWiki, which interestingly only affect what user see next time, not anyone else’).  Some types of social/collective intelligence algorithm has to be applied in order to solve the complexity issue, as well as the sparse data problem associated it when crossing search query with user ratings.

4) of course, there is always the money logic.

If you know more ranking logic than what posted here, I’d like to know it ..

Blog at