Looking at micro attribution from a conversion modeling framework, there are a few insights we can contribute right away without getting into the details.
1) The sampling bias
If your attribution analysis used only data from convertors, then you have an issue with sampling bias.
As a first order question for any modeling project, understanding the data sample, therefore the potential sampling bias, is crucial. How is this relevant to the attribution problem?
Considering a hypothetical, but commonly asked type of, question:
What is the right attributions to banner and search given that I know the conversion path data:
Banner -> conversion: 20%
Banner -> search -> conversion: 40%
Search -> conversion: 25%
Search -> Banner -> conversion: 15%
Well, what could be with the question? A standard setup for a conversion model is to use conversion as the dependent variable for the model with banner and search as predictors. The problem here is, we only have convertor cases but no non-convertor cases. We simply can’t perform a model at all. We need more data such as the number of click on banner but did not convert.
The sampling bias issue is actually deeper than this. We want to know if the coverage of banner and search are “biased” for the data we are using, an example is when banner were national while search were regional. We also need to ask, if the future campaigns will be run in ways similar to what happened before – the requirement of modeling setup mimicking the applying context.
2) Encoding sequential pattern
The data for micro attribution is naturally in the form of collection of events/transactions:
Some may think that this form of data makes predictive modeling infeasible. This is not the cases. There are many predictive modeling are done with transaction/event type of data: fraud detection, survival model, to name a couple. In fact, there are sophisticated practice in mining and modeling sequential patterns that are way beyond what being thought about in common attribution problem discussion. The simple message is: this is an area that is well researched and practices and there have been great amount of knowledge and expertise related to this already.
3) Separating model development from implementation processes
Again, the common sense from the predictive modeling world can shed some light on how our web analytics industry should approach attribution problem. All WA vendors are trying to figure out this crucial question: how should we provide data/tool service to help clients solve their attribution problem. Should we provide data, should we provide attribution rules, or should provide flexible tools so that clients can specify their own attribution rules.
The modeling perspective says that there is no generic conversion model that is right for all clients, very much like in Direct Marketing we all know there is no one right response model for all clients – even for clients in the same industry. Discover Card will have a different response model than American Express, partly because of the differences in their targeted population, their services, and partly because of the availably of data. Web Analytics vendors should provide data sufficient for client to build their own conversion models, but not building ONE standard model for clients (of course, they can provide separate modeling services, which is a different story). Web Analytics vendors should also provide tools so that clients’ modeling can be specified/implemented once it’s been developed. Given the parametric nature in conversion models, none of the tools from current major Web Analytics vendors seem sufficient for this task.
That is all for today. Please come back to read the next post: conversion model – not what you want but what you need.