Analytics Strategist

May 24, 2012

The Principles of Attribution Model

Filed under: attribution analytics — Tags: , , — Huayin Wang @ 7:36 pm

(Disclaimer:  some questions and answers below are totally made up,  any resemblance to anything anyone said is purely coincidental)

How do we know an attribution model, such as Last Click Attribution, is wrong?

  • it is incorrect surprise surprise, a lot of people just make the claim and be done with it
  • it does not accurately capture the real influence a campaign has on purchase – but how do you know it?
  • it only credit the closer – isn’t this just a re-statement of what it is?
  • it is unfair to upper funnels and only awards to lower funnel – are you suggesting that it should award to all funnels, why?
  • it leads to budget mis-allocation so your campaign is not optimized – how do you know?
  • it is so obvious, I just know it – what?

How do we know an attribution model, such as a equal attribution model, is right?

  • it is better than LCA – intuition?
  • it gives out different credits than LCA so you can see how much mis-allocation LCA does to you campaign – different from LCA is not automatically right
  • we tested and it generate better success metrics for the campaign – sound good, how?
  • it is fair – what does that mean?

How do we find the right attribution model?

  • try different attribution models and test the outcome – attribution model does not general outcome to campaigns directly
  • play with different models and see which one fit your situation better – how do I know the fitness?
  • use statistical modeling methodology to measure influence objectively – what models? conversion models?
  • use predictive model for conversion – why predictive models? what models? how to calculate influence and credit from the models?
  • test and control experiment – how many test and control, what formula to use to calculate credit?
  • you decide, we allow you to choose and try whatever attribution weights you want – but I want to know what’s the right one?
  • the predictive models help you with optimization, once we get that, you do not care about attribution – but I do care …
  • shh … it is proprietary: I won’t tell you or I will kill you! – ?

The Principle of Influence

Three principles are often implicitly used:  the “influence principle”,  the “fairness principle” and the “optimization principle”.

The influence principle works like this: assume we can measure each campaign’s influence on a conversion, the correct attribution model will give credit to campaigns proportional to their influence.  The second principle is often worded with “fairness”, but very much the same as the first principle:  if multiple campaigns contribute to a conversion, giving 100% credit to only one of them if “unfair” to others.  The third principle, the optimization principle, in my understanding, is more about the application of attribution (or the benefit of it) and not about the principle of attribution.

The principle of influence is the anchor of three; the fairness and optimization principles are either a softer version or a derivative of it.

Now we have our principle, are we close to figuring out the right approach to attribution model?  We need to get closer to the assumption of this principle.  Can we objectively measure (quantify)  influence?  Are there multiple solutions or just one right way to do this?

If influence principle is the only justification of attribution models, then quantitative measurement methodology such as probabilistic modeling, some time it is called algorithmic solution which I think is a misnomer,  will be the center technology to use.  It leave no room for arguing just on the ground of intuition alone.  Those who offer only intuition and experience, plus tools for clients to play with whatever attribution weights are not attribution solution provider, but merely a vendor of flexible reporting.

Those of the intuition and experience school like to frame attribution model around the order and position of touch points:  the first/last/even and the introducer/assist/closer. (how many vendors are doing this today?)  They have troubles in providing quantitative probabilistic solution to attribution issue.  The little known fact is that it is analytically flawed:  the labeling of “last touch” and “closer” are only known post-conversion, and therefore not usable inside probabilistic modeling framework.  In predictive modeling and data mining lingo, this is known as the “leakage problem”.  (search on Google, or read Xuhui’s article that mentioned this).

Unfortunately, we have a problem with the data scientist camp as well but of different nature; it is the lack of transparency with metrics, models and process details.  Some vendors are unwilling to open up their “secret sauce”.  Perhaps, but is that all?  I will try to demystify and discuss the “secret sauce” of attribution modeling.


Advertisements

May 17, 2012

Attribution Model vs Attribution Modeling

Attribution is a difficult topic, growing into a mess of tangled threads.

I hope this post, and subsequent ones, will help to untangle the messy threads.  I like to start with simple stuff, be meticulous with the use of words and concepts and be patient; after all, haste makes waste.

When an advertiser records a conversion or purchase, some times there are multiple campaigns in the touch point history of the conversion; how do we decide what campaign(s) responsible for the conversion and how should the conversion be credited to each of these campaigns?  This is the attribution problem; a practical issue first raised in digital advertising but in itself a general analytical challenges.  It is applicable to many marketing/advertising contexts, cross channels or within a particular channel for example.

Micro vs Macro

Notice that Attribution is a “micro” level problem: it dealt with each individual conversion event.  In contrast, Marketing Mix Model (or Media Mix Model) deals with “macro” level problem: crediting conversion volume to each channel/campaigns in aggregate.  There are similarity between the two when viewed from the business side; they are quite different analytic problems, different in all major aspects of the analytic process: from data to methodology to application.

Attribution Model vs Attribution Modeling

Advertisers implement business rule(s) to handle this “attribution”, or credit distribution, process.  These rules are generally called “attribution rule” or “attribution model”; examples of it are Last Click Model, First Click Model, Fractional Attribution Model etc..  Rules and models are interchangeable in this regard, they serve as instruction set for the execution of attribution process.

There are no shortage of attribution rules or models being discussed. Anyone can come up with a new one, as long as it does partitioning credit .  The challenge is finding the right one, to choose from too many of them.  In other words, it is the lack of justification for the approach, process and methodology of Attribution Rules/Models that is a problem.

Now comes the Attribution Modeling – a statistical model-based approach to quantify the impact of each campaigns on each individual’s conversion behavior.  It is a data-driven algorithmic approach; it is hot and cool, with an aura of objectivity around.  It is often treated as the secret sauce to unlock attribution and optimization and covered with a proprietary blackbox.

Let me slow down a bit.  I have discussed two important concepts here: Attribution Model and Attribution Modeling. The former refers to the attribution rules; the later refers to the process of generating/justifying the rules. I understand that everyone do not agree with my use of the words, or the distinction between the two; but I think this is a critical distinction, for untangling the threads in attribution discussion.

Domain Expert vs Data Scientist

There are generally two camps when it comes to the generation/justification of attribution model.  The first is the “domain experts” and the second the “data scientists”.  Domain experts take issues with attribution models by pointing out the good and bad, arguing on behalf of common sense, experience and intuition; it is qualitative, insightful and at times interesting but generally pessimistic and fall short when it comes to build a rigorous data-driven solutions. The general principle for justifying attribution is one of two: the influence and the fairness.  The influence principle attributes credit based on the influence of the campaign on conversion, whereas the fairness is often stated generally.

The fairness principle is not a concern for the data scientists camp; in fact, it is all about modeling/quantifying the impact or influence. After all, if you can do the attribution based on precise measurements of influences of each touch points, what other principle do you need? Of course, the problem is often about the promise and less about the principle.  In contrast to the domain experts, the data scientists approach is quantitative, rigorous and data driven. You can argue with the choice of a specific modeling methodology, but the resulting model itself does not require common sense or past experience to justify.

Principle of Attribution:  Influence, Fairness, Optimality

A third principle in picking the right attribution model is optimality, for lack of better word.  Do right attribution models lead to optimal campaign management?  Some argue yes.  Does the reverse statement true? Can optimality be a principle in choosing or justifying attribution model?  These are some of things I will discuss and debate about in my next writeup.

Thanks for reading!

Blog at WordPress.com.