Analytics Strategist

March 30, 2013

An Unusually Blunt Dialogue on Attribution – Part 3

Filed under: misc — Tags: , , , — Huayin Wang @ 4:53 pm
Q: It’s been over a week since we talked last time, and I am still in disbelief.  If what you said are true, multi-touch attribution problem is solved! Then again, I feel there are still so many holes. Before we discussion some challenging questions, can you sketch out the attribution process, the steps you proposed.

A: Sure.  There are four steps:

Step 1. Developing conversion model(s)
Step 2. Calculating conditional probability profile for each conversion event
Step 3. Applying Shapley Value formula to get the S-value set
Step 4. Calculating fractional credit: dividing S-value by the (non-conditional) conversion probability

Q: And what to call this – the attribution process? algorithm? framework? approach?
A: attribution agenda – of course, you can call it anything you like.

Q: Why don’t you start with data collection – I am sure you heard about GIGA principles and how important having good and correct data is for attribution …
A: I am squarely focusing on attribution logic – data issues are outside the scope of this conversation

Q: I noticed that there are no rule-based attribution models in your agenda. Are the rules really so arbitrary that they are of no use at all?
A: They are not arbitrary – like any social/cognitive rules of custom nature. For attribution purpose, however, they are neither conversion models, which measure how channels actually impact conversion probability, and nor clearly stated justification principle.

Q: What about the famous Introducer, Influencer and Closer framework – the thing everyone use in defining attribution models – and the insights they provided to attribution?
A: They are really of the same concept as the last touch, first touch rules – a position based way of looking at how channel and touch point sequence are correlated. You can use an alternative set of cleaner and more direct metrics to get similar insights – metrics derived from counting the proportions of a channel in conversion sequence as first touch, last touch and neither.

Q: Do these rules have no use at all in attribution process? Can they be used in conjunction with conversion models?
A: You do not use them together, there is simply no needs for them anymore when you can have conversion models.  However, there are cases when you do not have sufficient data to build your models. In that case, you can borrow from other models, or use these rule-based models as your heuristic rules.

Q: You are clearly not in the “guru camp” – as you said in your “Guru vs PhD” tweet. Are you in the PhD camp then?
A: No. I also think that they maybe more disappointed than the gurus from the web analytics side ..

Q: I have same feeling – I think you are killing their hope of being creative in the attribution modeling area. With your agenda, there is no more attribution models aside from conversion models, and no more attribution modeling aside from this one Shapley Value formula, and the adjustment factor.
A: The real creativity should be in the development of better conversion model.

Q: Let’s slow down a little bit. I think you maybe over simplifying the attribution problem.  Your conversion models seem only work when there is one touch event per channel — how can you handle multiple touch events per channel cases?
A: You may be confusing the conditional probability profile – in which channel is treated as one single entity – with conversion models. In my mind, you can creative multiple variables per channels that reflect complex feature of the touch point sequences for that channel: freq, recency, interval, first indicator etc.. Once the model is developed, you construct the conditional probability profile by taking all the touch points for that channel On or Off at the same time.

Q: Ok. How do you deal with the ordering effect – the fact that channel A first, and B second (A,B) is different from (B,A)?
A: You construct explicit order indicator variables in your conversion models … that way, your attribution formula (the Shapley Value) can remain the same.

Q: And what if the order does not matter.
A: Then the order indicator variables will not be significant in the conversion models.

Q: and the channel interaction?
A: through the usual way you model the interaction effects between two or more main effects.

Q: The separation of conversion model and attribution principle in your agenda is quite frustrating.  Why can’t we find innovative ways of handling both in one model – a sort of magic model, Bayesian, Markovian or whatever.
A: Go find it out.

Q: Control/Experiment could be an alternative, isn’t it?
A: Control/Experiment is at better a way of measuring the marginal impact; it is albeit to say that it is an impractical way to measure all levels of marginal impact that a conversion model will support.  If we have more than a couple of channels, the number of experiment needed goes up exponentially.  It also does not allow post-experiment analysis, and there is no way to practically incorporate recency and sequence patterns etc..

Q: What about optimization principle?  If by requiring the best attribution rule as reflecting the optimal way of allocating campaign budget to maximize the number of conversion, one can derive a unique attribution rule, can that be the solution to attribution?
A: No. Attribution problem is about events that happened already and needs to be answered that way, without requiring any assumption about future.  Campaign optimization is a related, but separate topic.

Q: Your attribution agenda is limited to conversion event. In reality, a lot of other metrics we care about, such as customer life time value, engagement value etc… How do you attribute those metrics?
A: If you can attribution (conversion) event, you can attribute all metrics derived from that, by figuring out what values linked to that event. In short, you figure out the fractional credit for the event first, then multiple the value of the event, you get the attribution process for that new metric.

Q: You have so far not talked about media cost at all – when we know every attribution vendors are using them in the process. How come there is no media cost in your attribution agenda?

A: Media cost is needed to evaluate cost-based channel performance, not for attribution. How much has a channel impacted a conversion is a fact, not depended on how much you paid the vendor –  if there is any relationship, it should be the opposite. The core of attribution process can be done without the media cost data — all vendors ask for it because they want to work on more projects aside from attribution.

Q: Regarding to issue of where should attribution process reside, you picked Agency.  Isn’t agency the last place you’d think of when it comes to any technology matter? Since when did you see agency put technology at their core competency?
A: Understandable. I said that not for any of the reasons you mentioned, but for what an ideal world should be.  Attribution process is so central to campaign planning, execution and performance reporting, at both tactical and strategic level.  Having that piece sitting outside of the integration center can cause a lot of frictions to moving your advertising/marketing to the next level.  I said that it should live inside your agency, but I did not say that it should be “build” by the agency; I did not say it should live inside your “current” agency; and certainly, there is nothing prevent you from making your technology vendor into your “new agency”, as long as they will take up the planning, execution and reporting works from your agency, at both strategic and tactical levels.

Q: What about Media Mix Modeling? If we have resources doing that, do we still need to worry about attribution?
A: The micro-macro attribution technologies. It is complicated and certainly need a separate discussion in order to do justice to the topic.  The simplest distinction between the are this:  when you know the most detailed data of who were touched by what campaigns, you do attribution.  If you have none of those data, but only know the aggregated level of media delivery and conversion data, you do MMM.

Q: I have to say that your agenda brings a lot of clarity to the state of attribution. I like the prospect of order; still, I can’t help but think about what a great time everyone have had around attribution models in recent year ..

A: Yes – the state of extreme democracy without consensus. To those who have gun, money and power, anarchy may just be the perfect state; not being cynical, just my glass half-full kind of perspective.

March 21, 2013

An Unusually Blunt Dialogue on Attribution – Part 2

Q: Continue on our yesterday’s conversation … I am still confused about the difference between conversion model and attribution model and attribution modeling.  Can you demonstrate using a simple example?

A: Sure.  Let’s look at a campaign with one vendor/channel on the media plan …

Q: Wait a minute, that will not be an attribution problem.  If there is only one channel/vendor,  does it matter what attribution model you use?

A: It does. Do we give the vendor 100% of the credit? A fraction less than 100% of the credit?

Q: Why not 100%?  I think all commonly used attribution models will use 100% …

A: You may want to think twice, because some users may convert on their own.  Let’s assume the vendor reach 10,000 users and 100 of them converted. Let’s also assume that, through analysis and modeling works (such as using a control group), you conclude that 80 out of the 100 converters will convert on their own.  How many converters does the vendor actually (incrementally) impacted?

Q: 20.

A: If you assign 100% credit to the vendor, the vendor will get all 100 converters’ credits.  Since the actual impacted conversion is 20, a fraction of credit should be used; in this case it is 20% instead 100%.  That’s attribution modeling, in its simplest form.

Q: Really? Can you recap the process and highlight the attribution modeling part of it?

A:  Sure. In this simplest example, the conversion model provides us two numbers(scores):

1)      The probability of conversion given the converter exposed to the campaign, call it P(c|camp) – in this case it is 100/10000 = 1% , and

2)      The probability of conversion given the converter not exposure to the campaign, call it P(c|no-camp) – in this case it is 80/10000 = 0.8%

The attribution modeling says that, only a fraction of the credit, (P(c|camp)-P(c|no-camp))/P(c|camp) == 0.2 or 20%, should be credited out.

Notice that this fraction for attribution is not 100%. It is not P(c|camp) which is 1%; and it is not P(c|camp) – P(c|no-camp) which is 0.2%.

Q: This is an interesting formula.  I do not recall seeing it anywhere before.  Does this formula come from the conversion model?

A: Not really.  The conversion model only providing the best possible estimate for P(c|camp) and P(c|no-camp), that’s all.  It will not provide the attribution fraction formula.

Q: Where does this formula come from then?

A: It comes from the following reasoning:  vendor(s) should get paid for what they actually (incrementally) impacted, not all the conversions they touched.

Q: So the principle of this “attribution modeling” is not data-driven but pure reason.  How much should I trust this reasoning?  Can this be the ground to build industry consensus?

A: What else can we build consensus on?

Q: Ok, I see how it works in this simple case, and I see the principle of it.  Can we generalize this “incremental impact” principle to multi-channel cases?

A: What do you have in mind?

Q: Let me try to work out the formula myself.  Suppose we have two channels, call them A, and B.  We start with conversion model(s), as usual.  From the conversion model(s), we find out our best estimates for P(c|A,B), P(c|nA,nB), P(c|nA,B), P(c|A,nB).  Now I understand why it does not matter if we use logistic regression, or probit model or neural network to build our conversion model – all that matter is to make sure we get the best estimates for the above scores J

A: Agree.  By the way, I think I understand the symbols you used, such as c, A, nA, nB etc. – let me know if you think I may guess it wrong :)

Q: This is interesting, I think I can get the formula now.  Take channel A first, and let’s call the fractional credit A should get as C_a;  we can calculate it with this formula:  C_a= (P(c|A,B)–P(c|nA,B)) / P(c|A,B), right?

A: If you do that, C_a + C_b maybe over 100%

Q: What’s wrong, then?

A: We need to first figure out what fraction of attribution available to be credited out to A and B, just as in the simplest case discussed before. It should be (P(c|A,B) – P(c|nA,nB)) / P(c|A,B).

Q: I see.  How should we divide the credit to A and B next?

A: That is a question we have not discussed yet.  In the simplest case, with one vendor, this is a trivial question. With more than one vendor(s)/channel(s), we need some new principle?

Q: I have an idea:  we can re-adjust the fractions on top of what we did before, like this:  C’_a = C_a / (C_a + C_b) and C’_b = C_b/(C_a + C_b);  and finally, we use C’_a and C’_b to partition the above fraction of credit.  Will that work?

(note: the following example has error in it, as pointed out by Vadim in his comment below)

A: Unfortunately, no.  Take the following example:

suppose A add no incremental value, except when B is present:  P(c|A,nB) == P(c|nA,nB) and P(c|A,B) > P(c|nA,B)

also, B does not add anything when A is present:  P(c|A,B) = P(c|A,nB)

The calculation will lead to:  C_b == 0 and C_a > 0.  Therefore, A get all the available credit and B get nothing.

Do you see a problem?

Q: Yes.  B will feel unfair, because without B, A will contribute nothing.  However, A get all the credit and B get nothing.

A: This is just a case with two channels and two players.  Imaging if we get 10 channels/players, what a complicated bargaining game this is going to be!

Q: Compare with this, the conversion model part is actually easy; well, not easy but more like a non-issue.  We can build conversion models to generate all these conditional probability scores.  However, we still stuck here and can’t figuring out a fair division of credit.
A: This is attribution modeling:  the process or formula that will translate the output of conversion models into attribution model (or fractional credits). We need to figure this thing out.

Q: What is it, really?

A: We are essentially looking for a rule or a formula to divide the total credit that we can all agree as fair.  Is that right?

Q: Right, but we have to be specific about what do we mean by “fair”.

A:  That’s right.  So, let’s discuss a minimal set of “fair” principles that we can all agree upon.  There are three of them, as I see it:

Efficiency: we are distributing all available credit, not leaving any on the table

Symmetry: if two channels are functionally identical, they should get the same credit

Dummy Channel: if a channel contribute nothing in all cases, it should get no credit

What do you think?

Q: I think we can agree with these principles.  How can they help?

A: Well, someone has proved that there is one and only one formula that satisfy this minimal set of principles. I think this is our attribution formula!

Q: Really? I do not believe this.  Who proved this?  Where can I read more of it?

A: In 1953, Lloyd Shapley published the proof in his PhD dissertation and the resulting formula became Shapley Value. The field of knowledge is called Cooperative Game Theory.  You can Google it and you will find tons of good references. Of course, Shapley did not call it “attribution problem” and he talked about players instead of channels. The collection of principles are more than three.  However, Transferable Utility and Additive principle are automatically satisfied when applied to credit partitioning problem.

Q: Now, how do you apply this attribution rule differently for different converters?

A: You do not.  The difference among converters are reflected in the scores generated from the conversion models, not in the above attribution formula – or Shapley Value.

Q: Ok, if that is the case, everyone in the industry will be using the same Attribution Formula, or Shapley Value.  How do we then creatively differentiate from each other?  How should different type of campaigns be treated uniquely?  How do the effect of channels on different types of conversion be reflected in attributed credits?

A: Well, all these will be reflected in how the conversion models are built and how the parameters of the conversion models are estimated, and finally the scores that come out of the conversion models.  You will innovate on statistical model development techniques. Attribution formula is, fortunately, not where you are going to innovate.

Q: This is quite shocking to me. I can’t imagine how the industry will react …

A: How did industry deal with Marketing Mix Modeling?  We accept the fact that those are simply regression models in essence, and start selling expertise on being able to do it thoroughly and do it right.  We do not have to create our own attribution model to be able to compete with each other.

March 20, 2013

An Unusually Blunt Dialogue on Attribution – Part 1

Filed under: misc — Tags: , , — Huayin Wang @ 10:04 pm

Q:  I will begin with this question, what you do NOT want to talk about today?

A:  I do not want to waste time on things that most people know and agree with, such as “Last Touch Attribution is flawed”

Q: Why is attribution model such a difficult challenge, that after many years we still seem to just begin scratching the surface of it?

A: No idea.

Q: Let me try a different way, why is it so hard to build an attribution model?

A: It is not.  It is NOT difficult to build an attribution model – in fact, you can build 5 of them in less than a min:  Last Touch, First Touch etc… J  It is difficult to build good attribution modeling – a process that produce methodologically sound attribution model.

Q: “Attribution modeling” – is this the kind of tool already available through Google Analytics.

A: No. Those are attribution model specification tools – “you specify the kind of attribution models to your heart’s content and I do reporting using them”.  They do not tell you what IS the RIGHT attribution model. An attribution reporting tool does not make an attribution modeling tool.

Q: “Methodologically sound” – that seems to be at the heart of all attribution debates these days.  Do you think we will ever reach a consensus on this?

A:  Without a consensus on this, how can anyone sell an attribution product or service?

Q: On the other hand, isn’t “algorithmic attribution” already a consensus, that everyone can build on it?

A: What is that thing?

Q: All vendors seem to take the “algorithmic attribution” approach, possibly adding additional phrases, such as “statistical models” and data-driven etc.  Isn’t that sufficient?

A: How? They never show how it works.

Q: Do you really need to get into that level of detail, the “Black Box” – the proprietary algorithm that people legitimately do not release to the public?

A: There is no reason to believe that anyone has a “proprietary algorithm” for attribution.  Unlike predictive modeling, a domain of technology that can be “externally” evaluated without going inside the Black Box,  attribution modeling is like math, a methodology whose validity needs to be internally justified. A Black Box for attribution sounds like an oxymoron for me.  You do not see people claim that they have a “proprietary proof” of Fermat’s Last Theorem.   (Ironically, Fermat himself claimed the proof on the margin of a book without actually showing it, but everyone knows he never intended it to be like that).

Q: Why then do people claim to have but do not show their algorithmic and/or modeling approach?

A: It is anyone’s guess.  I see no reason for that;  it hurts themselves and it hurts the advertising industry, particularly online advertising industry.  I suggest, from today, every vendor should either stop claiming that they have proprietary attribution modeling/model or get out of the “Black Box” (the new empire’s cloth?) and prove the legitimacy of their claim.

Q: Ok, suppose I say, I build a regression model to quantify which channels impact conversion and by how much, then calculate the proportional weights based on that and partition the credits according to the proportions.  What would you say?

A: How?

Q: You are not serious, right?  I am giving you so much details – how much more do you want?

A: The program and process sounds like it will work, and it is quite CLEAR that it is going to work to non-practitioners’ eyes.  But you know  and I know that it does NOT work.  Having built conversion models does not solve the attribution problem.  Attribution problem comes down to the partitioning of credit, i.e. how much of the conversion credit to be partitioned and how much given to each channels.  The logic has to be explicitly presented and justified.   The core challenge has been glossed over and covered up, but not solved.

Q: Please simply it for me.

A: There is no automatic translation available from conversion models to attribution models – the process of doing that, which is attribution modeling has to be explicitly stated.

Q: You defined attribution problem as partitioning credit to channels – are you talking about only Cross-Channel Attribution?  If I want to focus only on Digital Attribution, or even Publisher Attribution only, is what you said still relevant?

A: Yes.  I am talking about it from data analytics angle – you can just replace the word “channel” with others and the rest will apply.

Q: Ok, what if the conversion model I use is not regression, but some kind of Bayesian models?

A: It does not matter.  It can be Bayesian, Neural Net or a Hidden Markov Model.  As long as it is a conversion model.  The automatic translation is not there.

Q: Does it matter if the conversion model is predictive or descriptive?

A: It should be a conversion model – there are multiple meanings of “predictive model”;  it is essentially predictive models, but need not  handle “information leaking” type of issues as a predictive model should.

Q: Does it need to be “causal” model, and not a “correlational” model?

A: Define causal for me.  Specifically, do people know what they mean by “correlational” model?  Do they know multivariate models and dependence concepts?

Q: I assume we know.  Causal vs. correlational are just common sense concepts to help us make the discussion around “model” more precise …

A: But neither are more precise concepts than statistical modeling language.  Even philosophers themselves begin to use statistical modeling language to clarifying their “causal” framework …

Q: Now I am confused.  Where are we right now?

A: We are discussing statistical models and attribution modeling …

Q: Ok, should we use statistical models when we do attribution?

A: We have to.  Quantifying the impact of certain actions on conversion should be the foundation for any valid attribution process;  there are no more precise ways to do that than developing solid statistical models for conversion behavior!

Q: Not even experimental design?

A: Not even that.

Q: But what is the right statistical model?  Some types of regression models or some Bayesian models or Markovian models?

A: It does not have to be any one of them, and yet, any one of them may do the job.

Q: If that is true, how can one justify the objectivity of the model?

A: A conversion model provides the basis for what reality looks like – to our best knowledge at the moment.  There can be different types of statistical methodologies to model the conversion behavior, and that does not create problems with the objectivity of the model output.  We have seen this in marketing response models,  where the modelers have the freedom to choose whatever methodology (type of models) they deem appropriate and yet it does not compromise the objectivity of its results.

Q: But attribution is different;  when building marketing response models, what is important is the score, not the coefficients or any “form factors” of the model.  In attribution, those form factors are central, and not scores, to derive the attribution formula.

A: That’s exactly the problem that needs to be corrected. Attribution formula should NOT be built on the “form factors” of the conversion model, but rather on the scores of the conversion models!

Q: Explain more …

A: If you can’t claim that linear regression model IS the only right model for conversion behavior, you can’t claim those regression coefficients, the “form factors” of the regression models, are intrinsic to the conversion behavior.  Thus, any attribution formula built on top of that cannot be justified.

Q: And the conclusion, in simpler language …

A: Conversion model is needed for attribution, but attribution model is not the conversion model.  Attribution model should be built on top of the “essence” part of the conversion models, i.e. the scores, and not the form factors. Attribution modeling is the process of translating conversion modeling results to attribution model.

Q:  What is that saying about the offering from current vendors?

A:  They often tell us that they build conversion models, but reveals nothing about their attribution modeling methodology.

Q: What if they say that, they are hiding their proprietary attribution technology in Black Box?  Are they just covering up the fact that they have nothing in there, and they do not know how?

A: Anyone’s guess.  The bottom line is, anyone claiming anything should acknowledge the right to doubt from their audience.

Q: It is common to see companies hiding their predictive modeling (or recommendation engine technology) in Black Box … why not attribution?

A: Predictive modeling, or even recommendation modeling, are things that can be externally tested and verified.  You can put two predictive model scores, and test out which one has more predictive power without knowing how they build the models.  Attribution modeling is different;  you have to make explicit how and why your way of allocation is justified – otherwise, I have no way of verifying and validating your claim.

Q: We are not in the faith business …

A: Amen.

Q: Ok, big deal.  I am an advertiser, what should I do?

A: Demand anyone who is selling you attribution products/services, to show you their attribution stuff.  It is ok if they hide the conversion model part of it, but do not compromise on the attribution modeling.

Q: I am in the vendor business, what should I do?

A: Defend yourself – not by working on defensive rhetoric, but by building and presenting your attribution modeling openly.

Q: If I am an agency, what should I do?

A: Attribution should live inside the agency. You can own, or rent it;  you should not be fooled by those who like to make you think attribution modeling is a proprietary technology –  it is not.  Granted that you are not a technology company, but attribution modeling is not a proprietary technology.  If you have people who can build conversion model, you are right up there with those “proprietary” attribution vendors.

Q: If attribution modeling becomes an “Open” methodology, what about those attribution vendors?  What they will own and why advertisers and agencies wouldn’t build themselves?

A: That’s my question too J

Q: Are vendors going to be out of business?

A: Well, they can still own the conversion modeling part of it … and there are still predictive modeling shops out there, in business …

Q: Somehow, you sound like you know something about this “open secret” already J  Can you share a little on that?

A: Can we talk tomorrow? I need to leave for this “Attribution Revolution” conference tonight …

March 6, 2013

The difficult problems in attribution modeling

Filed under: misc — Tags: — Huayin Wang @ 10:58 pm

The term “attribution modeling” can have different meanings to different people – sometime being used interchangeably with “attribution model”. To me, attribution model refers to things like “last touch”, “first touch” etc. – rules that specify how attribution should be done. Attribution modeling is about the process where attribution model is generated.  Attribution Modeling give us the model generation process, as well as the reasons and justification of the attribution model being derived.

It is not difficult to come up with an attribution model, in fact, we can make up one in seconds. What difficult is to determine which one is the right attribution model. Despite all the discussion and progress made over the last few years, there is no consensus about it. And the lack of industry consensus really hurt.

The question about right attribution model is perhaps miss-guided; for we all know that a model could be right for one business, say e-commerce, may be wrong for another, such as B2B. What is right may also depend on type of campaigns, type of conversions and even type of users (male vs female, adult vs teens).  The right question should be: what’s the right attribution modeling – the right process of how an attribution model is generated.

Each one of us can easily list 4 or 5 most commonly used attribution models. What about attribution modeling? How many different processes can attribution model be produced?

Last Click/Last Touch attribution models are examples where intuition is the modeling process.  It is not data driven.  You can argue about the good and bad conceptually.  On the other hand, data-driven approach holds the fundamental belief that the right attribution model should be derived from data.  Within data-driven approach, there are two slightly differing approaches: experimental design vs algorithmic attribution.  

You may ask, what about Google’s Attribution Modeling Tool in Google Analytics? It is not really an Attribution Modeling Tool in my use of the word, it helps you specifying attribution models, not creating any data-driven models. It does not tell you how to derive the “right” attribution model.

The data-driven approach is what we will focus here. There has been great progress in the “algorithmic attribution” approach, and significant business build on this (Adometry and VisualIQ to name a couple).  However, none is clear and transparent enough about their key technologies – as an industry, we left with a lot of confusions.  

The set of difficult problems are about that – the core technology of attribution modeling. We need to answer these questions so we can build upon a common ground and move on.  Here’s a list of the questions/problems:

1) Is attribution modeling the same as statistical conversion modeling?

2) What’s the right type of models to use: predictive modeling, descriptive modeling, causal modeling?

3) Does it matter if the model is linear regression, logistic or some bayesian network model?

stay tuned for more.

September 17, 2012

The state of the attribution business (in marketing/advertising)

Filed under: misc — Tags: , , — Huayin Wang @ 8:01 pm

I had some chats with friends and colleagues lately over the topic of attribution model; the shared feeling is that there is a glaring contrast between the growing number of vendors and the utterly lack of clarity and consensus on approaches. To put this more concretely:

We all know that last click was wrong, but do not know what to do with it.

We do not know what’s the right approach and why (others are not) – is it algorithm attribution? Or is it experimental design?

Why does the right or wrong attribution model matter and to whom? What does it have to do with optimization, and how?

Very confused about Marketing Mix Modeling vs Attribution – is one more right than the other – is one enough?

And finally: who should build attribution services and who should own it?

If you think the current state of attribution is clearer than the above picture, please make you voice heard.  If you know the answer for any of the above questions, even better! You are more than welcome to share your thought and enlighten us, right here!

Thanks!

 

 

May 24, 2012

The Principles of Attribution Model

Filed under: attribution analytics — Tags: , , — Huayin Wang @ 7:36 pm

(Disclaimer:  some questions and answers below are totally made up,  any resemblance to anything anyone said is purely coincidental)

How do we know an attribution model, such as Last Click Attribution, is wrong?

  • it is incorrect - surprise surprise, a lot of people just make the claim and be done with it
  • it does not accurately capture the real influence a campaign has on purchase – but how do you know it?
  • it only credit the closer – isn’t this just a re-statement of what it is?
  • it is unfair to upper funnels and only awards to lower funnel – are you suggesting that it should award to all funnels, why?
  • it leads to budget mis-allocation so your campaign is not optimized – how do you know?
  • it is so obvious, I just know it – what?

How do we know an attribution model, such as a equal attribution model, is right?

  • it is better than LCA – intuition?
  • it gives out different credits than LCA so you can see how much mis-allocation LCA does to you campaign – different from LCA is not automatically right
  • we tested and it generate better success metrics for the campaign – sound good, how?
  • it is fair – what does that mean?

How do we find the right attribution model?

  • try different attribution models and test the outcome – attribution model does not general outcome to campaigns directly
  • play with different models and see which one fit your situation better – how do I know the fitness?
  • use statistical modeling methodology to measure influence objectively – what models? conversion models?
  • use predictive model for conversion – why predictive models? what models? how to calculate influence and credit from the models?
  • test and control experiment - how many test and control, what formula to use to calculate credit?
  • you decide, we allow you to choose and try whatever attribution weights you want – but I want to know what’s the right one?
  • the predictive models help you with optimization, once we get that, you do not care about attribution – but I do care …
  • shh … it is proprietary: I won’t tell you or I will kill you! - ?

The Principle of Influence

Three principles are often implicitly used:  the “influence principle”,  the “fairness principle” and the “optimization principle”.

The influence principle works like this: assume we can measure each campaign’s influence on a conversion, the correct attribution model will give credit to campaigns proportional to their influence.  The second principle is often worded with “fairness”, but very much the same as the first principle:  if multiple campaigns contribute to a conversion, giving 100% credit to only one of them if “unfair” to others.  The third principle, the optimization principle, in my understanding, is more about the application of attribution (or the benefit of it) and not about the principle of attribution.

The principle of influence is the anchor of three; the fairness and optimization principles are either a softer version or a derivative of it.

Now we have our principle, are we close to figuring out the right approach to attribution model?  We need to get closer to the assumption of this principle.  Can we objectively measure (quantify)  influence?  Are there multiple solutions or just one right way to do this?

If influence principle is the only justification of attribution models, then quantitative measurement methodology such as probabilistic modeling, some time it is called algorithmic solution which I think is a misnomer,  will be the center technology to use.  It leave no room for arguing just on the ground of intuition alone.  Those who offer only intuition and experience, plus tools for clients to play with whatever attribution weights are not attribution solution provider, but merely a vendor of flexible reporting.

Those of the intuition and experience school like to frame attribution model around the order and position of touch points:  the first/last/even and the introducer/assist/closer. (how many vendors are doing this today?)  They have troubles in providing quantitative probabilistic solution to attribution issue.  The little known fact is that it is analytically flawed:  the labeling of “last touch” and “closer” are only known post-conversion, and therefore not usable inside probabilistic modeling framework.  In predictive modeling and data mining lingo, this is known as the “leakage problem”.  (search on Google, or read Xuhui’s article that mentioned this).

Unfortunately, we have a problem with the data scientist camp as well but of different nature; it is the lack of transparency with metrics, models and process details.  Some vendors are unwilling to open up their “secret sauce”.  Perhaps, but is that all?  I will try to demystify and discuss the “secret sauce” of attribution modeling.


May 17, 2012

Attribution Model vs Attribution Modeling

Attribution is a difficult topic, growing into a mess of tangled threads.

I hope this post, and subsequent ones, will help to untangle the messy threads.  I like to start with simple stuff, be meticulous with the use of words and concepts and be patient; after all, haste makes waste.

When an advertiser records a conversion or purchase, some times there are multiple campaigns in the touch point history of the conversion; how do we decide what campaign(s) responsible for the conversion and how should the conversion be credited to each of these campaigns?  This is the attribution problem; a practical issue first raised in digital advertising but in itself a general analytical challenges.  It is applicable to many marketing/advertising contexts, cross channels or within a particular channel for example.

Micro vs Macro

Notice that Attribution is a “micro” level problem: it dealt with each individual conversion event.  In contrast, Marketing Mix Model (or Media Mix Model) deals with “macro” level problem: crediting conversion volume to each channel/campaigns in aggregate.  There are similarity between the two when viewed from the business side; they are quite different analytic problems, different in all major aspects of the analytic process: from data to methodology to application.

Attribution Model vs Attribution Modeling

Advertisers implement business rule(s) to handle this “attribution”, or credit distribution, process.  These rules are generally called “attribution rule” or “attribution model”; examples of it are Last Click Model, First Click Model, Fractional Attribution Model etc..  Rules and models are interchangeable in this regard, they serve as instruction set for the execution of attribution process.

There are no shortage of attribution rules or models being discussed. Anyone can come up with a new one, as long as it does partitioning credit .  The challenge is finding the right one, to choose from too many of them.  In other words, it is the lack of justification for the approach, process and methodology of Attribution Rules/Models that is a problem.

Now comes the Attribution Modeling – a statistical model-based approach to quantify the impact of each campaigns on each individual’s conversion behavior.  It is a data-driven algorithmic approach; it is hot and cool, with an aura of objectivity around.  It is often treated as the secret sauce to unlock attribution and optimization and covered with a proprietary blackbox.

Let me slow down a bit.  I have discussed two important concepts here: Attribution Model and Attribution Modeling. The former refers to the attribution rules; the later refers to the process of generating/justifying the rules. I understand that everyone do not agree with my use of the words, or the distinction between the two; but I think this is a critical distinction, for untangling the threads in attribution discussion.

Domain Expert vs Data Scientist

There are generally two camps when it comes to the generation/justification of attribution model.  The first is the “domain experts” and the second the “data scientists”.  Domain experts take issues with attribution models by pointing out the good and bad, arguing on behalf of common sense, experience and intuition; it is qualitative, insightful and at times interesting but generally pessimistic and fall short when it comes to build a rigorous data-driven solutions. The general principle for justifying attribution is one of two: the influence and the fairness.  The influence principle attributes credit based on the influence of the campaign on conversion, whereas the fairness is often stated generally.

The fairness principle is not a concern for the data scientists camp; in fact, it is all about modeling/quantifying the impact or influence. After all, if you can do the attribution based on precise measurements of influences of each touch points, what other principle do you need? Of course, the problem is often about the promise and less about the principle.  In contrast to the domain experts, the data scientists approach is quantitative, rigorous and data driven. You can argue with the choice of a specific modeling methodology, but the resulting model itself does not require common sense or past experience to justify.

Principle of Attribution:  Influence, Fairness, Optimality

A third principle in picking the right attribution model is optimality, for lack of better word.  Do right attribution models lead to optimal campaign management?  Some argue yes.  Does the reverse statement true? Can optimality be a principle in choosing or justifying attribution model?  These are some of things I will discuss and debate about in my next writeup.

Thanks for reading!

April 23, 2012

Attribution Model and Attribution Modeling do not mean the same thing

Filed under: misc — Tags: , — Huayin Wang @ 9:27 pm

With great frustration (to myself and many others who speak about attribution model before), I am making a plea here:  please make it clear what do you mean when you write or speak about attribution model!  For those who do not have the patient to think over, pick one from the two most common uses of it:

A:  Attribution Model as a reference to the process or rules about crediting marketing/advertising success to individual campaigns. Names for such commonly used credit allocation rules:  Last-Click, First Click and even distribution etc.

B:  Alternatively, people use attribution model to means the statistical modeling methodology and/or processes in producing the credit allocation rules above – this could be all kinds of control/experiment testing, regression modeling, bayesian statistical modeling etc..  There are arguments about whether the right model has to be causal model, explanatory model or predictive models.

A or B, which one are you? In other words, which one do you mean when you utter “attribution model”?

I am A; and I use “attribution modeling” for B.  This is the best I can do, after quite sometime struggling with it.

I believe this is a serious matter.  To quote Confucius: “If language is not correct, then what is said is not what is meant; if what is said is not what is meant, then what must be done remains undone.”

April 11, 2012

Funny analogies of wrong attribution models

Few topics are near and dear to my heart as Attribution Modeling is.  I first bumped into it a more than 4 years ago; and my first written piece on attribution is a linkedin Q&A piece answering a question from Kevin Lee on duplication-rate  (in August 2007).  Since then, my interest in attribution gets real serious, resulting in a dozen’s attribution related blog posts.  The interest never died after that, although I have not written anything the last three years.

I am back on it with a vengeance! Consider this as my first one back.

I want to start on a gentle note though.  I am amused about people still debating about First Touch vs Last Touch attribution as viable attribution models, a bit out of the question in my opinion.  I want to share some funny analogies for what could go wrong with them.

Starting with Last Touch Attribution Model, a football analogy goes like this: “relying solely on a last click attribution model may lead a manager to sack his midfielder for not scoring any goals. Despite creating countless opportunities he gets no credit as his name isn’t on the score-sheet. Similarly a first click attribution model may lead the manager to drop his striker for not creating any goals, despite finishing them. – BrightonSEO presentation slides

There are a lot of good analogies like this that are derived from team sports.  This analogy is applicable not only to Last Touch, but to all single touch point attribution models.  The funniest one I heard is about First Touch Attribution, from none other than the prolific Avinash Kaushik: “first click attribution is like giving his first girlfriend credit for his current marriage.” - Avinash quote

Analogy is analogy, it does not do full justice to what’s been discussed.  However, what we should learn at least this much: if your attribution model is solely based on the sequencing order of touch points, you are wrong.  Those who propose Last, First, Even, Linear or whatever attribution models, watch out!

A good attribution model needs a disciplined development process, and better yet, a data-driven one.  The less the assumptions made about the values of touch points the better – we should learn to let empirical evidence speak for itself.

Do you have any interesting analogy, or thought?

March 5, 2009

The three generations of (micro) attribution analytics

For marketing and advertising, attribution problem normally starts at the macro level: we have total sales/conversions and marketing spends.  Marketing Mix Modeling (MMM) is the commonly used analytics tool providing a solution using time series data of these macro metrics.  

The MMM solution has many limitations that are intrinsically linked to the nature of the (macro level) data that’s been used.  Micro attribution analytics, when the micro-level touch points and conversion tracking is available, provides a better attribution solution.  Although sadly, MMM is more often practice even when the data for micro-attribution is available; this is primarily due to the lack of development and understanding of the micro attribution analytics, particularly the model-based approach.

There has been three types, or better yet, three generations of micro analytics over the years: the tracking-based solution, the order-based solution and the model-based solution.

The tracking-based solution has been popular in the multi-channel marketing world.  The main challenge here is to figure out through which channel a sale or conversion event happens. The book Multichannel Marketing – Metrics and Methods for On and Offline Success by Akin Arikan is an excellent source of information for the most often used methodologies – covering customized URL, unique 1-800 numbers and many other cross-channel tracking techniques.  Tracking normally implemented at the channel-level, not individual event levels.  Without the tracking solution, the sales numbers by channels are inferred through MMM or other analytics. With proper tracking, the numbers are directly observed.

Tracking solution essentially a single attribution approach to a multi-touch attribution problem. It does not deal with the customer level multi-touch experience.  This single-touch attribution approach leads natrually to the last-touch point rule when viewed from a multi-touch attribution perspective.  Another drawback of it is that it is simply a data-based solution without much analytics sophistication behind it – it provides relationship numbers without a strong argument for causal interpretation.  

 The order-based solution explicitly recognizes the multi-touch nature of individual consumer experience for brands and products. With the availability of micro-level touch point and conversion data, order-based attribution generally seeks attribution rules in the form of a weighting scheme based on the order of events. For example, when all weights are zero except the last touch point, it simply reduced to the LAST touch point attribution.  There has been many such rules been discussed; with constant debate about the virtual and drawbacks of each and every one of the rules.  There are also derived metrics based on these low-level order-based rules, such as the appropriate attribution ratio (Eric Peterson).

Despite the many advantages of order-based multi-touch attribution approach, there are still methodological limitations. One of the limitations is that, as many already know, there is no weighting scheme that is generally applicable, or appropriate for all business under all circumstances. There is no point of arguing which rule is the best without the specifics of the business and data context.  The proper rule should be different depending on the context; however, there is no provision or general methodology for the rule should be developed.  

Another limitation of the order-based weighting scheme is: for any given rule, the weight of an event is determined solely based on the order of event and not the type of event.  For example, one rule may specify the first click getting 20% attribution – when it maybe more appropriate to give the first click 40% attribution if it is a “search” and 10% if it is a “banner click through”.

Intrinsic to its intuition-based rule development process is that it does not have a rigorous methodology to support any causal interpretation which is central for right attribution and operation optimization.

Here comes the third generation of attribution analytics: the model-based attribution.  It promises to deliver a sound modeling process for rule development, and provides the analytical rigor for finding relationships that can have causal interpretation.  

More details to come.  Please come back to read the next post: a deep dive example of model-based attribution.

Related post: Micro and Macro Attribution

Older Posts »

The Shocking Blue Green Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 162 other followers