Analytics Strategist

April 5, 2013

Multi-touch attribution problem is solved, agree?

Filed under: attribution analytics — Huayin Wang @ 10:08 pm

I believe the MTA modeling problem is solved with the approach I discussed in the Unusually Blunt Dialogue on Attribution.  I have since received some questions about the approach, or the agenda; some related to the contents and others about formatting.  Today, I am going to try a simple recap, to address those questions.

First of all, the formatting issue. The format in WP is hard to read. A friend of mine (thank you, Steve!) is kind enough to put the content into MS-Word.  Anyone interested in reading the dialogues in a better format, can download it here:  the attribution dialogue.

Below are Q&A for other questions:

Q: Is attribution problem solved?

A: Hardly. Attribution problem consists of many challenges: data, model/modeling, behavioral insight, reporting, and finally optimization.

Q: When you started, you were aiming to reach a consensus on Attribution Model and Modeling. Have we reached the consensus? Is this attribution modeling problem solved?

A: Consensus is never easy to build and may never be achieved. I believe I have covered enough ground to build consensus on this issue, so we can move on to other businesses. I believe the MTA modeling problem is solved, but I am open to someone who can convince me otherwise.

Q: Is there any remaining issues not covered in your agenda?

A: Yes. One example of the left out issues is the search – display interaction; we handles part of it, but not completely.

Q: What do you mean?

A: There are two types of interactions:  the interaction effect at behavioral level, which is covered in the conversion model, and the interaction effect on media exposure.  The latter type of interaction is not capturable by conversion models.

Q: This is quite dense … do we need another methodology to model the likelihood of exposure?

A: I do not think individual level modeling is the right approach – lack of data is not the only challenge …

Q: Ok, if this is so, how can we say attribution modeling is solved?

A: I consider this to be outside the main attribution modeling.  This trailing piece may need a different handle – a “re-attribution” methodology?

(more to come)

Advertisements

March 21, 2013

An Unusually Blunt Dialogue on Attribution – Part 2

Q: Continue on our yesterday’s conversation … I am still confused about the difference between conversion model and attribution model and attribution modeling.  Can you demonstrate using a simple example?

A: Sure.  Let’s look at a campaign with one vendor/channel on the media plan …

Q: Wait a minute, that will not be an attribution problem.  If there is only one channel/vendor,  does it matter what attribution model you use?

A: It does. Do we give the vendor 100% of the credit? A fraction less than 100% of the credit?

Q: Why not 100%?  I think all commonly used attribution models will use 100% …

A: You may want to think twice, because some users may convert on their own.  Let’s assume the vendor reach 10,000 users and 100 of them converted. Let’s also assume that, through analysis and modeling works (such as using a control group), you conclude that 80 out of the 100 converters will convert on their own.  How many converters does the vendor actually (incrementally) impacted?

Q: 20.

A: If you assign 100% credit to the vendor, the vendor will get all 100 converters’ credits.  Since the actual impacted conversion is 20, a fraction of credit should be used; in this case it is 20% instead 100%.  That’s attribution modeling, in its simplest form.

Q: Really? Can you recap the process and highlight the attribution modeling part of it?

A:  Sure. In this simplest example, the conversion model provides us two numbers(scores):

1)      The probability of conversion given the converter exposed to the campaign, call it P(c|camp) – in this case it is 100/10000 = 1% , and

2)      The probability of conversion given the converter not exposure to the campaign, call it P(c|no-camp) – in this case it is 80/10000 = 0.8%

The attribution modeling says that, only a fraction of the credit, (P(c|camp)-P(c|no-camp))/P(c|camp) == 0.2 or 20%, should be credited out.

Notice that this fraction for attribution is not 100%. It is not P(c|camp) which is 1%; and it is not P(c|camp) – P(c|no-camp) which is 0.2%.

Q: This is an interesting formula.  I do not recall seeing it anywhere before.  Does this formula come from the conversion model?

A: Not really.  The conversion model only providing the best possible estimate for P(c|camp) and P(c|no-camp), that’s all.  It will not provide the attribution fraction formula.

Q: Where does this formula come from then?

A: It comes from the following reasoning:  vendor(s) should get paid for what they actually (incrementally) impacted, not all the conversions they touched.

Q: So the principle of this “attribution modeling” is not data-driven but pure reason.  How much should I trust this reasoning?  Can this be the ground to build industry consensus?

A: What else can we build consensus on?

Q: Ok, I see how it works in this simple case, and I see the principle of it.  Can we generalize this “incremental impact” principle to multi-channel cases?

A: What do you have in mind?

Q: Let me try to work out the formula myself.  Suppose we have two channels, call them A, and B.  We start with conversion model(s), as usual.  From the conversion model(s), we find out our best estimates for P(c|A,B), P(c|nA,nB), P(c|nA,B), P(c|A,nB).  Now I understand why it does not matter if we use logistic regression, or probit model or neural network to build our conversion model – all that matter is to make sure we get the best estimates for the above scores J

A: Agree.  By the way, I think I understand the symbols you used, such as c, A, nA, nB etc. – let me know if you think I may guess it wrong 🙂

Q: This is interesting, I think I can get the formula now.  Take channel A first, and let’s call the fractional credit A should get as C_a;  we can calculate it with this formula:  C_a= (P(c|A,B)–P(c|nA,B)) / P(c|A,B), right?

A: If you do that, C_a + C_b maybe over 100%

Q: What’s wrong, then?

A: We need to first figure out what fraction of attribution available to be credited out to A and B, just as in the simplest case discussed before. It should be (P(c|A,B) – P(c|nA,nB)) / P(c|A,B).

Q: I see.  How should we divide the credit to A and B next?

A: That is a question we have not discussed yet.  In the simplest case, with one vendor, this is a trivial question. With more than one vendor(s)/channel(s), we need some new principle?

Q: I have an idea:  we can re-adjust the fractions on top of what we did before, like this:  C’_a = C_a / (C_a + C_b) and C’_b = C_b/(C_a + C_b);  and finally, we use C’_a and C’_b to partition the above fraction of credit.  Will that work?

(note: the following example has error in it, as pointed out by Vadim in his comment below)

A: Unfortunately, no.  Take the following example:

suppose A add no incremental value, except when B is present:  P(c|A,nB) == P(c|nA,nB) and P(c|A,B) > P(c|nA,B)

also, B does not add anything when A is present:  P(c|A,B) = P(c|A,nB)

The calculation will lead to:  C_b == 0 and C_a > 0.  Therefore, A get all the available credit and B get nothing.

Do you see a problem?

Q: Yes.  B will feel unfair, because without B, A will contribute nothing.  However, A get all the credit and B get nothing.

A: This is just a case with two channels and two players.  Imaging if we get 10 channels/players, what a complicated bargaining game this is going to be!

Q: Compare with this, the conversion model part is actually easy; well, not easy but more like a non-issue.  We can build conversion models to generate all these conditional probability scores.  However, we still stuck here and can’t figuring out a fair division of credit.
A: This is attribution modeling:  the process or formula that will translate the output of conversion models into attribution model (or fractional credits). We need to figure this thing out.

Q: What is it, really?

A: We are essentially looking for a rule or a formula to divide the total credit that we can all agree as fair.  Is that right?

Q: Right, but we have to be specific about what do we mean by “fair”.

A:  That’s right.  So, let’s discuss a minimal set of “fair” principles that we can all agree upon.  There are three of them, as I see it:

Efficiency: we are distributing all available credit, not leaving any on the table

Symmetry: if two channels are functionally identical, they should get the same credit

Dummy Channel: if a channel contribute nothing in all cases, it should get no credit

What do you think?

Q: I think we can agree with these principles.  How can they help?

A: Well, someone has proved that there is one and only one formula that satisfy this minimal set of principles. I think this is our attribution formula!

Q: Really? I do not believe this.  Who proved this?  Where can I read more of it?

A: In 1953, Lloyd Shapley published the proof in his PhD dissertation and the resulting formula became Shapley Value. The field of knowledge is called Cooperative Game Theory.  You can Google it and you will find tons of good references. Of course, Shapley did not call it “attribution problem” and he talked about players instead of channels. The collection of principles are more than three.  However, Transferable Utility and Additive principle are automatically satisfied when applied to credit partitioning problem.

Q: Now, how do you apply this attribution rule differently for different converters?

A: You do not.  The difference among converters are reflected in the scores generated from the conversion models, not in the above attribution formula – or Shapley Value.

Q: Ok, if that is the case, everyone in the industry will be using the same Attribution Formula, or Shapley Value.  How do we then creatively differentiate from each other?  How should different type of campaigns be treated uniquely?  How do the effect of channels on different types of conversion be reflected in attributed credits?

A: Well, all these will be reflected in how the conversion models are built and how the parameters of the conversion models are estimated, and finally the scores that come out of the conversion models.  You will innovate on statistical model development techniques. Attribution formula is, fortunately, not where you are going to innovate.

Q: This is quite shocking to me. I can’t imagine how the industry will react …

A: How did industry deal with Marketing Mix Modeling?  We accept the fact that those are simply regression models in essence, and start selling expertise on being able to do it thoroughly and do it right.  We do not have to create our own attribution model to be able to compete with each other.

May 24, 2012

The Principles of Attribution Model

Filed under: attribution analytics — Tags: , , — Huayin Wang @ 7:36 pm

(Disclaimer:  some questions and answers below are totally made up,  any resemblance to anything anyone said is purely coincidental)

How do we know an attribution model, such as Last Click Attribution, is wrong?

  • it is incorrect surprise surprise, a lot of people just make the claim and be done with it
  • it does not accurately capture the real influence a campaign has on purchase – but how do you know it?
  • it only credit the closer – isn’t this just a re-statement of what it is?
  • it is unfair to upper funnels and only awards to lower funnel – are you suggesting that it should award to all funnels, why?
  • it leads to budget mis-allocation so your campaign is not optimized – how do you know?
  • it is so obvious, I just know it – what?

How do we know an attribution model, such as a equal attribution model, is right?

  • it is better than LCA – intuition?
  • it gives out different credits than LCA so you can see how much mis-allocation LCA does to you campaign – different from LCA is not automatically right
  • we tested and it generate better success metrics for the campaign – sound good, how?
  • it is fair – what does that mean?

How do we find the right attribution model?

  • try different attribution models and test the outcome – attribution model does not general outcome to campaigns directly
  • play with different models and see which one fit your situation better – how do I know the fitness?
  • use statistical modeling methodology to measure influence objectively – what models? conversion models?
  • use predictive model for conversion – why predictive models? what models? how to calculate influence and credit from the models?
  • test and control experiment – how many test and control, what formula to use to calculate credit?
  • you decide, we allow you to choose and try whatever attribution weights you want – but I want to know what’s the right one?
  • the predictive models help you with optimization, once we get that, you do not care about attribution – but I do care …
  • shh … it is proprietary: I won’t tell you or I will kill you! – ?

The Principle of Influence

Three principles are often implicitly used:  the “influence principle”,  the “fairness principle” and the “optimization principle”.

The influence principle works like this: assume we can measure each campaign’s influence on a conversion, the correct attribution model will give credit to campaigns proportional to their influence.  The second principle is often worded with “fairness”, but very much the same as the first principle:  if multiple campaigns contribute to a conversion, giving 100% credit to only one of them if “unfair” to others.  The third principle, the optimization principle, in my understanding, is more about the application of attribution (or the benefit of it) and not about the principle of attribution.

The principle of influence is the anchor of three; the fairness and optimization principles are either a softer version or a derivative of it.

Now we have our principle, are we close to figuring out the right approach to attribution model?  We need to get closer to the assumption of this principle.  Can we objectively measure (quantify)  influence?  Are there multiple solutions or just one right way to do this?

If influence principle is the only justification of attribution models, then quantitative measurement methodology such as probabilistic modeling, some time it is called algorithmic solution which I think is a misnomer,  will be the center technology to use.  It leave no room for arguing just on the ground of intuition alone.  Those who offer only intuition and experience, plus tools for clients to play with whatever attribution weights are not attribution solution provider, but merely a vendor of flexible reporting.

Those of the intuition and experience school like to frame attribution model around the order and position of touch points:  the first/last/even and the introducer/assist/closer. (how many vendors are doing this today?)  They have troubles in providing quantitative probabilistic solution to attribution issue.  The little known fact is that it is analytically flawed:  the labeling of “last touch” and “closer” are only known post-conversion, and therefore not usable inside probabilistic modeling framework.  In predictive modeling and data mining lingo, this is known as the “leakage problem”.  (search on Google, or read Xuhui’s article that mentioned this).

Unfortunately, we have a problem with the data scientist camp as well but of different nature; it is the lack of transparency with metrics, models and process details.  Some vendors are unwilling to open up their “secret sauce”.  Perhaps, but is that all?  I will try to demystify and discuss the “secret sauce” of attribution modeling.


May 17, 2012

Attribution Model vs Attribution Modeling

Attribution is a difficult topic, growing into a mess of tangled threads.

I hope this post, and subsequent ones, will help to untangle the messy threads.  I like to start with simple stuff, be meticulous with the use of words and concepts and be patient; after all, haste makes waste.

When an advertiser records a conversion or purchase, some times there are multiple campaigns in the touch point history of the conversion; how do we decide what campaign(s) responsible for the conversion and how should the conversion be credited to each of these campaigns?  This is the attribution problem; a practical issue first raised in digital advertising but in itself a general analytical challenges.  It is applicable to many marketing/advertising contexts, cross channels or within a particular channel for example.

Micro vs Macro

Notice that Attribution is a “micro” level problem: it dealt with each individual conversion event.  In contrast, Marketing Mix Model (or Media Mix Model) deals with “macro” level problem: crediting conversion volume to each channel/campaigns in aggregate.  There are similarity between the two when viewed from the business side; they are quite different analytic problems, different in all major aspects of the analytic process: from data to methodology to application.

Attribution Model vs Attribution Modeling

Advertisers implement business rule(s) to handle this “attribution”, or credit distribution, process.  These rules are generally called “attribution rule” or “attribution model”; examples of it are Last Click Model, First Click Model, Fractional Attribution Model etc..  Rules and models are interchangeable in this regard, they serve as instruction set for the execution of attribution process.

There are no shortage of attribution rules or models being discussed. Anyone can come up with a new one, as long as it does partitioning credit .  The challenge is finding the right one, to choose from too many of them.  In other words, it is the lack of justification for the approach, process and methodology of Attribution Rules/Models that is a problem.

Now comes the Attribution Modeling – a statistical model-based approach to quantify the impact of each campaigns on each individual’s conversion behavior.  It is a data-driven algorithmic approach; it is hot and cool, with an aura of objectivity around.  It is often treated as the secret sauce to unlock attribution and optimization and covered with a proprietary blackbox.

Let me slow down a bit.  I have discussed two important concepts here: Attribution Model and Attribution Modeling. The former refers to the attribution rules; the later refers to the process of generating/justifying the rules. I understand that everyone do not agree with my use of the words, or the distinction between the two; but I think this is a critical distinction, for untangling the threads in attribution discussion.

Domain Expert vs Data Scientist

There are generally two camps when it comes to the generation/justification of attribution model.  The first is the “domain experts” and the second the “data scientists”.  Domain experts take issues with attribution models by pointing out the good and bad, arguing on behalf of common sense, experience and intuition; it is qualitative, insightful and at times interesting but generally pessimistic and fall short when it comes to build a rigorous data-driven solutions. The general principle for justifying attribution is one of two: the influence and the fairness.  The influence principle attributes credit based on the influence of the campaign on conversion, whereas the fairness is often stated generally.

The fairness principle is not a concern for the data scientists camp; in fact, it is all about modeling/quantifying the impact or influence. After all, if you can do the attribution based on precise measurements of influences of each touch points, what other principle do you need? Of course, the problem is often about the promise and less about the principle.  In contrast to the domain experts, the data scientists approach is quantitative, rigorous and data driven. You can argue with the choice of a specific modeling methodology, but the resulting model itself does not require common sense or past experience to justify.

Principle of Attribution:  Influence, Fairness, Optimality

A third principle in picking the right attribution model is optimality, for lack of better word.  Do right attribution models lead to optimal campaign management?  Some argue yes.  Does the reverse statement true? Can optimality be a principle in choosing or justifying attribution model?  These are some of things I will discuss and debate about in my next writeup.

Thanks for reading!

April 11, 2012

Funny analogies of wrong attribution models

Few topics are near and dear to my heart as Attribution Modeling is.  I first bumped into it a more than 4 years ago; and my first written piece on attribution is a linkedin Q&A piece answering a question from Kevin Lee on duplication-rate  (in August 2007).  Since then, my interest in attribution gets real serious, resulting in a dozen’s attribution related blog posts.  The interest never died after that, although I have not written anything the last three years.

I am back on it with a vengeance! Consider this as my first one back.

I want to start on a gentle note though.  I am amused about people still debating about First Touch vs Last Touch attribution as viable attribution models, a bit out of the question in my opinion.  I want to share some funny analogies for what could go wrong with them.

Starting with Last Touch Attribution Model, a football analogy goes like this: “relying solely on a last click attribution model may lead a manager to sack his midfielder for not scoring any goals. Despite creating countless opportunities he gets no credit as his name isn’t on the score-sheet. Similarly a first click attribution model may lead the manager to drop his striker for not creating any goals, despite finishing them. – BrightonSEO presentation slides

There are a lot of good analogies like this that are derived from team sports.  This analogy is applicable not only to Last Touch, but to all single touch point attribution models.  The funniest one I heard is about First Touch Attribution, from none other than the prolific Avinash Kaushik: “first click attribution is like giving his first girlfriend credit for his current marriage.” – Avinash quote

Analogy is analogy, it does not do full justice to what’s been discussed.  However, what we should learn at least this much: if your attribution model is solely based on the sequencing order of touch points, you are wrong.  Those who propose Last, First, Even, Linear or whatever attribution models, watch out!

A good attribution model needs a disciplined development process, and better yet, a data-driven one.  The less the assumptions made about the values of touch points the better – we should learn to let empirical evidence speak for itself.

Do you have any interesting analogy, or thought?

March 17, 2009

the wrong logic in attribution of interaction effect

Attribution should not be such a difficult problem – as long as reality conforms to our linear additive model of it. The interaction, sequential dependency and nonlinearity are the main trouble makers.

In this discussion, I am going to focus on the attribution problem in the presence of interaction effect

Here’s the story setup: there are two ad channels, paid search (PS) and display (D).  

Scenario 1)
      When we run both (PS) & (D), we get $40 in revenue.  How should we attribute this $40 to PS and D?

The simple answer is: we do not know – for one thing,  we do not have sufficient data.
What about making the attribution in proportion to each channels’ spending numbers? You can certainly do it, but it is not more justifiable than any others.

Scenario 2)
    when we run (PS) alone we get $20 in revenue;  when we run (PS) & (D) together, we get $40.
    Which channel gets what?

The simple answer is again: we do not know – we do not have enough data.
Again, a common reasoning of this is:  (PS) gets $20 and (D) gets $20 (= $40 – $20).  The logic seems reasonable, but still flawed because there is no consideration of the interaction between the two.  Of course, with the assumption that there is no interaction between the two, this is the conclusion.

Scenario 3)
    when we run (PS) alone we get $20 in revenue; running (D) alone gets $15 in revenue; running both (PS) & (D) the revenue is $40.
    Which channel gets what?

The answer:  we still do not know. However, we can’t blame the lack of data anymore.  It is forcing us to face the intrinsic limitation in the linear additive attribution framework itself.

Number-wise, the interaction effect is a positive $5, $40-($20+$15), which we do not know what portion to be attributed to which channel. The $5 is up for grab for anyone who fight it harder – and usually to nobody’s surprise, it goes to the power that be.

Does this remind anyone of how CEO’s salary is often justified?

What happens when the interaction effect is negative, such as in the following scenario?

Scenario 4)
    when we run (PS) alone we get $20 in revenue; running (D) alone gets $15 in revenue; running both (PS) & (D) the revenue is $30.
    Which channel gets what?
How should the $5 lost distributed?  We do not know. 

What do you think? Do we have any way to justify other than bring out the “fairness” principle?

If the question is not answerable, the logic we use will at most questionable, or plain wrong.

However, all is not lost. Perhaps we should ask ourselves a question: Why do we ask for it in the first place? Is this really what we needed, or just what we wanted? This was the subject of one of my recent post: what you wanted may not be what you needed.

March 16, 2009

the new challenges to Media Mix Modeling

Among many themes discussed in the 2009 Digital Outlook report by Razorfish, there is a strand linked to media and content fragmentation, the complex and non-linear consumer experience, interaction among multiple media and multiple campaigns – all of these lead to one of the biggest analytics challenge: the failure of traditional Media Mix Modeling (MMM) in searching of a better Attribution Analytics.

The very first article of the research and measurement section is on MMM. It has some of the clearest discussion of why MMM failed to handle today’s marketing challenge, despite of its decades of success.  But I believe it can be made clearer. One reason is its failure to handle media and campaign interaction, which I think it is not the modeling failure but rather a failure for the purpose of attribution ( I have discussed this extensively in my post: Attribution, what you want may not be what you need).  The interaction between traditional media and digital media however, is of a different nature and it has to do with mixing of both push and pull media.  Push media influence pull media in a way that render many of the modeling assumptions problematic.  

Here’s its summary paragraph:

” Marketing mix models have served us well for the last several decades. However, the media landscape has changed. The models will have to change and adapt. Until this happens, models that incorporate digital media will need an extra layer of scrutiny. But simultaneously, the advertisers and media companies need to push forward and help bring the time-honored practice of media mix modeling into the digital era.”

The report limit its discussion to MMM, the macro attribution problem.  It did not give a fair discussion of the general attribution problem – no discussion of the recent developments in attribution analytics ( called by many names such as Engagement Mapping, Conversion Attribution, Multicampaign Attribution etc.).  

For those who interested in the attribution analytics challenges, my prior post on the three generations of attribution analytics provide an indepth overview of the field. 

Other related posts: micro and macro attribution and the relationship between attribution and  optimization.

March 14, 2009

Eight trends to watch: 2009 Digital Outlook from Razorfish

1. Advertisers will turn to “measurability” and “differentiation” in the recession

2. Search will not be immune to the impact of the economy

3. Social Influence Marketing™ will go mainstream

4. Online ad networks will contract; open ad exchanges will expand

     with Google’s new interest-based targeting, thing looking to change even more rapidly.

5. This year, mobile will get smarter

6. Research and measurement will enter the digital age

     This is an issue dear to my heart and I have been writing about the importance of Attribution Analytics,  Micro and Macro Attribution many times in recent months; directly from the report:

    “Due to increased complexity in marketing, established research and measurement conventions are more challenged than ever. For this reason, 2009 will be a year for research reinvention. Current media mix models are falling down; they are based on older research models that assume media channels are by and large independent of one another. As media consumption changes among consumers, and marketers include more digital and disparate channels in the mix, it is more important than ever to develop new media mix models that recognize the intricacies of channel interaction.

7. “Portable” and “beyond-the-browser” opportunities will create new touchpoints for brands and content owners

8. Going digital will help TV modernize

Read the Razorfish report for details.

March 10, 2009

fairness is not the principle for optimization

In my other post, what you want may not be what you need, I wrote about the principle of optimization. Some follow up questions I got from people made me realize that I had not done a good job in explaining the point. I’d like to try again.

Correct attribution provides business a way to implement accountability. In marketing, correct attribution of sales and/or conversions presumably help us optimize the marketing spend. But how?  Here’s an example of what many people have in mind:  

    Suppose you have the following sale attributions to your four marketing channels:
             40% direct mail
             30% TV
             20% Paid Search
             10% Online Display
    then, you should allocate future budget to the four channels in proportion to the percentage they got.

This is intuitive, and perhaps what the fairness principle would do:  award according to contribution.  However, this is not the principle of optimization. Why?

Optimization is about maximization under constraints.  In case of budget optimization, you ask the question of how to distribute the last (or marginal) dollar more efficiently.  Your last dollar should be allocated to the channel with the highest marginal ROI.  In fact, this principle dictates that as long as there is difference in marginal ROI across channels you can always improve by moving dollars around.  Thus with true optimal allocation, the marginal ROI should be equalized across channels.

The 40% sale/conversion attribution to Direct Mail is used to calculate the average ROI.  In most DM programs, the early part of the dollar goes to the better names in the list, which tends to contribute to higher ROI; on the other hand, the fixed cost such as cost incurred for model development effort etc. will lower the ROI for the early part of the budget.  ROI and marginal ROI are variable functions of budget, and the marginal ROI in general is not equal to the average ROI.  There are different reasons for every channel with similar conclusion.  This is why those attribution percentages do not automatically tell us how to optimize. 

You may ask that, assuming all the marginal ROI are proportional to the average ROI, are we then justified to use of attribution percentages for budget allocations?  The answer is no.  If your assumption is right you should give all your dollars to one channel with the highest ROI, but not to all channels in proportion to the percentages.

This is an example of macro attribution. The same thinking applies to micro attribution as well.  Attribution is seen as linked to accountability and further more to operation and/or budget optimization.

We used an example of macro attribution to illustrate our point; same thinking applies to micro attribution as well.  Contrary to commonsense that regards attribution as the foundation for accountability and operation optimization, attribution percentages should not be used directly in optimization. The proportional rule or the principle of fairness is not the principle for optimization.

March 5, 2009

The three generations of (micro) attribution analytics

For marketing and advertising, attribution problem normally starts at the macro level: we have total sales/conversions and marketing spends.  Marketing Mix Modeling (MMM) is the commonly used analytics tool providing a solution using time series data of these macro metrics.  

The MMM solution has many limitations that are intrinsically linked to the nature of the (macro level) data that’s been used.  Micro attribution analytics, when the micro-level touch points and conversion tracking is available, provides a better attribution solution.  Although sadly, MMM is more often practice even when the data for micro-attribution is available; this is primarily due to the lack of development and understanding of the micro attribution analytics, particularly the model-based approach.

There has been three types, or better yet, three generations of micro analytics over the years: the tracking-based solution, the order-based solution and the model-based solution.

The tracking-based solution has been popular in the multi-channel marketing world.  The main challenge here is to figure out through which channel a sale or conversion event happens. The book Multichannel Marketing – Metrics and Methods for On and Offline Success by Akin Arikan is an excellent source of information for the most often used methodologies – covering customized URL, unique 1-800 numbers and many other cross-channel tracking techniques.  Tracking normally implemented at the channel-level, not individual event levels.  Without the tracking solution, the sales numbers by channels are inferred through MMM or other analytics. With proper tracking, the numbers are directly observed.

Tracking solution essentially a single attribution approach to a multi-touch attribution problem. It does not deal with the customer level multi-touch experience.  This single-touch attribution approach leads natrually to the last-touch point rule when viewed from a multi-touch attribution perspective.  Another drawback of it is that it is simply a data-based solution without much analytics sophistication behind it – it provides relationship numbers without a strong argument for causal interpretation.  

 The order-based solution explicitly recognizes the multi-touch nature of individual consumer experience for brands and products. With the availability of micro-level touch point and conversion data, order-based attribution generally seeks attribution rules in the form of a weighting scheme based on the order of events. For example, when all weights are zero except the last touch point, it simply reduced to the LAST touch point attribution.  There has been many such rules been discussed; with constant debate about the virtual and drawbacks of each and every one of the rules.  There are also derived metrics based on these low-level order-based rules, such as the appropriate attribution ratio (Eric Peterson).

Despite the many advantages of order-based multi-touch attribution approach, there are still methodological limitations. One of the limitations is that, as many already know, there is no weighting scheme that is generally applicable, or appropriate for all business under all circumstances. There is no point of arguing which rule is the best without the specifics of the business and data context.  The proper rule should be different depending on the context; however, there is no provision or general methodology for the rule should be developed.  

Another limitation of the order-based weighting scheme is: for any given rule, the weight of an event is determined solely based on the order of event and not the type of event.  For example, one rule may specify the first click getting 20% attribution – when it maybe more appropriate to give the first click 40% attribution if it is a “search” and 10% if it is a “banner click through”.

Intrinsic to its intuition-based rule development process is that it does not have a rigorous methodology to support any causal interpretation which is central for right attribution and operation optimization.

Here comes the third generation of attribution analytics: the model-based attribution.  It promises to deliver a sound modeling process for rule development, and provides the analytical rigor for finding relationships that can have causal interpretation.  

More details to come.  Please come back to read the next post: a deep dive example of model-based attribution.

Related post: Micro and Macro Attribution

Blog at WordPress.com.