Analytics Strategist

March 17, 2009

the wrong logic in attribution of interaction effect

Attribution should not be such a difficult problem – as long as reality conforms to our linear additive model of it. The interaction, sequential dependency and nonlinearity are the main trouble makers.

In this discussion, I am going to focus on the attribution problem in the presence of interaction effect

Here’s the story setup: there are two ad channels, paid search (PS) and display (D).  

Scenario 1)
      When we run both (PS) & (D), we get $40 in revenue.  How should we attribute this $40 to PS and D?

The simple answer is: we do not know – for one thing,  we do not have sufficient data.
What about making the attribution in proportion to each channels’ spending numbers? You can certainly do it, but it is not more justifiable than any others.

Scenario 2)
    when we run (PS) alone we get $20 in revenue;  when we run (PS) & (D) together, we get $40.
    Which channel gets what?

The simple answer is again: we do not know – we do not have enough data.
Again, a common reasoning of this is:  (PS) gets $20 and (D) gets $20 (= $40 – $20).  The logic seems reasonable, but still flawed because there is no consideration of the interaction between the two.  Of course, with the assumption that there is no interaction between the two, this is the conclusion.

Scenario 3)
    when we run (PS) alone we get $20 in revenue; running (D) alone gets $15 in revenue; running both (PS) & (D) the revenue is $40.
    Which channel gets what?

The answer:  we still do not know. However, we can’t blame the lack of data anymore.  It is forcing us to face the intrinsic limitation in the linear additive attribution framework itself.

Number-wise, the interaction effect is a positive $5, $40-($20+$15), which we do not know what portion to be attributed to which channel. The $5 is up for grab for anyone who fight it harder – and usually to nobody’s surprise, it goes to the power that be.

Does this remind anyone of how CEO’s salary is often justified?

What happens when the interaction effect is negative, such as in the following scenario?

Scenario 4)
    when we run (PS) alone we get $20 in revenue; running (D) alone gets $15 in revenue; running both (PS) & (D) the revenue is $30.
    Which channel gets what?
How should the $5 lost distributed?  We do not know. 

What do you think? Do we have any way to justify other than bring out the “fairness” principle?

If the question is not answerable, the logic we use will at most questionable, or plain wrong.

However, all is not lost. Perhaps we should ask ourselves a question: Why do we ask for it in the first place? Is this really what we needed, or just what we wanted? This was the subject of one of my recent post: what you wanted may not be what you needed.

Advertisements

March 16, 2009

the new challenges to Media Mix Modeling

Among many themes discussed in the 2009 Digital Outlook report by Razorfish, there is a strand linked to media and content fragmentation, the complex and non-linear consumer experience, interaction among multiple media and multiple campaigns – all of these lead to one of the biggest analytics challenge: the failure of traditional Media Mix Modeling (MMM) in searching of a better Attribution Analytics.

The very first article of the research and measurement section is on MMM. It has some of the clearest discussion of why MMM failed to handle today’s marketing challenge, despite of its decades of success.  But I believe it can be made clearer. One reason is its failure to handle media and campaign interaction, which I think it is not the modeling failure but rather a failure for the purpose of attribution ( I have discussed this extensively in my post: Attribution, what you want may not be what you need).  The interaction between traditional media and digital media however, is of a different nature and it has to do with mixing of both push and pull media.  Push media influence pull media in a way that render many of the modeling assumptions problematic.  

Here’s its summary paragraph:

” Marketing mix models have served us well for the last several decades. However, the media landscape has changed. The models will have to change and adapt. Until this happens, models that incorporate digital media will need an extra layer of scrutiny. But simultaneously, the advertisers and media companies need to push forward and help bring the time-honored practice of media mix modeling into the digital era.”

The report limit its discussion to MMM, the macro attribution problem.  It did not give a fair discussion of the general attribution problem – no discussion of the recent developments in attribution analytics ( called by many names such as Engagement Mapping, Conversion Attribution, Multicampaign Attribution etc.).  

For those who interested in the attribution analytics challenges, my prior post on the three generations of attribution analytics provide an indepth overview of the field. 

Other related posts: micro and macro attribution and the relationship between attribution and  optimization.

March 14, 2009

Eight trends to watch: 2009 Digital Outlook from Razorfish

1. Advertisers will turn to “measurability” and “differentiation” in the recession

2. Search will not be immune to the impact of the economy

3. Social Influence Marketing™ will go mainstream

4. Online ad networks will contract; open ad exchanges will expand

     with Google’s new interest-based targeting, thing looking to change even more rapidly.

5. This year, mobile will get smarter

6. Research and measurement will enter the digital age

     This is an issue dear to my heart and I have been writing about the importance of Attribution Analytics,  Micro and Macro Attribution many times in recent months; directly from the report:

    “Due to increased complexity in marketing, established research and measurement conventions are more challenged than ever. For this reason, 2009 will be a year for research reinvention. Current media mix models are falling down; they are based on older research models that assume media channels are by and large independent of one another. As media consumption changes among consumers, and marketers include more digital and disparate channels in the mix, it is more important than ever to develop new media mix models that recognize the intricacies of channel interaction.

7. “Portable” and “beyond-the-browser” opportunities will create new touchpoints for brands and content owners

8. Going digital will help TV modernize

Read the Razorfish report for details.

March 10, 2009

fairness is not the principle for optimization

In my other post, what you want may not be what you need, I wrote about the principle of optimization. Some follow up questions I got from people made me realize that I had not done a good job in explaining the point. I’d like to try again.

Correct attribution provides business a way to implement accountability. In marketing, correct attribution of sales and/or conversions presumably help us optimize the marketing spend. But how?  Here’s an example of what many people have in mind:  

    Suppose you have the following sale attributions to your four marketing channels:
             40% direct mail
             30% TV
             20% Paid Search
             10% Online Display
    then, you should allocate future budget to the four channels in proportion to the percentage they got.

This is intuitive, and perhaps what the fairness principle would do:  award according to contribution.  However, this is not the principle of optimization. Why?

Optimization is about maximization under constraints.  In case of budget optimization, you ask the question of how to distribute the last (or marginal) dollar more efficiently.  Your last dollar should be allocated to the channel with the highest marginal ROI.  In fact, this principle dictates that as long as there is difference in marginal ROI across channels you can always improve by moving dollars around.  Thus with true optimal allocation, the marginal ROI should be equalized across channels.

The 40% sale/conversion attribution to Direct Mail is used to calculate the average ROI.  In most DM programs, the early part of the dollar goes to the better names in the list, which tends to contribute to higher ROI; on the other hand, the fixed cost such as cost incurred for model development effort etc. will lower the ROI for the early part of the budget.  ROI and marginal ROI are variable functions of budget, and the marginal ROI in general is not equal to the average ROI.  There are different reasons for every channel with similar conclusion.  This is why those attribution percentages do not automatically tell us how to optimize. 

You may ask that, assuming all the marginal ROI are proportional to the average ROI, are we then justified to use of attribution percentages for budget allocations?  The answer is no.  If your assumption is right you should give all your dollars to one channel with the highest ROI, but not to all channels in proportion to the percentages.

This is an example of macro attribution. The same thinking applies to micro attribution as well.  Attribution is seen as linked to accountability and further more to operation and/or budget optimization.

We used an example of macro attribution to illustrate our point; same thinking applies to micro attribution as well.  Contrary to commonsense that regards attribution as the foundation for accountability and operation optimization, attribution percentages should not be used directly in optimization. The proportional rule or the principle of fairness is not the principle for optimization.

March 5, 2009

The three generations of (micro) attribution analytics

For marketing and advertising, attribution problem normally starts at the macro level: we have total sales/conversions and marketing spends.  Marketing Mix Modeling (MMM) is the commonly used analytics tool providing a solution using time series data of these macro metrics.  

The MMM solution has many limitations that are intrinsically linked to the nature of the (macro level) data that’s been used.  Micro attribution analytics, when the micro-level touch points and conversion tracking is available, provides a better attribution solution.  Although sadly, MMM is more often practice even when the data for micro-attribution is available; this is primarily due to the lack of development and understanding of the micro attribution analytics, particularly the model-based approach.

There has been three types, or better yet, three generations of micro analytics over the years: the tracking-based solution, the order-based solution and the model-based solution.

The tracking-based solution has been popular in the multi-channel marketing world.  The main challenge here is to figure out through which channel a sale or conversion event happens. The book Multichannel Marketing – Metrics and Methods for On and Offline Success by Akin Arikan is an excellent source of information for the most often used methodologies – covering customized URL, unique 1-800 numbers and many other cross-channel tracking techniques.  Tracking normally implemented at the channel-level, not individual event levels.  Without the tracking solution, the sales numbers by channels are inferred through MMM or other analytics. With proper tracking, the numbers are directly observed.

Tracking solution essentially a single attribution approach to a multi-touch attribution problem. It does not deal with the customer level multi-touch experience.  This single-touch attribution approach leads natrually to the last-touch point rule when viewed from a multi-touch attribution perspective.  Another drawback of it is that it is simply a data-based solution without much analytics sophistication behind it – it provides relationship numbers without a strong argument for causal interpretation.  

 The order-based solution explicitly recognizes the multi-touch nature of individual consumer experience for brands and products. With the availability of micro-level touch point and conversion data, order-based attribution generally seeks attribution rules in the form of a weighting scheme based on the order of events. For example, when all weights are zero except the last touch point, it simply reduced to the LAST touch point attribution.  There has been many such rules been discussed; with constant debate about the virtual and drawbacks of each and every one of the rules.  There are also derived metrics based on these low-level order-based rules, such as the appropriate attribution ratio (Eric Peterson).

Despite the many advantages of order-based multi-touch attribution approach, there are still methodological limitations. One of the limitations is that, as many already know, there is no weighting scheme that is generally applicable, or appropriate for all business under all circumstances. There is no point of arguing which rule is the best without the specifics of the business and data context.  The proper rule should be different depending on the context; however, there is no provision or general methodology for the rule should be developed.  

Another limitation of the order-based weighting scheme is: for any given rule, the weight of an event is determined solely based on the order of event and not the type of event.  For example, one rule may specify the first click getting 20% attribution – when it maybe more appropriate to give the first click 40% attribution if it is a “search” and 10% if it is a “banner click through”.

Intrinsic to its intuition-based rule development process is that it does not have a rigorous methodology to support any causal interpretation which is central for right attribution and operation optimization.

Here comes the third generation of attribution analytics: the model-based attribution.  It promises to deliver a sound modeling process for rule development, and provides the analytical rigor for finding relationships that can have causal interpretation.  

More details to come.  Please come back to read the next post: a deep dive example of model-based attribution.

Related post: Micro and Macro Attribution

the counting game around attribution analytics

Filed under: Datarology, Technology, Web Analytics — Tags: — Huayin Wang @ 3:59 am

How many ways people describe the solution for attribution problem? Let’s count:
        attribution analytics
        attribution modeling
        multi-campaign attribution
        marketing attribution
        revenue attribution
        engagement mapping
        conversion attribution
        attribution management
        impression attribution
        marketing mix modeling
        response attribution
        attribution rules
        multi touch attribution analysis
        advanced attribution management
        online campaign attribution 
        multiple protocol attribution

       (and these do not include: attribution theory, performance attribution etc. – less related to marketing and advertising.)

It is also interesting to read all the different rules and heuristics have been proposed as solutions for attributing conversion to the prior events by the order of them: first touch attribution, last touch, mean attribution, equal attribution, exponential weighted attribution, use engagement metrics as proxy, looking at ratio of first to last attribution etc..

What about all the articles and posts that talking about it?  There are perhaps over 100 of them just making a point about how important it is to think about multi-touch attribution problem and do some analysis –  very interesting indeed.

I am sure that I missed some of the names and rule variations in the lists above.  Please add whatever I missed in the comments to help me out.

The use of different terminologies creates some confusion – making it difficult to stay focus on the core methodology issue.  

Please come back to read the next post on the three generations of attribution analytics.

March 4, 2009

attribution: what you want may not be what you need

.. or should I say, what you need may not be what you want?

Attribution problem, particularly the macro attribution problem, is traditionally asking for a way to partition the success metrics to each marketing efforts, in the form of percentages, so that the relative contributions can be measured. The hope is that these percentages can be used, aside from figuring out how to distribute bonus, to guide the optimal budgetting decision.  However, the promise of using attribution as an optimization methodology is flawed. Attribution is basically an additive model of business process where the success can be partitioned as if they are mutually independent. It is problematic when the actually relationship between marketing efforts and their success metric is non-linear – due to either the presence of interaction (in the form of synergy or cannibalization) effects or the intrinsic quantitative relationship.

When the data-driven relationship/model is shown to be non-additive and non-linear, it may not be intuitively clear of how to use it for attribution, i.e. coming up with the percentages.  On the other hand, the non-linear non-additive model should just be what you need for operational optimization, such as budget optimization decision.  This is because true optimization follows the equalization principle of marginal returns, rather than the averages. Attribution percentage is not as necessary for optimization precisely because it is based on the average. It still useful in many cases when the averages and marginals are highly correlated.

This is the basic idea of this post, and the reasons I was asking everyone to come back and read. For fundamentally, you need to optimize your operation; not just a set of percentages for distributing credits.  

The other point I want to make is that the percentages that every attribution is trying to get at, or every attribution rules is trying to produce, only make senses with some types of causal interpretation.  If the data show that one factor/effort has no real influence on success/conversion, then it is conceptually not justifiable to attribution anything to it. Again, the right interpretation of an attribution is based on the affirmation of a causal relationship. This is another reason for why the statistical modeling is fundamentally important.

I am not saying that the conversion model I mentioned is the typical “conversion model” we used in the DM context; it does not has to be exclusively predictive modeling. The special type of conversion model that we are building is a causal model based on empirical data. Much of the predictive modeling techniques do apply, but still there are some differences.  For example, if it is for prediction purpose, proxy variables are as valid as any other variables. It may not be automatically acceptable when building a model that require causal interpretation. 

Correct attribution rules, should be based on sound conversion model. Its implementation in web analytics tools can facilitate the reporting process for insight generation and monitoring purposes. It will have a similar role as what it does now. What I am arguing about it that it should be build based on sound data-driven conversion model, not simply intuitions. My point goes a little further in that I am also arguing the use of conversion models (be it linear or non-linear, additive or non-additive), but not the attribution percentages.

In sum, conversion model will provide what you needed, which is the ability to optimize your operation, but may not be what you wanted with attribution; those percentages that we all like to see and talk about are ultimately less critical than what we thought.  

Please come back and read the next post on a deep dive example of conversion modeling approach to attribution

Comments?

March 2, 2009

the first 3 insights on attribution analytics

Looking at micro attribution from a conversion modeling framework, there are a few insights we can contribute right away without getting into the details.

1)  The sampling bias

If your attribution analysis used only data from convertors, then you have an issue with sampling bias.

As a first order question for any modeling project, understanding the data sample, therefore the potential sampling bias, is crucial.  How is this relevant to the attribution problem?

Considering a hypothetical, but commonly asked type of, question:

What is the right attributions to banner and search given that I know the conversion path data:
    Banner -> conversion:   20%
    Banner -> search -> conversion: 40%
    Search -> conversion: 25%
    Search -> Banner -> conversion:  15%

Well, what could be with the question?  A standard setup for a conversion model is to use conversion as the dependent variable for the model with banner and search as predictors. The problem here is, we only have convertor cases but no non-convertor cases.  We simply can’t perform a model at all.  We need more data such as the number of click on banner but did not convert.

The sampling bias issue is actually deeper than this.  We want to know if the coverage of banner and search are “biased” for the data we are using, an example is when banner were national while search were regional. We also need to ask, if the future campaigns will be run in ways similar to what happened before – the requirement of modeling setup mimicking the applying context.

2) Encoding sequential pattern

The data for micro attribution is naturally in the form of collection of events/transactions:
User1:
    banner_id time1.1
    search_id time1.2
    search_id time1.3
    conversion time1.4
User2:
    banner_id time2.1
User3:
    search_id time3.1
    conversion time3.2

Some may think that this form of data makes predictive modeling infeasible. This is not the cases.  There are many predictive modeling are done with transaction/event type of data: fraud detection, survival model, to name a couple.  In fact, there are sophisticated practice in mining and modeling sequential patterns that are way beyond what being thought about in common attribution problem discussion. The simple message is:  this is an area that is well researched and practices and there have been great amount of knowledge and expertise related to this already.  

3) Separating model development from implementation processes

Again, the common sense from the predictive modeling world can shed some light on how our web analytics industry should approach attribution problem.  All WA vendors are trying to figure out this crucial question: how should we provide data/tool service to help clients solve their attribution problem. Should we provide data, should we provide attribution rules, or should provide flexible tools so that clients can specify their own attribution rules. 

The modeling perspective says that there is no generic conversion model that is right for all clients, very much like in Direct Marketing we all know there is no one right response model for all clients – even for clients in the same industry. Discover Card will have a different response model than American Express, partly because of the differences in their targeted population, their services, and partly because of the availably of data.  Web Analytics vendors should provide data sufficient for client to build their own conversion models, but not building ONE standard model for clients (of course, they can provide separate modeling services, which is a different story). Web Analytics vendors should also provide tools so that clients’ modeling can be specified/implemented once it’s been developed.  Given the parametric nature in conversion models, none of the tools from current major Web Analytics vendors seem sufficient for this task.

That is all for today. Please come back to read the next post: conversion model – not what you want but what you need.

February 27, 2009

it is about who get the job

If you ever wonder what all my recent attribution posts are about … it is about who get the job to handle the problem. Statisticians and modelers should be the ones — it is their jobs and I am speaking on their behalf.

If in fact the solution to attribution analytics is conversion modeling, then why it seems like everyone is talking about it like everything else but conversion modeling?  

Well, in my humble opinion, it is a sign of the lack of involvement, or lack of engagement, from statisticians, modelers and data miners. Today’s web analytics certainly got a lot of hammers and power tools; however, attribution may just be a different kind of problem.

Stay tune for the next post on the first 3 insights of the conversion modeling perspective

February 26, 2009

micro attribution analytics is conversion modeling

If you are surprised by the title statement, you are in the majority.  

This is actually a very strong statement and I did not make it lightly. It is saying that micro attribution is an area of data analytics that can be defined and studied with rigorous statistical methodologies. In short, it is more of a science than art or common sense.  Micro attribution problem is more like a response modeling or risk modeling problem than the problem of finding out a fair rule for distributing year-end bonus.  

Does this sound the same as how others describe attribution problem and solutions? 

It is certainly different from those who think the solution to attribution problem is about tracking.  Tracking is important because it provides you the data, but in itself they do not tell you what factors or customer experience have more or less influence on conversion.

It is also different from many who think about “last click”, “first click” etc. when they speak about attribution models.  Those are not data analytics models or statistical models that I was referring to.  One is about intuition-based smart rule; the other is about data-driven behavioral modeling.  The smart rule vs. modeling debate was long over in Direct Marketing, but it is just beginning in web analytics and online, right here in the micro attribution problem.   

It is also different from the many who think this is all about metrics (because of the claim that there is no right solution to attribution :).  It is not about averaging the attribution of first-click and last click. It is not about using engagement metrics as a proxy either. 

It is definitely not the same as those who think we need to wisdom-of-the-crowd type of solution.  The percentage of you who think early keywords should get 15% attribution for “assist” maybe right, but it has no bearing to me.  I do not believe that there is an average truth in any of these, for reasons that I do not believe one retailer’s offer-X response model shouldn’t be used for a loyalty campaign of a telecom company.

It is categorically different from those who hold that there is no right answer to the attribution problem. I agree that there is no perfect model that has no model prediction errors, but that is not a refutation for statistical modeling.  Statistics is founded on imprecision in data and never afraid of counter examples.

It is an approach of simplification, not of complication and certainly not a proposal to bring in psychology, media-logy or astrology into the picture.  In that regard, it could be a spoiler for the fun party we had so far.

Still, it is really just a claim at this point.  Please come back to read the next post: (TBD)

Older Posts »

Create a free website or blog at WordPress.com.