Analytics Strategist

January 2, 2010

a decade in data analytics …

Filed under: misc, Web Analytics — Tags: , , , — Huayin Wang @ 10:53 pm

I was reading an article The Decade of Data: Seven Trends to Watch in 2010 this morning and found it a fitting retrospective and perspective piece.¬†¬†I have been working in data analytics for the past 15 years, so naturally I went searching for similar articles with more of a focus on analytics, but came back empty handed ūüė¶

I wish I could write a similar post, but feel the task is too big to take. ¬†A systematic review with vision into the future would require much more dedication and effort than I could afford at this point. ¬†However, I do have a couple of thoughts and went ahead to gather some evidence to share. ¬†I’d love to hear your thoughts; please comment and provide your perspectives.

The above chart shows search volume indices for several data analytics related keywords over the last six years. ¬†There are many interesting patterns. ¬†The one caught my eyes first is the birth of Google Analytics: Nov 14, 2005. ¬†No only did it cause a huge spike in the search trend for “analytics”, the first day “analytics” surpass “regression”, it become the driving force behind the growth of web analytics and analytics discipline in general. ¬†Today, more than half of all “analytics” searches are associated with “Google Analytics”. ¬†Anyone who writes the history of data analytics will have to study the impact of GA seriously.

I wish I could do a chart on the impact of SAS and SPSS on data analytics in a similar fashion, but unfortunately it is hard to isolate SAS searches for statistics software vs other “SAS” searches. ¬†When limited to the “software” category, it seems that SAS has about twice the volume of SPSS, so I used SPSS instead.

Many years ago, before Google Analytics and the “web analyst” generation, statistical analysis and modeling dominated the business applications of data analytics. ¬†Statistician and their predictive modeling practice were sitting in their ivy tower. ¬†Since the early years of the 21st century, data mining and machine learning became a strong competing discipline to statistics – I remember the many heated debates between statistician and computer scientists about statistical modeling vs data mining. ¬†New jargons came about, such as decision tree, neural network, association rule and sequence mining. ¬†To whomever had the newest, smartest, most math grade, efficient and powerful algorithm went the spoils.

Google Analytics changed everything. ¬†Along with data democratization came the democratization of data intelligence. Who would’ve guessed that today, for a large crowd of (web) analysts, analytics would become near-synonymous with Google Analytics and building dashboard, tracking and reporting the right metrics the holy grail of analytics? ¬†Those statisticians may still inhabit the ivy tower of data analytics, but the world is already owned by others – the people – as democracy would dictate.

No question about it, data analytics is trending up and flourishing as never before.

comments?  Please share your thought here.

Advertisements

March 17, 2009

the wrong logic in attribution of interaction effect

Attribution should not be such a difficult problem – as long as reality conforms to our linear additive model of it. The interaction, sequential dependency and nonlinearity are the main trouble makers.

In this discussion, I am going to focus on the attribution problem in the presence of interaction effect. 

Here’s the story setup: there are two ad channels, paid search (PS) and display (D). ¬†

Scenario 1)
      When we run both (PS) & (D), we get $40 in revenue.  How should we attribute this $40 to PS and D?

The simple answer is: we do not know Рfor one thing,  we do not have sufficient data.
What about making the attribution in proportion to each channels’ spending numbers? You can certainly do it, but it is not more justifiable than any others.

Scenario 2)
    when we run (PS) alone we get $20 in revenue;  when we run (PS) & (D) together, we get $40.
    Which channel gets what?

The simple answer is again: we do not know – we do not have enough data.
Again, a common reasoning of this is:  (PS) gets $20 and (D) gets $20 (= $40 Р$20).  The logic seems reasonable, but still flawed because there is no consideration of the interaction between the two.  Of course, with the assumption that there is no interaction between the two, this is the conclusion.

Scenario 3)
    when we run (PS) alone we get $20 in revenue; running (D) alone gets $15 in revenue; running both (PS) & (D) the revenue is $40.
    Which channel gets what?

The answer: ¬†we still do not know.¬†However, we can’t blame the lack of data anymore. ¬†It is forcing us to face the intrinsic limitation in the linear additive¬†attribution framework itself.

Number-wise, the interaction effect is a positive $5, $40-($20+$15), which we do not know what portion to be attributed to which channel. The $5 is up for grab for anyone who fight it harder – and usually to nobody’s surprise, it goes to the power that be.

Does this remind anyone of how CEO’s salary is often¬†justified?

What happens when the interaction effect is negative, such as in the following scenario?

Scenario 4)
    when we run (PS) alone we get $20 in revenue; running (D) alone gets $15 in revenue; running both (PS) & (D) the revenue is $30.
    Which channel gets what?
How should the $5 lost distributed?  We do not know. 

What do you think? Do we have any way to justify other than bring out the “fairness” principle?

If the question is not answerable, the logic we use will at most questionable, or plain wrong.

However, all is not lost. Perhaps we should ask ourselves a question: Why do we ask for it in the first place? Is this really what we needed, or just what we wanted? This was the subject of one of my recent post: what you wanted may not be what you needed.

March 16, 2009

the new challenges to Media Mix Modeling

Among many themes discussed in the 2009 Digital Outlook report by Razorfish, there is a strand linked to media and content fragmentation, the complex and non-linear consumer experience, interaction among multiple media and multiple campaigns – all of these lead to one of the biggest analytics challenge: the failure of traditional Media Mix Modeling (MMM) in searching of a better Attribution Analytics.

The very first article of the research and measurement section is on MMM. It has some of the clearest discussion of why MMM failed to handle today’s marketing challenge, despite of its decades of success. ¬†But I believe it can be made clearer. One reason is its failure to handle media and campaign interaction, which I think it is not the modeling failure but rather a failure for the purpose of attribution ( I have discussed this extensively in my post: Attribution, what you want may not be what you need). ¬†The interaction between traditional media and digital media however, is of a different nature and it has to do with mixing of both push and pull media. ¬†Push media influence pull media in a way that render many of the modeling assumptions problematic. ¬†

Here’s its summary paragraph:

”¬†Marketing mix models have served us well for the¬†last several decades. However, the media landscape¬†has changed. The models will have to change and¬†adapt. Until this happens, models that incorporate¬†digital media will need an extra layer of scrutiny.¬†But simultaneously, the advertisers and media companies¬†need to push forward and help bring the¬†time-honored practice of media mix modeling into the¬†digital era.”

The report limit its discussion to MMM, the macro attribution problem.  It did not give a fair discussion of the general attribution problem Рno discussion of the recent developments in attribution analytics ( called by many names such as Engagement Mapping, Conversion Attribution, Multicampaign Attribution etc.).  

For those who interested in the attribution analytics challenges, my prior post on the three generations of attribution analytics provide an indepth overview of the field. 

Other related posts: micro and macro attribution and the relationship between attribution and  optimization.

March 14, 2009

Eight trends to watch: 2009 Digital Outlook from Razorfish

1.¬†Advertisers will turn to ‚Äúmeasurability‚ÄĚ and ‚Äúdifferentiation‚Ä̬†in the recession

2. Search will not be immune to the impact of the economy

3.¬†Social Influence Marketing‚ĄĘ¬†will go mainstream

4. Online ad networks will contract; open ad exchanges will expand

¬†¬† ¬† with Google’s new interest-based targeting, thing looking to change even more rapidly.

5. This year, mobile will get smarter

6. Research and measurement will enter the digital age

     This is an issue dear to my heart and I have been writing about the importance of Attribution Analytics,  Micro and Macro Attribution many times in recent months; directly from the report:

¬†¬† ¬†“Due to increased complexity in marketing, established¬†research and measurement conventions¬†are more challenged than ever. For this reason,¬†2009 will be a year for research reinvention.¬†Current media mix models are falling down; they¬†are based on older research models that assume¬†media channels are by and large independent¬†of one another. As media consumption changes¬†among consumers, and marketers include more¬†digital and disparate channels in the mix, it is¬†more important than ever to develop new media¬†mix models that recognize the intricacies of¬†channel interaction.

7.¬†‚ÄúPortable‚ÄĚ and ‚Äúbeyond-the-browser‚Ä̬†opportunities will create new touchpoints for¬†brands and content owners

8. Going digital will help TV modernize

Read the Razorfish report for details.

March 10, 2009

fairness is not the principle for optimization

In my other post,¬†what you want may not be what you need, I wrote about the principle of optimization. Some follow up questions I got from people made me realize that I had not done a good job in explaining the point. I’d like to try again.

Correct attribution provides business a way to implement accountability. In marketing, correct attribution of sales and/or conversions presumably help us optimize the marketing spend. But how? ¬†Here’s an example of what many people have in mind: ¬†

    Suppose you have the following sale attributions to your four marketing channels:
             40% direct mail
             30% TV
             20% Paid Search
             10% Online Display
    then, you should allocate future budget to the four channels in proportion to the percentage they got.

This is intuitive, and perhaps what the fairness principle would do:  award according to contribution.  However, this is not the principle of optimization. Why?

Optimization is about maximization under constraints.  In case of budget optimization, you ask the question of how to distribute the last (or marginal) dollar more efficiently.  Your last dollar should be allocated to the channel with the highest marginal ROI.  In fact, this principle dictates that as long as there is difference in marginal ROI across channels you can always improve by moving dollars around.  Thus with true optimal allocation, the marginal ROI should be equalized across channels.

The 40% sale/conversion attribution to Direct Mail is used to calculate the average ROI.  In most DM programs, the early part of the dollar goes to the better names in the list, which tends to contribute to higher ROI; on the other hand, the fixed cost such as cost incurred for model development effort etc. will lower the ROI for the early part of the budget.  ROI and marginal ROI are variable functions of budget, and the marginal ROI in general is not equal to the average ROI.  There are different reasons for every channel with similar conclusion.  This is why those attribution percentages do not automatically tell us how to optimize. 

You may ask that, assuming all the marginal ROI are proportional to the average ROI, are we then justified to use of attribution percentages for budget allocations?  The answer is no.  If your assumption is right you should give all your dollars to one channel with the highest ROI, but not to all channels in proportion to the percentages.

This is an example of macro attribution. The same thinking applies to micro attribution as well.  Attribution is seen as linked to accountability and further more to operation and/or budget optimization.

We used an example of macro attribution to illustrate our point; same thinking applies to micro attribution as well.  Contrary to commonsense that regards attribution as the foundation for accountability and operation optimization, attribution percentages should not be used directly in optimization. The proportional rule or the principle of fairness is not the principle for optimization.

March 5, 2009

The three generations of (micro) attribution analytics

For marketing and advertising, attribution problem normally starts at the macro level: we have total sales/conversions and marketing spends.  Marketing Mix Modeling (MMM) is the commonly used analytics tool providing a solution using time series data of these macro metrics.  

The MMM solution has many limitations that are intrinsically linked to the nature of the (macro level) data that’s been used. ¬†Micro attribution analytics, when the micro-level touch points and conversion tracking is available, provides a better attribution solution. ¬†Although sadly, MMM is more often practice even when the data for micro-attribution is available; this is primarily due to the lack of development and understanding of the micro attribution analytics, particularly the model-based approach.

There has been three types, or better yet, three generations of micro analytics over the years: the tracking-based solution, the order-based solution and the model-based solution.

The tracking-based solution has been popular in the multi-channel marketing world.  The main challenge here is to figure out through which channel a sale or conversion event happens. The book Multichannel Marketing РMetrics and Methods for On and Offline Success by Akin Arikan is an excellent source of information for the most often used methodologies Рcovering customized URL, unique 1-800 numbers and many other cross-channel tracking techniques.  Tracking normally implemented at the channel-level, not individual event levels.  Without the tracking solution, the sales numbers by channels are inferred through MMM or other analytics. With proper tracking, the numbers are directly observed.

Tracking solution essentially a single attribution approach to a multi-touch attribution problem. It does not deal with the customer level multi-touch experience.  This single-touch attribution approach leads natrually to the last-touch point rule when viewed from a multi-touch attribution perspective.  Another drawback of it is that it is simply a data-based solution without much analytics sophistication behind it Рit provides relationship numbers without a strong argument for causal interpretation.  

 The order-based solution explicitly recognizes the multi-touch nature of individual consumer experience for brands and products. With the availability of micro-level touch point and conversion data, order-based attribution generally seeks attribution rules in the form of a weighting scheme based on the order of events. For example, when all weights are zero except the last touch point, it simply reduced to the LAST touch point attribution.  There has been many such rules been discussed; with constant debate about the virtual and drawbacks of each and every one of the rules.  There are also derived metrics based on these low-level order-based rules, such as the appropriate attribution ratio (Eric Peterson).

Despite the many advantages of order-based multi-touch attribution approach, there are still methodological limitations. One of the limitations is that, as many already know, there is no weighting scheme that is generally applicable, or appropriate for all business under all circumstances. There is no point of arguing which rule is the best without the specifics of the business and data context.  The proper rule should be different depending on the context; however, there is no provision or general methodology for the rule should be developed.  

Another limitation of the order-based weighting scheme is: for any given rule, the weight of an event is determined solely based on the order of event and not the type of event. ¬†For example, one rule may specify the first click getting 20% attribution ‚Äď when it maybe more appropriate to give the first click 40% attribution if it is a “search” and 10% if it is a “banner click through”.

Intrinsic to its intuition-based rule development process is that it does not have a rigorous methodology to support any causal interpretation which is central for right attribution and operation optimization.

Here comes the third generation of attribution analytics: the model-based attribution.  It promises to deliver a sound modeling process for rule development, and provides the analytical rigor for finding relationships that can have causal interpretation.  

More details to come.  Please come back to read the next post: a deep dive example of model-based attribution.

Related post: Micro and Macro Attribution

the counting game around attribution analytics

Filed under: Datarology, Technology, Web Analytics — Tags: — Huayin Wang @ 3:59 am

How many ways people describe the solution for¬†attribution problem? Let’s count:
        attribution analytics
        attribution modeling
        multi-campaign attribution
        marketing attribution
        revenue attribution
        engagement mapping
        conversion attribution
        attribution management
        impression attribution
        marketing mix modeling
        response attribution
        attribution rules
        multi touch attribution analysis
        advanced attribution management
        online campaign attribution 
        multiple protocol attribution

       (and these do not include: attribution theory, performance attribution etc. Рless related to marketing and advertising.)

It is also interesting to read all the different rules and heuristics have been proposed as solutions for attributing conversion to the prior events by the order of them: first touch attribution, last touch, mean attribution, equal attribution, exponential weighted attribution, use engagement metrics as proxy, looking at ratio of first to last attribution etc..

What about all the articles and posts that talking about it? ¬†There are perhaps over 100 of them just making a point about how important it is to think about multi-touch attribution problem and do some analysis –¬†¬†very interesting indeed.

I am sure that I missed some of the names and rule variations in the lists above.  Please add whatever I missed in the comments to help me out.

The use of different terminologies creates some confusion Рmaking it difficult to stay focus on the core methodology issue.  

Please come back to read the next post on the three generations of attribution analytics.

March 4, 2009

attribution: what you want may not be what you need

.. or should I say, what you need may not be what you want?

Attribution problem, particularly the macro attribution problem, is traditionally asking for a way to partition the success metrics to each marketing efforts, in the form of percentages, so that the relative contributions can be measured. The¬†hope is that these percentages can be used, aside from figuring out how to distribute bonus, to guide the optimal budgetting decision. ¬†However, the promise of using attribution as an optimization methodology is flawed. Attribution is basically an additive model of business process where the success can be partitioned as if they are mutually independent. It is problematic when the actually relationship between marketing efforts and their success metric is non-linear ‚Äď due to either the presence of interaction¬†(in the form of synergy or cannibalization)¬†effects or the intrinsic quantitative relationship.

When the data-driven relationship/model is shown to be non-additive and non-linear, it may not be intuitively clear of how to use it for attribution, i.e. coming up with the percentages.  On the other hand, the non-linear non-additive model should just be what you need for operational optimization, such as budget optimization decision.  This is because true optimization follows the equalization principle of marginal returns, rather than the averages. Attribution percentage is not as necessary for optimization precisely because it is based on the average. It still useful in many cases when the averages and marginals are highly correlated.

This is the basic idea of this post, and the reasons I was asking everyone to come back and read. For fundamentally, you need to optimize your operation; not just a set of percentages for distributing credits.  

The other point I want to make is that the percentages that every attribution is trying to get at, or every attribution rules is trying to produce, only make senses with some types of causal interpretation.  If the data show that one factor/effort has no real influence on success/conversion, then it is conceptually not justifiable to attribution anything to it. Again, the right interpretation of an attribution is based on the affirmation of a causal relationship. This is another reason for why the statistical modeling is fundamentally important.

I am not saying that the conversion model I mentioned is the typical ‚Äúconversion model‚ÄĚ we used in the DM context; it does not has to be exclusively predictive modeling. The special type of conversion model that we are building is a causal model based on empirical data. Much of the predictive modeling techniques do apply, but still there are some differences. ¬†For example, if it is for prediction purpose, proxy variables are as valid as any other variables. It may not be automatically acceptable when building a model that require causal interpretation.¬†

Correct attribution rules, should be based on sound conversion model. Its implementation in web analytics tools can facilitate the reporting process for insight generation and monitoring purposes. It will have a similar role as what it does now. What I am arguing about it that it should be build based on sound data-driven conversion model, not simply intuitions. My point goes a little further in that I am also arguing the use of conversion models (be it linear or non-linear, additive or non-additive), but not the attribution percentages.

In sum, conversion model will provide what you needed, which is the ability to optimize your operation, but may not be what you wanted with attribution; those percentages that we all like to see and talk about are ultimately less critical than what we thought.  

Please come back and read the next post on a deep dive example of conversion modeling approach to attribution. 

Comments?

March 2, 2009

the first 3 insights on attribution analytics

Looking at micro attribution from a conversion modeling framework, there are a few insights we can contribute right away without getting into the details.

1)  The sampling bias

If your attribution analysis used only data from convertors, then you have an issue with sampling bias.

As a first order question for any modeling project, understanding the data sample, therefore the potential sampling bias, is crucial.  How is this relevant to the attribution problem?

Considering a hypothetical, but commonly asked type of, question:

What is the right attributions to banner and search given that I know the conversion path data:
    Banner -> conversion:   20%
    Banner -> search -> conversion: 40%
    Search -> conversion: 25%
    Search -> Banner -> conversion:  15%

Well, what could be with the question? ¬†A standard setup for a conversion model is to use conversion as the dependent variable for the model with banner and search as predictors. The problem here is, we only have convertor cases but no non-convertor cases. ¬†We simply can’t perform a model at all. ¬†We need more data such as the number of click on banner but did not convert.

The sampling bias issue is actually deeper than this. ¬†We want to know if the coverage of banner and search are “biased” for the data we are using, an example is when banner were national while search were regional. We also need to ask, if the future campaigns will be run in ways similar to what happened before – the requirement of modeling setup mimicking the applying context.

2) Encoding sequential pattern

The data for micro attribution is naturally in the form of collection of events/transactions:
User1:
    banner_id time1.1
    search_id time1.2
    search_id time1.3
    conversion time1.4
User2:
    banner_id time2.1
User3:
    search_id time3.1
    conversion time3.2

Some may think that this form of data makes predictive modeling infeasible. This is not the cases.  There are many predictive modeling are done with transaction/event type of data: fraud detection, survival model, to name a couple.  In fact, there are sophisticated practice in mining and modeling sequential patterns that are way beyond what being thought about in common attribution problem discussion. The simple message is:  this is an area that is well researched and practices and there have been great amount of knowledge and expertise related to this already.  

3) Separating model development from implementation processes

Again, the common sense from the predictive modeling world can shed some light on how our web analytics industry should approach attribution problem.  All WA vendors are trying to figure out this crucial question: how should we provide data/tool service to help clients solve their attribution problem. Should we provide data, should we provide attribution rules, or should provide flexible tools so that clients can specify their own attribution rules. 

The modeling perspective says that there is no generic conversion model that is right for all clients, very much like in Direct Marketing we all know there is no one right response model for all clients – even for clients in the same industry. Discover Card will have a different response model than American Express, partly because of the differences in their targeted population, their services, and partly because of the availably of data. ¬†Web Analytics vendors should provide data sufficient for client to build their own conversion models, but not building ONE standard model for clients (of course, they can provide separate modeling services, which is a different story). Web Analytics vendors should also provide tools so that clients’ modeling can be specified/implemented once it’s been developed. ¬†Given the parametric nature in conversion models, none of the tools from current major Web Analytics vendors seem sufficient for this task.

That is all for today. Please come back to read the next post: conversion model Рnot what you want but what you need.

February 27, 2009

it is about who get the job

If you ever wonder what all my recent¬†attribution¬†posts are about … it is about who get the job to handle the problem. Statisticians and modelers should be the ones — it is their jobs and I am speaking on their behalf.

If in fact the solution to attribution analytics is conversion modeling, then why it seems like everyone is talking about it like everything else but conversion modeling?  

Well, in my humble opinion, it is a sign of the lack of involvement, or lack of engagement, from statisticians, modelers and data miners. Today’s web analytics certainly got a lot of hammers and power tools; however, attribution may just be a different kind of problem.

Stay tune for the next post on the first 3 insights of the conversion modeling perspective. 

Older Posts »

Blog at WordPress.com.