Analytics Strategist

May 20, 2013

Hi Mr. Wanamaker, only half of your money is wasted?

Filed under: Advertising, metrics — Huayin Wang @ 6:09 pm

Everyone working in the advertising industry, or related fields, has probably heard of the famous Wanamaker Quote: “Half the money I spend on advertising is wasted; the trouble is I don’t know which half.”

What he said seems obvious at first; however, when read a little deeper, it could be problematic.  Below are a few related points:

a) The waste may not be 50%; it may in fact be as high as 99%

Let’s begin by asking, how did he estimate the advertising waste?  Can someone know that amount of waste without being able to identify which part?

There are two ways to estimate media waste: the first one involves breaking down advertising campaigns into different tactics identifying ineffective ones.  The tactics can vary by audience attribute (age, gender, behavioral), geo and creative etc.. Take gender as an example, you Male audience maybe twice as effective as Female, so you treat Female tactic as Wasted. The problems with this estimation methodology: the ineffective tactics are not all “wasted” and the effective tactics contains waste tool.  The estimation is also quite subjective, since it depends not only on how you define “effective”, but also on how you breakdown campaigns into “tactics”.

The second way of estimating waste, the only defensible one in my view, relies on counting outcome directly.  Take direct response campaign as an example: if conversion is the outcome, the money spend without resulting in conversion will be wasted.  If display ads reached 30 millions of users and only 3,000 converted, then the spend on the 99% of users is wasted.  The actual waste number can be even higher, when considering that the ads shown to the converters may themselves be ineffective and should not be counted as incrementally effective.

b) It is not just about measurement (alone ), but more about granularity of the underlying measurement

Knowing how to measure the waste, the next question is: how to solve the waste issue?

The common (traditional, offline) scheme is to define a targeting audience first, following up with a “waste” measurement that is then defined as media delivered outside of the targeting audience.  This practice ignores the waste inherent in the definition of the target audience.  Age 20-34 maybe five time as likely to convert as others and therefore a valid target audience.  However, if the average converter rate is 1%, then conversion rate for this target audience is only 5% – which means 95% is waste as well.

Creating different targeting tactics and measuring them does not necessarily addressing the issue of waste! I am horrified to see how many people believe that bring offline GRP metrics to online solved the display advertising waste problem.  Age and Gender data do not generate tactics that are waste-less.  You need to use higher dimensional data to create and identify much more granular audience and context/creative groupings in order to truly combating the advertising waste problem!

 

Is GRP metrics the cure of online advertising waste?  I do not think so.  In fact, I think it will do more harm than good.

c) Targetability is key, but often ignored

To not making this a long writeup, I will make the point really short: without event level targeting, we are not going to solve the waste problem; in fact, we are not even facing it straightly.  If nothing else, the most granular level of media transaction mechanism, such as implemented in AdExchange RTB today,  is necessary.

April 5, 2013

Multi-touch attribution problem is solved, agree?

Filed under: attribution analytics — Huayin Wang @ 10:08 pm

I believe the MTA modeling problem is solved with the approach I discussed in the Unusually Blunt Dialogue on Attribution.  I have since received some questions about the approach, or the agenda; some related to the contents and others about formatting.  Today, I am going to try a simple recap, to address those questions.

First of all, the formatting issue. The format in WP is hard to read. A friend of mine (thank you, Steve!) is kind enough to put the content into MS-Word.  Anyone interested in reading the dialogues in a better format, can download it here:  the attribution dialogue.

Below are Q&A for other questions:

Q: Is attribution problem solved?

A: Hardly. Attribution problem consists of many challenges: data, model/modeling, behavioral insight, reporting, and finally optimization.

Q: When you started, you were aiming to reach a consensus on Attribution Model and Modeling. Have we reached the consensus? Is this attribution modeling problem solved?

A: Consensus is never easy to build and may never be achieved. I believe I have covered enough ground to build consensus on this issue, so we can move on to other businesses. I believe the MTA modeling problem is solved, but I am open to someone who can convince me otherwise.

Q: Is there any remaining issues not covered in your agenda?

A: Yes. One example of the left out issues is the search – display interaction; we handles part of it, but not completely.

Q: What do you mean?

A: There are two types of interactions:  the interaction effect at behavioral level, which is covered in the conversion model, and the interaction effect on media exposure.  The latter type of interaction is not capturable by conversion models.

Q: This is quite dense … do we need another methodology to model the likelihood of exposure?

A: I do not think individual level modeling is the right approach – lack of data is not the only challenge …

Q: Ok, if this is so, how can we say attribution modeling is solved?

A: I consider this to be outside the main attribution modeling.  This trailing piece may need a different handle – a “re-attribution” methodology?

(more to come)

Bid quality score: the missing piece in the programmatic exchange puzzle

Filed under: Ad Exchange, Game Theory, Matching game, misc, Technology — Huayin Wang @ 7:45 pm

On the eve of the Programmatic IO and Ad Tech conferences in SF, I want to share my idea for a new design feature of Exchange/SSP, a feature that has the potential of significantly impacting our industry. This feature is the bid auction rule.

Bid auction rule is known to be central to Search Engine Marketing.  Google’s success story underscores how important a role it can play in shaping the process and dynamics of the marketplace. There is reason to believe that it has the similar potential for the RTB Exchange industry.

The current auction model implemented in Ad Exchanges and SSPs are commonly known as Vickrey auction, or the second price auction. It goes like this:  upon receiving a set of bids, exchanges will decide on a winner based on the highest bid amount, and set the price to the second highest bid amount.  In the RTB Bid Process diagram below, this auction rule is indicated by the green arrow #7:

RTB bidding process

RTB bidding process

(I am simplifying the process a lot by removing non-essential details from the actual process for our purpose, e.g Ad Servers)

The new auction I’d like to propose is a familiar one: it is a modified Vickrey auction with quality score!  Here, the bid quality score is defined as the quality of an ad to the publisher, aside from the bid price.  It essentially captures all things that a publisher may care about the ad. I can think of a few factors:

  1. Ad transparency and related data availability
  2. Ad quality (adware, design)
  3. Ad content relevancy
  4. Advertiser and product brand reputation
  5. User response

Certainly, the bid quality scores are going to be publisher specific.  In fact, it can be made site-section specific or page specific.  For example, a publisher may have a reason to treat Home Page of their site differently than other pages.  It can also vary by user attributes if the publisher like to.

Given that, the Exchange/SSP will no longer be able to carry out the auction all by itself – as the rule no longer depends only on bid amounts.  We need a new processing component, as shown in the diagram below.

new-design

Now, #7 is replaced with this new component, called Publisher Decider.  Owned by the publisher, the decider works through the following steps:

  1. it takes in multiple bids
  2. calculates the bid quality scores
  3. for each bid, calculates the Total Bid Score (TBS), by multiplying bid amount and quality score
  4. ranks the set of bids by the TBS
  5. makes the bid with highest TBS the winner
  6. sets the bid price based on a formula below, as made famous by Google

p3

Here, P1 is the price set for the winner bid. Q1 is the bid quality score. B2 is the bid amount for the bid with second highest TBS. Q2 is the bid quality score for the bid with the second highest TBS.

This is not a surprise and it’s not much of a change. So, why is this so important?

Well, with the implementation of this new auction rule, we can guess some natural impacts coming out:

  • Named Brands will have an advantage on bid price, because they tend to have better quality scores. A premium publisher may be willing to take $1 CPM from apple than $5 CPM from a potential adware.  This will be achieved via Apple having a quality score 5 times or more higher than that of the other crappy ad.
  • Advertisers will have an incentive to be more transparent. Named brands will be better off with being transparent, to distinguish themselves from others. This will drive the quality score from non-transparent ads lower, therefore starting a good cycle.
  • DSPs or biddes will have no reason to not submit multiple bids, for they won’t be able to know which ad will be the winner before hand.
  • Premium Publishers will have more incentive to put their inventory into now that they have transparency and finer level of control.
  • The Ad Tach eco-system will respond with new players, such as ad-centric data companies serving the publisher needs, similar to the contextual companies serving advertisers

You may see missing links in the process I described here.  It is expected, because a complete picture is not the focus of this writing.  I hope you will be convinced that bid quality score / Publisher Decider is interesting, and potentially has significant impact by pushing the Ad Tech space in the direction of more unified technologies and consistent framework.

March 30, 2013

An Unusually Blunt Dialogue on Attribution – Part 3

Filed under: misc — Tags: , , , — Huayin Wang @ 4:53 pm
Q: It’s been over a week since we talked last time, and I am still in disbelief.  If what you said are true, multi-touch attribution problem is solved! Then again, I feel there are still so many holes. Before we discussion some challenging questions, can you sketch out the attribution process, the steps you proposed.

A: Sure.  There are four steps:

Step 1. Developing conversion model(s)
Step 2. Calculating conditional probability profile for each conversion event
Step 3. Applying Shapley Value formula to get the S-value set
Step 4. Calculating fractional credit: dividing S-value by the (non-conditional) conversion probability

Q: And what to call this – the attribution process? algorithm? framework? approach?
A: attribution agenda – of course, you can call it anything you like.

Q: Why don’t you start with data collection – I am sure you heard about GIGA principles and how important having good and correct data is for attribution …
A: I am squarely focusing on attribution logic – data issues are outside the scope of this conversation

Q: I noticed that there are no rule-based attribution models in your agenda. Are the rules really so arbitrary that they are of no use at all?
A: They are not arbitrary – like any social/cognitive rules of custom nature. For attribution purpose, however, they are neither conversion models, which measure how channels actually impact conversion probability, and nor clearly stated justification principle.

Q: What about the famous Introducer, Influencer and Closer framework – the thing everyone use in defining attribution models – and the insights they provided to attribution?
A: They are really of the same concept as the last touch, first touch rules – a position based way of looking at how channel and touch point sequence are correlated. You can use an alternative set of cleaner and more direct metrics to get similar insights – metrics derived from counting the proportions of a channel in conversion sequence as first touch, last touch and neither.

Q: Do these rules have no use at all in attribution process? Can they be used in conjunction with conversion models?
A: You do not use them together, there is simply no needs for them anymore when you can have conversion models.  However, there are cases when you do not have sufficient data to build your models. In that case, you can borrow from other models, or use these rule-based models as your heuristic rules.

Q: You are clearly not in the “guru camp” – as you said in your “Guru vs PhD” tweet. Are you in the PhD camp then?
A: No. I also think that they maybe more disappointed than the gurus from the web analytics side ..

Q: I have same feeling – I think you are killing their hope of being creative in the attribution modeling area. With your agenda, there is no more attribution models aside from conversion models, and no more attribution modeling aside from this one Shapley Value formula, and the adjustment factor.
A: The real creativity should be in the development of better conversion model.

Q: Let’s slow down a little bit. I think you maybe over simplifying the attribution problem.  Your conversion models seem only work when there is one touch event per channel — how can you handle multiple touch events per channel cases?
A: You may be confusing the conditional probability profile – in which channel is treated as one single entity – with conversion models. In my mind, you can creative multiple variables per channels that reflect complex feature of the touch point sequences for that channel: freq, recency, interval, first indicator etc.. Once the model is developed, you construct the conditional probability profile by taking all the touch points for that channel On or Off at the same time.

Q: Ok. How do you deal with the ordering effect – the fact that channel A first, and B second (A,B) is different from (B,A)?
A: You construct explicit order indicator variables in your conversion models … that way, your attribution formula (the Shapley Value) can remain the same.

Q: And what if the order does not matter.
A: Then the order indicator variables will not be significant in the conversion models.

Q: and the channel interaction?
A: through the usual way you model the interaction effects between two or more main effects.

Q: The separation of conversion model and attribution principle in your agenda is quite frustrating.  Why can’t we find innovative ways of handling both in one model – a sort of magic model, Bayesian, Markovian or whatever.
A: Go find it out.

Q: Control/Experiment could be an alternative, isn’t it?
A: Control/Experiment is at better a way of measuring the marginal impact; it is albeit to say that it is an impractical way to measure all levels of marginal impact that a conversion model will support.  If we have more than a couple of channels, the number of experiment needed goes up exponentially.  It also does not allow post-experiment analysis, and there is no way to practically incorporate recency and sequence patterns etc..

Q: What about optimization principle?  If by requiring the best attribution rule as reflecting the optimal way of allocating campaign budget to maximize the number of conversion, one can derive a unique attribution rule, can that be the solution to attribution?
A: No. Attribution problem is about events that happened already and needs to be answered that way, without requiring any assumption about future.  Campaign optimization is a related, but separate topic.

Q: Your attribution agenda is limited to conversion event. In reality, a lot of other metrics we care about, such as customer life time value, engagement value etc… How do you attribute those metrics?
A: If you can attribution (conversion) event, you can attribute all metrics derived from that, by figuring out what values linked to that event. In short, you figure out the fractional credit for the event first, then multiple the value of the event, you get the attribution process for that new metric.

Q: You have so far not talked about media cost at all – when we know every attribution vendors are using them in the process. How come there is no media cost in your attribution agenda?

A: Media cost is needed to evaluate cost-based channel performance, not for attribution. How much has a channel impacted a conversion is a fact, not depended on how much you paid the vendor —  if there is any relationship, it should be the opposite. The core of attribution process can be done without the media cost data — all vendors ask for it because they want to work on more projects aside from attribution.

Q: Regarding to issue of where should attribution process reside, you picked Agency.  Isn’t agency the last place you’d think of when it comes to any technology matter? Since when did you see agency put technology at their core competency?
A: Understandable. I said that not for any of the reasons you mentioned, but for what an ideal world should be.  Attribution process is so central to campaign planning, execution and performance reporting, at both tactical and strategic level.  Having that piece sitting outside of the integration center can cause a lot of frictions to moving your advertising/marketing to the next level.  I said that it should live inside your agency, but I did not say that it should be “build” by the agency; I did not say it should live inside your “current” agency; and certainly, there is nothing prevent you from making your technology vendor into your “new agency”, as long as they will take up the planning, execution and reporting works from your agency, at both strategic and tactical levels.

Q: What about Media Mix Modeling? If we have resources doing that, do we still need to worry about attribution?
A: The micro-macro attribution technologies. It is complicated and certainly need a separate discussion in order to do justice to the topic.  The simplest distinction between the are this:  when you know the most detailed data of who were touched by what campaigns, you do attribution.  If you have none of those data, but only know the aggregated level of media delivery and conversion data, you do MMM.

Q: I have to say that your agenda brings a lot of clarity to the state of attribution. I like the prospect of order; still, I can’t help but think about what a great time everyone have had around attribution models in recent year ..

A: Yes – the state of extreme democracy without consensus. To those who have gun, money and power, anarchy may just be the perfect state; not being cynical, just my glass half-full kind of perspective.

March 21, 2013

An Unusually Blunt Dialogue on Attribution – Part 2

Q: Continue on our yesterday’s conversation … I am still confused about the difference between conversion model and attribution model and attribution modeling.  Can you demonstrate using a simple example?

A: Sure.  Let’s look at a campaign with one vendor/channel on the media plan …

Q: Wait a minute, that will not be an attribution problem.  If there is only one channel/vendor,  does it matter what attribution model you use?

A: It does. Do we give the vendor 100% of the credit? A fraction less than 100% of the credit?

Q: Why not 100%?  I think all commonly used attribution models will use 100% …

A: You may want to think twice, because some users may convert on their own.  Let’s assume the vendor reach 10,000 users and 100 of them converted. Let’s also assume that, through analysis and modeling works (such as using a control group), you conclude that 80 out of the 100 converters will convert on their own.  How many converters does the vendor actually (incrementally) impacted?

Q: 20.

A: If you assign 100% credit to the vendor, the vendor will get all 100 converters’ credits.  Since the actual impacted conversion is 20, a fraction of credit should be used; in this case it is 20% instead 100%.  That’s attribution modeling, in its simplest form.

Q: Really? Can you recap the process and highlight the attribution modeling part of it?

A:  Sure. In this simplest example, the conversion model provides us two numbers(scores):

1)      The probability of conversion given the converter exposed to the campaign, call it P(c|camp) – in this case it is 100/10000 = 1% , and

2)      The probability of conversion given the converter not exposure to the campaign, call it P(c|no-camp) – in this case it is 80/10000 = 0.8%

The attribution modeling says that, only a fraction of the credit, (P(c|camp)-P(c|no-camp))/P(c|camp) == 0.2 or 20%, should be credited out.

Notice that this fraction for attribution is not 100%. It is not P(c|camp) which is 1%; and it is not P(c|camp) – P(c|no-camp) which is 0.2%.

Q: This is an interesting formula.  I do not recall seeing it anywhere before.  Does this formula come from the conversion model?

A: Not really.  The conversion model only providing the best possible estimate for P(c|camp) and P(c|no-camp), that’s all.  It will not provide the attribution fraction formula.

Q: Where does this formula come from then?

A: It comes from the following reasoning:  vendor(s) should get paid for what they actually (incrementally) impacted, not all the conversions they touched.

Q: So the principle of this “attribution modeling” is not data-driven but pure reason.  How much should I trust this reasoning?  Can this be the ground to build industry consensus?

A: What else can we build consensus on?

Q: Ok, I see how it works in this simple case, and I see the principle of it.  Can we generalize this “incremental impact” principle to multi-channel cases?

A: What do you have in mind?

Q: Let me try to work out the formula myself.  Suppose we have two channels, call them A, and B.  We start with conversion model(s), as usual.  From the conversion model(s), we find out our best estimates for P(c|A,B), P(c|nA,nB), P(c|nA,B), P(c|A,nB).  Now I understand why it does not matter if we use logistic regression, or probit model or neural network to build our conversion model – all that matter is to make sure we get the best estimates for the above scores J

A: Agree.  By the way, I think I understand the symbols you used, such as c, A, nA, nB etc. – let me know if you think I may guess it wrong :)

Q: This is interesting, I think I can get the formula now.  Take channel A first, and let’s call the fractional credit A should get as C_a;  we can calculate it with this formula:  C_a= (P(c|A,B)–P(c|nA,B)) / P(c|A,B), right?

A: If you do that, C_a + C_b maybe over 100%

Q: What’s wrong, then?

A: We need to first figure out what fraction of attribution available to be credited out to A and B, just as in the simplest case discussed before. It should be (P(c|A,B) – P(c|nA,nB)) / P(c|A,B).

Q: I see.  How should we divide the credit to A and B next?

A: That is a question we have not discussed yet.  In the simplest case, with one vendor, this is a trivial question. With more than one vendor(s)/channel(s), we need some new principle?

Q: I have an idea:  we can re-adjust the fractions on top of what we did before, like this:  C’_a = C_a / (C_a + C_b) and C’_b = C_b/(C_a + C_b);  and finally, we use C’_a and C’_b to partition the above fraction of credit.  Will that work?

(note: the following example has error in it, as pointed out by Vadim in his comment below)

A: Unfortunately, no.  Take the following example:

suppose A add no incremental value, except when B is present:  P(c|A,nB) == P(c|nA,nB) and P(c|A,B) > P(c|nA,B)

also, B does not add anything when A is present:  P(c|A,B) = P(c|A,nB)

The calculation will lead to:  C_b == 0 and C_a > 0.  Therefore, A get all the available credit and B get nothing.

Do you see a problem?

Q: Yes.  B will feel unfair, because without B, A will contribute nothing.  However, A get all the credit and B get nothing.

A: This is just a case with two channels and two players.  Imaging if we get 10 channels/players, what a complicated bargaining game this is going to be!

Q: Compare with this, the conversion model part is actually easy; well, not easy but more like a non-issue.  We can build conversion models to generate all these conditional probability scores.  However, we still stuck here and can’t figuring out a fair division of credit.
A: This is attribution modeling:  the process or formula that will translate the output of conversion models into attribution model (or fractional credits). We need to figure this thing out.

Q: What is it, really?

A: We are essentially looking for a rule or a formula to divide the total credit that we can all agree as fair.  Is that right?

Q: Right, but we have to be specific about what do we mean by “fair”.

A:  That’s right.  So, let’s discuss a minimal set of “fair” principles that we can all agree upon.  There are three of them, as I see it:

Efficiency: we are distributing all available credit, not leaving any on the table

Symmetry: if two channels are functionally identical, they should get the same credit

Dummy Channel: if a channel contribute nothing in all cases, it should get no credit

What do you think?

Q: I think we can agree with these principles.  How can they help?

A: Well, someone has proved that there is one and only one formula that satisfy this minimal set of principles. I think this is our attribution formula!

Q: Really? I do not believe this.  Who proved this?  Where can I read more of it?

A: In 1953, Lloyd Shapley published the proof in his PhD dissertation and the resulting formula became Shapley Value. The field of knowledge is called Cooperative Game Theory.  You can Google it and you will find tons of good references. Of course, Shapley did not call it “attribution problem” and he talked about players instead of channels. The collection of principles are more than three.  However, Transferable Utility and Additive principle are automatically satisfied when applied to credit partitioning problem.

Q: Now, how do you apply this attribution rule differently for different converters?

A: You do not.  The difference among converters are reflected in the scores generated from the conversion models, not in the above attribution formula – or Shapley Value.

Q: Ok, if that is the case, everyone in the industry will be using the same Attribution Formula, or Shapley Value.  How do we then creatively differentiate from each other?  How should different type of campaigns be treated uniquely?  How do the effect of channels on different types of conversion be reflected in attributed credits?

A: Well, all these will be reflected in how the conversion models are built and how the parameters of the conversion models are estimated, and finally the scores that come out of the conversion models.  You will innovate on statistical model development techniques. Attribution formula is, fortunately, not where you are going to innovate.

Q: This is quite shocking to me. I can’t imagine how the industry will react …

A: How did industry deal with Marketing Mix Modeling?  We accept the fact that those are simply regression models in essence, and start selling expertise on being able to do it thoroughly and do it right.  We do not have to create our own attribution model to be able to compete with each other.

March 20, 2013

An Unusually Blunt Dialogue on Attribution – Part 1

Filed under: misc — Tags: , , — Huayin Wang @ 10:04 pm

Q:  I will begin with this question, what you do NOT want to talk about today?

A:  I do not want to waste time on things that most people know and agree with, such as “Last Touch Attribution is flawed”

Q: Why is attribution model such a difficult challenge, that after many years we still seem to just begin scratching the surface of it?

A: No idea.

Q: Let me try a different way, why is it so hard to build an attribution model?

A: It is not.  It is NOT difficult to build an attribution model – in fact, you can build 5 of them in less than a min:  Last Touch, First Touch etc… J  It is difficult to build good attribution modeling – a process that produce methodologically sound attribution model.

Q: “Attribution modeling” – is this the kind of tool already available through Google Analytics.

A: No. Those are attribution model specification tools – “you specify the kind of attribution models to your heart’s content and I do reporting using them”.  They do not tell you what IS the RIGHT attribution model. An attribution reporting tool does not make an attribution modeling tool.

Q: “Methodologically sound” – that seems to be at the heart of all attribution debates these days.  Do you think we will ever reach a consensus on this?

A:  Without a consensus on this, how can anyone sell an attribution product or service?

Q: On the other hand, isn’t “algorithmic attribution” already a consensus, that everyone can build on it?

A: What is that thing?

Q: All vendors seem to take the “algorithmic attribution” approach, possibly adding additional phrases, such as “statistical models” and data-driven etc.  Isn’t that sufficient?

A: How? They never show how it works.

Q: Do you really need to get into that level of detail, the “Black Box” – the proprietary algorithm that people legitimately do not release to the public?

A: There is no reason to believe that anyone has a “proprietary algorithm” for attribution.  Unlike predictive modeling, a domain of technology that can be “externally” evaluated without going inside the Black Box,  attribution modeling is like math, a methodology whose validity needs to be internally justified. A Black Box for attribution sounds like an oxymoron for me.  You do not see people claim that they have a “proprietary proof” of Fermat’s Last Theorem.   (Ironically, Fermat himself claimed the proof on the margin of a book without actually showing it, but everyone knows he never intended it to be like that).

Q: Why then do people claim to have but do not show their algorithmic and/or modeling approach?

A: It is anyone’s guess.  I see no reason for that;  it hurts themselves and it hurts the advertising industry, particularly online advertising industry.  I suggest, from today, every vendor should either stop claiming that they have proprietary attribution modeling/model or get out of the “Black Box” (the new empire’s cloth?) and prove the legitimacy of their claim.

Q: Ok, suppose I say, I build a regression model to quantify which channels impact conversion and by how much, then calculate the proportional weights based on that and partition the credits according to the proportions.  What would you say?

A: How?

Q: You are not serious, right?  I am giving you so much details – how much more do you want?

A: The program and process sounds like it will work, and it is quite CLEAR that it is going to work to non-practitioners’ eyes.  But you know  and I know that it does NOT work.  Having built conversion models does not solve the attribution problem.  Attribution problem comes down to the partitioning of credit, i.e. how much of the conversion credit to be partitioned and how much given to each channels.  The logic has to be explicitly presented and justified.   The core challenge has been glossed over and covered up, but not solved.

Q: Please simply it for me.

A: There is no automatic translation available from conversion models to attribution models – the process of doing that, which is attribution modeling has to be explicitly stated.

Q: You defined attribution problem as partitioning credit to channels – are you talking about only Cross-Channel Attribution?  If I want to focus only on Digital Attribution, or even Publisher Attribution only, is what you said still relevant?

A: Yes.  I am talking about it from data analytics angle – you can just replace the word “channel” with others and the rest will apply.

Q: Ok, what if the conversion model I use is not regression, but some kind of Bayesian models?

A: It does not matter.  It can be Bayesian, Neural Net or a Hidden Markov Model.  As long as it is a conversion model.  The automatic translation is not there.

Q: Does it matter if the conversion model is predictive or descriptive?

A: It should be a conversion model – there are multiple meanings of “predictive model”;  it is essentially predictive models, but need not  handle “information leaking” type of issues as a predictive model should.

Q: Does it need to be “causal” model, and not a “correlational” model?

A: Define causal for me.  Specifically, do people know what they mean by “correlational” model?  Do they know multivariate models and dependence concepts?

Q: I assume we know.  Causal vs. correlational are just common sense concepts to help us make the discussion around “model” more precise …

A: But neither are more precise concepts than statistical modeling language.  Even philosophers themselves begin to use statistical modeling language to clarifying their “causal” framework …

Q: Now I am confused.  Where are we right now?

A: We are discussing statistical models and attribution modeling …

Q: Ok, should we use statistical models when we do attribution?

A: We have to.  Quantifying the impact of certain actions on conversion should be the foundation for any valid attribution process;  there are no more precise ways to do that than developing solid statistical models for conversion behavior!

Q: Not even experimental design?

A: Not even that.

Q: But what is the right statistical model?  Some types of regression models or some Bayesian models or Markovian models?

A: It does not have to be any one of them, and yet, any one of them may do the job.

Q: If that is true, how can one justify the objectivity of the model?

A: A conversion model provides the basis for what reality looks like – to our best knowledge at the moment.  There can be different types of statistical methodologies to model the conversion behavior, and that does not create problems with the objectivity of the model output.  We have seen this in marketing response models,  where the modelers have the freedom to choose whatever methodology (type of models) they deem appropriate and yet it does not compromise the objectivity of its results.

Q: But attribution is different;  when building marketing response models, what is important is the score, not the coefficients or any “form factors” of the model.  In attribution, those form factors are central, and not scores, to derive the attribution formula.

A: That’s exactly the problem that needs to be corrected. Attribution formula should NOT be built on the “form factors” of the conversion model, but rather on the scores of the conversion models!

Q: Explain more …

A: If you can’t claim that linear regression model IS the only right model for conversion behavior, you can’t claim those regression coefficients, the “form factors” of the regression models, are intrinsic to the conversion behavior.  Thus, any attribution formula built on top of that cannot be justified.

Q: And the conclusion, in simpler language …

A: Conversion model is needed for attribution, but attribution model is not the conversion model.  Attribution model should be built on top of the “essence” part of the conversion models, i.e. the scores, and not the form factors. Attribution modeling is the process of translating conversion modeling results to attribution model.

Q:  What is that saying about the offering from current vendors?

A:  They often tell us that they build conversion models, but reveals nothing about their attribution modeling methodology.

Q: What if they say that, they are hiding their proprietary attribution technology in Black Box?  Are they just covering up the fact that they have nothing in there, and they do not know how?

A: Anyone’s guess.  The bottom line is, anyone claiming anything should acknowledge the right to doubt from their audience.

Q: It is common to see companies hiding their predictive modeling (or recommendation engine technology) in Black Box … why not attribution?

A: Predictive modeling, or even recommendation modeling, are things that can be externally tested and verified.  You can put two predictive model scores, and test out which one has more predictive power without knowing how they build the models.  Attribution modeling is different;  you have to make explicit how and why your way of allocation is justified – otherwise, I have no way of verifying and validating your claim.

Q: We are not in the faith business …

A: Amen.

Q: Ok, big deal.  I am an advertiser, what should I do?

A: Demand anyone who is selling you attribution products/services, to show you their attribution stuff.  It is ok if they hide the conversion model part of it, but do not compromise on the attribution modeling.

Q: I am in the vendor business, what should I do?

A: Defend yourself – not by working on defensive rhetoric, but by building and presenting your attribution modeling openly.

Q: If I am an agency, what should I do?

A: Attribution should live inside the agency. You can own, or rent it;  you should not be fooled by those who like to make you think attribution modeling is a proprietary technology –  it is not.  Granted that you are not a technology company, but attribution modeling is not a proprietary technology.  If you have people who can build conversion model, you are right up there with those “proprietary” attribution vendors.

Q: If attribution modeling becomes an “Open” methodology, what about those attribution vendors?  What they will own and why advertisers and agencies wouldn’t build themselves?

A: That’s my question too J

Q: Are vendors going to be out of business?

A: Well, they can still own the conversion modeling part of it … and there are still predictive modeling shops out there, in business …

Q: Somehow, you sound like you know something about this “open secret” already J  Can you share a little on that?

A: Can we talk tomorrow? I need to leave for this “Attribution Revolution” conference tonight …

March 6, 2013

The difficult problems in attribution modeling

Filed under: misc — Tags: — Huayin Wang @ 10:58 pm

The term “attribution modeling” can have different meanings to different people – sometime being used interchangeably with “attribution model”. To me, attribution model refers to things like “last touch”, “first touch” etc. – rules that specify how attribution should be done. Attribution modeling is about the process where attribution model is generated.  Attribution Modeling give us the model generation process, as well as the reasons and justification of the attribution model being derived.

It is not difficult to come up with an attribution model, in fact, we can make up one in seconds. What difficult is to determine which one is the right attribution model. Despite all the discussion and progress made over the last few years, there is no consensus about it. And the lack of industry consensus really hurt.

The question about right attribution model is perhaps miss-guided; for we all know that a model could be right for one business, say e-commerce, may be wrong for another, such as B2B. What is right may also depend on type of campaigns, type of conversions and even type of users (male vs female, adult vs teens).  The right question should be: what’s the right attribution modeling – the right process of how an attribution model is generated.

Each one of us can easily list 4 or 5 most commonly used attribution models. What about attribution modeling? How many different processes can attribution model be produced?

Last Click/Last Touch attribution models are examples where intuition is the modeling process.  It is not data driven.  You can argue about the good and bad conceptually.  On the other hand, data-driven approach holds the fundamental belief that the right attribution model should be derived from data.  Within data-driven approach, there are two slightly differing approaches: experimental design vs algorithmic attribution.  

You may ask, what about Google’s Attribution Modeling Tool in Google Analytics? It is not really an Attribution Modeling Tool in my use of the word, it helps you specifying attribution models, not creating any data-driven models. It does not tell you how to derive the “right” attribution model.

The data-driven approach is what we will focus here. There has been great progress in the “algorithmic attribution” approach, and significant business build on this (Adometry and VisualIQ to name a couple).  However, none is clear and transparent enough about their key technologies – as an industry, we left with a lot of confusions.  

The set of difficult problems are about that – the core technology of attribution modeling. We need to answer these questions so we can build upon a common ground and move on.  Here’s a list of the questions/problems:

1) Is attribution modeling the same as statistical conversion modeling?

2) What’s the right type of models to use: predictive modeling, descriptive modeling, causal modeling?

3) Does it matter if the model is linear regression, logistic or some bayesian network model?

stay tuned for more.

September 17, 2012

The state of the attribution business (in marketing/advertising)

Filed under: misc — Tags: , , — Huayin Wang @ 8:01 pm

I had some chats with friends and colleagues lately over the topic of attribution model; the shared feeling is that there is a glaring contrast between the growing number of vendors and the utterly lack of clarity and consensus on approaches. To put this more concretely:

We all know that last click was wrong, but do not know what to do with it.

We do not know what’s the right approach and why (others are not) – is it algorithm attribution? Or is it experimental design?

Why does the right or wrong attribution model matter and to whom? What does it have to do with optimization, and how?

Very confused about Marketing Mix Modeling vs Attribution – is one more right than the other – is one enough?

And finally: who should build attribution services and who should own it?

If you think the current state of attribution is clearer than the above picture, please make you voice heard.  If you know the answer for any of the above questions, even better! You are more than welcome to share your thought and enlighten us, right here!

Thanks!

 

 

June 6, 2012

The professed love of data science

Filed under: Business — Tags: , — Huayin Wang @ 8:01 pm

It seems everyone fall in love with Big Data, Data Science and Data Scientists lately;  not a lot good stories outside the few poster-child start-ups.  It reminds me of an ancient Chinese story and a well known idiom:  Professed Love of What One Really Fears

In the spring and autumn period (770-476bc), there lived in Chu a person named Ye Zhuliang, who addressed himself as “lord Ye”.

It’s said that this lord ye was very fond of dragons – the walls had dragons painted on them, the beams, pillars, doors and the windows were all carved with them. As a result, his love for dragons was spread out.

When the real dragon in heaven heard of lord Ye, he was deeply moved. He decided to visit lord Ye to thank him. You might think lord Ye was very happy to see a real dragon. but, actually, at very the sight of the creature, he was scared out of his wits and ran away as fast as he could. From then on, people knew that lord Ye only loved pictures or carvings which look like dragons, not the real thing.

May 24, 2012

The Principles of Attribution Model

Filed under: attribution analytics — Tags: , , — Huayin Wang @ 7:36 pm

(Disclaimer:  some questions and answers below are totally made up,  any resemblance to anything anyone said is purely coincidental)

How do we know an attribution model, such as Last Click Attribution, is wrong?

  • it is incorrect - surprise surprise, a lot of people just make the claim and be done with it
  • it does not accurately capture the real influence a campaign has on purchase – but how do you know it?
  • it only credit the closer – isn’t this just a re-statement of what it is?
  • it is unfair to upper funnels and only awards to lower funnel – are you suggesting that it should award to all funnels, why?
  • it leads to budget mis-allocation so your campaign is not optimized – how do you know?
  • it is so obvious, I just know it – what?

How do we know an attribution model, such as a equal attribution model, is right?

  • it is better than LCA – intuition?
  • it gives out different credits than LCA so you can see how much mis-allocation LCA does to you campaign – different from LCA is not automatically right
  • we tested and it generate better success metrics for the campaign – sound good, how?
  • it is fair – what does that mean?

How do we find the right attribution model?

  • try different attribution models and test the outcome – attribution model does not general outcome to campaigns directly
  • play with different models and see which one fit your situation better – how do I know the fitness?
  • use statistical modeling methodology to measure influence objectively – what models? conversion models?
  • use predictive model for conversion – why predictive models? what models? how to calculate influence and credit from the models?
  • test and control experiment – how many test and control, what formula to use to calculate credit?
  • you decide, we allow you to choose and try whatever attribution weights you want – but I want to know what’s the right one?
  • the predictive models help you with optimization, once we get that, you do not care about attribution – but I do care …
  • shh … it is proprietary: I won’t tell you or I will kill you! – ?

The Principle of Influence

Three principles are often implicitly used:  the “influence principle”,  the “fairness principle” and the “optimization principle”.

The influence principle works like this: assume we can measure each campaign’s influence on a conversion, the correct attribution model will give credit to campaigns proportional to their influence.  The second principle is often worded with “fairness”, but very much the same as the first principle:  if multiple campaigns contribute to a conversion, giving 100% credit to only one of them if “unfair” to others.  The third principle, the optimization principle, in my understanding, is more about the application of attribution (or the benefit of it) and not about the principle of attribution.

The principle of influence is the anchor of three; the fairness and optimization principles are either a softer version or a derivative of it.

Now we have our principle, are we close to figuring out the right approach to attribution model?  We need to get closer to the assumption of this principle.  Can we objectively measure (quantify)  influence?  Are there multiple solutions or just one right way to do this?

If influence principle is the only justification of attribution models, then quantitative measurement methodology such as probabilistic modeling, some time it is called algorithmic solution which I think is a misnomer,  will be the center technology to use.  It leave no room for arguing just on the ground of intuition alone.  Those who offer only intuition and experience, plus tools for clients to play with whatever attribution weights are not attribution solution provider, but merely a vendor of flexible reporting.

Those of the intuition and experience school like to frame attribution model around the order and position of touch points:  the first/last/even and the introducer/assist/closer. (how many vendors are doing this today?)  They have troubles in providing quantitative probabilistic solution to attribution issue.  The little known fact is that it is analytically flawed:  the labeling of “last touch” and “closer” are only known post-conversion, and therefore not usable inside probabilistic modeling framework.  In predictive modeling and data mining lingo, this is known as the “leakage problem”.  (search on Google, or read Xuhui’s article that mentioned this).

Unfortunately, we have a problem with the data scientist camp as well but of different nature; it is the lack of transparency with metrics, models and process details.  Some vendors are unwilling to open up their “secret sauce”.  Perhaps, but is that all?  I will try to demystify and discuss the “secret sauce” of attribution modeling.


Older Posts »

The Shocking Blue Green Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 162 other followers