Analytics Strategist

March 17, 2009

the wrong logic in attribution of interaction effect

Attribution should not be such a difficult problem – as long as reality conforms to our linear additive model of it. The interaction, sequential dependency and nonlinearity are the main trouble makers.

In this discussion, I am going to focus on the attribution problem in the presence of interaction effect

Here’s the story setup: there are two ad channels, paid search (PS) and display (D).  

Scenario 1)
      When we run both (PS) & (D), we get $40 in revenue.  How should we attribute this $40 to PS and D?

The simple answer is: we do not know – for one thing,  we do not have sufficient data.
What about making the attribution in proportion to each channels’ spending numbers? You can certainly do it, but it is not more justifiable than any others.

Scenario 2)
    when we run (PS) alone we get $20 in revenue;  when we run (PS) & (D) together, we get $40.
    Which channel gets what?

The simple answer is again: we do not know – we do not have enough data.
Again, a common reasoning of this is:  (PS) gets $20 and (D) gets $20 (= $40 – $20).  The logic seems reasonable, but still flawed because there is no consideration of the interaction between the two.  Of course, with the assumption that there is no interaction between the two, this is the conclusion.

Scenario 3)
    when we run (PS) alone we get $20 in revenue; running (D) alone gets $15 in revenue; running both (PS) & (D) the revenue is $40.
    Which channel gets what?

The answer:  we still do not know. However, we can’t blame the lack of data anymore.  It is forcing us to face the intrinsic limitation in the linear additive attribution framework itself.

Number-wise, the interaction effect is a positive $5, $40-($20+$15), which we do not know what portion to be attributed to which channel. The $5 is up for grab for anyone who fight it harder – and usually to nobody’s surprise, it goes to the power that be.

Does this remind anyone of how CEO’s salary is often justified?

What happens when the interaction effect is negative, such as in the following scenario?

Scenario 4)
    when we run (PS) alone we get $20 in revenue; running (D) alone gets $15 in revenue; running both (PS) & (D) the revenue is $30.
    Which channel gets what?
How should the $5 lost distributed?  We do not know. 

What do you think? Do we have any way to justify other than bring out the “fairness” principle?

If the question is not answerable, the logic we use will at most questionable, or plain wrong.

However, all is not lost. Perhaps we should ask ourselves a question: Why do we ask for it in the first place? Is this really what we needed, or just what we wanted? This was the subject of one of my recent post: what you wanted may not be what you needed.


  1. These scenarios and intuition tells us that it is reasonable to expect that distinct marketing activites (e.g. search ad vs. display) each, exert a main effect (on whatever the targeted event is e.g. sales) and the combination of simultaneous/overlapping marketing activities also exerts a joint effect. It is also clear from these scenarios that the joint effect is not necessarily the sum of the main effects; it can be more or less and non-linear.

    If multiple marketing activities never/rarely coincided, it might not be necessary to factor in the joint effect when trying to understand what activities customers were responding to. However, this is seldom the case. Multiple marketing activities overlap by design, in part because there is an expectation that there will be a synergistic response. The trick then is to know how the synergy works so that it can be taken advantage of in the most profitable way possible.

    Formulating the pathway by which this synergistic effect manifests itself is, I believe at the heart of this evolving story.

    Until the next chapter…


    Comment by Satindra Chakravorty — March 18, 2009 @ 3:02 am

  2. It was actually a simple point that I failed to communicate clearly. My point is: there is no data-driven justifiable attribution of interaction effect. Still, people try to find ways to do it, using all kinds of logic, which I believe are misguided if not wrong.

    Comment by huayin — March 18, 2009 @ 3:07 am

  3. When interaction effect is positive, it is called synergy; when negative it is called cannibalization. In my experience, I can easily convince people that there is no right way to partition the interaction effect. Still, the same person would turn around in a sec and ask for the attribution percentages for each of his marketing channels.
    I do not know how to get my point across, you?

    Comment by huayin — March 18, 2009 @ 3:28 am

  4. There are data driven methods, Kenshoo, Atlas and Clear Sale among them. The key is WHO sets up the weightings / parameters. If (PS) set it then the last click will dominate and if (D) set them then guess what?

    What you need is a channel independent team running the numbers and crunching them to provide insight that will lead to the best mix possible.

    Comment by ppc_guru — March 20, 2009 @ 2:23 pm

  5. PPC guru,
    Thanks for the comments.
    You are right in saying that “the key is WHO sets up the weightings / parameters” — IF you want to know what the weightings (attribution percentages) will be.
    However, the point that I perhaps failed to communicate was, there is NO right way to do this when interaction effect is present, no matter who is doing this, and that includes your “channel independent team” as well.
    I could be wrong though – I need someone to hit me with clearer arguments to change my mind.

    Comment by huayin — March 20, 2009 @ 3:50 pm

  6. I agree with your analysis. The problem is in fact underidentified with data alone. Without theory (in the form of a parametric model), there is really no way to come up with inferences. As it is, the inferences will be conditional on the assumptions and structure underlying whatever the parametric model is chosen.

    It is perhaps best to make the assumptions as simple, modest, common-sense, and transparent as possible. I think of such assumptions and model forms as “conservative”, and sometimes they just show the bounds of interpretation, rather than a definitive answer.

    That way, someone with a different set of assumptions can make their case … the process of knowledge generation (even about attributions and interactions/synergy) should be empirical, and hopefully can minimize the ever-present power politics.

    Comment by Charlie H — November 4, 2009 @ 8:48 pm

RSS feed for comments on this post. TrackBack URI

Leave a Reply to huayin Cancel reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at

%d bloggers like this: