Analytics Strategist

May 20, 2013

Hi Mr. Wanamaker, only half of your money is wasted?

Filed under: Advertising, metrics — Huayin Wang @ 6:09 pm

Everyone working in the advertising industry, or related fields, has probably heard of the famous Wanamaker Quote: “Half the money I spend on advertising is wasted; the trouble is I don’t know which half.”

What he said seems obvious at first; however, when read a little deeper, it could be problematic.  Below are a few related points:

a) The waste may not be 50%; it may in fact be as high as 99%

Let’s begin by asking, how did he estimate the advertising waste?  Can someone know that amount of waste without being able to identify which part?

There are two ways to estimate media waste: the first one involves breaking down advertising campaigns into different tactics identifying ineffective ones.  The tactics can vary by audience attribute (age, gender, behavioral), geo and creative etc.. Take gender as an example, you Male audience maybe twice as effective as Female, so you treat Female tactic as Wasted. The problems with this estimation methodology: the ineffective tactics are not all “wasted” and the effective tactics contains waste tool.  The estimation is also quite subjective, since it depends not only on how you define “effective”, but also on how you breakdown campaigns into “tactics”.

The second way of estimating waste, the only defensible one in my view, relies on counting outcome directly.  Take direct response campaign as an example: if conversion is the outcome, the money spend without resulting in conversion will be wasted.  If display ads reached 30 millions of users and only 3,000 converted, then the spend on the 99% of users is wasted.  The actual waste number can be even higher, when considering that the ads shown to the converters may themselves be ineffective and should not be counted as incrementally effective.

b) It is not just about measurement (alone ), but more about granularity of the underlying measurement

Knowing how to measure the waste, the next question is: how to solve the waste issue?

The common (traditional, offline) scheme is to define a targeting audience first, following up with a “waste” measurement that is then defined as media delivered outside of the targeting audience.  This practice ignores the waste inherent in the definition of the target audience.  Age 20-34 maybe five time as likely to convert as others and therefore a valid target audience.  However, if the average converter rate is 1%, then conversion rate for this target audience is only 5% – which means 95% is waste as well.

Creating different targeting tactics and measuring them does not necessarily addressing the issue of waste! I am horrified to see how many people believe that bring offline GRP metrics to online solved the display advertising waste problem.  Age and Gender data do not generate tactics that are waste-less.  You need to use higher dimensional data to create and identify much more granular audience and context/creative groupings in order to truly combating the advertising waste problem!

 

Is GRP metrics the cure of online advertising waste?  I do not think so.  In fact, I think it will do more harm than good.

c) Targetability is key, but often ignored

To not making this a long writeup, I will make the point really short: without event level targeting, we are not going to solve the waste problem; in fact, we are not even facing it straightly.  If nothing else, the most granular level of media transaction mechanism, such as implemented in AdExchange RTB today,  is necessary.

April 5, 2013

Bid quality score: the missing piece in the programmatic exchange puzzle

Filed under: Ad Exchange, Game Theory, Matching game, misc, Technology — Huayin Wang @ 7:45 pm

On the eve of the Programmatic IO and Ad Tech conferences in SF, I want to share my idea for a new design feature of Exchange/SSP, a feature that has the potential of significantly impacting our industry. This feature is the bid auction rule.

Bid auction rule is known to be central to Search Engine Marketing.  Google’s success story underscores how important a role it can play in shaping the process and dynamics of the marketplace. There is reason to believe that it has the similar potential for the RTB Exchange industry.

The current auction model implemented in Ad Exchanges and SSPs are commonly known as Vickrey auction, or the second price auction. It goes like this:  upon receiving a set of bids, exchanges will decide on a winner based on the highest bid amount, and set the price to the second highest bid amount.  In the RTB Bid Process diagram below, this auction rule is indicated by the green arrow #7:

RTB bidding process

RTB bidding process

(I am simplifying the process a lot by removing non-essential details from the actual process for our purpose, e.g Ad Servers)

The new auction I’d like to propose is a familiar one: it is a modified Vickrey auction with quality score!  Here, the bid quality score is defined as the quality of an ad to the publisher, aside from the bid price.  It essentially captures all things that a publisher may care about the ad. I can think of a few factors:

  1. Ad transparency and related data availability
  2. Ad quality (adware, design)
  3. Ad content relevancy
  4. Advertiser and product brand reputation
  5. User response

Certainly, the bid quality scores are going to be publisher specific.  In fact, it can be made site-section specific or page specific.  For example, a publisher may have a reason to treat Home Page of their site differently than other pages.  It can also vary by user attributes if the publisher like to.

Given that, the Exchange/SSP will no longer be able to carry out the auction all by itself – as the rule no longer depends only on bid amounts.  We need a new processing component, as shown in the diagram below.

new-design

Now, #7 is replaced with this new component, called Publisher Decider.  Owned by the publisher, the decider works through the following steps:

  1. it takes in multiple bids
  2. calculates the bid quality scores
  3. for each bid, calculates the Total Bid Score (TBS), by multiplying bid amount and quality score
  4. ranks the set of bids by the TBS
  5. makes the bid with highest TBS the winner
  6. sets the bid price based on a formula below, as made famous by Google

p3

Here, P1 is the price set for the winner bid. Q1 is the bid quality score. B2 is the bid amount for the bid with second highest TBS. Q2 is the bid quality score for the bid with the second highest TBS.

This is not a surprise and it’s not much of a change. So, why is this so important?

Well, with the implementation of this new auction rule, we can guess some natural impacts coming out:

  • Named Brands will have an advantage on bid price, because they tend to have better quality scores. A premium publisher may be willing to take $1 CPM from apple than $5 CPM from a potential adware.  This will be achieved via Apple having a quality score 5 times or more higher than that of the other crappy ad.
  • Advertisers will have an incentive to be more transparent. Named brands will be better off with being transparent, to distinguish themselves from others. This will drive the quality score from non-transparent ads lower, therefore starting a good cycle.
  • DSPs or biddes will have no reason to not submit multiple bids, for they won’t be able to know which ad will be the winner before hand.
  • Premium Publishers will have more incentive to put their inventory into now that they have transparency and finer level of control.
  • The Ad Tach eco-system will respond with new players, such as ad-centric data companies serving the publisher needs, similar to the contextual companies serving advertisers

You may see missing links in the process I described here.  It is expected, because a complete picture is not the focus of this writing.  I hope you will be convinced that bid quality score / Publisher Decider is interesting, and potentially has significant impact by pushing the Ad Tech space in the direction of more unified technologies and consistent framework.

April 11, 2012

Funny analogies of wrong attribution models

Few topics are near and dear to my heart as Attribution Modeling is.  I first bumped into it a more than 4 years ago; and my first written piece on attribution is a linkedin Q&A piece answering a question from Kevin Lee on duplication-rate  (in August 2007).  Since then, my interest in attribution gets real serious, resulting in a dozen’s attribution related blog posts.  The interest never died after that, although I have not written anything the last three years.

I am back on it with a vengeance! Consider this as my first one back.

I want to start on a gentle note though.  I am amused about people still debating about First Touch vs Last Touch attribution as viable attribution models, a bit out of the question in my opinion.  I want to share some funny analogies for what could go wrong with them.

Starting with Last Touch Attribution Model, a football analogy goes like this: “relying solely on a last click attribution model may lead a manager to sack his midfielder for not scoring any goals. Despite creating countless opportunities he gets no credit as his name isn’t on the score-sheet. Similarly a first click attribution model may lead the manager to drop his striker for not creating any goals, despite finishing them. – BrightonSEO presentation slides

There are a lot of good analogies like this that are derived from team sports.  This analogy is applicable not only to Last Touch, but to all single touch point attribution models.  The funniest one I heard is about First Touch Attribution, from none other than the prolific Avinash Kaushik: “first click attribution is like giving his first girlfriend credit for his current marriage.” – Avinash quote

Analogy is analogy, it does not do full justice to what’s been discussed.  However, what we should learn at least this much: if your attribution model is solely based on the sequencing order of touch points, you are wrong.  Those who propose Last, First, Even, Linear or whatever attribution models, watch out!

A good attribution model needs a disciplined development process, and better yet, a data-driven one.  The less the assumptions made about the values of touch points the better – we should learn to let empirical evidence speak for itself.

Do you have any interesting analogy, or thought?

November 16, 2011

Ad exchange, matching game and mechanism design

Over the years, I have learned some interesting things in this new ad:tech industry, particularly around the Ad Exchange and RTB ad auction market model.  I want to share some of my thoughts here and hope you find them interesting to read.

Ad Exchange is not like a financial exchange

The “exchange” in the name is suggestive of a financial stock exchange market, and interesting observations can be made based on this analogy.  However, there are some fundamental differences such as the lack of liquidity in ad impression and information asymmetry.  Jerry Neumann has blogged about this topic profusely;  it is still a topic of great interest today, as seen in a recent article by Edward Montes.

In fact, the differences are easy to understand. The harder part is, like Jerry asked,  If the ad exchange aren’t exchanges, what are they?  or I should add, what should they be like?

Publisher’s preference is the missing piece

The analogy with financial exchange (stock and future) is not a good analogy partly because of its inability to fully model advertiser preference. Not all impressions are of the same value to an advertiser and not all advertisers give the same value to an impression. The commodity auction model as embedded in ad exchange does better, because it allows advertisers to bid based on any form of evaluation – a chance for advertiser to fully express its preference over audience, contextual content and publishers’ brand.

Still, there is a problem for the current auction model: after collecting all the bids from advertisers, it takes the highest bidder to be the winner, as if price is the only thing publishers care about.  In reality, not all bids with the same price are of the same value to a publisher.  Publishers care about brand safety and contextual relevancy as well; in fact, the quality of user experience may mean more to publishers than advertisers!  In sum, publishers care about the quality of the ads above and beyond the bid price.  Unfortunately, the current ad exchanges lack the proper mechanism allowing publishers to articulate their full preferences.  This results in lost of market efficiency and lost of opportunities to remove transaction frictions.  This is a design flaw.

Display marketplace is still far from perfectly efficient and this design flaw does not help.  The recent developments of Private Marketplace are piecemeal attempts to overcome this design issue.  Some market movements in late merge and acquisition attempts can be understood from this angle.

Where can we look for design idea on how to handle this issue?  – paid search and game theory!

The quality score framework from paid search

In many ways, paid search is just like ad exchange, with Google plays one of a few “publisher” roles.   In both markets, advertisers are competing for ad-view through auction bidding;  if we equate audience in display to keywords in search, then the bidding processes is quite the same:  search advertisers do extensive keyword research, look at past performance along side of other planning parameters such as time of the day etc. to optimize their bids;  similarly display advertisers look at the audience attributes, the site and page contents, past performance and planning parameters as they perform bid optimization.

The bidding processes in both markets are similar;  the differences lie in the post-bidding ad evaluation.

After all bids are collected, ad exchange today simply select the highest bidder.  In case of paid search, bids are evaluated on price, ad relevancy and many other attributes.  Google has mastered the evaluation process with its Quality Score framework.  This difference in having a Quality Score framework vs not is not a small thing.  As anyone familiar with the history of paid search know, the quality score framework played a pivotal role in shaping the search industry when Google introduced it around the turn of the century.  The post-bidding ad evaluation for display may just be a critical piece of technology and have potentially significant impact on the efficiency and the health of the display market.

The need for a non-trivial post-bidding ad evaluation calls for an extra decision process (and algorithm) to be added, either at ad exchange or at publisher’s site, or both.  In this new model and with this extra component, ad exchange will send the full list of bidding to the publisher instead of picking a winner based on price alone.  It is then up to the publisher to decide which ad will be shown.  With millions of publishers, large and small, this seemingly small change may be a trigger for more inside this industry where technology is already orders of magnitude more complex than paid search.

The matching game analogy

With full preferences being taking into account for both advertisers and publishers, ad exchange looks less like a commodity marketplace and more like the matching game.  It will be interesting to look at market efficiency from the perspective of mechanism design in game theory, which is another way of saying operational market process.

Matching advertisers with publishers under a continuous series of Vickrey auctions is our setup for the discussion – the best model I can think of that mimic the matching game setup;  it shouldn’t be too surprising to anyone that matching game is an interesting analogy to ad exchange.  As a game theory abstraction of many practical cases, matching game includes college admission and marriage market.  Let’s take the marriage market as an example.

Using a simplistic description, a marriage market involves a set of Men and a set of Women.  Each man has a preference vector over the set of women (a ranking of women);  similarly each woman has a preference vector over the set of men (a ranking of men).  A matching is an assignment of men to women such that each man is assigned to at most one woman and vice versa.  A matching is unstable if there exist a man-woman pair not currently matched to each other but both prefer match to each other than their current match – the pair as such is called a blocking pair.  When there is no blocking pair exist, a match will be called a stable match.

Clearly, stability of a match is a good quality: a stable match is not vulnerable to any voluntary pairwise rematch (translating into ad exchange language, a stable match is one such that no pair of advertiser – publisher currently not matched to each other have incentive to switch and form a new match).  A matching is male-optimal if no two males have incentive to switch partners. Female-Optimal is defined similarly.  A stable matching that is both male-optimal and female optimal looks like an perfect efficient market; we hope to find a mechanism that lead to the unique stable matching as such – something we can then mimic to implement for a future ad exchange model.

Unfortunately, there is no unique stable matching for matching game in general (in this case, having too many good things may not be a good thing).  There is also no unique optimal matching that is optimal from both men and women’s perspective.  We learned that Male Proposals Deferred Acceptance Algorithm, sort of like the current auction process in ad exchange in which advertisers played the male roles, produce Male-Optimal stable matching.  If we switch the role of men and women, a similar algorithm exists that produces Female-Optimal matching.  The two algorithms/mechanisms lead to two distinctly different results.  You can read more about algorithmic game theorycomputational game theory, specifically on matching game and mechanism design if interested.

So, why are we looking into this and what we’ve learned from it?  Below is my translation, or transliteration to be more appropriate, from game theory speak to the ad:tech domain.

We all like to believe that there is an efficient market design for everything, including the exchange marketplace for ads. Our believe is justified for all commodity marketplace by the general equilibrium theory.  Unfortunately, there is no equivalence of a “general equilibrium” or universal optimal stable match for a marriage market, which implies that there is no universal optimal advertiser-publisher matching in ad exchange.  If this is the case, the search for an optimal market mechanism for ad exchange will be a mission impossible.

However, there exist one-sided optimal condition, advertiser-optimal and/or publisher-optimal matching.  It is also easy to find the corresponding mechanisms that lead to those one-sided optimal stable matching.  The auction market as currently implemented in ad exchanges, with the addition of post-bidding evaluation process, is similar to the mechanism leading to advertiser-optimal matching.

The future seems open for all kinds of good mechanism design.  Still, I believe that there is a “naturalness” in the current style auction market.  It is quite natural for the auction process to start from the publisher side, by putting the ad impression on auction, because it is all start with the audience requesting a webpage – a request send to a publisher. It is not easy to imagine how advertiser can set up a “reverse auction” starting from the demand side, within a RTB context. We can never rule out the possibility, and it may work potentially for trading the ad future.

Conclusion:

I am reluctant to draw any conclusions – these are all food for thought and discussion.  I’d love to hear your comments!

April 28, 2009

Mining twitter data

Who is the first reporter of the Mexico City earth quake?  I remember watching twitter second-by-second and @cjserrato was the first one reported the earth quake (the tweet id is 1630381373):

 
mexico city

Mining twitter data is a huge challenge.  So far I have not been able to see many interesting data/text mining and data analytics around twitter data.  I have been playing the data lately, and here’s a thematic/topic graph I had – a visualization of all tweets of the last eight hours that are related to to “mexico city”:

tweets of mexico city topic graph 

You can tell that “Swine Flu” still at the center of all topics, whereas earthquake is clustered alone to the side.

Have you seen any interesting twitter analytics (by the way, I do not mean the twitter metrics or counters etc..)?

Jeff Clark of NeoFormix has a great set of application, the best I have found so far.  FlowingData is another one.

March 17, 2009

the wrong logic in attribution of interaction effect

Attribution should not be such a difficult problem – as long as reality conforms to our linear additive model of it. The interaction, sequential dependency and nonlinearity are the main trouble makers.

In this discussion, I am going to focus on the attribution problem in the presence of interaction effect

Here’s the story setup: there are two ad channels, paid search (PS) and display (D).  

Scenario 1)
      When we run both (PS) & (D), we get $40 in revenue.  How should we attribute this $40 to PS and D?

The simple answer is: we do not know – for one thing,  we do not have sufficient data.
What about making the attribution in proportion to each channels’ spending numbers? You can certainly do it, but it is not more justifiable than any others.

Scenario 2)
    when we run (PS) alone we get $20 in revenue;  when we run (PS) & (D) together, we get $40.
    Which channel gets what?

The simple answer is again: we do not know – we do not have enough data.
Again, a common reasoning of this is:  (PS) gets $20 and (D) gets $20 (= $40 – $20).  The logic seems reasonable, but still flawed because there is no consideration of the interaction between the two.  Of course, with the assumption that there is no interaction between the two, this is the conclusion.

Scenario 3)
    when we run (PS) alone we get $20 in revenue; running (D) alone gets $15 in revenue; running both (PS) & (D) the revenue is $40.
    Which channel gets what?

The answer:  we still do not know. However, we can’t blame the lack of data anymore.  It is forcing us to face the intrinsic limitation in the linear additive attribution framework itself.

Number-wise, the interaction effect is a positive $5, $40-($20+$15), which we do not know what portion to be attributed to which channel. The $5 is up for grab for anyone who fight it harder – and usually to nobody’s surprise, it goes to the power that be.

Does this remind anyone of how CEO’s salary is often justified?

What happens when the interaction effect is negative, such as in the following scenario?

Scenario 4)
    when we run (PS) alone we get $20 in revenue; running (D) alone gets $15 in revenue; running both (PS) & (D) the revenue is $30.
    Which channel gets what?
How should the $5 lost distributed?  We do not know. 

What do you think? Do we have any way to justify other than bring out the “fairness” principle?

If the question is not answerable, the logic we use will at most questionable, or plain wrong.

However, all is not lost. Perhaps we should ask ourselves a question: Why do we ask for it in the first place? Is this really what we needed, or just what we wanted? This was the subject of one of my recent post: what you wanted may not be what you needed.

March 16, 2009

the new challenges to Media Mix Modeling

Among many themes discussed in the 2009 Digital Outlook report by Razorfish, there is a strand linked to media and content fragmentation, the complex and non-linear consumer experience, interaction among multiple media and multiple campaigns – all of these lead to one of the biggest analytics challenge: the failure of traditional Media Mix Modeling (MMM) in searching of a better Attribution Analytics.

The very first article of the research and measurement section is on MMM. It has some of the clearest discussion of why MMM failed to handle today’s marketing challenge, despite of its decades of success.  But I believe it can be made clearer. One reason is its failure to handle media and campaign interaction, which I think it is not the modeling failure but rather a failure for the purpose of attribution ( I have discussed this extensively in my post: Attribution, what you want may not be what you need).  The interaction between traditional media and digital media however, is of a different nature and it has to do with mixing of both push and pull media.  Push media influence pull media in a way that render many of the modeling assumptions problematic.  

Here’s its summary paragraph:

” Marketing mix models have served us well for the last several decades. However, the media landscape has changed. The models will have to change and adapt. Until this happens, models that incorporate digital media will need an extra layer of scrutiny. But simultaneously, the advertisers and media companies need to push forward and help bring the time-honored practice of media mix modeling into the digital era.”

The report limit its discussion to MMM, the macro attribution problem.  It did not give a fair discussion of the general attribution problem – no discussion of the recent developments in attribution analytics ( called by many names such as Engagement Mapping, Conversion Attribution, Multicampaign Attribution etc.).  

For those who interested in the attribution analytics challenges, my prior post on the three generations of attribution analytics provide an indepth overview of the field. 

Other related posts: micro and macro attribution and the relationship between attribution and  optimization.

SIM: the brightest spot in the 2009 Digital Outlook report

Filed under: Advertising, Datarology — Tags: , — Huayin Wang @ 3:57 am

Social Media is superhot these days and naturally I expect it to be a significant topic in Razorfish’s Digital Outlook report; and I was not disappointed 🙂

Social Influence Marketing (SIM) is one of the eight trends to watch; social object theory right in the middle of the report, followed by secrets of powering SIM campaigns; the Pulse (tagged as one of the three things every CEO must know) and mobile, both are connected to SIM in some fundamental ways. Most importantly, in the research and measurement section (dearest to my heart), two our of three are about social influence measurement and research; both are excellent.

I am particularly fond of Marc Sanford’s Social Influence Measurement piece.  Marc approaches the topic methodically, providing good conceptual lead-ins as well as rigorous measurement framework. I enjoyed its evenly paced, matter of fact writing style.  Starting from discussion of the two aspects of SIM: sharable contents and people, the Generational Tag technology, Marc made it clear about the importance of separating the values where campaigns touch consumers directly versus the incremental values when the contents passing through viral media, through the power of endorsement.  The methodology part and the technology part of SIM come together nicely in the article.

The Social Influence Research by Andrew Harrison and Marcelo Marer is equally interesting. There are excellent detailed discussion about the challenges that facing the traditional survey and focus group research, and how it can evolve and adapt into a new form of social influence research.

March 14, 2009

Eight trends to watch: 2009 Digital Outlook from Razorfish

1. Advertisers will turn to “measurability” and “differentiation” in the recession

2. Search will not be immune to the impact of the economy

3. Social Influence Marketing™ will go mainstream

4. Online ad networks will contract; open ad exchanges will expand

     with Google’s new interest-based targeting, thing looking to change even more rapidly.

5. This year, mobile will get smarter

6. Research and measurement will enter the digital age

     This is an issue dear to my heart and I have been writing about the importance of Attribution Analytics,  Micro and Macro Attribution many times in recent months; directly from the report:

    “Due to increased complexity in marketing, established research and measurement conventions are more challenged than ever. For this reason, 2009 will be a year for research reinvention. Current media mix models are falling down; they are based on older research models that assume media channels are by and large independent of one another. As media consumption changes among consumers, and marketers include more digital and disparate channels in the mix, it is more important than ever to develop new media mix models that recognize the intricacies of channel interaction.

7. “Portable” and “beyond-the-browser” opportunities will create new touchpoints for brands and content owners

8. Going digital will help TV modernize

Read the Razorfish report for details.

reading notes : 2009 Digital Outlook

Filed under: Advertising, business strategy, misc, reading — Tags: , — Huayin Wang @ 6:04 pm

With six hundreds (in 5 days) tweets from readers of the 180 pages 2009 Digital Outlook from Razorfish, this report is certainly captured the attention of many working in marketing/advertising. It is an exciting read and I will share a couple of my notes here.

Clark Kokich’s introduction sets up the story line really well.  

The opening paragraphs went to the key point directly.

 “I spent the first 30 years of my advertising career focused on saying things. What do we need to say to persuade people to buy our product or service? How do we say it in a unique and memorable way? Where do we say it? How much will it cost to say it? How do we measure consumer reactions to the things we say to them?”

Now, after 10 years in the digital space, I find myself spending my time talking to clients about building things. What do customers need to make smart decisions? What applications do we need to build to satisfy that need? Where are our customers when they make a decision?”

He then described the new role agency need to play: ” .. it’s about the actual role they should be playing in setting business strategy, designing product and service offerings, delivering service after the sale, creating innovative distribution channels and developing new revenue models.”

These are great insights.  Ad agencies are expert of creative messaging – “saying things”; the new challenge is about shifting the focus away from that and go beyond. This is a tremendous challenge indeed, one that would require new skills and “deep collaboration between creative, technology, media, user experience and analytics”.

Older Posts »

Blog at WordPress.com.