I recently read a fascinating paper written by some folks at Microsoft called “Online controlled Experiments at Large Scale.” Skip to the end if you want a link to the paper.
The paper’s topic was how Microsoft has scaled its testing and optimization program on the Bing search engine. It was written with somewhat of an engineering bent, so it wasn’t 100% relevant to my world. The kinds of optimization tests I design and conduct are neither automated nor conducted at such massive scale.
However, the authors laid out 3 “Testing Tenets” that they felt were instrumental to the development (and scaling) of their testing program. These simple rules were fantastic to read and 100% relevant to the world of Marketing Optimization and testing!
I enjoyed them so much that I’m going to summarize them for you, with commentary on why you might want to steal them ;-).
Without further ado, here they are; Bing’s 3 Testing Tenets:
Tenet 1: The Organization wants to make data-driven decisions and has formalized the Overall Evaluation Criterion
This tenet is a bit wordy, IMO, but what essentially means is that in order to have a successful Optimization program, the (Marketing) organization needs to want to be data driven. This sounds like “table stakes,” and it is, but seldom will you find a Marketing organization where the culture of data-driven decision making has taken hold 100%.
The fact is that many organizations are getting into Optimization not because they want to be data-driven and scientific about their marketing, but because it has become a trendy thing to do. These programs will not be successful.
They will falter because the results and insights of experiments must be acted upon in order to be worthwhile. Too often I see Optimization specialists deliver statistically significant, improved customer experiences via testing only to see their “winning version” never get implemented because to do so would prove someone’s intuition wrong, go against “creative guidelines,” or cross a political boundary.
In these cases, the marketers in charge don’t want to be data-driven; they are merely paying lip service to the data-driven philosophy. In other words, if you’re not ready to “walk the walk,” don’t invest heavily in Optimization.
The second part of these testing tenet refers to having agreed-upon, formalized success criteria for individual tests, as well as for the testing program at-large.
Again, I often see clients dive head-first into running experiments without sorting out what they’re truly trying to achieve. Any tests with the stated goal of “brand engagement” or “improved customer experience” are probably half-baked initiatives.
There’s nothing inherently wrong with “brand engagement” or “improved customer experience,” but these concepts need to be boiled down to measurable KPIs as part of a hypothesis that can be proven or dis-proven.
Tenet 2: Controlled experiments can be run and their results are trustworthy
The second tenet again sounds like ‘stating the obvious,’ but it should not be overlooked or taken lightly. Not having this 2nd tenet locked down can quickly ruin a testing program because of a lack of trust in the data.
This is no different than the reports coming out of a web analytics platform like Webtrends, Google Analytics, etc. If the data comes into question, it’s a very serious thing that needs to be addressed ASAP.
The organization needs to have confidence that the experiments being run are “controlled,” meaning that they’re properly designed, instrumented, conducted, and analyzed. It takes a skilled team to make sure that the hypotheses are sound, the testing platform works as intended, the tests are set up correctly, and the analysis done post-test is rigorous.
Miss any of these targets, and you could end up reporting false information to business stakeholders. If they make business decisions based on false data, it will mean more than some professional “egg on your face.” It could mean the death of the Optimization program.
The various validity threats to experimentation are a topic worthy of a separate post. I may or may not post on this in the near future, but Marketing Experiments has explored this topic at some depth here [PDF].
Tenet 3: We are poor at assessing the value of ideas
This 3rd tenet is somewhat related to the first, but it indicates that in order to be successful in testing and optimization, you must be humble enough to admit that you (and your organization) are generally lousy at predicting outcomes. This is extra true when it comes to your fickle customers and prospects, right? 😉
The Bing paper quotes several industry professionals from companies like Google, Quicken, and Netflix and their humbling statistics on how often their companies are wrong about what customers really respond to and/or what will increase business KPIs.
It seems to be a part of human nature to trust “authority,” so when a Product Manager, or Usability expert, or Optimization guru, or HiPPO tells us that a design or offer or feature will resonate, we actually go into the experiment subconsciously biased! Admitting this tendency to be biased is what tenet #3 is all about.
I joke that I’m right 51% of the time about test outcomes, and that qualifies me as an “Expert.” All testing experts are “wrong” about test results on a regular basis, and it’s their lack of fear about being “wrong” that makes them successful…and experts.
Conclusion
I hope you’ll converse with me about these 3 testing tenets, and share them if you like them. Here’s a link to the paper, if you really want to geek out: http://www.exp-platform.com/Documents/2013%20controlledExperimentsAtScale.pdf