A team of researchers examined 2,101 commercial experiments facilitated by A/B splitting tools like Google Optimize, Mixpanel, Monetate and Optimizely and used regression analysis to detect whether p-hacking (previously), a statistical cheating technique that makes it look like you've found a valid cause-and-effect relationship when you haven't, had taken place.
They found that 57% of experimenters were p-hacking by halting the experiment once it looked like their initial hypothesis was borne out, without bothering to complete the run and possibly discovering some disconfirming data.
The researchers hypothesize that the cheating is the result of poor statistical ability, bugs in the tools that encourage bad statistical practice, and a desire to please your boss by either proving that you had an idea that was borne out by data, or by proving that your boss was right when they pronounced that things would work a certain way.
The behavior of experimenters in our data seems to deviate from profit maximization. If the
experiments are run to maximize learning about effect sizes while ignoring short term profits, we
should not observe p-hacking that inflates FDRs. If, in contrast, experiments are run to maximize
profits, we should not observe experiments with larger effect sizes being terminated later, as this
prevents the most effective intervention from being rolled out quickly.
Finally, on a more positive note, we find that stopping an experiment early or late is not driven
solely by p-hacking. Specifically, we find a pronounced day-of-the-week pattern, a 7-day cycle in
the first 35 days, and a tendency to terminate sooner when observing effects small rather than large
p-Hacking and False Discovery in A/B Testing [Ron Berman, Leonid Pekelis, Aisling Scott and Christophe Van den Bulte/SSRN]
(via Four Short Links)
(Image: http://www.beeze.de, CC-BY)
Equifax doxed virtually every adult in America as well as millions of people in other countries like the UK and Canada. The breach was caused by an acquisition spree in which the company bought smaller competitors faster than it could absorb them, followed by negligence in both monitoring and responses to early warnings. Execs who […]
The next version of Chrome will patch a bug that lets websites detect users who are in incognito mode by by probing the Filesystem API; they've also pledged to seek out and block any other vulnerabilities that will let servers detect users in incognito mode.
The Googler Uprising was a string of employee actions within Google over a series of issues related to ethics and business practices, starting with the company's AI project for US military drones, then its secretive work on a censored/surveilling search tool for use in China; then the $80m payout to Android founder Andy Rubin after […]
Vape technology has been around long enough that vapers are starting to get picky about their gear. Luckily, so are we. From disposable models to cutting-edge touchscreen atomizers, there’s a vaporizer in this roundup to suit every taste. Hera 2 – World’s Most Advanced Dual-Use Vaporizer Choose between dry herb or oil extraction modes – […]
With enough practice and commitment, anyone can be a visual artist. But without the right instruction, that time spent honing your skills could seem like an eternity. If you really want to see where your talent can take you, you need sound fundamentals – and no matter what discipline or genre you lean toward, the […]
Theoretically, there’s never been an easier time for marketers. The ubiquity of social media means a good word – or a good brand – can spread like wildfire with very little effort. But as limitless as the internet is, there’s a lot of competition and noise to contend with. And the vast graveyard of failed […]