causal inference

The Alternative to Causal Inference is Worse Causal Inference

Some of the most important questions data scientists investigate are causal questions. They're also some of the hardest to answer! A well-designed A/B test often provides the cleanest answer, but when a test is infeasible, there are plenty of other causal inference techniques that may be useful. While not perfect, these techniques are much better than the alternative: ad hoc methods with no logical foundation.

Supervised Learning as Function Approximation

Supervised learning is perhaps the most central idea in Machine Learning. It is equally central to statistics where it is known as regression. Statistics formulates the problem in terms of identifying the distribution from which observations are drawn; Machine Learning in terms of finding a model that fits the data well.

Naive Comparisons Under Endogeneity

Recently I have been reading Causal Inference: The Mixtape by Scott Cunningham. One thing I think Cunningham explains very well is the role of endogeneity in confounding even simple comparisons. I don’t have a background in economics, so I had never really grokked the concepts of endogenous and exogenous factors, especially as it related to causal inference.

Violations of the Stable Unit Treatment Value Assumption

We have previously mentioned the Stable Unit Treatment Value Assumption, or SUTVA, a complicated-sounding term that is one of the most important assumptions underlying A/B testing (and Causal Inference in general).

Sprinkle some Maximum Likelihood Estimation on that Contingency Table!

Maximum Likelihood Estimation provides consistent estimators, and can be efficiently computed under many null hypotheses of practical interest.

Contingency Tables Part II: The Binomial Distribution

In our last post, we introduced the potential outcomes framework as the foundational framework for causal inference. In the potential outcomes framework, each unit (e.g. each person) is represented by a pair of outcomes, corresponding to the result of the experience provided to them (treatment or control, A or B, etc.

Contingency Tables Part I: The Potential Outcomes Framework

"Why can't I take the results of an A/B test at face value? Who are you, the statistics mafia? I don't need a PhD in statistics to know that one number is greater than another." If this sounds familiar, it is helpful to remember that we do an A/B test to learn about different potential outcomes. Comparing potential outcomes is essential for smart decision making, and this framework is the cornerstone of causal inference.

A/B Testing

Calculators for planning and analyzing A/B tests

A/B Testing Best Practices

When I started this blog, my primary objective was less about teaching others A/B testing and more about clarifying my own thoughts on A/B testing. I had been running A/B tests for about a year, and I was starting to feel uncomfortable with some of the standard methodologies.

Optimal Experiment Design

We can plan sample sizes to control the width of confidence intervals.