Randomization provides the "reasoned basis for inference" in an experiment. Yet some approaches to analyzing experiments ignore the special structure of randomization. Simple, familiar approaches like regression models sometimes give wrong answers when applied to experiments. Approaches exploiting randomization deliver more reliable inferences than methods neglecting it. Randomization inference should be the first method we reach for when analyzing experiments.
Adjusting for observed factors does not elevate an observational study to the reliability of an experiment. P-values are not appropriate measures of the strength of evidence in an observational study. Instead, sensitivity analysis allows us to identify the magnitude of hidden biases that would be necessary to invalidate study conclusions. This leads to a strength-of-evidence metric appropriate for an observational study.
In a previous post, we discussed why randomization provides a reasoned basis for inference in an experiment. Randomization not only quantifies the plausibility of a causal effect but also allows us to infer something about the size of that effect.
In his 1935 book, “Design of Experiments”, Ronald Fisher described randomization as the “reasoned basis for inference” in an experiment. Why do we need a “basis” at all, let alone a reasoned one?
Table of Contents Introduction As Treated, Per Protocol, and Intent to Treat Potential Outcomes Notation Instrumental Variables Dose-Response Models Conclusions and Further Reading References Introduction Tech companies spoil data scientists. It’s so easy for us to A/B test everything.
Some of the most important questions data scientists investigate are causal questions. They're also some of the hardest to answer! A well-designed A/B test often provides the cleanest answer, but when a test is infeasible, there are plenty of other causal inference techniques that may be useful. While not perfect, these techniques are much better than the alternative: ad hoc methods with no logical foundation.
Supervised learning is perhaps the most central idea in Machine Learning. It is equally central to statistics where it is known as regression. Statistics formulates the problem in terms of identifying the distribution from which observations are drawn; Machine Learning in terms of finding a model that fits the data well.
Recently I have been reading Causal Inference: The Mixtape by Scott Cunningham. One thing I think Cunningham explains very well is the role of endogeneity in confounding even simple comparisons. I don’t have a background in economics, so I had never really grokked the concepts of endogenous and exogenous factors, especially as it related to causal inference.
We have previously mentioned the Stable Unit Treatment Value Assumption, or SUTVA, a complicated-sounding term that is one of the most important assumptions underlying A/B testing (and Causal Inference in general).