Posts

Principal Stratification and Mediation

This post explores principal stratification and mediation analysis as tools for understanding causal effects, decomposing them into direct and indirect components. It covers scenarios like non-compliance, missing outcomes, and surrogate indices, highlighting the importance of assumptions such as no direct effects and no Defiers. Practical methods, including multiple imputation, regression, and matching, are discussed for estimating effects even when key quantities are unobserved. Real-world examples, like marketing lift studies and product funnels, illustrate the relevance of these techniques for addressing complex causal questions.

Modes of Inference in Randomized Experiments

Randomization provides the “reasoned basis for inference” in an experiment. Yet some approaches to analyzing experiments ignore the special structure of randomization. Simple, familiar approaches like regression models sometimes give wrong answers when applied to experiments. Approaches exploiting randomization deliver more reliable inferences than methods neglecting it. Randomization inference should be the first method we reach for when analyzing experiments.

Sensitivity Analysis for Matched Sets with One Treated Unit

Adjusting for observed factors does not elevate an observational study to the reliability of an experiment. P-values are not appropriate measures of the strength of evidence in an observational study. Instead, sensitivity analysis allows us to identify the magnitude of hidden biases that would be necessary to invalidate study conclusions. This leads to a strength-of-evidence metric appropriate for an observational study.

Sensitivity Analysis for Matched Pairs

Observational studies involve more uncertainty than randomized experiments. Sensitivity analysis offers an approach to quantifying this uncertainty.

Attributable Effects

In a previous post, we discussed why randomization provides a reasoned basis for inference in an experiment. Randomization not only quantifies the plausibility of a causal effect but also allows us to infer something about the size of that effect.

The Reasoned Basis for Inference in Experiments

In his 1935 book, “Design of Experiments”, Ronald Fisher described randomization as the “reasoned basis for inference” in an experiment. Why do we need a “basis” at all, let alone a reasoned one?

Tests with One-Sided Noncompliance

Table of Contents Introduction As Treated, Per Protocol, and Intent to Treat Potential Outcomes Notation Instrumental Variables Dose-Response Models Conclusions and Further Reading References Introduction Tech companies spoil data scientists. It’s so easy for us to A/B test everything.

Eglot+Tree-Sitter in Emacs 29

I’ve been an Emacs user for about 15 years, and for the most part I use Emacs for org-mode and python development. I’ve happily used Jorgen Schäfer’s elpy as the core of my python development workflow for the last 5 years or so, and I’ve been happy with it.

Compiling Emacs 29 With Tree-Sitter

I started a new job recently and took the opportunity to install a new version of Emacs. Emacs 29 includes tree-sitter and built-in eglot support, which I’ll write about some other time.

User Segmentation from Heterogeneous Treatment Effects

Imagine we are attempting to identify segments within an audience, perhaps so we can market to them more effectively through personalization. A common approach to doing so is to apply some kind of clustering algorithm (such as K-means) based on various user covariates.