Posts

Design-Based Inference and Sensitivity Analysis for Survey Sampling

In this note, we consider sampling from a finite population, without replacement and with unequal probabilities. We seek an estimate of the population mean of some characteristic.

Principal Stratification and Mediation

This post explores principal stratification and mediation analysis as tools for understanding causal effects, decomposing them into direct and indirect components. It covers scenarios like non-compliance, missing outcomes, and surrogate indices, highlighting the importance of assumptions such as no direct effects and no Defiers. Practical methods, including multiple imputation, regression, and matching, are discussed for estimating effects even when key quantities are unobserved. Real-world examples, like marketing lift studies and product funnels, illustrate the relevance of these techniques for addressing complex causal questions.

Interpretable and Validatable Uplift Modeling

In this note, we introduce a method for interpreting and validating the results of uplift modeling. We propose two novel strategies for controlling the Familywise Error Rate in this setting.

Modes of Inference in Randomized Experiments

Randomization provides the “reasoned basis for inference” in an experiment. Yet some approaches to analyzing experiments ignore the special structure of randomization. Simple, familiar approaches like regression models sometimes give wrong answers when applied to experiments. Approaches exploiting randomization deliver more reliable inferences than methods neglecting it. Randomization inference should be the first method we reach for when analyzing experiments.

Sensitivity Analysis for Matched Sets with One Treated Unit

Adjusting for observed factors does not elevate an observational study to the reliability of an experiment. P-values are not appropriate measures of the strength of evidence in an observational study. Instead, sensitivity analysis allows us to identify the magnitude of hidden biases that would be necessary to invalidate study conclusions. This leads to a strength-of-evidence metric appropriate for an observational study.

Sensitivity Analysis for Matched Pairs

Observational studies involve more uncertainty than randomized experiments. Sensitivity analysis offers an approach to quantifying this uncertainty.

Attributable Effects

In a previous post, we discussed why randomization provides a reasoned basis for inference in an experiment. Randomization not only quantifies the plausibility of a causal effect but also allows us to infer something about the size of that effect.

The Reasoned Basis for Inference in Experiments

In his 1935 book, “Design of Experiments”, Ronald Fisher described randomization as the “reasoned basis for inference” in an experiment. Why do we need a “basis” at all, let alone a reasoned one?

Tests with One-Sided Noncompliance

Table of Contents

Introduction

Tech companies spoil data scientists. It’s so easy for us to A/B test everything. We can alter many aspects of the product from a configuration UI. We have the sample size to get a good read in as little as a few days. We have the data infrastructure to analyze and report results quickly.

Eglot+Tree-Sitter in Emacs 29

I’ve been an Emacs user for about 15 years, and for the most part I use Emacs for org-mode and python development. I’ve happily used Jorgen Schäfer’s elpy as the core of my python development workflow for the last 5 years or so, and I’ve been happy with it. Unfortunately the current maintainer, Gaby Launay, hasn’t had time to work on elpy for over a year now. In one sense this doesn’t matter: elpy is pretty stable; it’s open source so it can’t just disappear on me; and I feel comfortable making minor changes myself.