Posts

Statistics and Machine Learning: Better Together!

My masters degree focused on Machine Learning, but when I got my first job as a data scientist, I quickly realized there was a lot I still needed to learn about Statistics. Since then I have come to appreciate the nuanced differences between Statistics and Machine Learning and I’m convinced they have a lot to offer one another!

Contingency Tables Part IV: The Score Test

The score test can be used to calculate p-values and confidence intervals for A/B tests. The score test considers the slope of the likelihood function at the parameter value associated with the null hypothesis.

Thoughts on Principal Components Analysis

This is a post with more questions than answers. I’ve been thinking about Principal Components Analysis (PCA) lately.

Sprinkle some Maximum Likelihood Estimation on that Contingency Table!

Maximum Likelihood Estimation provides consistent estimators, and can be efficiently computed under many null hypotheses of practical interest.

Contingency Tables Part II: The Binomial Distribution

In our last post, we introduced the potential outcomes framework as the foundational framework for causal inference. In the potential outcomes framework, each unit (e.g. each person) is represented by a pair of outcomes, corresponding to the result of the experience provided to them (treatment or control, A or B, etc.

Contingency Tables Part I: The Potential Outcomes Framework

“Why can’t I take the results of an A/B test at face value? Who are you, the statistics mafia? I don’t need a PhD in statistics to know that one number is greater than another.” If this sounds familiar, it is helpful to remember that we do an A/B test to learn about different potential outcomes. Comparing potential outcomes is essential for smart decision making, and this framework is the cornerstone of causal inference.

Unshackle Yourself from Statistical Significance

Don’t be a prisoner to statistical significance. A/B testing should serve the business, not the other way around!

Commit Message Linting with Magit

I have a confession to make. I’ve been writing bad commit messages for years. It takes time to write good commit messages, and often I’m in a hurry. Or so I tell myself. But that’s a false dichotomy. I can have my cake and eat it too! Recently I discovered how to use magit to enforce best practices for commit messages.

Viterbi Algorithm, Part 2: Decoding

This is my second post describing the Viterbi algorithm. As before, our presentation follows Jurafsky and Martin closely, merely filling in some details omitted in the text.

Viterbi Algorithm, Part 1: Likelihood

The Viterbi algorithm is used to find the most likely sequence of states given a sequence of observations emitted by those states and some details of transition and emission probabilities. It has applications in Natural Language Processing like part-of-speech tagging, in error correction codes, and more!