PyMCon Afterword
A couple of weeks ago I gave a talk for the 2023 PyMCon Web Series. The aim of the talk was to advocate the advantages of a first principles approach to modeling, i.e. one that places focus on modeling the data generating process (DGP) and not simply outcomes alone. This post covers two slides, Tips and Resources, that didn’t make the final cut. Below, each would-have-been bullet point is covered briefly in two sentences or less.
Tips
Remember: there’s no “right” answer
By nature, the best of these first principles models can feel so natural that it becomes tempting to think of them as the right model, but any one approach is likely chosen from among other reasonable alternatives. It’s approximations all the way down so take liberty to reimagine your models and exercise creativity.
Don’t rush to your favorite modeling package
Take time to sketch, whiteboard, brainstorm with colleagues, do EDA, simulate from your assumptions or work things out on paper before model building.
Start simple and add assumptions incrementally
Examine your progress between iterations and stop when you exhaust the “information available to estimate any additional parameters”. Increase complexity judiciously.
Think graphically
See this excellent case study.
Learn to love priors
This methodology unashamedly places priors front-and-center, so lean into it. Use informative priors and avoid the temptation to limit yourself with flat priors.
Prior predictive samples
Do it.
Get comfortable with basics of probability
A basic vocabulary is needed to express your mental models mathematically. A few non-exhaustive examples of things you should know include the law of total probability, the rules of conditional probability, joint probability and expectations.
Challenge yourself to see these problems everywhere
Learning to recognize when a first principles approach is or isn’t appropriate is key. This skill comes with practice and you can get reps in by looking for examples in everyday life: as an example, check out the post on my quarantine playlist.
Resources
Other industry examples. Each illustrates a nice “middle ground” between traditional ML and the model I demonstrated in my talk which was very domain specific. I.e. both examples model the DGP, but are not specific to any one business model.
- PyMC Labs x Hello Fresh Mixed Media Marketing
- Pareto/NBD
Additional resources on the putting case study (the motivating example of the talk):
Other resources I was reading/listening to while preparing the talk:
- What’s the Probabilistic Story?
- The Prior Can Often Only Be Understood in the Context of the Likelihood
- Learning Bayesian Statistics interview with Michael Betancourt and the authors of Regression and Other Stories
- Andrew Gelman’s Keynote at PyData NYC - my first introduction to the putting example.