What is the posterior, why does it matter?

Data Philly, Sept. 2024
Slides
Abstract

We’ve all seen it. That neon sign hanging on the wall of some math department somewhere. Bayes’ Rule. An expression of conditional probability. And when it’s a model conditioned on data, it’s Bayesian Inference. Evaluating the right hand side of that equation, transforming the left into a posterior distribution. Though often reduced to a point estimate or confidence interval, when left as a distribution interesting possibilities abound…

The primary goal of this talk is to answer the second half of its title: why should you care? After providing a brief overview of how the posterior fits into Bayesian Inference we’ll focus on application. Examples from domains such as finance, investing, E-commerce and baseball will drive the content and demonstrate the versatility this approach affords. Attendees should leave with an understanding of the role of the posterior distribution and ideas of how to apply this to their own work.

What is optimization, why does it matter?

Data Philly, Feb. 2024
Slides
Abstract

In data science, we often think of optimization as the engine that fires each time model.fit(X, y) is called. Perhaps less considered is that these methods are a subset of a much larger discipline, with applications in data science that reach beyond machine learning: mathematical optimization.

In this talk we’ll try and answer the questions: “What makes mathematical optimization special, and why should data scientists care?” Along the way we’ll walk through examples which demonstrate how these tools can extend and complement the familiar methods within the machine learning toolbox. Having established a basic understanding of the sort of problems these methods place within our grasp, we’ll conclude with examples of real world applications from industries such as Healthcare, Marketing and Advertising.

The Power of Bayes in Industry

PyMCon Web Series 2023
Abstract

This talk will attempt to answer the question “what is a Data Generating Process and why does it matter?” While we will begin our discussion with a bit of theory, don’t worry about this being too technical or inaccessible if you’re new to Bayesian Statistics. Our primary goal is to focus on the second half of the question and give you tools to use for real-world applications.

With the core concepts and background covered, we’ll demonstrate how incorporating this understanding into our modeling decisions allows us to embed elements of a business function directly into our statistical models and how this can provide immense value in industry settings, especially where traditional machine learning techniques fail, such as

  • The ability to tackle critical problems when data is lacking, like launching a new product

  • Building powerful, predictive models that are difficult to overfit

  • Explainability is built in, and it’s already expressed in the terms of your business

Best of all is that the design techniques we propose here are such that when you get one the benefits above, the rest usually come for free.

All of this and more will be illustrated through concrete examples found in both publicly available data as well as proprietary data we use here at Perpay.

Model Evaluation For Humans

PyData Miami 2019
Abstract

Model evaluation is arguably the most important step of the modeling process. As data scientists we constantly make decisions based on these evaluations. Is model A better than model B? Will this model be profitable in production? Is performance in production declining? Model evaluation for humans discusses approaches for assessing our models practically in ways we understand.

Practitioners who are relatively new to releasing models into production or a product are likely to benefit the most from this talk however data scientists of all abilities may find useful strategies to take away. The approaches that we will discuss serve better as a play book than a rule book and will help build confidence in your model throughout all phases of the business lifecycle from development through deployment.