x | n | y | |
---|---|---|---|
0 | 2 | 1443 | 1346 |
1 | 3 | 694 | 577 |
2 | 4 | 455 | 337 |
3 | 5 | 353 | 208 |
4 | 6 | 272 | 149 |
Your Business Model is Your Data Generating Process
|
|
dantegates.github.io |
Disambiguating an overloaded term:
} |
Goal: Align these |
Stan Case Study: Golf putting
x | n | y | |
---|---|---|---|
0 | 2 | 1443 | 1346 |
1 | 3 | 694 | 577 |
2 | 4 | 455 | 337 |
3 | 5 | 353 | 208 |
4 | 6 | 272 | 149 |
|
Photo Credit: insidescience.org |
Image Source: Stan Development Team |
Image Source: Stan Development Team |
Summary
|
|
Image Source: Stan Development Team |
Modeled outcome Will the ball go in the hole? -not- y=0/1? |
|
Image Source: Stan Development Team |
Black Box Model? |
|
Image Source: Stan Development Team |
👶
First Principles TODO
D
: Learned parameter of the mature default rate as \(t\rightarrow\infty\), e.g. pm.Beta()
First Principles TODO
L
: The mature liquidation rate as \(t\to\infty\)
First Principles TODO
\(L\times\text{another_cdf}: R^{+}\to[0,L]\)
First Principles TODO
I_t
: The percentage of loans in-flight at time \(t\)
First Principles TODO
Technically we need to do pt.Stack(...).T
for p
and observed=
, but this makes the slide less readable.
Advantages over vanilla ML
Critiques
N
changing over timewith pm.Model as model:
...
# R: learned parameter of recovery, e.g. pm.Beta()
# D_T: number of loans that defaulted `t-T` days ago
# PD_T: number of loans in a past-due state `t-T` days ago
# N: total number of loans
R = ...
D_t = (D_T + (1-R) * PD_T) / N
D = D_t / some_cdf(t-T)
# ↓↓↓ everything else same as before ↓↓↓
...
👋
From Wikipedia,
Priors! All Bayesian models are DGPs!
Priors! All Bayesian models are DGPs!
🧍 | 1 |
---|---|
🤖 | 0 |