Your Probabilistic Modeling Copilot
🤔 What is this?#
Generative AI meets Probabilistic Programming.
ppchain an open-source toolkit for intuitive, effective modeling.
Your copilot to build model internal representations and optimize your Bayesian workflow.
🚀 What can this help with?#
ppchain aims to ease the pains of building a model.
Following the 3 main steps of the Bayesian data analysis process, as defined in [1], ppchain provides a (progressively growing) toolbox of AI-assisted functions aiming to make your life easier along the way:
Setting up a full probability model—a joint probability distribution for all observable and unobservable quantities in a problem.
ppchainsearches for domain knowledge about your underlying problem and helps building an internal representation that is consistent with both background knowledge and collected data.Conditioning on observed data: calculating and interpreting the appropriate posterior distribution—the conditional probability distribution of the unobserved quantities of ultimate interest, given the observed data.
Evaluating the fit of the model and the implications of the resulting posterior distribution: how well does the model fit the data? are the substantive conclusions reasonable? and how sensitive are the results to the modeling assumptions made?
⚙ Workflow#
ppchain provides a (progressively growing) set of AI-assisted functions to progress through the following workflow (where \($P$\) denotes a probability distribution, \($\theta$\) the parameters, and \($y$\) the data):
- Define the problem statement
Problem statement (conversational AI)
Specify hypothesis
Select model type
Data collection method
- Formalize priors, \($P(\theta)$\)
Search for background knowledge
Prior elicitation
Formalize prior distributions
Prior predictive checks
- Determine the likelihood function, \($P(y \mid \theta)$\)
Search for background knowledge
Load & preprocess data
Formalize the likelihood function
- Compute the posterior distribution, \($P(\theta \mid y) \propto P(y \mid \theta) \, P(\theta)$\)
Variables selection, identifying the subset of predictors to include in the model
Determine the functional form of the model
Fit the model to the observed data to estimate the unknown model parameters
Compute posterior distribution
- Run posterior inference
Compute posterior inference
Posterior predictive checking
Sensitivity analysis
Make predictions about future events
📖 Documentation#
Documentation: https://ppchain.readthedocs.io
💁 Contributing#
Contributions are very welcome, whether it is in the form of a new feature, improved infrastructure, or better documentation. For detailed information on how to contribute, see CONTRIBUTING.
If you are interested to get further involved with the ValueGrid team, please contact us.
License#
Usage is provided under the MIT license. See LICENSE for full details.
Credits & references#
Initial inspiration for
ppchaincame from Thomas Wiecki, PhD and Daniel Lee, as explained in more details in this LinkedIn post and Medium article.This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.