Introducing the teller

Today, give a try to Techtonique web app, a tool designed to help you make informed, data-driven decisions using Mathematics, Statistics, Machine Learning, and Data Visualization. Here is a tutorial with audio, video, code, and slides: https://moudiki2.gumroad.com/l/nrhgb. 100 API requests are now (and forever) offered to every user every month, no matter the pricing tier.

There is an increasing need for transparency and fairness in Machine Learning (ML) models predictions. Consider for example a banker who has to explain to a client why his/her loan application is rejected, or a health professional who must explain what constitutes his/her diagnosis. Some ML models are indeed very accurate, but are considered hard to explain, relatively to popular linear models.

Source of figure: James, Gareth, et al. An introduction to statistical learning. Vol. 112. New York: springer, 2013. Source: James, Gareth, et al. An introduction to statistical learning. Vol. 112. New York: springer, 2013.

We do not want to sacrifice this high accuracy to explainability. Hence: ML explainability. There are a lot of ML explainability tools out there, in the wild for that purpose (don’t take my word for it).

The teller is a model-agnostic tool for ML explainability - agnostic, as long as this model possesses methods fit and predict. The teller’s philosophy is to rely on Taylor series to explain ML models predictions: a little increase in model’s explanatory variables + a little decrease, and we can obtain approximate sensitivities of its predictions to changes in these explanatory variables.

Installation

Currently from Github, for the development version:

pip install git+https://github.com/Techtonique/teller.git

Package description

This notebook will give you a good introduction:

thierrymoudiki_011119_boston_housing.ipynb

Two models are used in the notebook: a linear model and a Random Forest (here, the black-box model). The most straightforward way to illustrate the teller is to use a linear model. In this case, the effects of model covariates on the response can be directly related to the linear model’s coefficients. Also, note that if there a lot of variables in your model, the teller’s explainer can be created with option n_jobs=-1 (for parallel execution).

Contributions/remarks are welcome as usual, you can submit a pull request on Github.

Note: I am currently looking for a gig. You can hire me on Malt or send me an email: thierry dot moudiki at pm dot me. I can do descriptive statistics, data preparation, feature engineering, model calibration, training and validation, and model outputs’ interpretation. I am fluent in Python, R, SQL, Microsoft Excel, Visual Basic (among others) and French. My résumé? Here!