AdaOpt classification on MNIST handwritten digits (without preprocessing)

Today, give a try to Techtonique web app, a tool designed to help you make informed, data-driven decisions using Mathematics, Statistics, Machine Learning, and Data Visualization. Here is a tutorial with audio, video, code, and slides: https://moudiki2.gumroad.com/l/nrhgb. 100 API requests are now (and forever) offered to every user every month, no matter the pricing tier.

Last week on this blog, I presented AdaOpt for R, applied to iris dataset classification. And the week before that, I introduced AdaOpt for Python. AdaOpt is a novel probabilistic classifier, based on a mix of multivariable optimization and a nearest neighbors algorithm. More details about the algorithm can be found in this (short) paper. This week, we are going to train AdaOpt on the popular MNIST handwritten digits dataset without preprocessing, a.k.a neither convolution nor pooling.

Install mlsauce’s AdaOpt from the command line (for R, cf. below):

!pip install git+https://github.com/Techtonique/mlsauce.git --upgrade

Import the packages that will be necessary for the demo:

from time import time
from tqdm import tqdm
import mlsauce as ms
import numpy as np
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_openml

Get MNIST handwritten digits data (notice that here, AdaOpt is trained on 5000 digits, and evaluated on 10000):

Z, t = fetch_openml('mnist_784', version=1, return_X_y=True)

print(Z.shape)
print(t.shape)

t_ = np.asarray(t, dtype=int)

np.random.seed(2395)

train_samples = 5000

X_train, X_test, y_train, y_test = train_test_split(
    Z, t_, train_size=train_samples, test_size=10000)

Creation of an AdaOpt object:

obj = ms.AdaOpt(**{'eta': 0.13913503573317965, 'gamma': 0.1764634904063013, 
                   'k': np.int(1.2154947405849463), 
                   'learning_rate': 0.6161538857826013, 
                   'n_iterations': np.int(245.55517115592275), 
                   'reg_alpha': 0.29915416038957043, 
                   'reg_lambda': 0.163411853029936, 
                   'row_sample': 0.9477046112286693, 
                   'tolerance': 0.05877163298305207})

Adjusting the AdaOpt object to the training set:

start = time()
obj.fit(X_train, y_train)
print(time()-start)

0.7025153636932373

Obtain the accuracy of AdaOpt on test set:

start = time()
print(obj.score(X_test, y_test))
print(time()-start)

0.9372
9.997464656829834

Classification report including additional error metrics:

preds = obj.predict(X_test)
print(classification_report(preds, y_test))

   precision    recall  f1-score   support

           0       0.99      0.94      0.96      1018
           1       0.99      0.95      0.97      1205
           2       0.93      0.97      0.95       955
           3       0.92      0.91      0.91      1064
           4       0.91      0.95      0.93       882
           5       0.89      0.95      0.92       838
           6       0.97      0.96      0.96       974
           7       0.95      0.95      0.95      1054
           8       0.88      0.93      0.91       953
           9       0.93      0.88      0.91      1057

    accuracy                           0.94     10000
   macro avg       0.94      0.94      0.94     10000
weighted avg       0.94      0.94      0.94     10000

Confusion matrix, true label vs predicted label:

import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
from sklearn.metrics import confusion_matrix

mat = confusion_matrix(y_test, preds)
sns.heatmap(mat.T, square=True, annot=True, fmt='d', cbar=False)
plt.xlabel('true label')
plt.ylabel('predicted label');

image-title-here

In R, the syntax is quite similar to what we’ve just demonstrated for Python. After having installed mlsauce, we’d have:

For the creation of an AdaOpt object:

library(mlsauce)

# create AdaOpt object with default parameters
obj <- mlsauce::AdaOpt()

# print object attributes
print(obj$get_params())

For fitting the AdaOpt object to the training set:

# fit AdaOpt to training set
obj$fit(X_train, y_train)

For obtaining the accuracy of AdaOpt on test set:

# obtain accuracy on test set 
print(obj$score(X_test, y_test))

Note: I am currently looking for a gig. You can hire me on Malt or send me an email: thierry dot moudiki at pm dot me. I can do descriptive statistics, data preparation, feature engineering, model calibration, training and validation, and model outputs’ interpretation. I am fluent in Python, R, SQL, Microsoft Excel, Visual Basic (among others) and French. My résumé? Here!