Today, give a try to Techtonique web app, a tool designed to help you make informed, data-driven decisions using Mathematics, Statistics, Machine Learning, and Data Visualization. Here is a tutorial with audio, video, code, and slides: https://moudiki2.gumroad.com/l/nrhgb. 100 API requests are now (and forever) offered to every user every month, no matter the pricing tier.
Disclaimer: Updated on 2025-06-28
Bayesian optimization (BO) is a popular (and clever, and elegant, and beautiful, and efficient) optimization method for hyperparameter tuning in Machine Learning and Deep Learning. BO is based on the use of a surrogate model that approximates the objective function (the function to be minimized) in a probabilistic way. It optimizes a cheaper acquisition function that allows to select the next point to evaluate.
The most common surrogate model in BO is the Gaussian process regressor, a Bayesian model with a Gaussian prior, and the most common acquisition function is the Expected Improvement (EI). The idea of EI is to select the next point to evaluate based on the expected improvement relative to the current best point.
Conformal Prediction is a framework allowing, among other things, to make supervised learning predictions with prediction intervals. For more details on Bayesian optimization and Conformal Prediction, see the following references:
In this post, I’ll show how to use conformalized surrogates for optimization, thanks to GPopt and nnetsauce. With this approach, any surrogate model can be used for optimization, and there’s no more constraint on the choice of a prior (Gaussian, Laplace, etc.). The acquisition function is the lower confidence bound (LCB) of the conformalized surrogate model.
A future post will show how to use conformalized surrogates for Machine Learning and Deep Learning hyperparameter tuning.
pip install GPopt nnetsauce
import GPopt as gp
import nnetsauce as ns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestRegressor, ExtraTreesRegressor
from sklearn.linear_model import RidgeCV, LassoCV
from sklearn.kernel_ridge import KernelRidge
from sklearn.svm import SVR
from sklearn.model_selection import cross_val_score
from scipy.optimize import minimize
from statsmodels.nonparametric.smoothers_lowess import lowess
# Six-Hump Camel Function (Objective function, to be minimized)
def six_hump_camel(x):
"""
Six-Hump Camel Function:
- Global minima located at:
(0.0898, -0.7126),
(-0.0898, 0.7126)
- Function value at the minima: f(x) = -1.0316
"""
x1 = x[0]
x2 = x[1]
term1 = (4 - 2.1 * x1**2 + (x1**4) / 3) * x1**2
term2 = x1 * x2
term3 = (-4 + 4 * x2**2) * x2**2
return term1 + term2 + term3
import matplotlib.pyplot as plt
import numpy as np
# Generate a grid of points in the input space
x = np.linspace(-3, 3, 100)
y = np.linspace(-2, 2, 100)
X, Y = np.meshgrid(x, y)
# Evaluate the objective function at each point in the grid
Z = np.zeros_like(X)
for i in range(X.shape[0]):
for j in range(X.shape[1]):
Z[i, j] = six_hump_camel([X[i, j], Y[i, j]])
# Plot the contour map
plt.figure(figsize=(8, 6))
contour = plt.contourf(X, Y, Z, levels=50, cmap='viridis')
plt.colorbar(contour, label='Objective function value')
plt.title('Contour plot of the Six-Hump Camel function')
plt.xlabel('x1')
plt.ylabel('x2')
plt.show()
from sklearn.utils import all_estimators
from tqdm import tqdm
# Get all available scikit-learn estimators
estimators = all_estimators(type_filter='regressor')
results = []
# Loop through all regressors
for name, RegressorClass in tqdm(estimators):
try:
# Instantiate the regressor (you might need to handle potential exceptions or required parameters)
regressor = RegressorClass()
print(f"\n Successfully instantiated regressor: {name} ----------")
# GPopt for Bayesian optimization
gp_opt = gp.GPOpt(objective_func=six_hump_camel,
lower_bound = np.array([-3, -2]),
upper_bound = np.array([3, 2]),
acquisition="ucb",
method="splitconformal",
surrogate_obj=ns.PredictionInterval(regressor), # Any surrogate model can be used, thanks to nnetsauce
n_init=10,
n_iter=190,
seed=432)
print(f"gp_opt.method: {gp_opt.method}")
res = gp_opt.optimize(verbose=1, ucb_tol=1e-6)
print(f"\n\n result: {res}")
display(res.best_params)
display(res.best_score)
results.append((name, res))
except Exception as e:
print(f"Could not instantiate regressor {name}: {e}")
import pandas as pd
results_df = pd.DataFrame(columns=['Regressor', 'Best Params', 'Best Score'])
for name, res in results:
best_params = res.best_params
best_score = res.best_score
results_df = pd.concat([results_df, pd.DataFrame({'Regressor': [name], 'Best Params': [best_params], 'Best Score': [best_score]})], ignore_index=True)
results_df.sort_values(by='Best Score', ascending=True, inplace=True)
results_df.reset_index(drop=True, inplace=True)
results_df.style.format({'Best Score': "{:.5f}"})
Regressor | Best Params | Best Score | |
---|---|---|---|
0 | BaggingRegressor | [ 0.09649658 -0.71691895] | -1.03133 |
1 | GaussianProcessRegressor | [ 0.09649658 -0.71691895] | -1.03133 |
2 | NuSVR | [ 0.09649658 -0.71691895] | -1.03133 |
3 | SVR | [ 0.09649658 -0.71691895] | -1.03133 |
4 | MLPRegressor | [-0.09155273 0.69482422] | -1.02905 |
5 | GradientBoostingRegressor | [ 0.04907227 -0.71142578] | -1.02514 |
6 | KNeighborsRegressor | [ 0.08203125 -0.6640625 ] | -1.01372 |
7 | ExtraTreeRegressor | [ 0.08203125 -0.6640625 ] | -1.01372 |
8 | RandomForestRegressor | [ 0.08203125 -0.6640625 ] | -1.01372 |
9 | DecisionTreeRegressor | [ 0.08203125 -0.6640625 ] | -1.01372 |
10 | HistGradientBoostingRegressor | [-0.00732422 -0.72167969] | -0.99277 |
11 | AdaBoostRegressor | [ 0.09375 -0.8125 ] | -0.93858 |
12 | ExtraTreesRegressor | [-0.05877686 -0.66418457] | -0.93331 |
13 | ElasticNet | [-0.06650758 -0.66453519] | -0.92451 |
14 | ARDRegression | [-0.06650758 -0.66453519] | -0.92451 |
15 | ElasticNetCV | [-0.06650758 -0.66453519] | -0.92451 |
16 | KernelRidge | [-0.06650758 -0.66453519] | -0.92451 |
17 | HuberRegressor | [-0.06650758 -0.66453519] | -0.92451 |
18 | Lars | [-0.06650758 -0.66453519] | -0.92451 |
19 | LarsCV | [-0.06650758 -0.66453519] | -0.92451 |
20 | LassoLars | [-0.06650758 -0.66453519] | -0.92451 |
21 | LassoLarsCV | [-0.06650758 -0.66453519] | -0.92451 |
22 | Lasso | [-0.06650758 -0.66453519] | -0.92451 |
23 | LassoCV | [-0.06650758 -0.66453519] | -0.92451 |
24 | LinearRegression | [-0.06650758 -0.66453519] | -0.92451 |
25 | LassoLarsIC | [-0.06650758 -0.66453519] | -0.92451 |
26 | LinearSVR | [-0.06650758 -0.66453519] | -0.92451 |
27 | OrthogonalMatchingPursuit | [-0.06650758 -0.66453519] | -0.92451 |
28 | OrthogonalMatchingPursuitCV | [-0.06650758 -0.66453519] | -0.92451 |
29 | PLSRegression | [-0.06650758 -0.66453519] | -0.92451 |
30 | DummyRegressor | [-0.06650758 -0.66453519] | -0.92451 |
31 | BayesianRidge | [-0.06650758 -0.66453519] | -0.92451 |
32 | QuantileRegressor | [-0.06650758 -0.66453519] | -0.92451 |
33 | PassiveAggressiveRegressor | [-0.06650758 -0.66453519] | -0.92451 |
34 | RadiusNeighborsRegressor | [-0.06650758 -0.66453519] | -0.92451 |
35 | RANSACRegressor | [-0.06650758 -0.66453519] | -0.92451 |
36 | Ridge | [-0.06650758 -0.66453519] | -0.92451 |
37 | RidgeCV | [-0.06650758 -0.66453519] | -0.92451 |
38 | SGDRegressor | [-0.06650758 -0.66453519] | -0.92451 |
39 | TheilSenRegressor | [-0.06650758 -0.66453519] | -0.92451 |
40 | TransformedTargetRegressor | [-0.06650758 -0.66453519] | -0.92451 |
41 | TweedieRegressor | [-0.06650758 -0.66453519] | -0.92451 |
# Michalewicz Function
def michalewicz(x, m=10):
"""
Michalewicz Function (for n=2 dimensions):
"""
return -sum(np.sin(xi) * (np.sin((i + 1) * xi**2 / np.pi))**(2 * m) for i, xi in enumerate(x))
import matplotlib.pyplot as plt
import numpy as np
# Generate a grid of points in the input space
x = np.linspace(0, 2, 100)
y = np.linspace(np.pi, 2, 100)
X, Y = np.meshgrid(x, y)
# Evaluate the objective function at each point in the grid
Z = np.zeros_like(X)
for i in range(X.shape[0]):
for j in range(X.shape[1]):
Z[i, j] = michalewicz([X[i, j], Y[i, j]])
# Plot the contour map
plt.figure(figsize=(8, 6))
contour = plt.contourf(X, Y, Z, levels=50, cmap='viridis')
plt.colorbar(contour, label='Objective function value')
plt.title('Contour plot of the Six-Hump Camel function')
plt.xlabel('x1')
plt.ylabel('x2')
plt.show()
{:class=”img-responsive”
from sklearn.utils import all_estimators
from tqdm import tqdm
# Get all available scikit-learn estimators
estimators = all_estimators(type_filter='regressor')
results = []
# Loop through all regressors
for name, RegressorClass in tqdm(estimators):
try:
# Instantiate the regressor (you might need to handle potential exceptions or required parameters)
regressor = RegressorClass()
print(f"\n Successfully instantiated regressor: {name} ----------")
# GPopt for Bayesian optimization
gp_opt = gp.GPOpt(objective_func=michalewicz,
lower_bound = np.array([0, np.pi]),
upper_bound = np.array([2, 2]),
acquisition="ucb",
method="splitconformal",
surrogate_obj=ns.PredictionInterval(regressor), # Any surrogate model can be used, thanks to nnetsauce
n_init=10,
n_iter=190,
seed=432)
print(f"gp_opt.method: {gp_opt.method}")
res = gp_opt.optimize(verbose=1, ucb_tol=1e-6)
print(f"\n\n result: {res}")
display(res.best_params)
display(res.best_score)
results.append((name, res))
except Exception as e:
print(f"Could not instantiate regressor {name}: {e}")
import pandas as pd
results_df = pd.DataFrame(columns=['Regressor', 'Best Params', 'Best Score'])
for name, res in results:
best_params = res.best_params
best_score = res.best_score
results_df = pd.concat([results_df, pd.DataFrame({'Regressor': [name], 'Best Params': [best_params], 'Best Score': [best_score]})], ignore_index=True)
results_df.sort_values(by='Best Score', ascending=True, inplace=True)
results_df.reset_index(drop=True, inplace=True)
results_df.style.format({'Best Score': "{:.5f}"})
Regressor | Best Params | Best Score | |
---|---|---|---|
0 | BaggingRegressor | [1.9989624 2.71631734] | -0.77895 |
1 | GradientBoostingRegressor | [1.9989624 2.71631734] | -0.77895 |
2 | GaussianProcessRegressor | [1.9989624 2.71631734] | -0.77895 |
3 | AdaBoostRegressor | [1.99511719 2.70736381] | -0.76882 |
4 | MLPRegressor | [1.99978638 2.68494514] | -0.74841 |
5 | RandomForestRegressor | [1.99978638 2.68494514] | -0.74841 |
6 | ExtraTreesRegressor | [1.97668457 2.67872644] | -0.67143 |
7 | ExtraTreeRegressor | [1.9453125 2.68227998] | -0.60804 |
8 | HuberRegressor | [1.93724655 2.67858092] | -0.58092 |
9 | KNeighborsRegressor | [1.93724655 2.67858092] | -0.58092 |
10 | KernelRidge | [1.93724655 2.67858092] | -0.58092 |
11 | ElasticNetCV | [1.93724655 2.67858092] | -0.58092 |
12 | LarsCV | [1.93724655 2.67858092] | -0.58092 |
13 | LassoCV | [1.93724655 2.67858092] | -0.58092 |
14 | Lars | [1.93724655 2.67858092] | -0.58092 |
15 | ARDRegression | [1.93724655 2.67858092] | -0.58092 |
16 | OrthogonalMatchingPursuitCV | [1.93724655 2.67858092] | -0.58092 |
17 | PLSRegression | [1.93724655 2.67858092] | -0.58092 |
18 | NuSVR | [1.93724655 2.67858092] | -0.58092 |
19 | OrthogonalMatchingPursuit | [1.93724655 2.67858092] | -0.58092 |
20 | LinearRegression | [1.93724655 2.67858092] | -0.58092 |
21 | LassoLarsIC | [1.93724655 2.67858092] | -0.58092 |
22 | LinearSVR | [1.93724655 2.67858092] | -0.58092 |
23 | LassoLarsCV | [1.93724655 2.67858092] | -0.58092 |
24 | PassiveAggressiveRegressor | [1.93724655 2.67858092] | -0.58092 |
25 | QuantileRegressor | [1.93724655 2.67858092] | -0.58092 |
26 | SGDRegressor | [1.93724655 2.67858092] | -0.58092 |
27 | RidgeCV | [1.93724655 2.67858092] | -0.58092 |
28 | Ridge | [1.93724655 2.67858092] | -0.58092 |
29 | RadiusNeighborsRegressor | [1.93724655 2.67858092] | -0.58092 |
30 | RANSACRegressor | [1.93724655 2.67858092] | -0.58092 |
31 | BayesianRidge | [1.93724655 2.67858092] | -0.58092 |
32 | TweedieRegressor | [1.93724655 2.67858092] | -0.58092 |
33 | TransformedTargetRegressor | [1.93724655 2.67858092] | -0.58092 |
34 | TheilSenRegressor | [1.93724655 2.67858092] | -0.58092 |
35 | DecisionTreeRegressor | [1.8515625 2.73579214] | -0.47178 |
36 | SVR | [0.76176453 2.71127445] | -0.41275 |
37 | DummyRegressor | [0.75 2.71349541] | -0.41257 |
38 | ElasticNet | [0.75 2.71349541] | -0.41257 |
39 | HistGradientBoostingRegressor | [0.75 2.71349541] | -0.41257 |
40 | Lasso | [0.75 2.71349541] | -0.41257 |
41 | LassoLars | [0.75 2.71349541] | -0.41257 |
Comments powered by Talkyard.