Today, give a try to Techtonique web app, a tool designed to help you make informed, data-driven decisions using Mathematics, Statistics, Machine Learning, and Data Visualization. Here is a tutorial with audio, video, code, and slides: https://moudiki2.gumroad.com/l/nrhgb. 100 API requests are now (and forever) offered to every user every month, no matter the pricing tier.
Introduction
In this post, we’ll explore an intriguing comparison between XGBoost’s balanced accuracy with its default hyperparameters on 72 data sets and a GenericBooster with Linear Regression as a base learner, also with its default hyperparameters. Whereas the first one clearly overfits when not tuned (to an extent that I wanted to verify), but gives an overall higher test set balanced accuracy, the second one gives highly correlated training and test set errors, with an overall lower test set balanced accuracy.
What We’ll Cover
- Setting up the environment with necessary packages
- Loading and preparing the data
- Comparing model performances
- Analyzing overfitting patterns
- Drawing conclusions about which approach might be better for specific scenarios
Let’s dive into the implementation and analysis.
!pip install genbooster
!pip install nnetsauce
Import the necessary packages.
import xgboost as xgb
import joblib
import requests
import io
from genbooster.genboosterclassifier import BoosterClassifier
from genbooster.randombagclassifier import RandomBagClassifier
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import BaggingRegressor
from sklearn.linear_model import Ridge, LinearRegression
from sklearn.gaussian_process import GaussianProcessRegressor
from tqdm import tqdm
from scipy import stats
from sklearn.model_selection import train_test_split
from sklearn.metrics import balanced_accuracy_score
from sklearn.model_selection import cross_val_score
from sklearn.metrics import make_scorer
from time import time
Obtain 72 data sets, a reduced form of the openml-cc18, available at https://github.com/thierrymoudiki/openml-cc18-reduced.
# Fetch the file content from the URL - **Use the raw file URL!**
url = 'https://github.com/thierrymoudiki/openml-cc18-reduced/raw/main/openml-cc18-Xys-2024-05-20.pkl'
# Use a session to handle potential connection interruptions
session = requests.Session()
response = session.get(url, stream=True)
response.raise_for_status() # Raise an exception for bad responses
# Load the data from the downloaded content in chunks
with io.BytesIO() as buffer:
for chunk in response.iter_content(chunk_size=1024*1024): # Download in 1MB chunks
buffer.write(chunk)
buffer.seek(0) # Reset buffer position to the beginning
clf_datasets = joblib.load(buffer)
1 - xgboost results
results = {}
balanced_accuracy_scorer = make_scorer(balanced_accuracy_score)
for i, dataset in tqdm(enumerate(clf_datasets.items())):
dataset_name = dataset[0]
print("\n ----------", (i + 1), "/", len(clf_datasets.items()), ": ", dataset_name)
try:
X, y = dataset[1]['dataset'][0], dataset[1]['dataset'][1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
# Split dataset into training and testing sets
model = xgb.XGBClassifier()
model.fit(X_train, y_train)
y_pred_train = model.predict(X_train)
y_pred = model.predict(X_test)
# Calculate balanced accuracy
balanced_acc_train = balanced_accuracy_score(y_train, y_pred_train)
balanced_acc_test = balanced_accuracy_score(y_test, y_pred)
print("Training Balanced Accuracy:", balanced_acc_train)
print("Testing Balanced Accuracy:", balanced_acc_test)
results[dataset_name] = {"training_score": balanced_acc_train,
"testing_score": balanced_acc_test}
except:
print("Error")
0it [00:00, ?it/s]
---------- 1 / 72 : kr-vs-kp
1it [00:00, 4.01it/s]
Training Balanced Accuracy: 0.9793965874420882
Testing Balanced Accuracy: 0.975187969924812
---------- 2 / 72 : letter
2it [00:03, 2.19s/it]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.7513736263736264
---------- 3 / 72 : balance-scale
3it [00:04, 1.52s/it]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.7260334744908249
---------- 4 / 72 : mfeat-factors
4it [00:09, 2.97s/it]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.86
---------- 5 / 72 : mfeat-fourier
5it [00:16, 4.19s/it]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.86
---------- 6 / 72 : breast-w
6it [00:17, 3.20s/it]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9470108695652174
---------- 7 / 72 : mfeat-karhunen
7it [00:22, 3.84s/it]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.905
---------- 8 / 72 : mfeat-morphological
8it [00:22, 2.76s/it]
Training Balanced Accuracy: 0.9862499999999998
Testing Balanced Accuracy: 0.7950000000000002
---------- 9 / 72 : mfeat-zernike
10it [00:23, 1.46s/it]
Training Balanced Accuracy: 0.99875
Testing Balanced Accuracy: 0.7799999999999999
---------- 10 / 72 : cmc
Training Balanced Accuracy: 0.9798009974256786
Testing Balanced Accuracy: 0.6773695839418791
---------- 11 / 72 : optdigits
13it [00:23, 1.66it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.905
---------- 12 / 72 : credit-approval
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.8353204172876304
---------- 13 / 72 : credit-g
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.6797619047619048
---------- 14 / 72 : pendigits
16it [00:24, 3.01it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9240601503759398
---------- 15 / 72 : diabetes
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.7112962962962963
---------- 16 / 72 : spambase
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9041740767862747
---------- 17 / 72 : splice
18it [00:24, 4.30it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9529914529914528
---------- 18 / 72 : tic-tac-toe
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9776119402985075
---------- 19 / 72 : vehicle
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.7570075757575757
---------- 20 / 72 : electricity
21it [00:24, 5.32it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.778005115089514
---------- 21 / 72 : satimage
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.8195766960114664
---------- 22 / 72 : eucalyptus
22it [00:25, 5.05it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.6379919821780285
---------- 23 / 72 : sick
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9583333333333333
---------- 24 / 72 : vowel
24it [00:25, 4.56it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.898989898989899
---------- 25 / 72 : isolet
25it [00:26, 2.19it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.6002747252747253
---------- 26 / 72 : analcatdata_authorship
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9850446428571429
---------- 27 / 72 : analcatdata_dmft
27it [00:27, 2.97it/s]
Training Balanced Accuracy: 0.5354622066211941
Testing Balanced Accuracy: 0.1808442851453604
---------- 28 / 72 : mnist_784
30it [00:27, 3.67it/s]
Training Balanced Accuracy: 0.9974358974358974
Testing Balanced Accuracy: 0.5634728124659475
---------- 29 / 72 : pc4
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.7357954545454546
---------- 30 / 72 : pc3
Training Balanced Accuracy: 0.9939024390243902
Testing Balanced Accuracy: 0.7138888888888889
---------- 31 / 72 : jm1
33it [00:28, 5.62it/s]
Training Balanced Accuracy: 0.9894845464612907
Testing Balanced Accuracy: 0.6519350215002389
---------- 32 / 72 : kc2
Training Balanced Accuracy: 0.9764705882352941
Testing Balanced Accuracy: 0.6002190580503833
---------- 33 / 72 : kc1
Training Balanced Accuracy: 0.9660003848559195
Testing Balanced Accuracy: 0.7827829738499714
---------- 34 / 72 : pc1
35it [00:28, 7.30it/s]
Training Balanced Accuracy: 0.9818181818181818
Testing Balanced Accuracy: 0.7803379416282642
---------- 35 / 72 : adult
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.7384868421052632
---------- 36 / 72 : Bioresponse
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.7790603891521323
---------- 37 / 72 : wdbc
39it [00:28, 9.19it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.933531746031746
---------- 38 / 72 : phoneme
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.8967423969227071
---------- 39 / 72 : qsar-biodeg
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9139827179890023
---------- 40 / 72 : wall-robot-navigation
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9885802469135803
---------- 41 / 72 : semeion
43it [00:29, 8.21it/s]
Training Balanced Accuracy: 0.6431816659664762
Testing Balanced Accuracy: 0.63
---------- 42 / 72 : ilpd
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.5405740609496811
---------- 43 / 72 : madelon
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.845
---------- 44 / 72 : nomao
45it [00:29, 9.50it/s]
Training Balanced Accuracy: 0.9890350877192983
Testing Balanced Accuracy: 0.9456508403876824
---------- 45 / 72 : ozone-level-8hr
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.7531879884821061
---------- 46 / 72 : cnae-9
Training Balanced Accuracy: 0.7333304959709455
Testing Balanced Accuracy: 0.699604743083004
---------- 47 / 72 : first-order-theorem-proving
50it [00:30, 7.28it/s]
Training Balanced Accuracy: 0.9974161661661661
Testing Balanced Accuracy: 0.44510582010582017
---------- 48 / 72 : banknote-authentication
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 1.0
---------- 49 / 72 : blood-transfusion-service-center
Training Balanced Accuracy: 0.8398196194712132
Testing Balanced Accuracy: 0.6754385964912281
---------- 50 / 72 : PhishingWebsites
Training Balanced Accuracy: 0.9715260585285342
Testing Balanced Accuracy: 0.9516145358841988
---------- 51 / 72 : cylinder-bands
52it [00:30, 8.60it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.8183730715287517
---------- 52 / 72 : bank-marketing
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.6352247605011054
---------- 53 / 72 : GesturePhaseSegmentationProcessed
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.473961038961039
---------- 54 / 72 : har
54it [00:31, 4.44it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.7054457054457054
---------- 55 / 72 : dresses-sales
Training Balanced Accuracy: 0.9970238095238095
Testing Balanced Accuracy: 0.5472085385878489
---------- 56 / 72 : texture
57it [00:31, 5.19it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9494949494949495
---------- 57 / 72 : connect-4
Training Balanced Accuracy: 0.8687185238384995
Testing Balanced Accuracy: 0.490677451203767
---------- 58 / 72 : MiceProtein
58it [00:32, 4.66it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9510996240601504
---------- 59 / 72 : steel-plates-fault
59it [00:32, 3.88it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.8302055604850634
---------- 60 / 72 : climate-model-simulation-crashes
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.7222222222222222
---------- 61 / 72 : wilt
Training Balanced Accuracy: 0.9880952380952381
Testing Balanced Accuracy: 0.9090909090909092
---------- 62 / 72 : car
62it [00:32, 5.28it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9357142857142857
---------- 63 / 72 : segment
63it [00:34, 1.90it/s]
Training Balanced Accuracy: 0.9974826446647592
Testing Balanced Accuracy: 0.9649894440534835
---------- 64 / 72 : mfeat-pixel
64it [00:35, 2.15it/s]
Training Balanced Accuracy: 0.9012499999999999
Testing Balanced Accuracy: 0.7649999999999999
---------- 65 / 72 : Fashion-MNIST
66it [00:36, 2.41it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.545
---------- 66 / 72 : jungle_chess_2pcs_raw_endgame_complete
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.7222113037136032
---------- 67 / 72 : numerai28.6
67it [00:36, 2.87it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.5044004400440043
---------- 68 / 72 : Devnagari-Script
68it [00:38, 1.32it/s]
Training Balanced Accuracy: 0.9909686700767264
Testing Balanced Accuracy: 0.2217391304347826
---------- 69 / 72 : CIFAR_10
71it [00:39, 2.08it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.215
---------- 70 / 72 : Internet-Advertisements
Training Balanced Accuracy: 0.967741724282422
Testing Balanced Accuracy: 0.9613787375415282
---------- 71 / 72 : dna
Training Balanced Accuracy: 0.9147506693440429
Testing Balanced Accuracy: 0.9043803418803419
---------- 72 / 72 : churn
72it [00:39, 1.84it/s]
Training Balanced Accuracy: 1.0
Testing Balanced Accuracy: 0.9111295681063123
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
# Your data
data = results
# Convert to DataFrame - correct approach
df = pd.DataFrame.from_dict({(i,j): data[i][j]
for i in data.keys()
for j in data[i].keys()},
orient='index')
# The index is a MultiIndex with (dataset, score_type)
# Let's properly unpack it
df = df.reset_index()
df.columns = ['index', 'score'] # Temporary column names
# Split the multi-index into separate columns
df[['dataset', 'score_type']] = pd.DataFrame(df['index'].tolist(), index=df.index)
df = df.drop(columns=['index'])
# Now pivot the data
plot_df = df.pivot(index='dataset', columns='score_type', values='score')
# Sort by test score for better visualization
plot_df = plot_df.sort_values('testing_score')
# Create figure
plt.figure(figsize=(14, 10), dpi=100)
# Plot training and test scores
sns.scatterplot(data=plot_df, x=plot_df.index, y='training_score',
color='blue', label='Training Score', s=100, alpha=0.7)
sns.scatterplot(data=plot_df, x=plot_df.index, y='testing_score',
color='red', label='Test Score', s=100, alpha=0.7)
# Add connecting lines
for i, dataset in enumerate(plot_df.index):
plt.plot([i, i], [plot_df.loc[dataset, 'training_score'],
plot_df.loc[dataset, 'testing_score']],
color='gray', alpha=0.3, linestyle='--')
# Add perfect score line
plt.axhline(1.0, color='green', linestyle=':', alpha=0.5, label='Perfect Score')
# Customize plot
plt.title('Training vs Test Scores Across Datasets', fontsize=16, pad=20)
plt.xlabel('Dataset', fontsize=12)
plt.ylabel('Score', fontsize=12)
plt.xticks(rotation=90, fontsize=8)
plt.yticks(np.arange(0, 1.1, 0.1))
plt.ylim(0, 1.05)
plt.legend(loc='upper right', bbox_to_anchor=(1.15, 1))
plt.grid(True, alpha=0.3)
# Adjust layout
plt.tight_layout()
plt.show()
import numpy as np
import pandas as pd
from scipy.stats import ttest_rel, wilcoxon, shapiro
import matplotlib.pyplot as plt
import seaborn as sns
def detect_overfitting(train_scores, test_scores, alpha=0.05, plot=True):
"""
Performs statistical tests to detect overfitting between training and test scores,
including BOTH paired t-test and Wilcoxon signed-rank test for robustness.
Parameters:
-----------
train_scores : array-like
Array of training scores (e.g., accuracies) for multiple datasets
test_scores : array-like
Array of test scores for the same datasets
alpha : float, default=0.05
Significance level for statistical tests
plot : bool, default=True
Whether to generate diagnostic plots
Returns:
--------
dict
Dictionary containing test results and effect sizes
"""
# Convert to numpy arrays
train_scores = np.asarray(train_scores)
test_scores = np.asarray(test_scores)
differences = train_scores - test_scores
# Initialize results dictionary
results = {
'n_datasets': len(train_scores),
'mean_train_score': np.mean(train_scores),
'mean_test_score': np.mean(test_scores),
'mean_difference': np.mean(differences),
'median_difference': np.median(differences),
'std_difference': np.std(differences, ddof=1),
}
# Normality test (for interpretation guidance)
_, shapiro_p = shapiro(differences)
results['normality_shapiro_p'] = shapiro_p
results['normal_distribution'] = shapiro_p >= alpha
# ========================
# 1. Paired t-test (parametric)
# ========================
t_stat, t_p = ttest_rel(train_scores, test_scores, alternative='greater')
results['paired_ttest'] = {
'statistic': t_stat,
'p_value': t_p,
'significant': t_p < alpha
}
# ========================
# 2. Wilcoxon signed-rank test (non-parametric)
# ========================
w_stat, w_p = wilcoxon(train_scores, test_scores, alternative='greater')
results['wilcoxon_test'] = {
'statistic': w_stat,
'p_value': w_p,
'significant': w_p < alpha
}
# ========================
# Effect size measures
# ========================
# Cohen's d (standardized mean difference)
results['cohens_d'] = results['mean_difference'] / results['std_difference'] if results['std_difference'] > 0 else 0
# Rank-biserial correlation (for Wilcoxon)
n = len(differences)
results['rank_biserial'] = 1 - (2 * w_stat) / (n * (n + 1)) if n > 0 else 0
# Effect size classification
for eff_size in ['cohens_d', 'rank_biserial']:
val = abs(results[eff_size])
if val > 0.8:
results[f'{eff_size}_interpret'] = 'large'
elif val > 0.5:
results[f'{eff_size}_interpret'] = 'medium'
else:
results[f'{eff_size}_interpret'] = 'small'
# ========================
# Generate plots
# ========================
if plot:
plt.figure(figsize=(15, 5))
# Plot 1: Training vs Test scores
plt.subplot(1, 3, 1)
sns.scatterplot(x=train_scores, y=test_scores)
plt.plot([0, 1], [0, 1], 'k--', alpha=0.5)
plt.xlabel('Training Scores')
plt.ylabel('Test Scores')
plt.title('Training vs Test Scores')
plt.grid(True, alpha=0.3)
# Plot 2: Distribution of differences
plt.subplot(1, 3, 2)
sns.histplot(differences, kde=True)
plt.axvline(results['mean_difference'], color='r', linestyle='--', label='Mean')
plt.axvline(results['median_difference'], color='g', linestyle=':', label='Median')
plt.xlabel('Train Score - Test Score')
plt.title('Distribution of Differences')
plt.legend()
plt.grid(True, alpha=0.3)
# Plot 3: Boxplot of scores
plt.subplot(1, 3, 3)
pd.DataFrame({
'Training': train_scores,
'Test': test_scores
}).boxplot()
plt.title('Score Distributions')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
return results
# Run the analysis
training_scores = [result['training_score'] for result in results.values()]
testing_scores = [result['testing_score'] for result in results.values()]
results = detect_overfitting(training_scores, testing_scores)
# Print results in a readable format
print("=== Overfitting Detection Report ===")
print(f"Datasets: {results['n_datasets']}")
print(f"Mean training score: {results['mean_train_score']:.3f}")
print(f"Mean test score: {results['mean_test_score']:.3f}")
print(f"Mean difference: {results['mean_difference']:.3f} (SD={results['std_difference']:.3f})")
print(f"Spearman rho: {stats.spearmanr(training_scores, testing_scores).correlation:.3f}")
print("\n=== Normality Test ===")
print(f"Shapiro-Wilk p-value: {results['normality_shapiro_p']:.4f}")
print("Interpretation:", "Normal distribution" if results['normal_distribution'] else "Non-normal distribution")
print("\n=== Statistical Tests ===")
print("1. Paired t-test:")
print(f" t = {results['paired_ttest']['statistic']:.3f}, p = {results['paired_ttest']['p_value']:.5f}")
print(f" Significant: {'✅' if results['paired_ttest']['significant'] else '❌'}")
print("\n2. Wilcoxon signed-rank test:")
print(f" W = {results['wilcoxon_test']['statistic']:.3f}, p = {results['wilcoxon_test']['p_value']:.5f}")
print(f" Significant: {'✅' if results['wilcoxon_test']['significant'] else '❌'}")
print("\n=== Effect Sizes ===")
print(f"Cohen's d: {results['cohens_d']:.3f} ({results['cohens_d_interpret']} effect)")
print(f"Rank-biserial correlation: {results['rank_biserial']:.3f} ({results['rank_biserial_interpret']} effect)")
=== Overfitting Detection Report ===
Datasets: 72
Mean training score: 0.975
Mean test score: 0.764
Mean difference: 0.210 (SD=0.175)
Spearman rho: 0.208
=== Normality Test ===
Shapiro-Wilk p-value: 0.0001
Interpretation: Non-normal distribution
=== Statistical Tests ===
1. Paired t-test:
t = 10.184, p = 0.00000
Significant: ✅
2. Wilcoxon signed-rank test:
W = 2556.000, p = 0.00000
Significant: ✅
=== Effect Sizes ===
Cohen's d: 1.200 (large effect)
Rank-biserial correlation: 0.027 (small effect)
2 - Genbooster
from os import name
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor, ExtraTreeRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.neural_network import MLPRegressor
estimators = [LinearRegression()]
name_estimators = ["LinearRegression"]
balanced_accuracy_scorer = make_scorer(balanced_accuracy_score)
for j, est in enumerate(estimators):
results = {}
print("\n ---------- GenBooster +", name_estimators[j])
try:
for i, dataset in tqdm(enumerate(clf_datasets.items())):
dataset_name = dataset[0]
print("\n ----------", (i + 1), "/", len(clf_datasets.items()), ": ", dataset_name)
try:
X, y = dataset[1]['dataset'][0], dataset[1]['dataset'][1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
X_train = X_train.astype(np.float64)
X_test = X_test.astype(np.float64)
# Split dataset into training and testing sets
model = BoosterClassifier(base_estimator=est)
model.fit(X_train, y_train)
y_pred_train = model.predict(X_train)
y_pred = model.predict(X_test)
# Calculate balanced accuracy
balanced_acc_train = balanced_accuracy_score(y_train, y_pred_train)
balanced_acc_test = balanced_accuracy_score(y_test, y_pred)
print("Training Balanced Accuracy:", balanced_acc_train)
print("Testing Balanced Accuracy:", balanced_acc_test)
results[dataset_name] = {"training_score": balanced_acc_train,
"testing_score": balanced_acc_test}
except Exception as e:
print("Error", e)
continue
except Exception as e:
print("Error", e)
continue
# Your data
data = results
# Convert to DataFrame - correct approach
df = pd.DataFrame.from_dict({(i,j): data[i][j]
for i in data.keys()
for j in data[i].keys()},
orient='index')
# The index is a MultiIndex with (dataset, score_type)
# Let's properly unpack it
df = df.reset_index()
df.columns = ['index', 'score'] # Temporary column names
# Split the multi-index into separate columns
df[['dataset', 'score_type']] = pd.DataFrame(df['index'].tolist(), index=df.index)
df = df.drop(columns=['index'])
# Now pivot the data
plot_df = df.pivot(index='dataset', columns='score_type', values='score')
# Sort by test score for better visualization
plot_df = plot_df.sort_values('testing_score')
# Create figure
plt.figure(figsize=(14, 10), dpi=100)
# Plot training and test scores
sns.scatterplot(data=plot_df, x=plot_df.index, y='training_score',
color='blue', label='Training Score', s=100, alpha=0.7)
sns.scatterplot(data=plot_df, x=plot_df.index, y='testing_score',
color='red', label='Test Score', s=100, alpha=0.7)
# Add connecting lines
for i, dataset in enumerate(plot_df.index):
plt.plot([i, i], [plot_df.loc[dataset, 'training_score'],
plot_df.loc[dataset, 'testing_score']],
color='gray', alpha=0.3, linestyle='--')
# Add perfect score line
plt.axhline(1.0, color='green', linestyle=':', alpha=0.5, label='Perfect Score')
# Customize plot
plt.title(f'Training vs Test Scores Across Datasets for' + name_estimators[j], fontsize=16, pad=20)
plt.xlabel('Dataset', fontsize=12)
plt.ylabel('Score', fontsize=12)
plt.xticks(rotation=90, fontsize=8)
plt.yticks(np.arange(0, 1.1, 0.1))
plt.ylim(0, 1.05)
plt.legend(loc='upper right', bbox_to_anchor=(1.15, 1))
plt.grid(True, alpha=0.3)
# Adjust layout
plt.tight_layout()
plt.show()
# Run the analysis
training_scores = [result['training_score'] for result in results.values()]
testing_scores = [result['testing_score'] for result in results.values()]
results = detect_overfitting(training_scores, testing_scores)
# Print results in a readable format
print("=== Overfitting Detection Report ===")
print(f"Datasets: {results['n_datasets']}")
print(f"Mean training score: {results['mean_train_score']:.3f}")
print(f"Mean test score: {results['mean_test_score']:.3f}")
print(f"Mean difference: {results['mean_difference']:.3f} (SD={results['std_difference']:.3f})")
print(f"Spearman rho: {stats.spearmanr(training_scores, testing_scores).correlation:.3f}")
print("\n=== Normality Test ===")
print(f"Shapiro-Wilk p-value: {results['normality_shapiro_p']:.4f}")
print("Interpretation:", "Normal distribution" if results['normal_distribution'] else "Non-normal distribution")
print("\n=== Statistical Tests ===")
print("1. Paired t-test:")
print(f" t = {results['paired_ttest']['statistic']:.3f}, p = {results['paired_ttest']['p_value']:.5f}")
print(f" Significant: {'✅' if results['paired_ttest']['significant'] else '❌'}")
print("\n2. Wilcoxon signed-rank test:")
print(f" W = {results['wilcoxon_test']['statistic']:.3f}, p = {results['wilcoxon_test']['p_value']:.5f}")
print(f" Significant: {'✅' if results['wilcoxon_test']['significant'] else '❌'}")
print("\n=== Effect Sizes ===")
print(f"Cohen's d: {results['cohens_d']:.3f} ({results['cohens_d_interpret']} effect)")
print(f"Rank-biserial correlation: {results['rank_biserial']:.3f} ({results['rank_biserial_interpret']} effect)")
---------- GenBooster + LinearRegression
0it [00:00, ?it/s]
---------- 1 / 72 : kr-vs-kp
1it [00:00, 2.21it/s]
Training Balanced Accuracy: 0.9480928346328172
Testing Balanced Accuracy: 0.9182957393483708
---------- 2 / 72 : letter
2it [00:08, 4.76s/it]
Training Balanced Accuracy: 0.6532232216708022
Testing Balanced Accuracy: 0.6037087912087912
---------- 3 / 72 : balance-scale
3it [00:08, 2.76s/it]
Training Balanced Accuracy: 0.6579835623313884
Testing Balanced Accuracy: 0.6492236337971365
---------- 4 / 72 : mfeat-factors
4it [00:10, 2.50s/it]
Training Balanced Accuracy: 0.8712500000000001
Testing Balanced Accuracy: 0.8649999999999999
---------- 5 / 72 : mfeat-fourier
5it [00:12, 2.35s/it]
Training Balanced Accuracy: 0.8174999999999999
Testing Balanced Accuracy: 0.7849999999999999
---------- 6 / 72 : breast-w
6it [00:13, 1.64s/it]
Training Balanced Accuracy: 0.9810158838019196
Testing Balanced Accuracy: 0.962409420289855
---------- 7 / 72 : mfeat-karhunen
7it [00:15, 1.80s/it]
Training Balanced Accuracy: 0.8699999999999999
Testing Balanced Accuracy: 0.865
---------- 8 / 72 : mfeat-morphological
8it [00:16, 1.70s/it]
Training Balanced Accuracy: 0.75
Testing Balanced Accuracy: 0.7250000000000001
---------- 9 / 72 : mfeat-zernike
9it [00:21, 2.61s/it]
Training Balanced Accuracy: 0.7987499999999998
Testing Balanced Accuracy: 0.74
---------- 10 / 72 : cmc
10it [00:21, 2.00s/it]
Training Balanced Accuracy: 0.539420682309971
Testing Balanced Accuracy: 0.48957420514548927
---------- 11 / 72 : optdigits
11it [00:23, 2.01s/it]
Training Balanced Accuracy: 0.8338839752605576
Testing Balanced Accuracy: 0.865
---------- 12 / 72 : credit-approval
12it [00:24, 1.49s/it]
Training Balanced Accuracy: 0.8800414474732983
Testing Balanced Accuracy: 0.7980625931445604
---------- 13 / 72 : credit-g
13it [00:24, 1.17s/it]
Training Balanced Accuracy: 0.6815476190476191
Testing Balanced Accuracy: 0.7059523809523809
---------- 14 / 72 : pendigits
14it [00:26, 1.46s/it]
Training Balanced Accuracy: 0.9082342205793651
Testing Balanced Accuracy: 0.8939849624060152
---------- 15 / 72 : diabetes
15it [00:27, 1.10s/it]
Training Balanced Accuracy: 0.7434696261682243
Testing Balanced Accuracy: 0.6457407407407407
---------- 16 / 72 : spambase
16it [00:27, 1.09it/s]
Training Balanced Accuracy: 0.8971008789190608
Testing Balanced Accuracy: 0.8953865467099069
---------- 17 / 72 : splice
17it [00:28, 1.19it/s]
Training Balanced Accuracy: 0.8904200133868808
Testing Balanced Accuracy: 0.8835470085470085
---------- 18 / 72 : tic-tac-toe
18it [00:28, 1.40it/s]
Training Balanced Accuracy: 0.7092305954129476
Testing Balanced Accuracy: 0.7153432835820895
---------- 19 / 72 : vehicle
19it [00:29, 1.33it/s]
Training Balanced Accuracy: 0.7512680047291669
Testing Balanced Accuracy: 0.7388257575757575
---------- 20 / 72 : electricity
20it [00:29, 1.54it/s]
Training Balanced Accuracy: 0.7804471490091439
Testing Balanced Accuracy: 0.7631713554987212
---------- 21 / 72 : satimage
21it [00:33, 1.45s/it]
Training Balanced Accuracy: 0.7904065610735302
Testing Balanced Accuracy: 0.7254887702816034
---------- 22 / 72 : eucalyptus
22it [00:34, 1.36s/it]
Training Balanced Accuracy: 0.5992921311574563
Testing Balanced Accuracy: 0.5444643212085073
---------- 23 / 72 : sick
23it [00:34, 1.09s/it]
Training Balanced Accuracy: 0.6421904761904762
Testing Balanced Accuracy: 0.5416666666666666
---------- 24 / 72 : vowel
24it [00:37, 1.45s/it]
Training Balanced Accuracy: 0.7588383838383838
Testing Balanced Accuracy: 0.6565656565656567
---------- 25 / 72 : isolet
25it [00:42, 2.61s/it]
Training Balanced Accuracy: 0.5353598014888339
Testing Balanced Accuracy: 0.4539835164835164
---------- 26 / 72 : analcatdata_authorship
26it [00:43, 2.07s/it]
Training Balanced Accuracy: 0.9691163839829222
Testing Balanced Accuracy: 0.9847136047215497
---------- 27 / 72 : analcatdata_dmft
27it [00:44, 1.79s/it]
Training Balanced Accuracy: 0.31318315240756367
Testing Balanced Accuracy: 0.2114058144165671
---------- 28 / 72 : mnist_784
28it [00:47, 2.33s/it]
Training Balanced Accuracy: 0.6018919115939201
Testing Balanced Accuracy: 0.5142492099814755
---------- 29 / 72 : pc4
29it [00:48, 1.77s/it]
Training Balanced Accuracy: 0.6706163207080265
Testing Balanced Accuracy: 0.5482954545454546
---------- 30 / 72 : pc3
30it [00:48, 1.37s/it]
Training Balanced Accuracy: 0.5466884375956731
Testing Balanced Accuracy: 0.525
---------- 31 / 72 : jm1
31it [00:49, 1.09s/it]
Training Balanced Accuracy: 0.5669233866908285
Testing Balanced Accuracy: 0.5454690237298934
---------- 32 / 72 : kc2
32it [00:49, 1.17it/s]
Training Balanced Accuracy: 0.7185861091424521
Testing Balanced Accuracy: 0.651697699890471
---------- 33 / 72 : kc1
33it [00:50, 1.36it/s]
Training Balanced Accuracy: 0.5635553470919324
Testing Balanced Accuracy: 0.5556403893872877
---------- 34 / 72 : pc1
34it [00:50, 1.52it/s]
Training Balanced Accuracy: 0.5
Testing Balanced Accuracy: 0.5
---------- 35 / 72 : adult
35it [00:50, 1.69it/s]
Training Balanced Accuracy: 0.7400971341967484
Testing Balanced Accuracy: 0.6551535087719298
---------- 36 / 72 : Bioresponse
36it [00:51, 1.84it/s]
Training Balanced Accuracy: 0.7767986723709284
Testing Balanced Accuracy: 0.732331888295191
---------- 37 / 72 : wdbc
37it [00:51, 2.11it/s]
Training Balanced Accuracy: 0.9478328173374613
Testing Balanced Accuracy: 0.9623015873015872
---------- 38 / 72 : phoneme
38it [00:52, 2.38it/s]
Training Balanced Accuracy: 0.707737690038575
Testing Balanced Accuracy: 0.6642024281764636
---------- 39 / 72 : qsar-biodeg
39it [00:52, 2.37it/s]
Training Balanced Accuracy: 0.8324861723727508
Testing Balanced Accuracy: 0.816967792615868
---------- 40 / 72 : wall-robot-navigation
40it [00:53, 1.80it/s]
Training Balanced Accuracy: 0.653627904631567
Testing Balanced Accuracy: 0.5904361071027737
---------- 41 / 72 : semeion
41it [00:55, 1.02s/it]
Training Balanced Accuracy: 0.5926607850658483
Testing Balanced Accuracy: 0.6
---------- 42 / 72 : ilpd
42it [00:55, 1.24it/s]
Training Balanced Accuracy: 0.597766939872203
Testing Balanced Accuracy: 0.5581148121899362
---------- 43 / 72 : madelon
43it [00:56, 1.44it/s]
Training Balanced Accuracy: 0.73625
Testing Balanced Accuracy: 0.7
---------- 44 / 72 : nomao
44it [00:56, 1.53it/s]
Training Balanced Accuracy: 0.8829423602789812
Testing Balanced Accuracy: 0.9193963930806036
---------- 45 / 72 : ozone-level-8hr
45it [00:58, 1.18it/s]
Training Balanced Accuracy: 0.5
Testing Balanced Accuracy: 0.5
---------- 46 / 72 : cnae-9
46it [01:00, 1.45s/it]
Training Balanced Accuracy: 0.7333304959709455
Testing Balanced Accuracy: 0.699604743083004
---------- 47 / 72 : first-order-theorem-proving
47it [01:02, 1.39s/it]
Training Balanced Accuracy: 0.2568945598032055
Testing Balanced Accuracy: 0.27648809523809526
---------- 48 / 72 : banknote-authentication
48it [01:02, 1.06s/it]
Training Balanced Accuracy: 0.9954954954954955
Testing Balanced Accuracy: 1.0
---------- 49 / 72 : blood-transfusion-service-center
49it [01:02, 1.22it/s]
Training Balanced Accuracy: 0.5951321966889054
Testing Balanced Accuracy: 0.611842105263158
---------- 50 / 72 : PhishingWebsites
50it [01:03, 1.41it/s]
Training Balanced Accuracy: 0.9133593601218816
Testing Balanced Accuracy: 0.9122886931875696
---------- 51 / 72 : cylinder-bands
51it [01:03, 1.72it/s]
Training Balanced Accuracy: 0.6772527472527472
Testing Balanced Accuracy: 0.6669004207573632
---------- 52 / 72 : bank-marketing
52it [01:03, 1.86it/s]
Training Balanced Accuracy: 0.5602668372475554
Testing Balanced Accuracy: 0.5917464996315401
---------- 53 / 72 : GesturePhaseSegmentationProcessed
53it [01:04, 1.44it/s]
Training Balanced Accuracy: 0.4052738438032556
Testing Balanced Accuracy: 0.3567099567099567
---------- 54 / 72 : har
54it [01:06, 1.17it/s]
Training Balanced Accuracy: 0.6586891684653934
Testing Balanced Accuracy: 0.6022327522327522
---------- 55 / 72 : dresses-sales
55it [01:06, 1.46it/s]
Training Balanced Accuracy: 0.6180213464696224
Testing Balanced Accuracy: 0.5049261083743842
---------- 56 / 72 : texture
56it [01:08, 1.16s/it]
Training Balanced Accuracy: 0.9217171717171717
Testing Balanced Accuracy: 0.9090909090909093
---------- 57 / 72 : connect-4
57it [01:09, 1.00s/it]
Training Balanced Accuracy: 0.36527957383567194
Testing Balanced Accuracy: 0.36229643372500514
---------- 58 / 72 : MiceProtein
58it [01:13, 2.01s/it]
Training Balanced Accuracy: 0.7886130536130537
Testing Balanced Accuracy: 0.7212312030075188
---------- 59 / 72 : steel-plates-fault
59it [01:15, 1.85s/it]
Training Balanced Accuracy: 0.5668243559177928
Testing Balanced Accuracy: 0.5421214137829045
---------- 60 / 72 : climate-model-simulation-crashes
60it [01:15, 1.38s/it]
Training Balanced Accuracy: 0.6486486486486487
Testing Balanced Accuracy: 0.6111111111111112
---------- 61 / 72 : wilt
61it [01:15, 1.06s/it]
Training Balanced Accuracy: 0.6660061646851607
Testing Balanced Accuracy: 0.6818181818181819
---------- 62 / 72 : car
62it [01:16, 1.08it/s]
Training Balanced Accuracy: 0.5475360592569449
Testing Balanced Accuracy: 0.5132936507936507
---------- 63 / 72 : segment
63it [01:17, 1.09s/it]
Training Balanced Accuracy: 0.821012708763058
Testing Balanced Accuracy: 0.8258268824771289
---------- 64 / 72 : mfeat-pixel
64it [01:23, 2.42s/it]
Training Balanced Accuracy: 0.7024999999999999
Testing Balanced Accuracy: 0.7100000000000001
---------- 65 / 72 : Fashion-MNIST
65it [01:27, 2.94s/it]
Training Balanced Accuracy: 0.5262499999999999
Testing Balanced Accuracy: 0.43499999999999994
---------- 66 / 72 : jungle_chess_2pcs_raw_endgame_complete
66it [01:27, 2.19s/it]
Training Balanced Accuracy: 0.5535725243283519
Testing Balanced Accuracy: 0.5305783752385694
---------- 67 / 72 : numerai28.6
67it [01:28, 1.67s/it]
Training Balanced Accuracy: 0.6070716881814764
Testing Balanced Accuracy: 0.5743074307430743
---------- 68 / 72 : Devnagari-Script
68it [01:38, 4.07s/it]
Training Balanced Accuracy: 0.2718190537084399
Testing Balanced Accuracy: 0.15
---------- 69 / 72 : CIFAR_10
69it [01:40, 3.60s/it]
Training Balanced Accuracy: 0.29875
Testing Balanced Accuracy: 0.205
---------- 70 / 72 : Internet-Advertisements
70it [01:41, 2.66s/it]
Training Balanced Accuracy: 0.9168827257490049
Testing Balanced Accuracy: 0.9555647840531561
---------- 71 / 72 : dna
71it [01:41, 2.07s/it]
Training Balanced Accuracy: 0.8917670682730924
Testing Balanced Accuracy: 0.8936965811965812
---------- 72 / 72 : churn
72it [01:42, 1.42s/it]
Training Balanced Accuracy: 0.5900049020872572
Testing Balanced Accuracy: 0.5955149501661129
=== Overfitting Detection Report ===
Datasets: 72
Mean training score: 0.692
Mean test score: 0.661
Mean difference: 0.031 (SD=0.041)
Spearman rho: 0.972
=== Normality Test ===
Shapiro-Wilk p-value: 0.0153
Interpretation: Non-normal distribution
=== Statistical Tests ===
1. Paired t-test:
t = 6.302, p = 0.00000
Significant: ✅
2. Wilcoxon signed-rank test:
W = 2125.000, p = 0.00000
Significant: ✅
=== Effect Sizes ===
Cohen's d: 0.743 (medium effect)
Rank-biserial correlation: 0.191 (small effect)
But… Which one is the best in this particular setting?
For attribution, please cite this work as:
T. Moudiki (2025-06-14). An Overfitting dilemma: XGBoost Default Hyperparameters vs GenericBooster + LinearRegression Default Hyperparameters. Retrieved from https://thierrymoudiki.github.io/blog/2025/06/14/python/xgboost-default-overfitting
BibTeX citation (remove empty spaces)@misc{ tmoudiki20250614, author = { T. Moudiki }, title = { An Overfitting dilemma: XGBoost Default Hyperparameters vs GenericBooster + LinearRegression Default Hyperparameters }, url = { https://thierrymoudiki.github.io/blog/2025/06/14/python/xgboost-default-overfitting }, year = { 2025 } }
Previous publications
- I'm supposed to present 'Conformal Predictive Simulations for Univariate Time Series' at COPA CONFERENCE 2025 in London... Sep 4, 2025
- external regressors in ahead::dynrmf's interface for Machine learning forecasting Sep 1, 2025
- Another interesting decision, now for 'Beyond Nelson-Siegel and splines: A model-agnostic Machine Learning framework for discount curve calibration, interpolation and extrapolation' Aug 20, 2025
- Boosting any randomized based learner for regression, classification and univariate/multivariate time series forcasting Jul 26, 2025
- New nnetsauce version with CustomBackPropRegressor (CustomRegressor with Backpropagation) and ElasticNet2Regressor (Ridge2 with ElasticNet regularization) Jul 15, 2025
- mlsauce (home to a model-agnostic gradient boosting algorithm) can now be installed from PyPI. Jul 10, 2025
- A user-friendly graphical interface to techtonique dot net's API (will eventually contain graphics). Jul 8, 2025
- Calling =TECHTO_MLCLASSIFICATION for Machine Learning supervised CLASSIFICATION in Excel is just a matter of copying and pasting Jul 7, 2025
- Calling =TECHTO_MLREGRESSION for Machine Learning supervised regression in Excel is just a matter of copying and pasting Jul 6, 2025
- Calling =TECHTO_RESERVING and =TECHTO_MLRESERVING for claims triangle reserving in Excel is just a matter of copying and pasting Jul 5, 2025
- Calling =TECHTO_SURVIVAL for Survival Analysis in Excel is just a matter of copying and pasting Jul 4, 2025
- Calling =TECHTO_SIMULATION for Stochastic Simulation in Excel is just a matter of copying and pasting Jul 3, 2025
- Calling =TECHTO_FORECAST for forecasting in Excel is just a matter of copying and pasting Jul 2, 2025
- Random Vector Functional Link (RVFL) artificial neural network with 2 regularization parameters successfully used for forecasting/synthetic simulation in professional settings: Extensions (including Bayesian) Jul 1, 2025
- R version of 'Backpropagating quasi-randomized neural networks' Jun 24, 2025
- Backpropagating quasi-randomized neural networks Jun 23, 2025
- Beyond ARMA-GARCH: leveraging any statistical model for volatility forecasting Jun 21, 2025
- Stacked generalization (Machine Learning model stacking) + conformal prediction for forecasting with ahead::mlf Jun 18, 2025
- An Overfitting dilemma: XGBoost Default Hyperparameters vs GenericBooster + LinearRegression Default Hyperparameters Jun 14, 2025
- Programming language-agnostic reserving using RidgeCV, LightGBM, XGBoost, and ExtraTrees Machine Learning models Jun 13, 2025
- Exceptionally, and on a more personal note (otherwise I may get buried alive)... Jun 10, 2025
- Free R, Python and SQL editors in techtonique dot net Jun 9, 2025
- Beyond Nelson-Siegel and splines: A model-agnostic Machine Learning framework for discount curve calibration, interpolation and extrapolation Jun 7, 2025
- scikit-learn, glmnet, xgboost, lightgbm, pytorch, keras, nnetsauce in probabilistic Machine Learning (for longitudinal data) Reserving (work in progress) Jun 6, 2025
- R version of Probabilistic Machine Learning (for longitudinal data) Reserving (work in progress) Jun 5, 2025
- Probabilistic Machine Learning (for longitudinal data) Reserving (work in progress) Jun 4, 2025
- Python version of Beyond ARMA-GARCH: leveraging model-agnostic Quasi-Randomized networks and conformal prediction for nonparametric probabilistic stock forecasting (ML-ARCH) Jun 3, 2025
- Beyond ARMA-GARCH: leveraging model-agnostic Machine Learning and conformal prediction for nonparametric probabilistic stock forecasting (ML-ARCH) Jun 2, 2025
- Permutations and SHAPley values for feature importance in techtonique dot net's API (with R + Python + the command line) Jun 1, 2025
- Which patient is going to survive longer? Another guide to using techtonique dot net's API (with R + Python + the command line) for survival analysis May 31, 2025
- A Guide to Using techtonique.net's API and rush for simulating and plotting Stochastic Scenarios May 30, 2025
- Simulating Stochastic Scenarios with Diffusion Models: A Guide to Using techtonique.net's API for the purpose May 29, 2025
- Will my apartment in 5th avenue be overpriced or not? Harnessing the power of www.techtonique.net (+ xgboost, lightgbm, catboost) to find out May 28, 2025
- How long must I wait until something happens: A Comprehensive Guide to Survival Analysis via an API May 27, 2025
- Harnessing the Power of techtonique.net: A Comprehensive Guide to Machine Learning Classification via an API May 26, 2025
- Quantile regression with any regressor -- Examples with RandomForestRegressor, RidgeCV, KNeighborsRegressor May 20, 2025
- Survival stacking: survival analysis translated as supervised classification in R and Python May 5, 2025
- 'Bayesian' optimization of hyperparameters in a R machine learning model using the bayesianrvfl package Apr 25, 2025
- A lightweight interface to scikit-learn in R: Bayesian and Conformal prediction Apr 21, 2025
- A lightweight interface to scikit-learn in R Pt.2: probabilistic time series forecasting in conjunction with ahead::dynrmf Apr 20, 2025
- Extending the Theta forecasting method to GLMs, GAMs, GLMBOOST and attention: benchmarking on Tourism, M1, M3 and M4 competition data sets (28000 series) Apr 14, 2025
- Extending the Theta forecasting method to GLMs and attention Apr 8, 2025
- Nonlinear conformalized Generalized Linear Models (GLMs) with R package 'rvfl' (and other models) Mar 31, 2025
- Probabilistic Time Series Forecasting (predictive simulations) in Microsoft Excel using Python, xlwings lite and www.techtonique.net Mar 28, 2025
- Conformalize (improved prediction intervals and simulations) any R Machine Learning model with misc::conformalize Mar 25, 2025
- My poster for the 18th FINANCIAL RISKS INTERNATIONAL FORUM by Institut Louis Bachelier/Fondation du Risque/Europlace Institute of Finance Mar 19, 2025
- Interpretable probabilistic kernel ridge regression using Matérn 3/2 kernels Mar 16, 2025
- (News from) Probabilistic Forecasting of univariate and multivariate Time Series using Quasi-Randomized Neural Networks (Ridge2) and Conformal Prediction Mar 9, 2025
- Word-Online: re-creating Karpathy's char-RNN (with supervised linear online learning of word embeddings) for text completion Mar 8, 2025
- CRAN-like repository for most recent releases of Techtonique's R packages Mar 2, 2025
- Presenting 'Online Probabilistic Estimation of Carbon Beta and Carbon Shapley Values for Financial and Climate Risk' at Institut Louis Bachelier Feb 27, 2025
- Web app with DeepSeek R1 and Hugging Face API for chatting Feb 23, 2025
- tisthemachinelearner: A Lightweight interface to scikit-learn with 2 classes, Classifier and Regressor (in Python and R) Feb 17, 2025
- R version of survivalist: Probabilistic model-agnostic survival analysis using scikit-learn, xgboost, lightgbm (and conformal prediction) Feb 12, 2025
- Model-agnostic global Survival Prediction of Patients with Myeloid Leukemia in QRT/Gustave Roussy Challenge (challengedata.ens.fr): Python's survivalist Quickstart Feb 10, 2025
- A simple test of the martingale hypothesis in esgtoolkit Feb 3, 2025
- Command Line Interface (CLI) for techtonique.net's API Jan 31, 2025
- Gradient-Boosting and Boostrap aggregating anything (alert: high performance): Part5, easier install and Rust backend Jan 27, 2025
- Just got a paper on conformal prediction REJECTED by International Journal of Forecasting despite evidence on 30,000 time series (and more). What's going on? Part2: 1311 time series from the Tourism competition Jan 20, 2025
- Techtonique is out! (with a tutorial in various programming languages and formats) Jan 14, 2025
- Univariate and Multivariate Probabilistic Forecasting with nnetsauce and TabPFN Jan 14, 2025
- Just got a paper on conformal prediction REJECTED by International Journal of Forecasting despite evidence on 30,000 time series (and more). What's going on? Jan 5, 2025
- Python and Interactive dashboard version of Stock price forecasting with Deep Learning: throwing power at the problem (and why it won't make you rich) Dec 31, 2024
- Stock price forecasting with Deep Learning: throwing power at the problem (and why it won't make you rich) Dec 29, 2024
- No-code Machine Learning Cross-validation and Interpretability in techtonique.net Dec 23, 2024
- survivalist: Probabilistic model-agnostic survival analysis using scikit-learn, glmnet, xgboost, lightgbm, pytorch, keras, nnetsauce and mlsauce Dec 15, 2024
- Model-agnostic 'Bayesian' optimization (for hyperparameter tuning) using conformalized surrogates in GPopt Dec 9, 2024
- You can beat Forecasting LLMs (Large Language Models a.k.a foundation models) with nnetsauce.MTS Pt.2: Generic Gradient Boosting Dec 1, 2024
- You can beat Forecasting LLMs (Large Language Models a.k.a foundation models) with nnetsauce.MTS Nov 24, 2024
- Unified interface and conformal prediction (calibrated prediction intervals) for R package forecast (and 'affiliates') Nov 23, 2024
- GLMNet in Python: Generalized Linear Models Nov 18, 2024
- Gradient-Boosting anything (alert: high performance): Part4, Time series forecasting Nov 10, 2024
- Predictive scenarios simulation in R, Python and Excel using Techtonique API Nov 3, 2024
- Chat with your tabular data in www.techtonique.net Oct 30, 2024
- Gradient-Boosting anything (alert: high performance): Part3, Histogram-based boosting Oct 28, 2024
- R editor and SQL console (in addition to Python editors) in www.techtonique.net Oct 21, 2024
- R and Python consoles + JupyterLite in www.techtonique.net Oct 15, 2024
- Gradient-Boosting anything (alert: high performance): Part2, R version Oct 14, 2024
- Gradient-Boosting anything (alert: high performance) Oct 6, 2024
- Benchmarking 30 statistical/Machine Learning models on the VN1 Forecasting -- Accuracy challenge Oct 4, 2024
- Automated random variable distribution inference using Kullback-Leibler divergence and simulating best-fitting distribution Oct 2, 2024
- Forecasting in Excel using Techtonique's Machine Learning APIs under the hood Sep 30, 2024
- Techtonique web app for data-driven decisions using Mathematics, Statistics, Machine Learning, and Data Visualization Sep 25, 2024
- Parallel for loops (Map or Reduce) + New versions of nnetsauce and ahead Sep 16, 2024
- Adaptive (online/streaming) learning with uncertainty quantification using Polyak averaging in learningmachine Sep 10, 2024
- New versions of nnetsauce and ahead Sep 9, 2024
- Prediction sets and prediction intervals for conformalized Auto XGBoost, Auto LightGBM, Auto CatBoost, Auto GradientBoosting Sep 2, 2024
- Quick/automated R package development workflow (assuming you're using macOS or Linux) Part2 Aug 30, 2024
- R package development workflow (assuming you're using macOS or Linux) Aug 27, 2024
- A new method for deriving a nonparametric confidence interval for the mean Aug 26, 2024
- Conformalized adaptive (online/streaming) learning using learningmachine in Python and R Aug 19, 2024
- Bayesian (nonlinear) adaptive learning Aug 12, 2024
- Auto XGBoost, Auto LightGBM, Auto CatBoost, Auto GradientBoosting Aug 5, 2024
- Copulas for uncertainty quantification in time series forecasting Jul 28, 2024
- Forecasting uncertainty: sequential split conformal prediction + Block bootstrap (web app) Jul 22, 2024
- learningmachine for Python (new version) Jul 15, 2024
- learningmachine v2.0.0: Machine Learning with explanations and uncertainty quantification Jul 8, 2024
- My presentation at ISF 2024 conference (slides with nnetsauce probabilistic forecasting news) Jul 3, 2024
- 10 uncertainty quantification methods in nnetsauce forecasting Jul 1, 2024
- Forecasting with XGBoost embedded in Quasi-Randomized Neural Networks Jun 24, 2024
- Forecasting Monthly Airline Passenger Numbers with Quasi-Randomized Neural Networks Jun 17, 2024
- Automated hyperparameter tuning using any conformalized surrogate Jun 9, 2024
- Recognizing handwritten digits with Ridge2Classifier Jun 3, 2024
- Forecasting the Economy May 27, 2024
- A detailed introduction to Deep Quasi-Randomized 'neural' networks May 19, 2024
- Probability of receiving a loan; using learningmachine May 12, 2024
- mlsauce's `v0.18.2`: various examples and benchmarks with dimension reduction May 6, 2024
- mlsauce's `v0.17.0`: boosting with Elastic Net, polynomials and heterogeneity in explanatory variables Apr 29, 2024
- mlsauce's `v0.13.0`: taking into account inputs heterogeneity through clustering Apr 21, 2024
- mlsauce's `v0.12.0`: prediction intervals for LSBoostRegressor Apr 15, 2024
- Conformalized predictive simulations for univariate time series on more than 250 data sets Apr 7, 2024
- learningmachine v1.1.2: for Python Apr 1, 2024
- learningmachine v1.0.0: prediction intervals around the probability of the event 'a tumor being malignant' Mar 25, 2024
- Bayesian inference and conformal prediction (prediction intervals) in nnetsauce v0.18.1 Mar 18, 2024
- Multiple examples of Machine Learning forecasting with ahead Mar 11, 2024
- rtopy (v0.1.1): calling R functions in Python Mar 4, 2024
- ahead forecasting (v0.10.0): fast time series model calibration and Python plots Feb 26, 2024
- A plethora of datasets at your fingertips Part3: how many times do couples cheat on each other? Feb 19, 2024
- nnetsauce's introduction as of 2024-02-11 (new version 0.17.0) Feb 11, 2024
- Tuning Machine Learning models with GPopt's new version Part 2 Feb 5, 2024
- Tuning Machine Learning models with GPopt's new version Jan 29, 2024
- Subsampling continuous and discrete response variables Jan 22, 2024
- DeepMTS, a Deep Learning Model for Multivariate Time Series Jan 15, 2024
- A classifier that's very accurate (and deep) Pt.2: there are > 90 classifiers in nnetsauce Jan 8, 2024
- learningmachine: prediction intervals for conformalized Kernel ridge regression and Random Forest Jan 1, 2024
- A plethora of datasets at your fingertips Part2: how many times do couples cheat on each other? Descriptive analytics, interpretability and prediction intervals using conformal prediction Dec 25, 2023
- Diffusion models in Python with esgtoolkit (Part2) Dec 18, 2023
- Diffusion models in Python with esgtoolkit Dec 11, 2023
- Julia packaging at the command line Dec 4, 2023
- Quasi-randomized nnetworks in Julia, Python and R Nov 27, 2023
- A plethora of datasets at your fingertips Nov 20, 2023
- A classifier that's very accurate (and deep) Nov 12, 2023
- mlsauce version 0.8.10: Statistical/Machine Learning with Python and R Nov 5, 2023
- AutoML in nnetsauce (randomized and quasi-randomized nnetworks) Pt.2: multivariate time series forecasting Oct 29, 2023
- AutoML in nnetsauce (randomized and quasi-randomized nnetworks) Oct 22, 2023
- Version v0.14.0 of nnetsauce for R and Python Oct 16, 2023
- A diffusion model: G2++ Oct 9, 2023
- Diffusion models in ESGtoolkit + announcements Oct 2, 2023
- An infinity of time series forecasting models in nnetsauce (Part 2 with uncertainty quantification) Sep 25, 2023
- (News from) forecasting in Python with ahead (progress bars and plots) Sep 18, 2023
- Forecasting in Python with ahead Sep 11, 2023
- Risk-neutralize simulations Sep 4, 2023
- Comparing cross-validation results using crossval_ml and boxplots Aug 27, 2023
- Reminder Apr 30, 2023
- Did you ask ChatGPT about who you are? Apr 16, 2023
- A new version of nnetsauce (randomized and quasi-randomized 'neural' networks) Apr 2, 2023
- Simple interfaces to the forecasting API Nov 23, 2022
- A web application for forecasting in Python, R, Ruby, C#, JavaScript, PHP, Go, Rust, Java, MATLAB, etc. Nov 2, 2022
- Prediction intervals (not only) for Boosted Configuration Networks in Python Oct 5, 2022
- Boosted Configuration (neural) Networks Pt. 2 Sep 3, 2022
- Boosted Configuration (_neural_) Networks for classification Jul 21, 2022
- A Machine Learning workflow using Techtonique Jun 6, 2022
- Super Mario Bros © in the browser using PyScript May 8, 2022
- News from ESGtoolkit, ycinterextra, and nnetsauce Apr 4, 2022
- Explaining a Keras _neural_ network predictions with the-teller Mar 11, 2022
- New version of nnetsauce -- various quasi-randomized networks Feb 12, 2022
- A dashboard illustrating bivariate time series forecasting with `ahead` Jan 14, 2022
- Hundreds of Statistical/Machine Learning models for univariate time series, using ahead, ranger, xgboost, and caret Dec 20, 2021
- Forecasting with `ahead` (Python version) Dec 13, 2021
- Tuning and interpreting LSBoost Nov 15, 2021
- Time series cross-validation using `crossvalidation` (Part 2) Nov 7, 2021
- Fast and scalable forecasting with ahead::ridge2f Oct 31, 2021
- Automatic Forecasting with `ahead::dynrmf` and Ridge regression Oct 22, 2021
- Forecasting with `ahead` Oct 15, 2021
- Classification using linear regression Sep 26, 2021
- `crossvalidation` and random search for calibrating support vector machines Aug 6, 2021
- parallel grid search cross-validation using `crossvalidation` Jul 31, 2021
- `crossvalidation` on R-universe, plus a classification example Jul 23, 2021
- Documentation and source code for GPopt, a package for Bayesian optimization Jul 2, 2021
- Hyperparameters tuning with GPopt Jun 11, 2021
- A forecasting tool (API) with examples in curl, R, Python May 28, 2021
- Bayesian Optimization with GPopt Part 2 (save and resume) Apr 30, 2021
- Bayesian Optimization with GPopt Apr 16, 2021
- Compatibility of nnetsauce and mlsauce with scikit-learn Mar 26, 2021
- Explaining xgboost predictions with the teller Mar 12, 2021
- An infinity of time series models in nnetsauce Mar 6, 2021
- New activation functions in mlsauce's LSBoost Feb 12, 2021
- 2020 recap, Gradient Boosting, Generalized Linear Models, AdaOpt with nnetsauce and mlsauce Dec 29, 2020
- A deeper learning architecture in nnetsauce Dec 18, 2020
- Classify penguins with nnetsauce's MultitaskClassifier Dec 11, 2020
- Bayesian forecasting for uni/multivariate time series Dec 4, 2020
- Generalized nonlinear models in nnetsauce Nov 28, 2020
- Boosting nonlinear penalized least squares Nov 21, 2020
- Statistical/Machine Learning explainability using Kernel Ridge Regression surrogates Nov 6, 2020
- NEWS Oct 30, 2020
- A glimpse into my PhD journey Oct 23, 2020
- Submitting R package to CRAN Oct 16, 2020
- Simulation of dependent variables in ESGtoolkit Oct 9, 2020
- Forecasting lung disease progression Oct 2, 2020
- New nnetsauce Sep 25, 2020
- Technical documentation Sep 18, 2020
- A new version of nnetsauce, and a new Techtonique website Sep 11, 2020
- Back next week, and a few announcements Sep 4, 2020
- Explainable 'AI' using Gradient Boosted randomized networks Pt2 (the Lasso) Jul 31, 2020
- LSBoost: Explainable 'AI' using Gradient Boosted randomized networks (with examples in R and Python) Jul 24, 2020
- nnetsauce version 0.5.0, randomized neural networks on GPU Jul 17, 2020
- Maximizing your tip as a waiter (Part 2) Jul 10, 2020
- New version of mlsauce, with Gradient Boosted randomized networks and stump decision trees Jul 3, 2020
- Announcements Jun 26, 2020
- Parallel AdaOpt classification Jun 19, 2020
- Comments section and other news Jun 12, 2020
- Maximizing your tip as a waiter Jun 5, 2020
- AdaOpt classification on MNIST handwritten digits (without preprocessing) May 29, 2020
- AdaOpt (a probabilistic classifier based on a mix of multivariable optimization and nearest neighbors) for R May 22, 2020
- AdaOpt May 15, 2020
- Custom errors for cross-validation using crossval::crossval_ml May 8, 2020
- Documentation+Pypi for the `teller`, a model-agnostic tool for Machine Learning explainability May 1, 2020
- Encoding your categorical variables based on the response variable and correlations Apr 24, 2020
- Linear model, xgboost and randomForest cross-validation using crossval::crossval_ml Apr 17, 2020
- Grid search cross-validation using crossval Apr 10, 2020
- Documentation for the querier, a query language for Data Frames Apr 3, 2020
- Time series cross-validation using crossval Mar 27, 2020
- On model specification, identification, degrees of freedom and regularization Mar 20, 2020
- Import data into the querier (now on Pypi), a query language for Data Frames Mar 13, 2020
- R notebooks for nnetsauce Mar 6, 2020
- Version 0.4.0 of nnetsauce, with fruits and breast cancer classification Feb 28, 2020
- Create a specific feed in your Jekyll blog Feb 21, 2020
- Git/Github for contributing to package development Feb 14, 2020
- Feedback forms for contributing Feb 7, 2020
- nnetsauce for R Jan 31, 2020
- A new version of nnetsauce (v0.3.1) Jan 24, 2020
- ESGtoolkit, a tool for Monte Carlo simulation (v0.2.0) Jan 17, 2020
- Search bar, new year 2020 Jan 10, 2020
- 2019 Recap, the nnetsauce, the teller and the querier Dec 20, 2019
- Understanding model interactions with the `teller` Dec 13, 2019
- Using the `teller` on a classifier Dec 6, 2019
- Benchmarking the querier's verbs Nov 29, 2019
- Composing the querier's verbs for data wrangling Nov 22, 2019
- Comparing and explaining model predictions with the teller Nov 15, 2019
- Tests for the significance of marginal effects in the teller Nov 8, 2019
- Introducing the teller Nov 1, 2019
- Introducing the querier Oct 25, 2019
- Prediction intervals for nnetsauce models Oct 18, 2019
- Using R in Python for statistical learning/data science Oct 11, 2019
- Model calibration with `crossval` Oct 4, 2019
- Bagging in the nnetsauce Sep 25, 2019
- Adaboost learning with nnetsauce Sep 18, 2019
- Change in blog's presentation Sep 4, 2019
- nnetsauce on Pypi Jun 5, 2019
- More nnetsauce (examples of use) May 9, 2019
- nnetsauce Mar 13, 2019
- crossval Mar 13, 2019
- test Mar 10, 2019
Comments powered by Talkyard.