Classifier Comparison Lab

Model Fitting Binary Classification

Compare five binary classification algorithms side-by-side on the same data. See how algorithm choice alone changes accuracy, precision, recall, and ROC curves — then examine the holdout set to spot which models overfit.

OVERVIEW & OBJECTIVE

Binary classification assigns each observation to one of two classes (e.g., churn vs. retain, click vs. ignore). Different algorithms make different assumptions about the data, so no single classifier always wins. This tool lets you train multiple models on the same dataset and compare their performance on both training and holdout test data.

All five algorithms in this tool handle numeric (continuous) features only. In practice, most of these algorithms can also work with categorical features via encoding techniques like one-hot or ordinal encoding — that limitation is for simplicity here, not a real constraint of the methods.

Algorithms at a Glance

Algorithm	Decision Boundary	Best When…	Watch Out For…
Logistic Regression	Linear (hyperplane)	Classes are roughly linearly separable	Non-linear patterns, multicollinearity
Decision Tree	Axis-aligned rectangular splits	Non-linear patterns, mixed feature types	Overfitting (high variance without pruning)
k-Nearest Neighbors	Complex / local	Non-linear boundaries, small-medium data	Curse of dimensionality, slow on large data
Naive Bayes	Quadratic (Gaussian assumption)	Class-conditional features are roughly Gaussian	Correlated features violate independence assumption
Linear SVM	Linear (maximum-margin hyperplane)	High-dimensional data, clear margin between classes	Non-linear patterns (without kernel trick)

DATA SOURCE

Load a marketing use case:

Each scenario has different distributional characteristics designed to show that different algorithms excel on different problems. Try all three!

Upload numeric CSV data

Provide a header row with numeric feature columns and a binary target column (0/1). Up to 2,000 rows recommended for responsive performance.

Drag & Drop raw data file (.csv, .tsv, .txt, .xls, .xlsx)

Include headers. All feature columns must be numeric. Target column must be binary (0/1).

No file uploaded.

RESULTS

Training Set Performance

These metrics reflect how well each model fits the data it was trained on.

Metric

Holdout Test Set Performance

These metrics reveal each model’s true generalization ability on unseen data. The gap column shows how much performance dropped from training — large gaps signal overfitting.

Metric

INTERPRETATION GUIDE

What the Metrics Mean

Metric	Meaning	Range
Accuracy	Proportion of all predictions that are correct	0 – 1 (higher is better)
Precision	Of predicted positives, how many are truly positive	0 – 1 (higher = fewer false alarms)
Recall	Of actual positives, how many were detected	0 – 1 (higher = fewer missed cases)
F1 Score	Harmonic mean of precision and recall	0 – 1 (balanced measure)
Specificity	Of actual negatives, how many were correctly identified	0 – 1
AUC	Area under the ROC curve; overall discrimination ability	0.5 (random) – 1.0 (perfect)
Log Loss	Penalizes confident wrong predictions heavily	0+ (lower is better)

Spotting Overfitting

If a model scores much higher on the training set than the holdout set, it has memorized training noise rather than learned generalizable patterns. The gap column in the holdout table quantifies this.

≤ 5% — Healthy: minimal overfitting
5% – 10% — Moderate: consider simplifying the model
> 10% — Severe: model is likely overfitting

Decision Trees are particularly prone to overfitting (try lowering max depth). Logistic Regression and Linear SVM typically show smaller gaps because their linear boundary can’t memorize complex noise.

When Each Algorithm Excels

Logistic Regression — Best for linearly separable data; outputs well-calibrated probabilities; highly interpretable.
Decision Tree — Captures non-linear patterns and interactions; easy to visualize; prone to overfitting without depth limits.
k-NN — Adapts to any boundary shape; no training phase; slow at prediction time on large datasets.
Naive Bayes — Fast and effective when class-conditional features are roughly Gaussian; handles high-dimensional data well.
Linear SVM — Finds the maximum-margin boundary; robust to outliers near the boundary; needs a kernel for non-linear patterns.

Classifier Comparison Lab

👨‍🏫 Professor Mode: Guided Learning Experience

OVERVIEW & OBJECTIVE

DATA SOURCE

Use a Case Study

Upload Your Data

Upload numeric CSV data

CONFIGURATION

Feature Selection

ALGORITHM SELECTION

RESULTS

Training Set Performance

Holdout Test Set Performance

CONFUSION MATRICES (Holdout Set)

ROC CURVES (Holdout Set)

DECISION BOUNDARIES

KEY INSIGHTS

INTERPRETATION GUIDE