Total events

405,586

Years

2020–2025

Tornado-only

8,106


Source data

Pipeline starts from data/processed/storm_events.csv, the merged output of 01_data_preparation.qmd and 01_merge.qmd. The upstream notebooks parse raw NOAA Storm Events records, collapse event locations to one centroid per EVENT_ID, convert damage suffixes such as K/M/B into numeric values, and attach the most recent ISD station observation within 100 km and 90 minutes before event start.

Raw inputs are two public NOAA/NCEI sources. The Storm Events files provide timestamps, location, event type, magnitude, damage, injury/death counts, tornado-specific measurements, and narrative text. The ISD station files provide hourly surface weather fields such as wind, ceiling, visibility, temperature, dew point, and sea-level pressure.

The 100 km / 90-minute rule is a practical availability heuristic, not proof that the station fully represents the storm environment. It makes the start-time meteorological panel usable for S1 and for any at-event context variables. S2 is different: it is a retrospective EF-severity classification task, so later sections explicitly decide which post-event tornado descriptors may be used.


damage_reg — regression task

Target: LOG_DAMAGE

Rows: 67,687

Columns: 44

tornado_ef — classification task

Target: TOR_F_SCALE_NUM (EF0/1/2/3p)

Rows: 8,097

Columns: 37


Total events

405,586

Tornado-only

8,106

EF3+ share

3.1%


General EDA overview

EDA summarizes the merged Storm Events data used to build the S1 and S2 task datasets.


Map — storm events (sampled)

Correlation heatmap

Distribution explorer
Temporal patterns
Monthly seasonality

Event-type distribution
Damage by top event types

Pearson top-15

Scenario 1 — overview

Scenario 1 predicts LOG_DAMAGE and compares statistical regressors with feature-partitioning models using test RMSE.


Task rows

67,687

Primary metric

RMSE (lower = better)

Best model

rf — 1.393

H1 verdict

PASS


Side-by-side family comparison
Statistical

linear · polynomial · ridge

polynomial

Best RMSE = 1.716

Feature-partitioning

random forest · xgboost

rf

Best RMSE = 1.393

PASS — H1 PASS — feature-partitioning best RMSE 1.393 < statistical best RMSE 1.716.

Statistical — engine RMSE s
Feature-partitioning — engine RMSE s
Winner hyperparameters

Saved tuning choice for the damage_reg winner (rf). Engines without tunable parameters are reported as such.


H1 margin + median baseline
Metric explorer — all S1 models

Full leaderboard — all metrics

Canonical S1 prediction diagnostics

Actual vs predicted
Error slices

Residuals
Top errors

Live training sandbox — S1 (damage_reg)
Local sandbox — not canonical.
Trains the chosen engine on a subsample with the chosen hyperparameters and reports k-fold CV. Results are isolated from canonical artifacts and never affect verdicts shown elsewhere.
Sandbox controls



Pick which predictors to include. Defaults to all curated features for the selected task.


Scenario 2 — classification metrics

Scenario 2 predicts ordinal TOR_F_SCALE_NUM and compares parametric with non-parametric classifiers using test QWK.


Task rows

8,097

Primary metric

QWK (higher = better)

EF3+ share

3.1%

H2 verdict

PASS


Side-by-side family comparison
Parametric

logistic · LDA · naive Bayes

lda

Best QWK = 0.523

Non-parametric

CART · random forest · k-NN

rf

Best QWK = 0.581

PASS — H2 PASS — non-parametric best QWK 0.581 > parametric best QWK 0.523.

Parametric — engine QWK s
Non-parametric — engine QWK s
Winner hyperparameters

Saved tuning choice for the tornado_ef winner (rf). Engines without tunable parameters are reported as such.


Full leaderboard — all metrics

QWK disagreement — penalty × count per cell
QWK by model

Confusion matrix (test fold) — counts + row %
ROC curves (one-vs-rest) + AUC

Threshold trade-off (per class)

Live training sandbox — S2 (tornado_ef)
Local sandbox — not canonical.
Trains the chosen engine on a subsample with the chosen hyperparameters and reports k-fold CV. Results are isolated from canonical artifacts and never affect verdicts shown elsewhere.
Sandbox controls



Pick which predictors to include. Defaults to all curated features for the selected task.


Scenario 3 — feature selection

Scenario 3 applies feature selection to the S1/S2 winners and checks whether RFE keeps fewer than half of candidate predictors.



Three feature-selection methods
RFE wrapper · backward Backward wrapper search; drops weak features per round. Display-only here — full search is in 04_feature_selection.qmd.
Lasso embedded · alpha=1 Embedded L1 selection; coefficients shrink to exactly zero at the validation-optimal lambda.
Elastic Net embedded · alpha=0.5 Embedded L1 + L2; retains correlated groups instead of dropping all but one.

Feature-selection comparison — retained share
Test-set primary metric per method

Run feature optimization


Comparison table — saved fs_comparison.rds

Summary from 05_summary.qmd

Verdicts and summary tables are loaded from the notebook summary artifact.


Scenario summary table

Verdicts
  • H1 — S1 damage regression PASS — rf RMSE 1.393 vs polynomial RMSE 1.716; lower wins. Selected hparams: min_n = 2, trees = 300.
  • H2 — S2 EF classification PASS — rf QWK 0.581 vs lda QWK 0.523; higher wins. Selected hparams: min_n = 2, trees = 500, imbalance_method = weighted.
  • H3 — S3 feature selection PASS — damage_reg kept 5/43 features (11.6%); tornado_ef kept 11/36 features (30.6%)