How Projections Work

Translating pitch-level skill into next-season K%, BB%, and kwERA

Contents
1 From Skills to Outcomes 2 What Drives K% and BB%? 3 Why Linear, Not XGBoost? 4 Aging Curves 5 Interpreting Projections

1. From Skills to Outcomes

Pitch grades like Stuff+, Location+, and Pitching+ measure current skill on a per-pitch basis. But front offices and fantasy managers need to know: what will this pitcher actually do next season? Projections bridge that gap.

The Projection Pipeline
Pitch Grades Stuff+ per pitch type Location+ Pitching+ Historical Stats Prior-year K% Prior-year BB% 2-year ERA avg Usage & workload Context Age / career arc Role (SP / RP) Arsenal mix ElasticNet 231 features in ~30-50 survive L1 K% BB% kwERA 5.40 - 12K% + 16BB% PROJECTIONS = STABLE SKILL EXPECTATION Not predicting hot streaks — capturing where a pitcher's talent sits on a full-season arc
INPUT 1
Pitch-level grades
Stuff+, Location+, and Pitching+ for each pitch type tell us how good the pitcher's raw tools are right now.
INPUT 2
Historical performance
Prior-year K%, BB%, ERA, and 2-year averages. Skills are sticky year-to-year, so the best predictor of future K% is past K%.
INPUT 3
Context & aging
Age, role (starter vs. reliever), workload, and arsenal composition all shift the baseline expectation.
OUTPUT
K%, BB%, and kwERA
Three numbers that distill a pitcher's projected run-prevention ability based on the things most within their control.
🎯
Why kwERA? ERA is noisy -- it depends on defense, sequencing luck, and park factors. kwERA (5.40 - 12*K% + 16*BB%) strips all that away and focuses on the two outcomes a pitcher controls most: strikeouts and walks. Lower is better -- an elite pitcher might project for a 3.00 kwERA, while a replacement-level arm sits around 5.00.

2. What Drives K% and BB%?

The model has 231 features, but most of the predictive power concentrates in a handful of categories. Use the toggle to switch between strikeout and walk drivers.

Top Predictors of Projected Strikeout Rate
Relative importance Historical K% Prior year + 2-yr avg Very High Stuff+ (breakers) Slider, curve, sweeper High Arsenal synergy Tunnel quality, pitch mix Moderate Pitch usage mix More offspeed = more K Moderate Age / career arc Peak at 26-28 Low

Strikeouts are driven by a combination of historical track record (the strongest signal) and current pitch quality. A pitcher with elite Stuff+ on breaking balls and a deep arsenal that tunnels well will generate whiffs. But even the best raw stuff can't overcome a career-long inability to miss bats.

Top Predictors of Projected Walk Rate
Relative importance Historical BB% Prior year + 2-yr avg Very High Location+ Command quality High Count tendencies First-pitch strike rate Moderate Age / role BB% is very stable with age Low

Walk rate is the stickiest skill in baseball. A pitcher who walked 8% of batters last year will almost certainly walk close to 8% next year. Location+ (command quality) adds incremental signal -- pitchers who consistently hit their spots walk fewer batters. Count tendencies matter too: pitchers who get ahead 0-1 rarely walk the batter.

3. Why Linear, Not XGBoost?

Stuff+, Location+, and Pitching+ all use XGBoost. So why does the Projections model use a simple linear model? The answer comes down to sample size and feature count.

The Dimensionality Problem
Stuff+ (XGBoost) Training samples: ~1.5 million pitches Features: 24 Samples per feature: ~62,500 Plenty of data to learn complex interactions Projections (ElasticNet) Training samples: ~500 pitcher-seasons Features: 231 Samples per feature: ~2.2 XGBoost would memorize, not learn ElasticNet's Built-In Feature Selection (L1 Penalty) 231 input features: ~30-50 survive with non-zero weights ~180 driven to zero L1 penalty automatically zeros out noisy, redundant, or uninformative features
XGBoost with 231 features & ~500 rows

Would overfit catastrophically

  • Trees can memorize individual pitcher-seasons
  • Can find "patterns" in 2 data points
  • No built-in way to ignore irrelevant features
  • Would look great on training data, terrible on new seasons
ElasticNet (L1 + L2 regularization)

Built for this exact scenario

  • L1 (Lasso): Drives weak feature weights to exactly zero
  • L2 (Ridge): Shrinks remaining weights to prevent overfit
  • Only ~30-50 features end up mattering
  • Stable predictions that hold up on future seasons
🔎
Analogy: XGBoost is like giving a detective unlimited time and a massive corkboard -- great when there are millions of clues to cross-reference. ElasticNet is like telling the detective: "You have 500 case files and 231 possible leads. Most are dead ends. Find the 30 that actually matter and ignore the rest." When evidence is scarce, disciplined focus beats elaborate theories.

4. Aging Curves

Pitcher performance isn't static. The model accounts for where a pitcher sits on their career arc, adjusting projections based on age-related trends across the entire MLB population.

Aging Effects on K% and BB%

Hover over the curve to see values at each age.

Pitcher Age 21 23 25 27 29 31 33 35 37 39 Rate (% of league avg) 90% 95% 100% 105% 110% Lg avg PEAK ZONE K% (strikeout rate) BB% (walk rate) K%: BB%:
K% AGING
Sharp peak, steady decline
Strikeout ability peaks at age 26-28 when velocity and stuff are at their best. The decline accelerates after 32 as velocity fades and pitch sharpness erodes.
BB% AGING
Remarkably flat
Command is a learned skill that doesn't require elite athleticism. Walk rates barely change from age 25-35, only rising modestly in late career. This is why BB% is so predictable.
What this means for projections: A 27-year-old with a 24% K-rate gets a small downward age adjustment -- he's near peak. A 33-year-old with the same K-rate gets a larger downward adjustment because the model expects his K% to keep declining. Neither pitcher's BB% moves much from aging alone.

5. Interpreting Projections

A projection is a center of gravity, not a promise. It represents the most likely outcome given typical health and workload, but real seasons involve variance.

Distribution of Possible Outcomes
Actual Season kwERA 2.80 3.20 3.60 4.00 4.40 PROJECTION ~68% of outcomes fall here Best case Everything breaks right Worst case Injuries, bad luck A 3.60 kwERA projection means 3.20-4.00 is the expected range, not that 3.60 is guaranteed

Important Caveats

CAVEAT 1
Assumes typical health
Projections don't predict injuries. A torn UCL or shoulder issue resets everything. The projection is what we'd expect if the pitcher stays healthy and throws a full workload.
CAVEAT 2
Role changes shift the baseline
A starter who moves to the bullpen typically gains 1-2 mph of velocity and 2-3 points of K%. Projections assume the pitcher stays in their current role unless specified otherwise.
CAVEAT 3
Not capturing hot streaks
If a pitcher had an incredible April, the projection doesn't spike up. Projections estimate true talent over a full season, smoothing out short-term variance.
CAVEAT 4
Uncertain for young pitchers
With limited MLB track record, projections for rookies and 2nd-year pitchers carry wider uncertainty bands. The model relies more on pitch grades and less on historical stats for these arms.
What Projections Are Good At
Ranking pitchers
Even if absolute values are off by 0.3 kwERA, relative rankings are stable. If we project Pitcher A as better than Pitcher B, that usually holds.
Identifying skill changes
When a pitcher's projection shifts significantly year-to-year, it flags a real underlying change -- new pitch, velocity loss, or mechanical adjustment.
Separating skill from luck
A 2.50 ERA pitcher who projects for a 4.00 kwERA probably benefited from great defense or strand rate luck. Projections cut through the noise.
Long-term accuracy
Over many pitchers and seasons, the projected K% and BB% closely track actual results. The model is calibrated -- it doesn't systematically over- or under-predict.

Stockyard Baseball -- Model Explainers