Why Tree Depth Matters for Location+

The Problem

Sliders vs RHB: A Broken Sub-Model

Five of six Location+ sub-models predict run value well. But breaking balls vs right-handed batters has a seasonal correlation near zero — the model can't rank slider command at all.

0.37

Fastball R (best)

0.31

Offspeed L

0.27

Breaking L

0.035

Breaking R

The breaking_R correlation of 0.035 means: Knowing a pitcher's average Location+ grade for their slider tells you almost nothing about how much run value that slider actually prevented. Sliders specifically show r = −0.050 — the grades are inversely related to outcomes.

Root Cause

The Slider Value Surface Is Non-Convex

Unlike fastballs (good up, bad down center) or changeups (good low), sliders vs same-side batters have two distinct "good" zones separated by the worst possible location.

Fastball / Changeup: Convex

One good region, one bad region. A single split on plate_z gets you 80% of the way there. The value surface is a smooth gradient — easy for shallow trees.

Slider vs RHB: Non-Convex

Two separated good regions with the worst zone in between. No single split on any feature isolates "good." The tree must carve two disconnected pockets in 2D space — an XOR-like pattern.

The Mechanism

Split Budget: Where Do 4 Levels Go?

Each tree path has exactly depth splits. With 11 features competing for those splits, the budget is tight. Here's how a single tree must allocate splits to capture slider value.

Visualization

What the Tree Actually Looks Like

Toggle between depth 4 and depth 5 to see how one extra level changes what the tree can represent. Watch how the leaf nodes (colored boxes at bottom) go from blurry to precise.

The Effect

How Blurry Grades Kill Seasonal Correlation

Per-pitch noise washes out over a season. Systematic bias doesn't. When the model assigns similar grades to "chase zone slider" and "hanging slider," averaging over 100+ pitches preserves the error.

The Verdict

This Is Underfit, Not Overfit

Four signs that depth 4 was too shallow — and depth 5 is the right fix.

1. Systematic failure, not random

Overfitting = great training, bad test. Here the model fails everywhere — it never captures the pattern. That's textbook underfit.

2. The fix generalizes

Depth 5 improves held-out 2024 test seasonal correlations. Overfit would collapse on test data; this doesn't.

3. Depth hierarchy makes sense

Location+ (11 features) at depth 4 had the fewest splits-per-feature of any model. Stuff+ (24 features, depth 5) and Pitching+ (40 features, depth 6) were already deeper.

4. Still conservative capacity

Depth 5 with 11 features = each path uses ~half the features. Nowhere near memorizing individual pitches. 32 leaf nodes for a continuous 2D surface is modest.

4 → 5

Depth Change

16 → 32

Max Leaf Nodes

+0.003

Pitch-Type r Gain

+0.007

Pitcher r Gain

Summary

One Extra Split Changes Everything

●

The Problem

Sliders have a non-convex value surface. Two good zones separated by the worst zone. Depth 4 can't carve both.

●

The Mechanism

3 of 4 splits consumed by spatial interaction → no budget for count context. Trees face impossible tradeoff.

●

The Fix

Depth 5 gives one extra split per path. Now each tree resolves spatial pattern AND conditions on count.

Why Tree Depth Mattersfor Location+

Sliders vs RHB: A Broken Sub-Model

The Slider Value Surface Is Non-Convex

Fastball / Changeup: Convex

Slider vs RHB: Non-Convex

Split Budget: Where Do 4 Levels Go?

What the Tree Actually Looks Like

How Blurry Grades Kill Seasonal Correlation

This Is Underfit, Not Overfit

1. Systematic failure, not random

2. The fix generalizes

3. Depth hierarchy makes sense

4. Still conservative capacity

One Extra Split Changes Everything

The Problem

The Mechanism

The Fix

Why Tree Depth Matters
for Location+