How Stuff+ Works

Measuring pure pitch nastiness through physics alone

Contents

1 The Residual Target 2 The Physics of Nastiness 3 Why Residuals, Not Raw xRV? 4 Early Stopping Protects the Signal 5 What Makes Great Stuff? 6 Stuff+ Is the Raw Pitch Grade

1. The Residual Target

Stuff+ doesn't train on raw pitch outcomes. It trains on what Location+ couldn't explain — the residual. If a pitch prevented more runs than its location alone should have, that leftover is the "stuff" effect.

The Residual Pipeline

STEP 1

Location+ predicts first

Location+ uses 11 features (plate_x, plate_z, zone metrics, count) to predict xRV from location alone. It answers: "how much was this pitch worth based on where it was thrown?"

STEP 2

Subtract Location+ prediction

residual = raw xRV - Location+ prediction. Whatever the location model couldn't explain becomes the training target for Stuff+.

STEP 3

Stuff+ trains on residuals

XGBoost sees 24 physics features and tries to predict the residual. It learns which physical properties make pitches harder to hit, independent of where they're thrown.

STEP 4

Scale to 100 = average

Raw predictions are converted to a grade where 100 is MLB average. A Stuff+ of 120 means the pitch's physics are 20% nastier than average. 80 means 20% less nasty.

2. The Physics of Nastiness

Stuff+ uses 24 physics-only features, grouped into five categories. No location, no count, no batter quality — just the raw physical properties of the pitch. Click a category to explore.

Interactive Feature Explorer

effective_speed

Perceived velocity accounting for extension. A pitcher releasing closer to the plate has effectively faster pitches.

plate_speed

Actual speed when the ball crosses the plate. The gap between release and plate speed reveals how much the pitch decelerates.

Note: raw release_speed was removed — it's redundant with effective_speed + plate_speed which together capture both perceived velocity and deceleration.

pfx_x

Horizontal movement (inches). How much the pitch breaks left/right vs. a spinless pitch.

pfx_z

Vertical movement (inches). How much the pitch rises or drops vs. gravity alone.

pfx_total

Total movement magnitude. The combined horizontal + vertical break distance.

pfx_z_vs_avg

Vertical movement vs. the average for that pitch type. Unusual drop/rise catches batters off-guard.

vmov_diff_from_ff

Vertical movement gap from the pitcher's own fastball. Bigger gap = harder for batters to distinguish.

ivb_adjusted

Induced vertical break adjusted for arm slot. Isolates spin-driven rise from mechanical tilt.

release_speed_drop

Speed lost between release and plate. Reveals pitch deceleration profile and "heaviness."

pfx_z_vs_avg

Context for how a pitch's vertical break compares to its pitch type average across the league.

spin_rate

Total revolutions per minute. Higher spin creates more movement potential, but not all spin translates to break.

spin_axis

The angle of the spin axis in degrees. Determines the direction of movement the spin creates.

spin_efficiency

Percentage of spin that generates movement (vs. gyroscopic spin that doesn't). A bullet-spin fastball has low efficiency.

active_spin

spin_rate × spin_efficiency. The actual RPM contributing to pitch movement. The "useful" spin.

release_pos_x

Horizontal release position. Arm-side location affects pitch angle into the zone.

release_pos_y

Distance from the rubber at release. Longer = more extension = effectively closer to the batter.

release_pos_z

Vertical release height. Determines the natural plane of the pitch and perceived angle.

release_extension

How far in front of the rubber the ball is released. More extension = less reaction time for the batter.

approach_angle

Overall angle at which the pitch approaches the plate. Combines vertical and horizontal components.

haa (horizontal approach angle)

Side-to-side angle at the plate. A sweeper arrives at a very different HAA than a cutter, fooling timing.

VAA (vertical approach angle)

The downward angle as the pitch crosses the plate. Flatter VAA on fastballs = more "rise" illusion = more swings under.

tunnel_diff_x

Horizontal separation from other pitches at the tunnel point (where batters commit). Less separation = more deception.

tunnel_diff_z

Vertical separation at the tunnel point. Pitches that look identical until late create the most whiffs.

tunnel_diff_speed

Speed difference at the tunnel point. If two pitches reach the commit-point at the same time, batter timing breaks.

🎮

Think of it this way: These 24 features describe everything about a pitch's flight path that a batter has to react to — how fast it's coming, where it's spinning, which way it breaks, and how deceptive the trajectory is. Stuff+ compresses all of that into a single number: "how hard is this pitch to hit, based purely on its physics?"

3. Why Residuals, Not Raw xRV?

This is the design choice that makes Stuff+ actually measure stuff instead of accidentally crediting location. Without residuals, the model conflates where the pitch was thrown with how nasty it was.

Without Residuals (Bad)

When training on raw xRV, a perfectly located slow pitch looks good. The model can't separate why it worked — was it the location or the stuff? It learns the wrong lesson.

With Residuals (Correct)

With residuals, location credit is already removed. The model only sees: "given where this was thrown, did the physics help or hurt?" Slow + flat = bad residual = bad stuff. Correct.

The Key Insight

Residuals let us correctly identify elite stuff even when outcomes are average (because location was bad), and correctly penalize bad stuff even when outcomes are good (because location bailed it out).

4. Early Stopping Protects the Signal

The residual target is extremely noisy at the per-pitch level. A single pitch outcome depends on count, batter, timing, luck — the physics signal is real but tiny. XGBoost's early stopping is critical here. (See the XGBoost Trees explainer for the full story.)

Stuff+ Model Configuration

600

Max trees allowed

50–200

Where they actually stop

Max tree depth

0.08

Learning rate

150

Min child weight

6-Way Split: Different Models for Different Contexts

Fastballs have the most data and strongest signal, so their models run longer. Breaking balls are noisier and stop earlier. By splitting into 6 models, each one learns the physics that matter most for that specific pitch-batter combination.

5. What Makes Great Stuff?

Different pitch types earn elite Stuff+ grades through different physics. Select a pitch type to see what the model values most.

⚾

The unifying theme: Great stuff means the pitch does something the batter's brain doesn't expect. For fastballs, it's arriving faster and flatter than predicted. For breaking balls, it's breaking harder and further than predicted. For changeups, it's looking like a fastball until it's too late. Stuff+ quantifies all of this.

6. Stuff+ Is the Raw Pitch Grade

Stuff+ Measures Pitch Design, Not Game Outcomes

Use Stuff+ to evaluate a pitch's physical quality — the raw material. A pitcher can have elite Stuff+ and still get shelled if their command is bad (that's what Location+ measures). Conversely, mediocre stuff with elite command can succeed.

USE FOR

Pitch design decisions

"Should this pitcher develop a sweeper or a cutter?" Stuff+ tells you which pitch shapes their mechanics produce most effectively.

USE FOR

Scouting raw talent

A reliever with 130 Stuff+ and 70 Location+ has elite raw material but needs to refine command. The ceiling is very high.

DON'T USE FOR

Predicting ERA alone

Stuff+ is only half the equation. Pitching+ (which combines Stuff+ and Location+) is the full picture for predicting outcomes.

DON'T USE FOR

Ignoring context

A 120 Stuff+ changeup is useless if the pitcher can't locate their fastball. Stuff grades individual pitches; the arsenal matters too.

The Full Grading Stack

Stockyard Baseball — Model Explainers