A full-stack web app for analyzing MLB pitching data. Built on pitch-level Statcast data with custom-computed statistics and machine learning pitch grades.
The goal of this project is to get information into the hands of players, coaches, and fans. For players and coaches, that means removing the what and the why — so the only remaining question is how. That's the focus of stockyardbaseball.com — a coaching service that helps players achieve the things necessary to get to the big leagues.
8.9M+
Pitches Tracked
3,400+
Pitchers
11
Seasons
180+
Statistics
Three per-pitch machine learning models (Stuff+, Location+, Pitching+) grade every pitch thrown since 2015, plus a pitcher-season deception model (2024+). All grades use a 100-based scale where 100 is league average and higher is better. Per-pitch models score each pitch separately against left-handed and right-handed batters.
How nasty is the pitch? Grades based on physical characteristics: velocity, movement, spin, release point, extension, arm angle, tunnel deception, and trajectory.
XGBoost · 24 features
How well is the pitch located? Grades based on plate location, count awareness, inning context, pitch deflection, induced vertical break, and trajectory break features.
XGBoost · 23 features
The full picture. Combines stuff and location features with sequencing context (pitch mix, fastball differentials, tunnel rotation) for an overall pitch effectiveness grade.
XGBoost · 37 features · Highest predictive power of the three
How much harder is the pitcher to hit than expected? Trained on bat tracking residuals (bat speed suppression, swing length shortening) to isolate deception independent of Stuff+/Location+/Pitching+. Measures delivery deception and timing disruption.
Ridge regression · ~15 features · r = −0.160 with FIP · Available 2024+
A season-level model that measures how well a pitcher's pitch types work together as a unit. Trained on the residual between a pitcher's actual CSW rate and the CSW predicted by their individual Stockyard Stuff+ grades — isolating the interaction effects between pitches.
Captures pairwise velocity and movement contrasts between pitch types, tunnel consistency, and specialist vs. diverse arsenal dynamics. A pitcher with two elite pitches that tunnel well together scores higher than one with three average pitches.
RidgeCV · 26 features · r = 0.75 year-over-year stability · Independent of Stuff+ (r = −0.05)
Next-season predictions for pitcher performance. ElasticNet models trained on 10 years of pitcher-season transitions (1,700+ data points) with 200+ features from Statcast, FanGraphs, and Stockyard grades.
Direct predictions of strikeout rate and walk rate for the following season, the two stats most within a pitcher's control.
K% R² = 0.52 · BB% R² = 0.34 · Beats Marcel-style baselines
Derived from projected K% and BB% using the formula 5.40 − 12 × (K% − BB%). A simple ERA estimator that's nearly as predictive as SIERA with no fitting required.
Public formula · r = 0.60 year-over-year · r = 0.25 vs next-year ERA
Learn how projections are built →
Statcast (Primary Source)
Every pitch thrown since 2015 — velocity, spin, movement, location, outcomes, and extended tracking data (arm angle, bat speed, game situation). The majority of stats on this site are computed directly from this pitch-level data, including K%, BB%, FIP, xFIP, BABIP, batted ball stats, plate discipline, contact quality, pitch movement, and more.
MLB Stats API
Official scorer data that can't be derived from pitch tracking alone — ERA (requires earned/unearned run distinction), games started, wins, losses, saves, and holds.
FanGraphs
A small set of advanced metrics that require park factors or complex modeling beyond what pitch data alone can produce: WAR, SIERA, xERA, park-adjusted stats (ERA-, FIP-), leverage/clutch metrics (WPA, RE24, Clutch), and run values per pitch type.
Three-tier merge priority
FanGraphs provides the baseline, then self-computed Statcast aggregations override where possible, then MLB official stats take highest priority. This ensures maximum accuracy at every level.
Frontend
Next.js 14, React, Tailwind CSS, TanStack Query, Recharts
Backend
Python FastAPI, SQLAlchemy, SQLite
ML Models
XGBoost for pitch grades (Stuff+, Location+, Pitching+), RidgeCV for Arsenal Synergy, ElasticNet for projections
Data Ingestion
pybaseball for Statcast, MLB Stats API for official records, FanGraphs API for advanced metrics
Every line of code in this project — frontend, backend, data pipelines, machine learning models, and deployment infrastructure — was written by AI using Claude by Anthropic, with code review assistance from Codex by OpenAI. A human guides the direction, designs the models, and makes product decisions, but the implementation is 100% AI-generated. No hand-written code.
Grade Scale
100 = league average. 110 = good. 120 = elite. 80 = below average. One standard deviation is roughly 10 points.
Signal Strength
Trust season grades (100+ pitches). Question monthly grades. Ignore single-game grades. As sample size shrinks, noise dominates.
Location+ vs Stuff+
Decorrelated by design. A pitcher can have elite Stuff+ and average Location+ — that tells you the pitches are nasty but command needs work.
Run Value Sign
Raw xRV: negative = runs prevented = good for pitchers. The 100 scale flips this so higher is always better, regardless of the underlying metric.
Rather than relying on third-party leaderboards, the majority of statistics on this site are aggregated directly from raw Statcast pitch data. This gives us full control over methodology and lets us offer stats that aren't available elsewhere.
Rate Stats
K%, BB%, K-BB%, HR%, TTO%, BABIP, AVG, OBP, SLG, WHIP, K/9, BB/9, HR/9
ERA Estimators
FIP and xFIP computed with year-specific constants and league HR/FB rates
Batted Ball
GB%, FB%, LD%, IFFB%, Pull%, Cent%, Oppo%, FB Pull% — spray angles from hit coordinates
Contact Quality
Hard Hit%, Soft%, Barrel%, Sweet Spot%, Avg/Max Exit Velo, Avg Launch Angle
Plate Discipline
O-Swing%, Z-Swing%, Swing%, O-Contact%, Z-Contact%, Contact%, SwStr%, CStr%, CSW%
Pitch Movement
iVB and iHB from raw Statcast pfx data — more accurate than FanGraphs, which measures at a shorter distance
Arm-Slot Adjusted
Movement and acceleration adjusted for arm angle via regression, with submarine outlier detection
Per-Pitch-Type
Velocity, usage, CSW%, Barrel%, and xRV per 100 for each of 7 pitch types
Stockyard Originals
xRV (BABIP-neutral expected run value), Stockyard Stuff+, Location+, Pitching+ grades for every pitch, Arsenal Synergy per season, count-normalized variants
Key terms used across the site. Search or filter by category. View the full glossary →
The angle of the pitcher’s arm slot at release, measured in degrees from horizontal. Used to adjust movement expectations and calculate arm-slot-adjusted metrics.
Season-level model measuring how a pitcher’s pitch types work together as a unit. Trained on the gap between actual CSW and CSW predicted by individual Stuff+ grades. Independent of Stuff+ (r = −0.05).
Batting Average on Balls In Play. Measures the rate at which batted balls (excluding HR) fall for hits. Used to identify luck — league average is ~.300.
Called Strike + Whiff Rate. Percentage of pitches resulting in a called strike or swinging strike. A measure of pitch dominance.
Season-level model trained on bat tracking residuals (bat speed suppression, swing length shortening) to isolate deception independent of Stuff+/Location+/Pitching+. Available 2024+.
Fielding Independent Pitching. ERA estimator using only strikeouts, walks, HBP, and home runs — the outcomes most within a pitcher’s control.
Induced Horizontal Break. Gravity-adjusted horizontal movement, computed from raw Statcast pfx data. Measured from the catcher’s perspective.
Induced Vertical Break. Gravity-adjusted vertical movement of a pitch, computed from raw Statcast pfx data. Positive = rise relative to gravity.
ERA estimator derived from K% and BB%: 5.40 − 12 × (K% − BB%). Simple, no fitting required, nearly as predictive as SIERA for next-season ERA.
Command-only pitch grade measuring where the pitch crosses the plate relative to count context. Intentionally blind to velocity and movement.
Combined pitch grade merging physics, command, sequencing, and batter quality into one holistic effectiveness score. Highest predictive power of the three.
Count-normalized Pitching+ variant. Same concept as Stuff+ CountNorm — removes the bias that arises from pitchers who pitch more often in favorable counts appearing better than they are.
Skill-Interactive ERA. FanGraphs advanced ERA estimator that adjusts for batted ball profile, park factors, and interaction effects.
Physics-only pitch grade measuring velocity, movement, spin, and release traits. 100 = league average, higher is better. Intentionally blind to location.
Count-normalized Stuff+ variant that strips count-selection bias from grade averages. Correlates better with season outcomes because it removes the effect of pitching in favorable vs unfavorable counts.
How much a pitch’s trajectory diverges from the fastball after the decision point where the batter must commit. Greater divergence = harder to distinguish from the fastball.
Vertical Approach Angle. The downward angle (in degrees) at which the pitch arrives at the plate. Flatter VAA on fastballs correlates with more swings and misses.
Wins Above Replacement. Estimates total value of a player compared to a replacement-level player. Sourced from FanGraphs (fWAR) on this site.
Statcast ERA estimator based on expected wOBA (xwOBA) allowed. Adjusts for quality of contact rather than actual outcomes.
Expected FIP. Same as FIP but replaces actual home runs with expected HR based on fly ball rate × league HR/FB rate. More stable year-over-year than FIP.
Expected Run Value. BABIP-neutral estimate of each pitch’s run impact. Negative = runs prevented = good for pitchers. The training target for all pitch grade models.