MLB ELO RATING

▲

Hot Players

High Gainers

▼

Cold Players

Risk Alert

▲

Weekly Hot Players

Cumulative ELO delta over the selected range

▼

Weekly Cold Players

Last 7 Days

League Summary

Active Players

Average ELO

Elite Players

Total PA

#	Player	Team	ELO	Tier	PA	EV	HH%	Last Game
Loading...

Rows

Note: Team ELO is calculated from individual player ELOs weighted by plate appearances (batters) and batters faced (pitchers). Does not account for defense or stolen bases. Speedster teams (KC, SEA) may rank slightly lower; poor defensive teams (CHW, DET) may rank slightly higher than true value. Batting ELO correlates with Runs Scored (r = 0.90); Pitching ELO with Runs Against (r = -0.91).

#	Team	Aggregate ELO	Batting ELO	Bat Tier	Pitching ELO	Pitch Tier	RS	RA
Loading...

Batting

Pitching

#	Player	Team	Talent ELO	Tier	Base ELO	PA
Loading...

Rows

⚾ Select Batter

⚽ Select Pitcher

⚾ Select Team

📚 Project Abstract

This system applies the FanGraphs-style playerElo model (Jacob Richey) to MLB plate appearances. Each PA is scored by run value, compared against an expected run value based on batter/pitcher Elo and the current base-out state. Ratings move according to how much the outcome outperforms (or underperforms) expectation. The baseline is 1,500 (league average).

⚙ Key Adjustments

RE24 run value: Play RV = RE_after − RE_before + runs_scored
Expected RV (state matrix): Quadratic expected RV by base-out state and Elo
Park factors: Park run environment adjustment (FanGraphs method)
Home advantage: Applied to batter expected RV when batting at home
Field error handling: Batters do not gain Elo from errors

🔢 ELO Formula (FanGraphs / Jacob Richey)

All players start at 1,500. Per PA:

Play RV: rv = RE_after − RE_before + runs_scored
Expected RV: xRV = A·Elo² + B·Elo + C (state-specific)
RV diff: rv_diff = rv − ((xRV_b + xRV_p)/2) − park_factor
Elo update: bΔ = (921.675 + 6046.370·rv_diff − Elo_b)/502,
                pΔ = (965.754 − 4762.089·rv_diff − Elo_p)/502

Parameter	Value
INITIAL_ELO	1500.0
MIN_ELO	500.0
DIVISOR	502.0

📊 Data Source

Statcast pitch-level data aggregated into plate appearances
Per-PA base/out state, score change, and runner movement
Run value computed from RE24 (2016–2018 Retrosheet baseline)

🎯 What Moves Elo

High RV plays in leverage states (bases loaded, fewer outs)
Outperforming a strong opponent in a difficult state
Run-suppressing outcomes by pitchers in high-expectancy states

The same event can produce very different Elo changes depending on state and opponent quality.

🏆 ELO Tiers

Tier	Range	Description
Elite	1,800+	MVP-caliber
High	1,650-1,799	All-Star level
Above Avg	1,550-1,649	Above-average starter
Average	1,450-1,549	League average
Below Avg	1,350-1,449	Below-average
Low	1,200-1,349	Struggling
Cold	<1,200	Significant slump

🏡 Park Factor Examples

Stadium	Park Factor	Effect
Coors Field (COL)	+0.0172	Run-friendly, RV adjusted down
T-Mobile Park (SEA)	-0.0175	Run-suppressing, RV adjusted up
Yankee Stadium (NYY)	+0.0119	Slightly run-friendly

Park factors are applied as a run-value adjustment to the RV diff.

📚 What is ELO?

The ELO Rating System is a method originally designed to calculate the relative skill levels of chess players. Players start at 1,500 (league average). In this model, each plate appearance is scored by run value, compared to expected run value given the base-out state and player quality. Elite performers rise above 1,800; struggling players fall below 1,200.

⚖ Expectation-Based Updates

Elo changes are driven by how much the outcome beat expectations. The model computes batter and pitcher updates separately using Fangraphs regression formulas. A player's rating only rises by outperforming expectation.

⚡ Expected RV & Run Value

Each PA gets a true run value from RE24 and an expected run value from the state matrix. The difference between these, adjusted for park factor, drives Elo changes.

Component	Description
Play RV	RE_after − RE_before + runs_scored
Expected RV	Quadratic by state and Elo
Park Factor	Run environment adjustment

In run-friendly parks, positive outcomes are discounted to avoid inflation.

⚖ Pitcher-Batter Context

Batter and pitcher Elo are on the same 1,500-based scale, but updates are computed independently. High-Elo pitchers consistently suppress run value in high-expectancy states; high-Elo batters do the opposite.

💡 Interpreting Ratings Across Roles

Ratings are most meaningful within role (batters vs pitchers). Use percentiles within each role when comparing talent tiers, especially for edge cases like relievers vs everyday hitters.

Key Takeaway: When comparing players across roles, use role-relative rankings (percentile within batters or within pitchers) rather than raw ELO values.

⚾ Two-Way Players

Players who both bat and pitch have two independent ELO ratings—one for batting and one for pitching. Each rating only changes when the player acts in that role. Switch between Batting/Pitching tabs for separate OHLC charts. Marked with TWP badge in search results and leaderboards.

📊 Reading OHLC Charts

Open: ELO at start of day (first PA)
High: Highest ELO during day
Low: Lowest ELO during day
Close: ELO at end of day (last PA)

■ Green candles indicate ELO rose that day (close > open)
■ Red candles indicate decline (close < open)

Moving averages (MA5, MA15) smooth daily noise.

🎯 What is Talent ELO?

While main ELO captures overall performance, Talent ELO decomposes into specific skill dimensions tracked independently using binary matchup model. No composite talent score—each dimension stands alone. A player might be Elite in Power but Average in Contact.

⚾ Batter Dimensions (4)

Dimension	Description	Positive Events	Negative Events
Contact	Avoid strikeouts, make contact	1B, 2B, 3B	K, Out
Power	Extra-base hit and HR ability	HR (1.0), 2B (0.7)	GIDP (-0.7), Out (-0.2)
Discipline	Plate discipline, walk rate	BB (1.0), HBP (0.8)	—
Clutch	High-leverage with RISP	Hits with runners on 2B/3B	Outs, K with RISP

⚽ Pitcher Dimensions (4)

Dimension	Description	Positive Events	Negative Events
Stuff	Strikeout-inducing ability	K (1.0)	HR (-0.8)
BIP Suppression	Batted-ball suppression/BABIP	Out (0.4), GIDP (0.5)	1B (-0.6), 2B (-0.8), 3B (-0.9)
Command	Walk prevention	K (0.3), Out (0.15)	BB (-1.0), HBP (-0.8)
Clutch	Clutch pitching with runners on	Same events amplified in RISP	—

🔢 Calculation Method

Uses binary matchup model—each PA pits batter skill against pitcher skill in relevant dimension. Outcomes trigger independent dimension updates. ELO expected score uses rating difference to determine expected outcome.

Reliability scaling: Scales from 0.3–1.0 as players accumulate qualifying events; new players get dampened updates that strengthen with sample growth.

💡 Important Notes

Uses same tier system (Elite 1,800+, etc.)
Dimensions are independent (can be Elite in Power, Average in Contact)
Batter Clutch and Pitcher Clutch are separate
BIP Suppression has low sensitivity (consistent with DIPS theory)
Speed dimension currently disabled due to insufficient stolen base data

🔭 What is the Matchup Predictor?

Forecasts the outcome distribution of a single plate appearance between a specific batter and pitcher. Uses Talent ELO dimensions rather than sparse head-to-head data. Runs entirely in browser (no backend calls except fetching talent ELO). V2.1 z-score matchup predictor port.

Output: 7-outcome probability distribution (BB, K, OUT, 1B, 2B, 3B, HR) plus derived metrics (expected wOBA, on-base %, expected slugging).

⚙ Input: 6 Talent Dimensions

Batter	Pitcher
Discipline (Stage 1: BB)	Command (Stage 1: BB)
Contact (Stage 1: K; Stage 2: Hit)	Stuff (Stage 1: K)
Power (Stage 3: XBH)	BIP Suppression (Stage 2: Hit)

Clutch dimensions not used—models context-neutral PA.

📊 Z-Score Normalization

Raw ELO values on different scales. Predictor converts to z-scores using dimension-specific distribution:

              z = (ELO − mean) / std
            

After normalization, z=+1.0 means "one standard deviation above average" regardless of dimension.

Dimension	Mean	Std
Batter Contact	1504.5	33.9
Batter Power	1468.6	61.6
Batter Discipline	1700.3	139.0
Pitcher Stuff	1587.3	56.6
Pitcher BIP Supp.	1513.3	18.2
Pitcher Command	1681.1	126.5

🛠 3-Stage Decision Tree

Sequential structure mirrors natural PA progression:

Stage 1: BB / K / BIP (3-way Softmax)

                z_disc_cmd = z(Discipline) − z(Command)

                z_stuff_contact = z(Stuff) − z(Contact)

                logit_BB = base_logit_BB + z_disc_cmd / 3.5

                logit_K = base_logit_K + z_stuff_contact / 3.5

                Softmax → P(BB), P(K), P(BIP)

Stage 2: Hit / Out given BIP (Logistic)

                z_contact_bip = z(Contact) − z(BIP_Suppression)

                logit = base_logit + z_contact_bip / 5.0

                P(Hit|BIP) = 1 / (1 + e^(−logit))

                P(Hit) = P(BIP) × P(Hit|BIP)

Stage 3: XBH / Single given Hit (Logistic)

                logit = base_logit + z_power / 5.0

                P(XBH|Hit) = 1 / (1 + e^(−logit))

                P(2B) = P(XBH) × 0.552

                P(3B) = P(XBH) × 0.045

                P(HR) = P(XBH) × 0.403

📈 League Average Base Rates (2025)

Metric	Value
BB rate	9.49%
K rate	22.18%
BIP rate	68.34%
BABIP	.321
XBH rate	34.93%
2B ratio (of XBH)	55.2%
3B ratio	4.5%
HR ratio	40.3%

⚖ Expected wOBA Calculation

7-outcome distribution converted using linear weights:

Outcome	wOBA Weight
BB	0.69
K / OUT	0.00
1B	0.88
2B	1.24
3B	1.56
HR	2.00

              expected_wOBA = Σ P(outcome) × wOBA_weight(outcome)
            

💡 Interpreting Results

Expected wOBA is best summary (league avg ~.310; >.350 batter advantage, <.280 pitcher advantage)
Outcome bar shows probability distribution
Stage cards show which skill matchups drive prediction
Average vs. average recovers league base rates
Model is context-neutral (no base-out state, leverage, platoon splits, park factors)

⚠ Limitations

No platoon splits (L/R matchups significant)
No situational context (base-out state, inning, score, leverage)
Static talent ELO (no recent form, fatigue, injury adjustment)
No batted-ball quality modeling
Fixed XBH split (league averages, not player-specific)
Calibration assumes normality (z-scores work for bell-shaped distributions)

📈 ELO vs Market Divergence

Compares aggregated team ELO ratings against Kalshi prediction market prices to identify undervalued and overvalued teams. Useful for pre-season analysis and weekly re-ranking decisions.

⚙ How It Works

Aggregate Team ELO: Average all player ELOs on a team roster
Convert to Win Probability: Use logistic formula to get expected win rate
Compare to Market: Subtract market probability from ELO probability
Signal Generation: Classify divergence into undervalued/overvalued

🔢 Team ELO Aggregation

Player ELOs are aggregated to team level using weighted composite:

batting_elo_avg = AVG(batting_elo) for all batters on team
pitching_elo_avg = AVG(pitching_elo) for all pitchers on team
composite = (batting_elo × 0.55) + (pitching_elo × 0.45)

Batting weighted slightly higher (55%) because offense has more impact on fantasy production.

📉 Win Probability Conversion

Team composite ELO is converted to win probability using standard ELO formula:

              win_prob = 1 / (1 + 10^((opponent_elo - team_elo) / 400))
            

Default opponent ELO is 1500 (league average). A team with 1600 composite ELO has ~64% expected win rate against an average team.

🔍 Divergence Calculation

              divergence = elo_win_prob - market_win_prob
            

Divergence	Signal	Interpretation
> +0.10	Strong Undervalued	ELO says much better than market thinks
+0.05 to +0.10	Moderate Undervalued	ELO sees value market is missing
+0.03 to +0.05	Weak Undervalued	Slight edge vs market
-0.03 to +0.03	Fair	Market and ELO roughly agree
-0.03 to -0.05	Weak Overvalued	Market slightly bullish vs ELO
-0.05 to -0.10	Moderate Overvalued	Market overpricing this team
< -0.10	Strong Overvalued	Market significantly overpricing

🎯 Primary Use Cases

Pre-season rankings: Compare team ELO to Vegas win totals and season futures
Weekly re-ranking: Track which teams market is over/underrating
Daily moneylines: Compare ELO-implied odds to daily game lines on Kalshi
Trade targets: Identify players on undervalued teams for buy-low opportunities

💻 API Endpoints

Endpoint	Description
`GET /divergence/teams`	All teams with Kalshi market divergence signals
`GET /divergence/team/{team}`	Detailed breakdown for single team
`GET /divergence/rankings`	Teams ranked by composite ELO only

⚾ Stream-Score (Pitcher Streaming Edge)

Calculates a 0-100 streaming score for pitchers based on their ELO, opponent strikeout tendency, and park factors. Produces a daily top-10 streamer board with risk tags for waiver wire and spot-start decisions.

⚙ Input Factors

Factor	Weight	Description	Why It Matters
Pitcher ELO	50%	Pitcher's current ELO rating	Most important - quality of the pitcher
Opponent K%	30%	Opposing team's strikeout rate	Higher K% = easier matchup for pitcher
Park Factor	20%	Venue's run-scoring environment	Lower factor = pitcher-friendly park

🔢 Score Calculation

Each factor is normalized to 0-100 scale, then weighted:

elo_score = normalize(pitcher_elo, min=1200, max=1800)
k_score = normalize(opp_k_pct, min=0.16, max=0.32)
park_score = invert_normalize(park_factor, min=0.88, max=1.15)

                stream_score = (0.50 × elo_score) + (0.30 × k_score) + (0.20 × park_score)
              

Park factor is inverted so that pitcher-friendly parks (low factor) score higher.

🚨 Risk Tags

Tag	Score Range	Recommendation
Green	60+	Strong stream - high confidence play
Yellow	40-59	Moderate stream - acceptable risk
Red	<40	Risky stream - consider alternatives

📊 Normalization Ranges

Factor	Min	Max	Scale	Note
Pitcher ELO	1200	1800	0-100	Straight scale
Opponent K%	16%	32%	0-100	Higher K% = higher score
Park Factor	0.88	1.15	0-100	Inverted (low PF = high score)

🎯 Primary Use Cases

Waiver wire adds: Identify streamers for two-start weeks
Spot starts: Evaluate streaming options for specific game days
Matchup exploitation: Target pitchers facing high-K% teams
Park-aware decisions: Factor in venue when choosing streamers

💡 Example Calculation

Pitcher: 1650 ELO pitching @ SEA (PF 0.91) vs COL (28% K%)

                elo_score = (1650-1200)/(1800-1200) × 100 = 75.0

                k_score = (0.28-0.16)/(0.32-0.16) × 100 = 75.0

                park_score = (1 - (0.91-0.88)/(1.15-0.88)) × 100 = 88.9

                stream_score = (0.50×75) + (0.30×75) + (0.20×88.9) = 77.3 (Green)

💻 API Endpoints

Endpoint	Description
`GET /stream/board`	Daily top-10 streaming board with risk tags
`GET /stream/score/{player_id}?opponent=NYY&venue=away`	Score single pitcher for specific matchup
`GET /stream/team-k-rates`	All team strikeout rates

⚠ Limitations

Does not include umpire tendencies or weather
Travel/rest factors not yet implemented
K% is team-level, not lineup-specific
Park factors are season-level, not adjusted for weather

🛠 System Architecture

Simple pipeline architecture:

              [Statcast Parquet] → ETL → [DB plate_appearances] → ELO Engine → [player_elo / daily_ohlc]
            

Uses FanGraphs-style playerElo with: RE24 run value, state-matrix expected RV, park factor adjustments, home advantage, and field error handling.

🔢 ELO Calculation (FanGraphs / Jacob Richey)

Per-PA process:

              1. Play RV: rv = RE_after − RE_before + runs_scored

              2. Expected RV: xRV = A·Elo² + B·Elo + C (state-specific)

              3. RV diff: rv_diff = rv − ((xRV_b + xRV_p)/2) − park_factor

              4. Elo update: bΔ = (921.675 + 6046.370·rv_diff − Elo_b)/502

              5. Elo update: pΔ = (965.754 − 4762.089·rv_diff − Elo_p)/502

Parameters: INITIAL_ELO=1500.0, MIN_ELO=500.0, DIVISOR=502.0

📊 OHLC Tracking

Daily OHLC computed for each player with ≥1 PA per day. Enables candlestick visualization of daily fluctuations—easy to spot hot/cold streaks and turning points.

Field	Definition
Open	ELO before first PA of day
High	Maximum ELO during day
Low	Minimum ELO during day
Close	ELO after last PA of day
Delta	Close − Open

⚾ Two-Way Player Implementation

Engine tracks separate batting and pitching ELO via PlayerEloState dataclass:

batting_elo: updated only when batting
pitching_elo: updated only when pitching
batting_pa: plate appearances as batter
pitching_pa: batters faced as pitcher

Pure batters have dormant pitching_elo at 1,500 (vice versa). OHLC keyed by (player_id, role). Two-way player generates two OHLC entries per game day.

              composite = (batting_elo × batting_pa + pitching_elo × pitching_pa) / total_pa
            

🗃 Database Schema

Table	Rows	Description
players	Varies	Player metadata
plate_appearances	Varies	All PAs (base/out state, score change)
player_elo	Varies	Current Elo per player
elo_pa_detail	Varies	Per-PA Elo change records
daily_ohlc	Varies	Daily OHLC per player per role

💻 Data Pipeline

Three stages:

Raw Data: Statcast pitch data and MLB play-by-play
ETL: Aggregate to PA, compute base/out states, runs scored, play RV
ELO Engine: Load RE24 baseline + state matrix + park factors, process PAs chronologically

⚠ Disclaimer

This project is for demonstration and analytical purposes. ELO ratings represent one approach to player evaluation and do not capture defense, baserunning, or game context beyond run expectancy. Data sourced from MLB Statcast via Baseball Savant. Should not be considered definitive skill assessments.

ELO methodology inspired by FanGraphs PlayerELO

Daily Performance

Hot Players

Cold Players

Weekly Hot Players

Weekly Cold Players

League Summary

Leaderboard

Team Power Rankings

Talent Leaderboard

Matchup Predictor

⚾ Select Batter

⚽ Select Pitcher

Pitcher vs Team

⚽ Select Pitcher

⚾ Select Team

Guide

📚 Project Abstract

⚙ Key Adjustments

🔢 ELO Formula (FanGraphs / Jacob Richey)

📊 Data Source

🎯 What Moves Elo

🏆 ELO Tiers

🏡 Park Factor Examples

📚 What is ELO?

⚖ Expectation-Based Updates

⚡ Expected RV & Run Value

⚖ Pitcher-Batter Context

💡 Interpreting Ratings Across Roles

⚾ Two-Way Players

📊 Reading OHLC Charts

🎯 What is Talent ELO?

⚾ Batter Dimensions (4)

⚽ Pitcher Dimensions (4)

🔢 Calculation Method

💡 Important Notes

🔭 What is the Matchup Predictor?

⚙ Input: 6 Talent Dimensions

📊 Z-Score Normalization

🛠 3-Stage Decision Tree

📈 League Average Base Rates (2025)

⚖ Expected wOBA Calculation

💡 Interpreting Results

⚠ Limitations

📈 ELO vs Market Divergence

⚙ How It Works

🔢 Team ELO Aggregation

📉 Win Probability Conversion

🔍 Divergence Calculation

🎯 Primary Use Cases

💻 API Endpoints

⚾ Stream-Score (Pitcher Streaming Edge)

⚙ Input Factors

🔢 Score Calculation

🚨 Risk Tags

📊 Normalization Ranges

🎯 Primary Use Cases

💡 Example Calculation

💻 API Endpoints

⚠ Limitations

🛠 System Architecture

🔢 ELO Calculation (FanGraphs / Jacob Richey)

📊 OHLC Tracking

⚾ Two-Way Player Implementation

🗃 Database Schema

💻 Data Pipeline

⚠ Disclaimer