Daily Performance
Player ELO fluctuations for the selected date. Daily + Weekly Movers. 2025 Season
Hot Players
High GainersCold Players
Risk AlertWeekly Hot Players
Weekly Cold Players
Last 7 DaysLeague Summary
Leaderboard
Top players ranked by ELO rating. 2025 Season
| # | Player | Team | ELO | Tier | PA | EV | HH% | Last Game |
|---|---|---|---|---|---|---|---|---|
| Loading... | ||||||||
Team Power Rankings
Aggregated team ELO from player ratings weighted by PA/BF. 2025 Season
Note: Team ELO is calculated from individual player ELOs weighted by plate appearances (batters) and batters faced (pitchers). Does not account for defense or stolen bases. Speedster teams (KC, SEA) may rank slightly lower; poor defensive teams (CHW, DET) may rank slightly higher than true value. Batting ELO correlates with Runs Scored (r = 0.90); Pitching ELO with Runs Against (r = -0.91).
| # | Team | Aggregate ELO | Batting ELO | Bat Tier | Pitching ELO | Pitch Tier | RS | RA |
|---|---|---|---|---|---|---|---|---|
| Loading... | ||||||||
Talent Leaderboard
Players ranked by skill dimension ELO. 2025 Season
| # | Player | Team | Talent ELO | Tier | Base ELO | PA |
|---|---|---|---|---|---|---|
| Loading... | ||||||
Matchup Predictor
Predict plate appearance outcomes using talent ELO z-scores. 2025 Season
⚾ Select Batter
⚽ Select Pitcher
Pitcher vs Team
Analyze how a pitcher matches up against every batter on a team.
⚽ Select Pitcher
⚾ Select Team
Guide
Understanding the MLB ELO Rating System
📚 Project Abstract
This system applies the FanGraphs-style playerElo model (Jacob Richey) to MLB plate appearances. Each PA is scored by run value, compared against an expected run value based on batter/pitcher Elo and the current base-out state. Ratings move according to how much the outcome outperforms (or underperforms) expectation. The baseline is 1,500 (league average).
⚙ Key Adjustments
- RE24 run value: Play RV = RE_after − RE_before + runs_scored
- Expected RV (state matrix): Quadratic expected RV by base-out state and Elo
- Park factors: Park run environment adjustment (FanGraphs method)
- Home advantage: Applied to batter expected RV when batting at home
- Field error handling: Batters do not gain Elo from errors
🔢 ELO Formula (FanGraphs / Jacob Richey)
All players start at 1,500. Per PA:
rv = RE_after − RE_before + runs_scoredxRV = A·Elo² + B·Elo + C (state-specific)rv_diff = rv − ((xRV_b + xRV_p)/2) − park_factorbΔ = (921.675 + 6046.370·rv_diff − Elo_b)/502,
pΔ = (965.754 − 4762.089·rv_diff − Elo_p)/502| Parameter | Value |
|---|---|
| INITIAL_ELO | 1500.0 |
| MIN_ELO | 500.0 |
| DIVISOR | 502.0 |
📊 Data Source
- Statcast pitch-level data aggregated into plate appearances
- Per-PA base/out state, score change, and runner movement
- Run value computed from RE24 (2016–2018 Retrosheet baseline)
🎯 What Moves Elo
- High RV plays in leverage states (bases loaded, fewer outs)
- Outperforming a strong opponent in a difficult state
- Run-suppressing outcomes by pitchers in high-expectancy states
The same event can produce very different Elo changes depending on state and opponent quality.
🏆 ELO Tiers
| Tier | Range | Description |
|---|---|---|
| Elite | 1,800+ | MVP-caliber |
| High | 1,650-1,799 | All-Star level |
| Above Avg | 1,550-1,649 | Above-average starter |
| Average | 1,450-1,549 | League average |
| Below Avg | 1,350-1,449 | Below-average |
| Low | 1,200-1,349 | Struggling |
| Cold | <1,200 | Significant slump |
🏡 Park Factor Examples
| Stadium | Park Factor | Effect |
|---|---|---|
| Coors Field (COL) | +0.0172 | Run-friendly, RV adjusted down |
| T-Mobile Park (SEA) | -0.0175 | Run-suppressing, RV adjusted up |
| Yankee Stadium (NYY) | +0.0119 | Slightly run-friendly |
Park factors are applied as a run-value adjustment to the RV diff.
📚 What is ELO?
The ELO Rating System is a method originally designed to calculate the relative skill levels of chess players. Players start at 1,500 (league average). In this model, each plate appearance is scored by run value, compared to expected run value given the base-out state and player quality. Elite performers rise above 1,800; struggling players fall below 1,200.
⚖ Expectation-Based Updates
Elo changes are driven by how much the outcome beat expectations. The model computes batter and pitcher updates separately using Fangraphs regression formulas. A player's rating only rises by outperforming expectation.
⚡ Expected RV & Run Value
Each PA gets a true run value from RE24 and an expected run value from the state matrix. The difference between these, adjusted for park factor, drives Elo changes.
| Component | Description |
|---|---|
| Play RV | RE_after − RE_before + runs_scored |
| Expected RV | Quadratic by state and Elo |
| Park Factor | Run environment adjustment |
In run-friendly parks, positive outcomes are discounted to avoid inflation.
⚖ Pitcher-Batter Context
Batter and pitcher Elo are on the same 1,500-based scale, but updates are computed independently. High-Elo pitchers consistently suppress run value in high-expectancy states; high-Elo batters do the opposite.
💡 Interpreting Ratings Across Roles
Ratings are most meaningful within role (batters vs pitchers). Use percentiles within each role when comparing talent tiers, especially for edge cases like relievers vs everyday hitters.
⚾ Two-Way Players
Players who both bat and pitch have two independent ELO ratings—one for batting and one for pitching. Each rating only changes when the player acts in that role. Switch between Batting/Pitching tabs for separate OHLC charts. Marked with TWP badge in search results and leaderboards.
📊 Reading OHLC Charts
- Open: ELO at start of day (first PA)
- High: Highest ELO during day
- Low: Lowest ELO during day
- Close: ELO at end of day (last PA)
■ Green candles indicate ELO rose that day (close > open)
■ Red candles indicate decline (close < open)
Moving averages (MA5, MA15) smooth daily noise.
🎯 What is Talent ELO?
While main ELO captures overall performance, Talent ELO decomposes into specific skill dimensions tracked independently using binary matchup model. No composite talent score—each dimension stands alone. A player might be Elite in Power but Average in Contact.
⚾ Batter Dimensions (4)
| Dimension | Description | Positive Events | Negative Events |
|---|---|---|---|
| Contact | Avoid strikeouts, make contact | 1B, 2B, 3B | K, Out |
| Power | Extra-base hit and HR ability | HR (1.0), 2B (0.7) | GIDP (-0.7), Out (-0.2) |
| Discipline | Plate discipline, walk rate | BB (1.0), HBP (0.8) | — |
| Clutch | High-leverage with RISP | Hits with runners on 2B/3B | Outs, K with RISP |
⚽ Pitcher Dimensions (4)
| Dimension | Description | Positive Events | Negative Events |
|---|---|---|---|
| Stuff | Strikeout-inducing ability | K (1.0) | HR (-0.8) |
| BIP Suppression | Batted-ball suppression/BABIP | Out (0.4), GIDP (0.5) | 1B (-0.6), 2B (-0.8), 3B (-0.9) |
| Command | Walk prevention | K (0.3), Out (0.15) | BB (-1.0), HBP (-0.8) |
| Clutch | Clutch pitching with runners on | Same events amplified in RISP | — |
🔢 Calculation Method
Uses binary matchup model—each PA pits batter skill against pitcher skill in relevant dimension. Outcomes trigger independent dimension updates. ELO expected score uses rating difference to determine expected outcome.
Reliability scaling: Scales from 0.3–1.0 as players accumulate qualifying events; new players get dampened updates that strengthen with sample growth.
💡 Important Notes
- Uses same tier system (Elite 1,800+, etc.)
- Dimensions are independent (can be Elite in Power, Average in Contact)
- Batter Clutch and Pitcher Clutch are separate
- BIP Suppression has low sensitivity (consistent with DIPS theory)
- Speed dimension currently disabled due to insufficient stolen base data
🔭 What is the Matchup Predictor?
Forecasts the outcome distribution of a single plate appearance between a specific batter and pitcher. Uses Talent ELO dimensions rather than sparse head-to-head data. Runs entirely in browser (no backend calls except fetching talent ELO). V2.1 z-score matchup predictor port.
Output: 7-outcome probability distribution (BB, K, OUT, 1B, 2B, 3B, HR) plus derived metrics (expected wOBA, on-base %, expected slugging).
⚙ Input: 6 Talent Dimensions
| Batter | Pitcher |
|---|---|
| Discipline (Stage 1: BB) | Command (Stage 1: BB) |
| Contact (Stage 1: K; Stage 2: Hit) | Stuff (Stage 1: K) |
| Power (Stage 3: XBH) | BIP Suppression (Stage 2: Hit) |
Clutch dimensions not used—models context-neutral PA.
📊 Z-Score Normalization
Raw ELO values on different scales. Predictor converts to z-scores using dimension-specific distribution:
After normalization, z=+1.0 means "one standard deviation above average" regardless of dimension.
| Dimension | Mean | Std |
|---|---|---|
| Batter Contact | 1504.5 | 33.9 |
| Batter Power | 1468.6 | 61.6 |
| Batter Discipline | 1700.3 | 139.0 |
| Pitcher Stuff | 1587.3 | 56.6 |
| Pitcher BIP Supp. | 1513.3 | 18.2 |
| Pitcher Command | 1681.1 | 126.5 |
🛠 3-Stage Decision Tree
Sequential structure mirrors natural PA progression:
z_stuff_contact = z(Stuff) − z(Contact)
logit_BB = base_logit_BB + z_disc_cmd / 3.5
logit_K = base_logit_K + z_stuff_contact / 3.5
Softmax → P(BB), P(K), P(BIP)
logit = base_logit + z_contact_bip / 5.0
P(Hit|BIP) = 1 / (1 + e^(−logit))
P(Hit) = P(BIP) × P(Hit|BIP)
P(XBH|Hit) = 1 / (1 + e^(−logit))
P(2B) = P(XBH) × 0.552
P(3B) = P(XBH) × 0.045
P(HR) = P(XBH) × 0.403
📈 League Average Base Rates (2025)
| Metric | Value |
|---|---|
| BB rate | 9.49% |
| K rate | 22.18% |
| BIP rate | 68.34% |
| BABIP | .321 |
| XBH rate | 34.93% |
| 2B ratio (of XBH) | 55.2% |
| 3B ratio | 4.5% |
| HR ratio | 40.3% |
⚖ Expected wOBA Calculation
7-outcome distribution converted using linear weights:
| Outcome | wOBA Weight |
|---|---|
| BB | 0.69 |
| K / OUT | 0.00 |
| 1B | 0.88 |
| 2B | 1.24 |
| 3B | 1.56 |
| HR | 2.00 |
💡 Interpreting Results
- Expected wOBA is best summary (league avg ~.310; >.350 batter advantage, <.280 pitcher advantage)
- Outcome bar shows probability distribution
- Stage cards show which skill matchups drive prediction
- Average vs. average recovers league base rates
- Model is context-neutral (no base-out state, leverage, platoon splits, park factors)
⚠ Limitations
- No platoon splits (L/R matchups significant)
- No situational context (base-out state, inning, score, leverage)
- Static talent ELO (no recent form, fatigue, injury adjustment)
- No batted-ball quality modeling
- Fixed XBH split (league averages, not player-specific)
- Calibration assumes normality (z-scores work for bell-shaped distributions)
📈 ELO vs Market Divergence
Compares aggregated team ELO ratings against Kalshi prediction market prices to identify undervalued and overvalued teams. Useful for pre-season analysis and weekly re-ranking decisions.
⚙ How It Works
- Aggregate Team ELO: Average all player ELOs on a team roster
- Convert to Win Probability: Use logistic formula to get expected win rate
- Compare to Market: Subtract market probability from ELO probability
- Signal Generation: Classify divergence into undervalued/overvalued
🔢 Team ELO Aggregation
Player ELOs are aggregated to team level using weighted composite:
Batting weighted slightly higher (55%) because offense has more impact on fantasy production.
📉 Win Probability Conversion
Team composite ELO is converted to win probability using standard ELO formula:
Default opponent ELO is 1500 (league average). A team with 1600 composite ELO has ~64% expected win rate against an average team.
🔍 Divergence Calculation
| Divergence | Signal | Interpretation |
|---|---|---|
| > +0.10 | Strong Undervalued | ELO says much better than market thinks |
| +0.05 to +0.10 | Moderate Undervalued | ELO sees value market is missing |
| +0.03 to +0.05 | Weak Undervalued | Slight edge vs market |
| -0.03 to +0.03 | Fair | Market and ELO roughly agree |
| -0.03 to -0.05 | Weak Overvalued | Market slightly bullish vs ELO |
| -0.05 to -0.10 | Moderate Overvalued | Market overpricing this team |
| < -0.10 | Strong Overvalued | Market significantly overpricing |
🎯 Primary Use Cases
- Pre-season rankings: Compare team ELO to Vegas win totals and season futures
- Weekly re-ranking: Track which teams market is over/underrating
- Daily moneylines: Compare ELO-implied odds to daily game lines on Kalshi
- Trade targets: Identify players on undervalued teams for buy-low opportunities
💻 API Endpoints
| Endpoint | Description |
|---|---|
GET /divergence/teams | All teams with Kalshi market divergence signals |
GET /divergence/team/{team} | Detailed breakdown for single team |
GET /divergence/rankings | Teams ranked by composite ELO only |
⚾ Stream-Score (Pitcher Streaming Edge)
Calculates a 0-100 streaming score for pitchers based on their ELO, opponent strikeout tendency, and park factors. Produces a daily top-10 streamer board with risk tags for waiver wire and spot-start decisions.
⚙ Input Factors
| Factor | Weight | Description | Why It Matters |
|---|---|---|---|
| Pitcher ELO | 50% | Pitcher's current ELO rating | Most important - quality of the pitcher |
| Opponent K% | 30% | Opposing team's strikeout rate | Higher K% = easier matchup for pitcher |
| Park Factor | 20% | Venue's run-scoring environment | Lower factor = pitcher-friendly park |
🔢 Score Calculation
Each factor is normalized to 0-100 scale, then weighted:
Park factor is inverted so that pitcher-friendly parks (low factor) score higher.
🚨 Risk Tags
| Tag | Score Range | Recommendation |
|---|---|---|
| Green | 60+ | Strong stream - high confidence play |
| Yellow | 40-59 | Moderate stream - acceptable risk |
| Red | <40 | Risky stream - consider alternatives |
📊 Normalization Ranges
| Factor | Min | Max | Scale | Note |
|---|---|---|---|---|
| Pitcher ELO | 1200 | 1800 | 0-100 | Straight scale |
| Opponent K% | 16% | 32% | 0-100 | Higher K% = higher score |
| Park Factor | 0.88 | 1.15 | 0-100 | Inverted (low PF = high score) |
🎯 Primary Use Cases
- Waiver wire adds: Identify streamers for two-start weeks
- Spot starts: Evaluate streaming options for specific game days
- Matchup exploitation: Target pitchers facing high-K% teams
- Park-aware decisions: Factor in venue when choosing streamers
💡 Example Calculation
k_score = (0.28-0.16)/(0.32-0.16) × 100 = 75.0
park_score = (1 - (0.91-0.88)/(1.15-0.88)) × 100 = 88.9
stream_score = (0.50×75) + (0.30×75) + (0.20×88.9) = 77.3 (Green)
💻 API Endpoints
| Endpoint | Description |
|---|---|
GET /stream/board | Daily top-10 streaming board with risk tags |
GET /stream/score/{player_id}?opponent=NYY&venue=away | Score single pitcher for specific matchup |
GET /stream/team-k-rates | All team strikeout rates |
⚠ Limitations
- Does not include umpire tendencies or weather
- Travel/rest factors not yet implemented
- K% is team-level, not lineup-specific
- Park factors are season-level, not adjusted for weather
🛠 System Architecture
Simple pipeline architecture:
Uses FanGraphs-style playerElo with: RE24 run value, state-matrix expected RV, park factor adjustments, home advantage, and field error handling.
🔢 ELO Calculation (FanGraphs / Jacob Richey)
Per-PA process:
2. Expected RV: xRV = A·Elo² + B·Elo + C (state-specific)
3. RV diff: rv_diff = rv − ((xRV_b + xRV_p)/2) − park_factor
4. Elo update: bΔ = (921.675 + 6046.370·rv_diff − Elo_b)/502
5. Elo update: pΔ = (965.754 − 4762.089·rv_diff − Elo_p)/502
Parameters: INITIAL_ELO=1500.0, MIN_ELO=500.0, DIVISOR=502.0
📊 OHLC Tracking
Daily OHLC computed for each player with ≥1 PA per day. Enables candlestick visualization of daily fluctuations—easy to spot hot/cold streaks and turning points.
| Field | Definition |
|---|---|
| Open | ELO before first PA of day |
| High | Maximum ELO during day |
| Low | Minimum ELO during day |
| Close | ELO after last PA of day |
| Delta | Close − Open |
⚾ Two-Way Player Implementation
Engine tracks separate batting and pitching ELO via PlayerEloState dataclass:
- batting_elo: updated only when batting
- pitching_elo: updated only when pitching
- batting_pa: plate appearances as batter
- pitching_pa: batters faced as pitcher
Pure batters have dormant pitching_elo at 1,500 (vice versa). OHLC keyed by (player_id, role). Two-way player generates two OHLC entries per game day.
🗃 Database Schema
| Table | Rows | Description |
|---|---|---|
| players | Varies | Player metadata |
| plate_appearances | Varies | All PAs (base/out state, score change) |
| player_elo | Varies | Current Elo per player |
| elo_pa_detail | Varies | Per-PA Elo change records |
| daily_ohlc | Varies | Daily OHLC per player per role |
💻 Data Pipeline
Three stages:
- Raw Data: Statcast pitch data and MLB play-by-play
- ETL: Aggregate to PA, compute base/out states, runs scored, play RV
- ELO Engine: Load RE24 baseline + state matrix + park factors, process PAs chronologically
⚠ Disclaimer
This project is for demonstration and analytical purposes. ELO ratings represent one approach to player evaluation and do not capture defense, baserunning, or game context beyond run expectancy. Data sourced from MLB Statcast via Baseball Savant. Should not be considered definitive skill assessments.
ELO methodology inspired by FanGraphs PlayerELO