Stolen Goods: Updating Sample Size! Sample Size! Sample Size!

Courtesy of Derek Carty of The Hardball Times, here is an elaborate update to an older GOI post cataloging the requisite PA thresholds for hitters before certain statistics become statistically significant:
HITTERS
  • Strikeout rate/Contact rate: 150 PA
  • LD%: 150 PA
  • Walk rate: 200 PA
  • GB%: 200 PA
  • GB/FB: 200 PA
  • FB%: 250 PA
  • Home run rate: 300 PA
  • HR/FB: 300 PA
  • BABIP: Doesn't reach a 0.50 r-squared at 650 or below.
  • Batting average: Doesn't reach a 0.50 r-squared at 650 or below.
PITCHERS
  • K/PA: 150 BF
  • GB%: 150 BF
  • LD%: 150 BF
  • FB%: 200 BF
  • GB/FB: 200 BF
  • K/BB: 500 BF
  • IF FB%: 500 BF
  • BB/PA: 550 BF
  • BABIP: Doesn't reach a 0.50 r-squared at 650 or below.
  • HR/FB: Doesn't reach a 0.50 r-squared at 650 or below.
Enjoy.

4 comments:

Sky said...

"Statistically significant" is a bit vague here. An R^2 of .5 means R is about .7. So these thresholds tell you at what point you should regress 30% towards league average.

In other words, there's no magic cutoff point, just a range where in-season data becomes more and more meaningful.

David "MVP" Eckstein said...

Thats' a fair statement. I'm just trying to illustrate the notion that you shouldn't panic on a player 2 weeks into the season, let alone 100 or 200 ab later. Past production is a sunk cost.

Jon Peltier said...

"Past production is a sunk cost." So is that long-term multi-mullion dollar contract.

David "MVP" Eckstein said...

If player P produces X and costs Y, where as replacement player R produces T, where T<X, but costs U, where U<Y, then its only a GET RID OF HIM sunk cost if you can get salary relief and comparable production. Otherwise it's just a silly expense you should endure.