Showing posts with label BB%. Show all posts
Showing posts with label BB%. Show all posts

Is walking in baseball an old man's skill? (part 2)

As I showed in part one, there was a pretty strong correlation between walking and age last season (R-Squared was .587). I took the idea from part one (is walking an old man's skill?) and took it a step further. According to Fangraphs, there have been 3,636 players in the history of baseball who have accumulated 1000+ PAs. Over the past several weeks, largely because I did not know that there already existed an excel function (pivot table) to do it for me, I organized and sorted every one of those 3,636 players by BB% and took the average PA (and wOBA, for good measure) by walk rate (categorizing each walk rate by 1/10 of a percent). From there, I graphed the data and calculated the correlations. The lowest walk rate in the sample size (0.40%) belonged to a man by the name of George Zettlein (1108 PAs, .209 wOBA). The highest walk rate in the sample size (20.80%) belonged not to Barry Bonds (20.60%), but Ted Williams (9791 PAs, .493 wOBA).

The basis questions to the two data sets was as follows:
1) Do players who walk more get more PAs (in other words, do players who walk more have longer careers)?
2) Are players who walk more generally better at baseball (is BB% correlated to wOBA)?*
*note: wOBA is an OPS-like metric, which is skewed towards OBP rather than SLG and accounts for other factors such as SB%.

Chart #1 (click to enlarge): BB% (x-variable) to PAs (y-variable)

The R-Squared for this data set is .551

Chart #2 (click to enlarge): BB% (x-variable) to wOBA (y-variable)

The R-Squared for this data set is .639 (no surprise).

Other tidbits of data:
-The average wOBA over this sample size was .321
-The average BB% of the 3,636 players was 8.21%
-The average PA/career in the data set was 3596.

Conclusions:
-No surprise here, but the backbone of production is walking. There is a very high (approx. +.64) correlation.
-Walking is an old man's skill. There is a strong coefficient of determination between how many PAs a player will get over the course of his career and his walk rate. I feel that the strength of the coefficient is understated by two factors. First, guys like Albert Pujols are still young. In other words, a lot of very good players are still very young and haven't gotten the chance to accumulate 6000+ PAs yet. Secondly, and more importantly, many players with great talent are derailed by injury. There are plenty of players with fantastic peripherals who just cannot stay healthy and thus get fewer PAs across their career, thus distorting the lower ends of the data. When calculating the average PAs for players with better walk rates within the data (13+%), I noticed many players whose careers ended before accumulating even 2500 PAs.

This data should be pretty reliable. The minimum sample sizes necessary to draw statistically relevant inferences from the data is 200 PA for BB% and 500 PA for OPS. wOBA is somewhat related to OPS (though a much better representation of a player's offensive value), thus I make the logical, but mathematically unproven assumption that 1000 PAs would be a good minimum range for the data from which I could draw statistically relevant conclusions about wOBA (in relation to BB%).

All in all, the answer seems to be yes, walking is an old man's sport.

Is walking in baseball an old man's skill? (part 1)

There is an old adage in baseball that walking is an old man's skill. The hypothesis is that as players get older, they need to "learn" to walk more in order to compensate deteriorating abilities to get on base vis a vis hitting. As Fangraphs previous noted, the statistical sample size necessary to draw inferences about a player's BB% is 200 PA. K% is 150 PA. With this in mind, I decided to analyze the above hypothesis by taking two approaches. The first approach was to look at data from 2009; to group players by age and average out their BB% (and K% for fun). Of course, there is always the notion that players who walk more are inherently better and have skills that age better and thus the sample size will be biased to reflect this. Such is the topic of my second approach of analyzing BB% (and some other stats) based on playing time in the majors -- I've organized all 3,636 players in the history of baseball who have accumulated 1000+ PAs (thanks to fangraphs) and plotted BB% to PA to see if this hypothesis has any credence.

The latter approach is going to be more difficult to sort through, average, and make presentable and thus will come out at a later date (sometime after I write Memo II, unless TBO wants to organize all the data for me). This post will deal with my first approach to the data -- average BB% and K% by age group in 2009 (200 min. PA).

Before I present the data, let me present my initial hypotheses regarding the data. I made two predictions about the data. The first prediction was that BB% would be higher on the extremes (the youngest and oldest players). I made this assumption on the presumptions that 1) players who came into the majors earlier would need more "raw talent" to survive 200+ PA and 2) that guys with poor BB%'s would whittle out of the majors as they got older and had lower bat speed and slower feet. My second prediction was that K% would dramatically increase as players got older than 30 because of declining bat speeds.

Here is the actual data (click to enlarge):


Age Avg BB% Avg K%
21 9.20% 24.20%
22 8.20% 22.43%
23 9.23% 22.23%
24 7.58% 21.49%
25 9.00% 19.32%
26 8.90% 19.53%
27 9.09% 19.46%
28 8.90% 20.01%
29 9.78% 20.38%
30 10.20% 23.03%
31 9.99% 19.90%
32 8.88% 18.93%
33 8.44% 16.78%
34 9.39% 17.35%
35 8.94% 15.44%
36 10.03% 19.21%
37 11.89% 18.73%
38 10.60% 20.67%
39 12.70% 23.70%
40 14.00% 20.70%
41 13.00% 17.20%

Surprisingly enough, K% decreased amongst the age 30 to 35 groups. Walk rate had a slight upward tendency which I feel is understated by the graphic because the degree of change is so small (a 1% change in BB% seems more dramatic than it appears, considering that BB%'s ranged from 7.58 to 14.00%). Players tend to be "in their prime" from ages 25-34, so it is not very shocking that the BB% data is pretty flat line across that age group. After age 35, you can see the BB% spike, although the sample size of players within an age group decreases as the age group gets older than 35 (which is expected, considering that veterans (aka old guys) have to be worth their price tag to accumulate 200+ PAs). It is worth noting that only four players older than 38 accumulated 200+ PA's in 2009. I should probably also mention that only two MLB players (Travis Snyder and Elvis Andruws) under 22 accumulated 200+ PA's. Furthermore, the entire sample size of MLB players who accumulated 200 PA's in 2009 was 346. It seems that player begin to lose their bat speed at 35. If any MLB teams are reading this blog, you should probably note that Vlad and Miguel Tedaja are each 35 (or older, depending on which birth certificate you consult).

Do you find this data shocking? Reinforcing?

When I have time, I'll run the numbers using PA as the X-factor and average BB% and K% as the variables. That chart should be even more interesting...

________

Update: I just calculated the coefficient of determination for both the strikeout and walk data. For walks to age, the R-Squared is .587. For strikeouts, it is .178.