Evaluating xWHIP 2.0 (and Quick xWHIP)

Editor's Note: The following article was written by xWHIP 2.0 Calculator co-collaborator and creator of Simple/Quick xWHIP, Martin Alex Hambrick. You can download the xWHIP 2.0 Calculator (requires Excel or Open Office) by clicking here.

About six months ago, I began tinkering with the concept of predicting WHIP. After all, WHIP is one of the most widely used indicators of pitcher skill, and in most fantasy leagues, it's a category. Additionally, the components should, in theory, be relatively easy to predict. Walk rate tends to remain consistent from year to year, and the number of hits a pitcher allow should regress, much like a pitcher's BABIP.

I set about predicting WHIP by regressing hits against various standards. While I was looking at the numbers, I noticed an interesting trend: WHIP correlated extremely strongly with K/BB ratio. I was interested in turning correlation into something useful for predicting WHIP, but I wasn't sure how to. So, I took the extremely scientific (sarcasm) path of using Excel to generate the trend line equation and then worked backwards from that.

I was left with the following equation: 1.54 - .512K = xH/IP . I rounded out the numbers, threw bases on balls into the mix, and was left with the following: 1.5 + (BB - .5K)/IP = xWHIP

This correlated extremely strongly with career WHIP, with an R-squared and R value of .6758 and .822 , respectively. However, it consistently estimated WHIP far higher than the actual results. So despite the strong correlation, the accuracy wasn't quite there. I tinkered with the variables involved and tried to standardize it based on factors such as league hits/out, and was left with the following formula, which worked well and was quite accurate. But I didn't have any explanation for why it worked. "It just works" was my motto:

1.375 + (BB - .5K)/IP = xWHIP

I sat on this formula for awhile until I met another stat-head, Jeff Gross, who had independently come up with the idea of xWHIP. However, he was using a far more scientific means of generating xWHIP, regressing against real game data averages for various events: ground balls, fly balls, pop-ups, line drives, etc. All of these tend to produce outs at a specific rate that remains constant from year to year. So if you have the number of batted ball types a pitcher allows, you should be able to predict the number of hits he will allow and thus predict his WHIP.

We worked together with the formula for expected hits, and came up with an additional statistic: expected outs, which would be used in place of innings pitched. While Jeff worked on further refining the formula, I reexamined the "Simple xWHIP" formula.

Using Jeff's formula, I plugged in the statistics of an "average" pitcher, one who throws an average number of ground balls, fly balls, line drives, and pop-ups. The only two variables I left unknown were strikeouts and walks. After simplifying the equation, I was left with the following: 1.3747 + (BB - .496*K)/IP. Since the goal is simplicity, and these numbers are close enough, we can round up and get exactly what I got before:

1.375 + (BB - .5K)/IP = xWHIP

Finally there was a valid basis for my formula that "just worked." After discovering that, I went to work at checking the statistical validity of both xWHIP (using the expected innings formula, not actual innings pitcher) and "Simple xWHIP." I did two calculations for each year: predictive and evaluative. Evaluative is simply comparing 2010 xWHIP vs. 2010 WHIP. Predictive is taking a pitcher's career xWHIP until a given year and using that to predict that year's WHIP.

The results were as follows:

As you can see, xWHIP does a pretty good job as both a predictive and evaluative stat. Stat-heads can look at the R^2 value. For the lay person, just look at "Accuracy." What that says is, if you picked a pitcher at random, his xWHIP would be, on average, that much higher or lower than his real WHIP.

As you can also see, xWHIP and "Simple xWHIP" are remarkably similar in their predictive and evaluative power. Full xWHIP is clearly the better choice when you have all the data in front of you, and it will more accurately reflect the abilities of extreme groundball and flyball pitchers. But, when it comes to simplicity, you can't beat Simple xWHIP. In many cases, you don't even need a calculator to figure it out.

So, using Simple xWHIP, let's examine pitchers in 2010. First, we'll look at the 10 pitchers whose xWHIP was significantly higher or significantly lower than their WHIP.

This gives a pretty good indication of people to steer clear of. Most of these are pretty obvious, but Matt Cain and Ubaldo Jiminez both strike me as pitchers a lot of people would overpay for.

On the flip side, some interesting names top the list of people poised for a bounce-back year. Of course, some of these you still want to avoid like the plague—a .13 dropoff from a 1.56 WHIP is still pretty bad. I'm looking at you, Paul Maholm.

Now, let's look at the people with the best WHIP, and what you can expect from them next year:

Cliff Lee, Jered Weaver, Roy Halladay, Mat Latos, Josh Johnson, Shaun Marcum and Justin Verlander are all poised to remain at or below a 1.2 WHIP, and you should pay face value for them. A few guys, like Adam Wainwright, Ted Lilly and Roy Oswalt should still be high-end WHIP pitchers, but be aware that they pitched far beyond their abilities last season.

Finally, let's take a look at the pitchers with the best xWHIP and see if we can find any high value pickups.

Most of the pitchers with very low xWHIP also had very low WHIP, so you will have a hard time getting them for a good value. James Shields, Scott Baker and Francisco Liriano seem to represent the best values here. Dan Haren and Tim Lincecum are good values in theory, but more than likely you will see people pick them up simply because they have the marquee name, so you may still have to overpay to grab them.

So there you have it. xWHIP and Simple xWHIP. Simple xWHIP, like FIP, is easy to calculate, yet still very powerful. xWHIP is a bit more complex, but accounts for much more and will give you a more accurate view of any specific pitcher, especially an extreme groundball or flyball pitcher.