Measuring pitch variability through PITCHf/x

Trip Somers • January 3, 2011 • Analysis

For a while, I've been wondering what can be measured and analyzed using PITCHf/x data that hasn't already been measured and analyzed. A few things crossed my mind, but the most interesting thought was about the degree of variability of a pitcher's pitches.

It would be relatively easy to measure how much variability a pitcher has in velocity and movement if all things were equal. Of course, they aren't.

The two biggest problems for analyzing this type of variability are, as I see them, pitch type identification and park-to-park measurement error. Variability would mean little if half of a pitcher's "two-seam fastballs" are actually change-ups. Variability also runs into problems when parks like Kansas City -- whose radar gun readings are notoriously high -- are included in a data set with other ballparks.

Fortunately, if we only look at a single ballpark -- usually the pitcher's home ballpark because it has the greatest sample size -- park-to-park measurement error should be less of a factor. Without some form of park-to-park normalization, though, interpark comparisons shouldn't necessarily be taken at face value.

Additionally, 2010 saw a huge improvement in pitch type identification. While it still isn't 100% accurate, it is close enough on many pitchers to give me confidence while playing around with my ideas.

I haven't really dug into the numbers yet, but I will be looking to see if variability within a pitch type helps or hurts a pitcher. My gut feeling is that the number itself won't have much meaning.

To calculate the variability, I plan to capture the 95% window using two measurements of the standard deviation in both directions from the mean. By definition, this eliminates the outliers, but it will take some study to determine if that's really the measurement to use.

It may be beneficial to use a pythagorrean measure to find the variability for pitch movement; however, this would not appropriately model pitches that have greater variability vertically than horizontally (and vice versa, of course).

Look for a follow-up after I play around with this idea.