Measuring pitch variability through PITCHf/x

January 3, 2011 • Analysis

For a while, I've been wondering what can be measured and analyzed using PITCHf/x data that hasn't already been measured and analyzed. A few things crossed my mind, but the most interesting thought was about the degree of variability of a pitcher's pitches.

It would be relatively easy to measure how much variability a pitcher has in velocity and movement if all things were equal. Of course, they aren't.

The two biggest problems for analyzing this type of variability are, as I see them, pitch type identification and park-to-park measurement error. Variability would mean little if half of a pitcher's "two-seam fastballs" are actually change-ups. Variability also runs into problems when parks like Kansas City -- whose radar gun readings are notoriously high -- are included in a data set with other ballparks.

Fortunately, if we only look at a single ballpark -- usually the pitcher's home ballpark because it has the greatest sample size -- park-to-park measurement error should be less of a factor. Without some form of park-to-park normalization, though, interpark comparisons shouldn't necessarily be taken at face value.

Additionally, 2010 saw a huge improvement in pitch type identification. While it still isn't 100% accurate, it is close enough on many pitchers to give me confidence while playing around with my ideas.

I haven't really dug into the numbers yet, but I will be looking to see if variability within a pitch type helps or hurts a pitcher. My gut feeling is that the number itself won't have much meaning.

To calculate the variability, I plan to capture the 95% window using two measurements of the standard deviation in both directions from the mean. By definition, this eliminates the outliers, but it will take some study to determine if that's really the measurement to use.

It may be beneficial to use a pythagorrean measure to find the variability for pitch movement; however, this would not appropriately model pitches that have greater variability vertically than horizontally (and vice versa, of course).

Look for a follow-up after I play around with this idea.


Thinking about run values

December 30, 2010 • Analysis

For some time, I've been looking for a way to appropriately integrate run values into the PITCHf/x database. I have read articles at Beyond the Boxscore, Inside The Book, and Cubs f/x, but I am no closer to getting what I want. Unfortunately, I lack the resources and time to find the answers myself.

Many tables have been published with run expectancies for the 12 ball/strike count states for various time periods. Tables have also been published for the 24 base/out states. Because the two tables contain different representations of the same data, there's no way to combine them. What I would like to see -- and I'm sure this makes me a sadist -- is a run expectancy table for the 288 ball/strike/base/out states.

Yes, that's one hell of a matrix to process, but there are two thoughts that seem to be the beginning of arguments against the two relatively simple approaches:

  • The thought against only using the 12 ball/strike count states table: a first-pitch strike in a bases loaded, no out situation has to effect the run expectancy more than a first-pitch strike in a bases empty, two out situation, right?
  • The thought against only using the 24 base/out states table: an 0-2 single with a runner on first base has to effect the run expectancy more than an 3-0 single with a runner on first base, right?

Admittedly, I don't have the knowledge or skills necessary to issue either of those thoughts as facts, so I have posed them as questions. It seems logical, though, doesn't it?

I think an appropriate time period for the analysis to cover is 1998-present -- since the last expansion.

Does anyone know if anyone has tackled this subject, successfully or otherwise? Is this covered in a book that I have not yet read -- possibly even one that I have read?

Consider this an open call for help in this matter.

[UPDATE: Tom Tango finally calculated the 288 states after the 2018 season, but his website was not working when I tried to grab the link.]


2010 Texas Rangers Win-Curve Revisited

December 28, 2010 • Analysis

In 2009, I published a win-curve that predicted Texas Rangers attendance for a given win level. The Rangers won 87 games, and my win-curve predicted 27,958 attendees per game for that win level. Actual attendance was only 27,641. The difference was 317, only a 1.15% difference.

This season, I updated my data and published another win-curve. The yellow dot on the graph marks the 2009 attendance level, and the red dot marks the 2009 win level.

Texas Rangers, Wins vs Estimated Attendance, 2010
2010 Attendance Prediction. For a full description, read the original article (link above).

In 2010, the Texas Rangers won 90 games. My win-curve predicted an average home attendance of 31,202. According to ESPN's numbers, the actual average home attendance was 30,928.

The difference of 274 attendees per game translates to only a 0.89% overshoot. The model was more accurate this year than last year.

As the season approaches, I will update the data and issue a new prediction.


A collection of thoughts on Stephen Strasburg

September 8, 2010 • Analysis

Yeah, I'm late to the party on this one, but I wanted to share some of what has been written in the blogosphere about Stephen Strasburg's elbow injury.

To start this post off, here are two quotes from my March 2009 analysis of his mechanics after watching him pitch against TCU:

His flexed elbow moves well behind his back and reaches shoulder height before the ball. From there, he must forcefully externally rotate his arm to get the ball to driveline height. This causes late forearm turnover and increases the valgus torque that occurs during reverse forearm bounce. This is a risk factor for his ulnar collateral ligament.

Strasburg has some of the common flaws of traditional pitching mechanics and carries with him the associated risks. These risks will almost certainly not affect his draft status because it could be 10 years before anything goes wrong.

The second paragraph is included to give context for my analysis.

Around the same time as my analysis, Kyle Boddy (then writing for Driveline Mechanics - the now-defunct SBN blog) compared Strasburg's mechanics to those of Pedro Martinez and Mark Prior. The three pitchers demonstrated striking mechanical similarities.

Notably, Pedro Martinez pitched relatively injury free for most of his career until his age 34 season, the one exception being rather severe shoulder inflammation in 2001.

Mark Prior, of course, was not as lucky. After initially injuring his shoulder in a baserunning collision, Prior suffered from a string of elbow and shoulder injuries. Some people blame the collision for his problems, and while it seems like a possibilty, it is impossible to know for sure.

After Strasburg's injury, Kyle wrote two articles concerning Strasburg and elbow injuries in general.

His first article (Elbow Injuries and What Causes Them (Stephen Strasburg Bonus Content!)) is a lengthy discussion of how horizontal shoulder abduction -- referred to as "scap loading" or "scapular loading" by some -- leads to increased horizontal adduction velocities that increase valgus stress in the elbow. He notes that while this clearly can't be labeled as the sole contributor to Strasburg's injury, it certainly played a role.

Kyle's second article (Strasburg, The Inverted W, and Pitching Mechanics) attacks some misconceptions and naysaying about the reputation of the inverted W position. In his discussion, he brings it back to Mark Prior by comparing Prior's peak horizontal shoulder abduction position to Strasburg's peak horizontal shoulder abduction position.

Finally, Eric Cressey offered his thoughts -- The Skinny on Stephen Strasburg’s Injury. Much of the article explains how important the health of the anterior forearm musculature (flexor-pronator mass) is in helping take valgus stress in the UCL. He briefly tackles overall tissue quality and links back to the great series he wrote on elbow pain.

Cressey puts some of the blame on the inverted W, but he is quick to mention that mechanical quirks like that aren't always a sign of impending injury.

A lot of people subscribe to the idea that a pitcher "only has so many bullets" in his arm. Cressey quotes J.P. Ricciardi and seems to agree with him. The idea is hard to argue with, since "so many bullets" could be 1,000 or 1,000,000 or even 1,000,000,000.

As a stand-alone theory, it leaves a lot to be desired, and leads to a series of questions:

  • How many bullets do I have?
  • What's the best way to conserve my bullets?
  • Can I get more bullets? If so, how?

With a boiled-down, unexplained idea like this, people are likely to misapply it by any number of means. That could include keeping strict pitch counts to protect the arm but still pitching year-round without rest. Alternatively, some people may wind up thinking that there's nothing they can do to extend the life of their arms and then neglect appropriate strength and conditioning.

Cressey, however, applies idea very well in a brief discussion of how to save those bullets. If you haven't read his thoughts, you should.

I have some of my own thoughts to share about Strasburg, but it may take me some time to pull them all together. Stay tuned.


McCarthy suffers another stress fracture

April 27, 2010 • Analysis

Jeff Wilson has reported that Brandon McCarthy has been placed on the 7-day DL in Oklahoma City with a stress fracture of his right scapula. Unbelievable.

Seriously unbelievable. Bones get stronger after stress fractures. It's part of the healing process sometimes referred to as overcompensation (or supercompensation). Bones respond to stress and stress fractures by growing thicker, stronger, and more dense.

This is the third diagnosis of a stress fracture in McCarthy's shoulder. Having been through this twice before, McCarthy's shoulder blade should be plenty strong enough to withstand two months of pitching, but it apparently isn't.

Unbelievable.

What is believable, though? I see a couple of possible explanations.

The original stress fracture from 2007 simply may not be healed. If this is the case, the cause is likely dietary, but it could be that the injury has never been given sufficient time to heal. Stress fractures often become pain-free well before they are actually healed.

Another explanation is that the problem is not actually a stress fracture. Soft tissue is much more susceptible to re-injury than is bony tissue, and the location of McCarthy's injury is a confluence of soft tissue that literally encapsulates the glenohumeral joint.

The recommendations here are running short.

McCarthy attempted a mechanical overhaul, but it doesn't seem to have accomplished its chief goal despite leading to a sparking ground ball rate at Oklahoma City where McCarthy has been excellent.

At this point, it looks like mechanics aren't McCarthy's real problem. If it isn't his mechanics, the culprit is one of the following: diet, strength/conditioning, and genetics.

Genetics, of course, can not be changed, but the other two can be addressed.

In addressing the diet, there are three things to watch for, and they all go hand-in-hand. The goal is improved bone density so the main focal points are calcium, vitamin D, and pH balance. I am not a dietician or a nutritionist, so I will stop short of making specific recommendations.

In addressing potential strength and conditioning issues that may be contributing to McCarthy's problems, a recently published DVD set contains just about everything anyone would ever need to know ranging from prehab and diagnosis to rehab and high performance.

You (and Brandon McCarthy) should check out Optimal Shoulder Performance.

[[Update: The evidence is apparently quite clear. This is, in fact, a scapular stress fracture. Someone who has seen recent video of McCarthy believes that McCarthy had fallen back into old mechanical habits.]]