2011 Texas Rangers: Wins, Attendance, and Playoffs

March 10, 2011 • Analysis

For two years in a row, my attendance prediction model has come extremely close to predicting the actual attendance at the team's win level.

In 2009's prediction, my model overshot actual attendance by 1.15%. In 2010, it overshot actual attendance by 0.85%.

The model has been updated to include data from 2010.

Texas Rangers, Wins vs Estimated Attendance, 2011

Again, the model predicts an increase in attendance. At last year's win level -- 90 wins -- the model predicts an average attendance of 33,645 per home game. To fall below last year's attendance level, the model says that the Rangers would have to win fewer than 77 games.

Coming off a World Series appearance, it will be interesting to see how the model holds up for 2011.

Regression Notes

The standard error is down from last year's 2,602 attendees per game to 2,560. The R-square and Adjusted R-Square values are nearly identical to the previous year's -- all three years have been right around 0.90 for both values.

Thanks to the accuracy of last year's prediction, the t Stat and P-value numbers for all three independent variables improved. The growth factor variable (inflation) is still the least significant of the three with a t Stat of 1.415, but again, removing it from the calculations results in larger errors.

Playoff Probability

There were no significant changes to the playoff probabilities for each win level in the AL West. The 90% barrier is crossed at 95 wins, and the 50% barrier is crossed at 91 wins.

I'm keeping it short-and-sweet this time to avoid repeating what I've said in the past. If you'd like to read my previous articles, which are good if you'd like to read about how I constructed my model, check out the links below:

Texas Rangers Win-Curve Part I: Wins vs Attendance

Texas Rangers Win-Curve Part II: Playoff Probability

2010 Texas Rangers: Wins, Attendance, and Playoffs

If you're fascinated by this stuff and haven't read Vince Gennaro's book Diamond Dollars, I strongly encourage you to take a look at it.


Measuring pitch variability through PITCHf/x

January 3, 2011 • Analysis

For a while, I've been wondering what can be measured and analyzed using PITCHf/x data that hasn't already been measured and analyzed. A few things crossed my mind, but the most interesting thought was about the degree of variability of a pitcher's pitches.

It would be relatively easy to measure how much variability a pitcher has in velocity and movement if all things were equal. Of course, they aren't.

The two biggest problems for analyzing this type of variability are, as I see them, pitch type identification and park-to-park measurement error. Variability would mean little if half of a pitcher's "two-seam fastballs" are actually change-ups. Variability also runs into problems when parks like Kansas City -- whose radar gun readings are notoriously high -- are included in a data set with other ballparks.

Fortunately, if we only look at a single ballpark -- usually the pitcher's home ballpark because it has the greatest sample size -- park-to-park measurement error should be less of a factor. Without some form of park-to-park normalization, though, interpark comparisons shouldn't necessarily be taken at face value.

Additionally, 2010 saw a huge improvement in pitch type identification. While it still isn't 100% accurate, it is close enough on many pitchers to give me confidence while playing around with my ideas.

I haven't really dug into the numbers yet, but I will be looking to see if variability within a pitch type helps or hurts a pitcher. My gut feeling is that the number itself won't have much meaning.

To calculate the variability, I plan to capture the 95% window using two measurements of the standard deviation in both directions from the mean. By definition, this eliminates the outliers, but it will take some study to determine if that's really the measurement to use.

It may be beneficial to use a pythagorrean measure to find the variability for pitch movement; however, this would not appropriately model pitches that have greater variability vertically than horizontally (and vice versa, of course).

Look for a follow-up after I play around with this idea.


Thinking about run values

December 30, 2010 • Analysis

For some time, I've been looking for a way to appropriately integrate run values into the PITCHf/x database. I have read articles at Beyond the Boxscore, Inside The Book, and Cubs f/x, but I am no closer to getting what I want. Unfortunately, I lack the resources and time to find the answers myself.

Many tables have been published with run expectancies for the 12 ball/strike count states for various time periods. Tables have also been published for the 24 base/out states. Because the two tables contain different representations of the same data, there's no way to combine them. What I would like to see -- and I'm sure this makes me a sadist -- is a run expectancy table for the 288 ball/strike/base/out states.

Yes, that's one hell of a matrix to process, but there are two thoughts that seem to be the beginning of arguments against the two relatively simple approaches:

  • The thought against only using the 12 ball/strike count states table: a first-pitch strike in a bases loaded, no out situation has to effect the run expectancy more than a first-pitch strike in a bases empty, two out situation, right?
  • The thought against only using the 24 base/out states table: an 0-2 single with a runner on first base has to effect the run expectancy more than an 3-0 single with a runner on first base, right?

Admittedly, I don't have the knowledge or skills necessary to issue either of those thoughts as facts, so I have posed them as questions. It seems logical, though, doesn't it?

I think an appropriate time period for the analysis to cover is 1998-present -- since the last expansion.

Does anyone know if anyone has tackled this subject, successfully or otherwise? Is this covered in a book that I have not yet read -- possibly even one that I have read?

Consider this an open call for help in this matter.

[UPDATE: Tom Tango finally calculated the 288 states after the 2018 season, but his website was not working when I tried to grab the link.]


2010 Texas Rangers Win-Curve Revisited

December 28, 2010 • Analysis

In 2009, I published a win-curve that predicted Texas Rangers attendance for a given win level. The Rangers won 87 games, and my win-curve predicted 27,958 attendees per game for that win level. Actual attendance was only 27,641. The difference was 317, only a 1.15% difference.

This season, I updated my data and published another win-curve. The yellow dot on the graph marks the 2009 attendance level, and the red dot marks the 2009 win level.

Texas Rangers, Wins vs Estimated Attendance, 2010
2010 Attendance Prediction. For a full description, read the original article (link above).

In 2010, the Texas Rangers won 90 games. My win-curve predicted an average home attendance of 31,202. According to ESPN's numbers, the actual average home attendance was 30,928.

The difference of 274 attendees per game translates to only a 0.89% overshoot. The model was more accurate this year than last year.

As the season approaches, I will update the data and issue a new prediction.


A collection of thoughts on Stephen Strasburg

September 8, 2010 • Analysis

Yeah, I'm late to the party on this one, but I wanted to share some of what has been written in the blogosphere about Stephen Strasburg's elbow injury.

To start this post off, here are two quotes from my March 2009 analysis of his mechanics after watching him pitch against TCU:

His flexed elbow moves well behind his back and reaches shoulder height before the ball. From there, he must forcefully externally rotate his arm to get the ball to driveline height. This causes late forearm turnover and increases the valgus torque that occurs during reverse forearm bounce. This is a risk factor for his ulnar collateral ligament.

Strasburg has some of the common flaws of traditional pitching mechanics and carries with him the associated risks. These risks will almost certainly not affect his draft status because it could be 10 years before anything goes wrong.

The second paragraph is included to give context for my analysis.

Around the same time as my analysis, Kyle Boddy (then writing for Driveline Mechanics - the now-defunct SBN blog) compared Strasburg's mechanics to those of Pedro Martinez and Mark Prior. The three pitchers demonstrated striking mechanical similarities.

Notably, Pedro Martinez pitched relatively injury free for most of his career until his age 34 season, the one exception being rather severe shoulder inflammation in 2001.

Mark Prior, of course, was not as lucky. After initially injuring his shoulder in a baserunning collision, Prior suffered from a string of elbow and shoulder injuries. Some people blame the collision for his problems, and while it seems like a possibilty, it is impossible to know for sure.

After Strasburg's injury, Kyle wrote two articles concerning Strasburg and elbow injuries in general.

His first article (Elbow Injuries and What Causes Them (Stephen Strasburg Bonus Content!)) is a lengthy discussion of how horizontal shoulder abduction -- referred to as "scap loading" or "scapular loading" by some -- leads to increased horizontal adduction velocities that increase valgus stress in the elbow. He notes that while this clearly can't be labeled as the sole contributor to Strasburg's injury, it certainly played a role.

Kyle's second article (Strasburg, The Inverted W, and Pitching Mechanics) attacks some misconceptions and naysaying about the reputation of the inverted W position. In his discussion, he brings it back to Mark Prior by comparing Prior's peak horizontal shoulder abduction position to Strasburg's peak horizontal shoulder abduction position.

Finally, Eric Cressey offered his thoughts -- The Skinny on Stephen Strasburg’s Injury. Much of the article explains how important the health of the anterior forearm musculature (flexor-pronator mass) is in helping take valgus stress in the UCL. He briefly tackles overall tissue quality and links back to the great series he wrote on elbow pain.

Cressey puts some of the blame on the inverted W, but he is quick to mention that mechanical quirks like that aren't always a sign of impending injury.

A lot of people subscribe to the idea that a pitcher "only has so many bullets" in his arm. Cressey quotes J.P. Ricciardi and seems to agree with him. The idea is hard to argue with, since "so many bullets" could be 1,000 or 1,000,000 or even 1,000,000,000.

As a stand-alone theory, it leaves a lot to be desired, and leads to a series of questions:

  • How many bullets do I have?
  • What's the best way to conserve my bullets?
  • Can I get more bullets? If so, how?

With a boiled-down, unexplained idea like this, people are likely to misapply it by any number of means. That could include keeping strict pitch counts to protect the arm but still pitching year-round without rest. Alternatively, some people may wind up thinking that there's nothing they can do to extend the life of their arms and then neglect appropriate strength and conditioning.

Cressey, however, applies idea very well in a brief discussion of how to save those bullets. If you haven't read his thoughts, you should.

I have some of my own thoughts to share about Strasburg, but it may take me some time to pull them all together. Stay tuned.