Thinking about run values

December 30, 2010 • Analysis

For some time, I've been looking for a way to appropriately integrate run values into the PITCHf/x database. I have read articles at Beyond the Boxscore, Inside The Book, and Cubs f/x, but I am no closer to getting what I want. Unfortunately, I lack the resources and time to find the answers myself.

Many tables have been published with run expectancies for the 12 ball/strike count states for various time periods. Tables have also been published for the 24 base/out states. Because the two tables contain different representations of the same data, there's no way to combine them. What I would like to see -- and I'm sure this makes me a sadist -- is a run expectancy table for the 288 ball/strike/base/out states.

Yes, that's one hell of a matrix to process, but there are two thoughts that seem to be the beginning of arguments against the two relatively simple approaches:

  • The thought against only using the 12 ball/strike count states table: a first-pitch strike in a bases loaded, no out situation has to effect the run expectancy more than a first-pitch strike in a bases empty, two out situation, right?
  • The thought against only using the 24 base/out states table: an 0-2 single with a runner on first base has to effect the run expectancy more than an 3-0 single with a runner on first base, right?

Admittedly, I don't have the knowledge or skills necessary to issue either of those thoughts as facts, so I have posed them as questions. It seems logical, though, doesn't it?

I think an appropriate time period for the analysis to cover is 1998-present -- since the last expansion.

Does anyone know if anyone has tackled this subject, successfully or otherwise? Is this covered in a book that I have not yet read -- possibly even one that I have read?

Consider this an open call for help in this matter.

[UPDATE: Tom Tango finally calculated the 288 states after the 2018 season, but his website was not working when I tried to grab the link.]


2010 Texas Rangers Win-Curve Revisited

December 28, 2010 • Analysis

In 2009, I published a win-curve that predicted Texas Rangers attendance for a given win level. The Rangers won 87 games, and my win-curve predicted 27,958 attendees per game for that win level. Actual attendance was only 27,641. The difference was 317, only a 1.15% difference.

This season, I updated my data and published another win-curve. The yellow dot on the graph marks the 2009 attendance level, and the red dot marks the 2009 win level.

Texas Rangers, Wins vs Estimated Attendance, 2010
2010 Attendance Prediction. For a full description, read the original article (link above).

In 2010, the Texas Rangers won 90 games. My win-curve predicted an average home attendance of 31,202. According to ESPN's numbers, the actual average home attendance was 30,928.

The difference of 274 attendees per game translates to only a 0.89% overshoot. The model was more accurate this year than last year.

As the season approaches, I will update the data and issue a new prediction.


A collection of thoughts on Stephen Strasburg

September 8, 2010 • Analysis

Yeah, I'm late to the party on this one, but I wanted to share some of what has been written in the blogosphere about Stephen Strasburg's elbow injury.

To start this post off, here are two quotes from my March 2009 analysis of his mechanics after watching him pitch against TCU:

His flexed elbow moves well behind his back and reaches shoulder height before the ball. From there, he must forcefully externally rotate his arm to get the ball to driveline height. This causes late forearm turnover and increases the valgus torque that occurs during reverse forearm bounce. This is a risk factor for his ulnar collateral ligament.

Strasburg has some of the common flaws of traditional pitching mechanics and carries with him the associated risks. These risks will almost certainly not affect his draft status because it could be 10 years before anything goes wrong.

The second paragraph is included to give context for my analysis.

Around the same time as my analysis, Kyle Boddy (then writing for Driveline Mechanics - the now-defunct SBN blog) compared Strasburg's mechanics to those of Pedro Martinez and Mark Prior. The three pitchers demonstrated striking mechanical similarities.

Notably, Pedro Martinez pitched relatively injury free for most of his career until his age 34 season, the one exception being rather severe shoulder inflammation in 2001.

Mark Prior, of course, was not as lucky. After initially injuring his shoulder in a baserunning collision, Prior suffered from a string of elbow and shoulder injuries. Some people blame the collision for his problems, and while it seems like a possibilty, it is impossible to know for sure.

After Strasburg's injury, Kyle wrote two articles concerning Strasburg and elbow injuries in general.

His first article (Elbow Injuries and What Causes Them (Stephen Strasburg Bonus Content!)) is a lengthy discussion of how horizontal shoulder abduction -- referred to as "scap loading" or "scapular loading" by some -- leads to increased horizontal adduction velocities that increase valgus stress in the elbow. He notes that while this clearly can't be labeled as the sole contributor to Strasburg's injury, it certainly played a role.

Kyle's second article (Strasburg, The Inverted W, and Pitching Mechanics) attacks some misconceptions and naysaying about the reputation of the inverted W position. In his discussion, he brings it back to Mark Prior by comparing Prior's peak horizontal shoulder abduction position to Strasburg's peak horizontal shoulder abduction position.

Finally, Eric Cressey offered his thoughts -- The Skinny on Stephen Strasburg’s Injury. Much of the article explains how important the health of the anterior forearm musculature (flexor-pronator mass) is in helping take valgus stress in the UCL. He briefly tackles overall tissue quality and links back to the great series he wrote on elbow pain.

Cressey puts some of the blame on the inverted W, but he is quick to mention that mechanical quirks like that aren't always a sign of impending injury.

A lot of people subscribe to the idea that a pitcher "only has so many bullets" in his arm. Cressey quotes J.P. Ricciardi and seems to agree with him. The idea is hard to argue with, since "so many bullets" could be 1,000 or 1,000,000 or even 1,000,000,000.

As a stand-alone theory, it leaves a lot to be desired, and leads to a series of questions:

  • How many bullets do I have?
  • What's the best way to conserve my bullets?
  • Can I get more bullets? If so, how?

With a boiled-down, unexplained idea like this, people are likely to misapply it by any number of means. That could include keeping strict pitch counts to protect the arm but still pitching year-round without rest. Alternatively, some people may wind up thinking that there's nothing they can do to extend the life of their arms and then neglect appropriate strength and conditioning.

Cressey, however, applies idea very well in a brief discussion of how to save those bullets. If you haven't read his thoughts, you should.

I have some of my own thoughts to share about Strasburg, but it may take me some time to pull them all together. Stay tuned.


McCarthy suffers another stress fracture

April 27, 2010 • Analysis

Jeff Wilson has reported that Brandon McCarthy has been placed on the 7-day DL in Oklahoma City with a stress fracture of his right scapula. Unbelievable.

Seriously unbelievable. Bones get stronger after stress fractures. It's part of the healing process sometimes referred to as overcompensation (or supercompensation). Bones respond to stress and stress fractures by growing thicker, stronger, and more dense.

This is the third diagnosis of a stress fracture in McCarthy's shoulder. Having been through this twice before, McCarthy's shoulder blade should be plenty strong enough to withstand two months of pitching, but it apparently isn't.

Unbelievable.

What is believable, though? I see a couple of possible explanations.

The original stress fracture from 2007 simply may not be healed. If this is the case, the cause is likely dietary, but it could be that the injury has never been given sufficient time to heal. Stress fractures often become pain-free well before they are actually healed.

Another explanation is that the problem is not actually a stress fracture. Soft tissue is much more susceptible to re-injury than is bony tissue, and the location of McCarthy's injury is a confluence of soft tissue that literally encapsulates the glenohumeral joint.

The recommendations here are running short.

McCarthy attempted a mechanical overhaul, but it doesn't seem to have accomplished its chief goal despite leading to a sparking ground ball rate at Oklahoma City where McCarthy has been excellent.

At this point, it looks like mechanics aren't McCarthy's real problem. If it isn't his mechanics, the culprit is one of the following: diet, strength/conditioning, and genetics.

Genetics, of course, can not be changed, but the other two can be addressed.

In addressing the diet, there are three things to watch for, and they all go hand-in-hand. The goal is improved bone density so the main focal points are calcium, vitamin D, and pH balance. I am not a dietician or a nutritionist, so I will stop short of making specific recommendations.

In addressing potential strength and conditioning issues that may be contributing to McCarthy's problems, a recently published DVD set contains just about everything anyone would ever need to know ranging from prehab and diagnosis to rehab and high performance.

You (and Brandon McCarthy) should check out Optimal Shoulder Performance.

[[Update: The evidence is apparently quite clear. This is, in fact, a scapular stress fracture. Someone who has seen recent video of McCarthy believes that McCarthy had fallen back into old mechanical habits.]]


2010 Texas Rangers: Wins, Attendance, and Playoffs

April 5, 2010 • Analysis

In winning 87 games last season, the Texas Rangers drew an average attendance that was nearly what my model predicted for that win level -- predicted attendance: 27,958 per game; actual attendance: 27,641 per game.

For this year's model, there have been no tweaks to the methodology. I have simply added last year's data to the model. For details on my wins-attendance model, click here. It is based on the model presented by Vince Gennaro in his book Diamond Dollars: The Economics of Winning in Baseball.

Here's this year's model of Attendance versus Wins:

Texas Rangers, Wins vs Estimated Attendance, 2010
2010 Attendance Prediction. For a full description, read the original article (link above).

At 2009's level of 87 wins -- represented by the red dot -- my model predicts the Rangers to crack the 30,000 mark for average attendance at 30,593 per game. The model also predicts the Rangers to maintain last year's attendance level with as few as 73 wins -- represented by the yellow dot.

Regression Notes

The standard error is down from last year's 2,646 attendees per game to 2,602. The R-square and Adjusted R-Square values are nearly identical.

The growth factor variable is slightly more significant than last season, but still seems more significant to the calculations than its relatively low t Stat value (1.326) suggests. Removing it from the regression results in smaller R-Square values and a larger standard error.

Playoff Chances

Using a logistics regression for the past 12 seasons (since the Tampa Bay Rays franchise came into existence), I took a look at the odds of making the playoffs for a given win level. This is based on historical probability rather than a super complex mathematic system. For a more in-depth explanation of this process, click here.

Josh Hamilton predicted that the Rangers would win 96 games. Historically, 96 wins gives an American League West team a 94.54% chance of making the playoffs (94.50% across the entire American League).

Team president Nolan Ryan predicted 92 wins. Those four wins dramatically change the team's playoff chances. 92-win AL West teams can expect to make the playoffs 62.77% of the time, while a 92-win team from any AL division can expect to make it 68.44% of the time.

Various projection systems predict the Rangers to win between 81 and 87 games. This represents quite a wide range of playoff chances -- AL West: < 0.50% to 8.39%; AL overall: 0.73% to 14.02%.

After about the half-way point in a season, the results from such a logistics regression become fairly meaningless for that season. At that point, the division and wild-card races are taking firm shape, and a daily look at the standings tells a much more complete story.

[Note: When properly applied during the off-season (or at the trade deadline), though, playoff probability added can be used to more accurately estimate a player's true dollar value to an organization. This was to be explained in Part III of my Texas Rangers win-curve series, but I stopped at Part II. I may take another crack at finishing that series this year.]