There are certain milestones in the baseball universe that can ice a player’s immortality. Amongst these milestones for hitters are 500 home runs and 3000 hits. The key milestone for pitchers is 300 wins, and to a lesser extent 3000 strikeouts. Year in and year out fans closely monitor their favorite player’s progression to these hallowed milestones and guess if he will ever make it. As a player nears a milestone the world takes notice to such an extreme that every at bat is televised on ESPN. This post is devoted to determing the probability that players will reach a certain milestone. Before going further, I have to acknowledge that a much smarter man than me came up with these calculations. This particular method is one in which Bill James entitles “career assesments.” However, the numbers presented represent my own work utilizing James’ method. Also of note is that for this exercise I am assuming that the 2009 season is complete. Meaning that the probabilities shown will be slightly lower than what they will be when the season ends at the end of the month.
I am going to start with who I consider to be the best hitter in the game. A book can be written about the contributions that Albert Pujols has made to the baseball community. Over the next few years there is the potential that Alex Rodriguez will surpass Barry Bonds’ home run record of 762. To be exact his chances are 27.4%. However, the projected amount of home runs that Alex Rodriguez will hit in his career is 721, which would be good for third all time. Therefore we will assume that the record of 762 is still standing if Pujols approaches that many. So, what are Pujols’ chances of setting the new home run record? According to James’ formula, they are 12.5%. It pains me to say, that Barry Bonds’ record just might stand a little longer than I want it to.
Below I going to post a few more career assessment numbers that I ran and found interesting, and then I will share the method with you so that you can try some out for yourself.
Rickey Henderson’s career stolen base record looks safe. The two most legitimate competitors to the throne that I could think of are Carl Crawford and Jose Reyes. Both of them came up with a zero percent chance of surpassing Henderson. Yes, that is how good Henderson was.
It seems likely that we will see the fourth player to reach the 2000 RBI plateau. Currently, Alex Rodriguez has a 83.5% chance and Manny Ramirez has a 37.9% chance.
Derek Jeter is all but a lock for 3000 hits with a 95.3% chance. Vladimir Guerrero will need to pick it up a little bit with only a 16.2% chance.
C.C. Sabathia is probably the most likely to 300 wins at first glance and his chances are 18.1%.
The two best closers ever were both part of this generation in Mariano Rivera and Trevor Hoffman. Hoffman holds the record for career saves, but what are the chances that Rivera surpasses him? According to the formulas Hoffman will finish his career with 599 saves. Rivera needs 80 saves to beat that mark, and the formulas say that he has a 19.4% chance of making that happen.
So now hopefully you are adequately curious to find out where these numbers came from. I will work you through the method to show you how this is done. For this example I will calculate Mark Reynold’s chances of beating Reggie Jackson’s mark of 2597 strikeouts.
Step One: Calculate how many he needs to do so. Currently Reynolds has 513 career strikeouts. 2598 - 513 = 2085
Step Two: Estimate the years remaining in Reynold’s career. Use the age that Reynold’s was as of June 30 of the current year (in this case 2009 and he was 25). The estimated time remaining in a players career is (42 - age) / 2. (42 - 25) / 2 = 8.5
Step Three: Use a weighted average of the last three years to determine at what level he is playing at. Weight year one as one, year two as two, and the most current year as three. Divide this result by 6. Reynolds has 180 strikeouts this year, 204 last year, and 129 the year before. (180*3 + 204*2 + 129) / 6 = 179.5 If the result for this number is less the 75% of the most recent year, then use 75% of the most recent year.
Step Four: Multiply the results from steps two and three together. This is the projected amount of whatever remaining in the player’s career. 8.5 * 179.5 = 1525.75
Step Five: Divide step four by step one and subtract .5. (1525.75 / 2085) - .5 = .2318 OR 23.18% chance. Note if this number is less than zero, then your answer is zero. It is possible for this number to be greater than one. In that case use this formula: .97 ^ (step one result / step three result)
There you have it. Not very difficult and an excellent basis for discussion. The only reserve that I have with this formula is that I think it underestimates the amount of service time that a player has left in him. I suspect, but do not know, that players that make the pros and play for a year or two but don’t stick bring the average career length down. If I could devote a full time job to baseball statistics I would try to come up with a conditional probability expanding the length of a player’s career given that they have already played five seasons in the majors. Alas, there are flaws in everything, but until you create a better system this is what we have to work with.