Friday, May 31, 2013

Pitcher Hitting 1871-2012: An Analysis

As I mentioned in my wrap-up of the May 30th, 2013 games, after Cubs pitcher Travis Wood hit a grand  slam, I received this tweet:
30 May
has there ever been a pitching staff that has had this much success as the pitchers have had this season?
It's an excellent question, and I was well on my way to researching it when I lost power in the house for six hours. I went back at it today, and here's what I found. This first graph shows pitcher batting averages from 1871-2012:

I typically only go back to 1901 whenever I research topics, but in this case went back to the beginning of organized baseball in 1871 because the notion of the pitcher as a useless hitter was not then prevalent. Bill James discussed this idea in his Historical Baseball Abstract (2001) when he introduced the concept of "Peripheral Quality Indicia" (PQI), in which he argued that unrelated items can be used as proxy measures of the quality of play, and #1 on his list of 16 items was pitcher hitting (the full list is at the end of this post if you're curious). The higher the quality of play, the less hitting success pitchers would have because there's  no connection between pitcher hitting and team success. After a slow but steady decline that began in the 1930s, the current standard was reached around 1960 or so and remained steady through today.

Through Thursday's game, Cubs pitchers had 4 home runs and 19 RBI:
Travis Wood 14 26 24 5 7 1 0 2 7 1 8 .292 .320 .583 .903 141
Scott Feldman* 10 25 24 2 4 2 0 1 6 0 8 .167 .167 .375 .542 43
Jeff Samardzija 11 23 17 3 2 1 0 1 2 1 10 .118 .167 .353 .520 37
Carlos Villanueva 11 19 17 1 3 0 0 0 0 0 9 .176 .176 .176 .353 -3
Edwin Jackson 10 17 15 1 1 1 0 0 2 0 6 .067 .067 .133 .200 -47
Matt Garza 2 3 3 0 1 1 0 0 2 0 1 .333 .333 .667 1.000 165
Generated 5/31/2013.

Being roughly 1/3 of the way through the season, that would put them on pace to hit around 12 home runs and drive in around 57 RBI. That is NOT going to happen, period. This chart shows every team whose pitchers combined hit 9 or more home runs in a season:

If Cubs pitchers were to maintain this torrid pace, they would be the BEST EVER in pitcher hitting, which makes the tweeter referenced above pretty darn smart. Other than that 2001 Rockies team (where half the games were played at Coors Field, which had a park factor of 122 that year), the age of the pitcher "helping his cause" is long gone, and for one reason:
                                   Pitchers simply don't bat as often as they used to.
As pitchers pitch far fewer innings than in the past, they have fewer plate appearances. Unless pitching patterns dramatically change, the the teams listed in the chart above will be the best forever. This graph shows the number of pitcher plate appearances for the National League:

I'm frankly surprised pitchers bat even 2.5 times a game--over the course of a season, that's 80-90 fewer plate appearances than even 30-40 years ago, and that many fewer opportunities for hits, home runs and RBI.

This rather large chart lists every pitcher with at least 15 career home runs:

Other than Carlos Zambrano and Mike Hampton, these are definitely old-school players. Look at Zambrano's career plate appearances--744 in a 12-year career, almost 100 fewer than Earl Wilson in a similar length of career and over 200 fewer than Don Newcombe. And that's the final answer to the question--due to many factors (specialization, fewer hitting opportunities, the DH and more), pitching staffs will be hard-pressed to even hit five home runs in a year--the last team to do that was the 2009 Cubs featuring...Carlos Zambrano. And so, to the tweeter: No, on a pro-rated basis, no team HAS ever had pitcher hitting success like the Cubs so far in 2013, but it won't last, and no team will ever approach the hitting feats of the pitchers of old--that day has simply passed.

BILL JAMES PERIPHERAL QUALITY INDICIA (from the New Bill James Historical Baseball Abstract, 2001, p876):
1. Pitcher hitting
2. The average distance of player age from 27
3. Percent of players less than six feet tall or taller than 6'3"
4. Fielding percent and passed balls
5. Double plays
6. Usage of pitchers at other positions
7. Percent of fielding plays made by pitchers
8. Percent of blowout games 
9. Average attendance and seating capacity
10. Field condition
11. Specialization of player roles
12. Average distance of teams from .500
13. Percent of games that go 9 innings
14. Standard deviation of offensive effectiveness (I have no idea what this means)
15. Record keeping
16. Percent of managers with 20 or more years in the game

The way he describes it makes sense when you consider his baseball progression:
1. Major league baseball
2. Minor league baseball
3. College baseball
4. High school baseball
5. Little League baseball
6. Teeball

In Little League, the pitcher is often the best hitter, player age is around 15-17 years off from 27, almost all players are under six feet tall, and you can go on--one can draw the reasonable conclusion that Little League baseball is not as good as major league baseball. Where these indicia take on greater meaning is when they are applied to the majors themselves and tested over time. Find the book and read this particular essay (it's about Bob Lemon, right up there with the best-hitting pitchers of all-time), it's very interesting.

Monday, May 27, 2013

Baseball Streaks--Team Hitting

Some time back I wrote a post on individual hitting streaks in baseball. This post will use the same measures I used in that post (for the most part) but view team hitting streaks instead.

I'll begin with some dubious ones, in all cases going back to 1916--these are the teams that were shut out in the most consecutive games:
With the organizational success the Braves have had beginning in 1991, it's hard to remember just how woeful they were prior to that. In this four-game stretch, they played relatively well, only giving up a combined 16 runs. They had a decent lineup featuring Bob Horner, Dale Murphy and Claudell Washington but still managed to finish the season at 66-96, only three seasons removed from their playoff appearance of 1982. 

Those are TWO different Senators teams that managed to have these streaks--the 1958 version went on to become the Twins and the 1964 became the Rangers.

These are the teams with the most consecutive one-run games:

The Twins made it twice, which must be a source of pride to them. The 2001 Tigers were not a good team--actually, they would be even worse in 2002 when they won only 55 games, but if you look at that roster, be sure you haven't eaten recently. It's hard to believe that the folks of Kansas City showed up to see the Royals play after what had passed as baseball (and the Yankees farm system) that had been the Athletics, but even Bill James went to those games. Of course, the Royals proved that good baseball will draw teams in any city.

I'll end the futility with teams that had the most games of scoring two or fewer runs:

How about that, there's the aforementioned 2002 Tigers. As a lifelong Cubs fan, I've never had any use for the Cardinals, which is probably one reason why I consider Lou Brock (and Bill Mazeroski) as the worst Hall of Fame selections in modern times (honorable mention to Andre Dawson and Jim Rice). This Cardinals team is not a bad team on paper, featuring a young Keith Hernandez
and Garry Templeton and halfway-decent pitching, but they weren't the team they would become in the 80s.

Now, some offensive juggernauts--these are the teams with the most consecutive games of scoring 10 or more runs:
Obviously, this is a bygone era, a relic of pre-draft days when the differences between the best teams and the worst were greater than today. The main reason I'm writing these posts is because of those 1929 Giants, who did manage to score 10 or more runs in a six-game stretch. What this chart DOESN'T show is that every single one of those games was played against the Phillies, and every single one of those games was AT Philadelphia. In addition, the Giants won these games at the tail-end of a 25-game road streak. There's other delicious tidbits buried here, but this will be revisited under team pitching streaks, so I'll save them for then.

The 2006 Braves were the team to break the playoff streak and clearly had some offensive firepower, but generally underperformed--they played six games below their Pythagorean Win total and had simply reached the end of a 15-year line of excellence.

These are the teams with home runs in the most consecutive games:

This one is just as much a product of modern times as the prior was of the past--other than some token entries, they're very recent. The park effect for the Rockies is frequently commented upon, but less mentioned is the Rangers park effect--it's not as dramatic as the Rockies, but it's right up there. In 2002, the Rockies had a three-year park effect of 115 (which is ridiculous), but the Rangers had a 106. One factor in park effect is the players who actually hit in that park, and the Rangers had some big hitters that year in Alex Rodriguez (57), Rafael Palmeiro (43 at the age of 37) and five other players in double digits, but had woeful pitching. Even in the midst of one of the greatest offensive eras in baseball history, bad pitching was an albatross that teams couldn't overcome.

On the opposite end of the scale, these are the teams with the most games with no home runs:
The 1979 Astros were not a bad team--they finished 89-76 and would make the playoffs the next year. NOT ONE PLAYER on that team hit in double digits in home runs--the team leader was Jose Cruz with 9 (on an odd note, pitcher J.R. Richard also had two home runs). As a team they hit 49 home runs, which is easily the lowest total in the modern era (not including the strike-shortened 1981 season). Words can't express the dampening effects that the Astrodome had on power figures--Baseball-Reference has a neat feature that allows for individual player's and team's statistics to be normalized to a league average, and if the 1979 Astros had just been average, they would have hit 64 home runs. Players like Jim Wynn, Joe Morgan, Cesar Cedeno, Bob Watson and others were simply robbed of power, which is how the Astros ended up on the top of this table. The 1955 Orioles were the first season in Baltimore, meaning they were essentially still the 1954 St. Louis Browns, who along with the Athletics and Senators effectively turned the AL into a 5-team race (at best) in most years.

The strikeout has become much more prevalent in the intervening years, and it began right around 1950 and is discussed elsewhere in a different post. This chart shows teams with the most games of 10 or more strikeouts:
In the modern era, I'm not sure this even means anything, but the attitude that a strikeout is just another out is beginning to wane. To the extent that the three-outcome hitter (homer, walk, strikeout) truly existed (other than Adam Dunn) will be for future researchers to determine--candidates like Mark Reynolds, Jack Cust and others in the high strikeout-high home run category are pretty much on their way out of baseball, and the ones that are left will NOT be signing big contracts in the future. In the White Sox game on Sunday, May 26th, 2013, Adam Dunn came up in the bottom of the 8th with the Sox ahead 5-3 over the Marlins and Alex Rios on 2nd with one out. Dunn struck out looking, not even advancing the runner, and this has been a disturbing trend for him going back to around 2010--prior to that, he was right around the league average in moving runners when they were in scoring position, but a strikeout won't do that. The cost of a Dunn strikeout is supposed to be counterbalanced by his home runs (and he had one yesterday for two of the Sox five runs), but teams don't appear to be willing to make that tradeoff anymore, and certainly won't pay big money for it.

When I post next on streaks, it will be on individual pitching performances. Be sure to follow me on Twitter in order to be alerted to when I post new material.

WAR by Age--A Pitching Analysis

A couple of weeks ago, I posted on the career WAR trend for position players. I may revisit the topic as ideas have popped into my mind (separating power hitters from defensive specialists, etc.), but it was always my intention to discuss the same subject with respect to pitchers. It will be trickier, since the modern era has introduced specialization that is far beyond anything seen on the offensive side. As such, I'll break down pitching by two categories:
1. Starting--I just changed my criteria as I wrote this. I was initially going to include anyone who made at least 20 starts in a season, and had I done that, this would have created a compilation of around 9,000 seasons. I didn't particularly like this for any number of reasons and decided to expand it to ANY start between 1901-2012. This will probably lower the average WAR figures but give a more realistic picture.
2. Relief--I also just changed my criteria as I wrote this, but I'm less sure on this. I was initially going to create two relief categories (closer and middle) and define closer as any season with 10+ saves. I really didn't like that, and understand fully that relief pitching will only really matter beginning around 1975 or so, but I've decided again to widen it and include any pitching season with at least one save. Again, this will likely dilute the values, but it will be across the board. I may run this one also with the 10+ save criteria to see if any differences arise.

I've changed my mind again--this chart shows every pitcher who made at least one start in a year from 1901-2012:

This is a two-axis chart, with the number of pitchers who made at least one start at a given age on the left and the average WAR on the right. If you read the previous post on position players referenced above, the age distribution should look very familiar--the number of pitchers peaks at age 25, when about 2500 have made at least one start and begins a rather steady decline. I would absolutely LOVE to have pitch count data for this time frame, but reliable data only goes back to 1988. 

The red line is average WAR, and that does something very different than in position players--it gradually increases until somewhere around age 32-33 and then stays at a very steady plateau until around 37. I stop at that age because at that point, the effect I described among position players begins, which is that pitchers at 37 or more aren't pitching because teams need roster spots filled, but because they're effective--if they're not effective, THEY'RE NOT IN THE MAJOR LEAGUES. This is why there's no precipitous drop-off in performance--if there was, they'd simply be released. What is stunning is that gradual increase in performance that begins around 22-23.

I was going to show a second chart, but the story is simply told--of the games these pitchers were in, they started about 60% of games, a number that held remarkably steady through the age. Now I'll show the next chart, which is those seasons in which the pitcher made at least 20 starts:

I didn't do something similar with the position players, which would have been the equivalent of them having to have at least 300 plate appearances or something  like that. In effect, I've selected out the best of the best, since a pitcher simply won't make 20 starts if they're not effective. These seasons peak at about the same age and begin the same precipitous decline afterward. The spikes are ages 40 and 45 are the Roger Clemens and Nolan Ryan effects, respectively, with a little love for Warren Spahn as well. 

The very same pattern with position players is represented--if a pitcher in his 30s continues to perform, he'll still be pitching, but the numbers suggest that it's not easy to maintain. This is a selection of those 35-year-old pitchers from modern times:

Rk Player Year GS Age Tm G CG SHO W L W-L% IP H R ER BB SO ERA ERA+
1 Ryan Dempster 2012 28 35 TOT 28 0 0 12 8 .600 173.0 155 71 65 52 153 3.38 125
2 A.J. Burnett 2012 31 35 PIT 31 1 1 16 10 .615 202.1 189 86 79 62 180 3.51 106
3 Randy Wolf 2012 26 35 TOT 30 0 0 5 10 .333 157.2 196 103 99 52 104 5.65 73
4 Bruce Chen 2012 34 35 KCR 34 0 0 11 14 .440 191.2 215 114 108 47 140 5.07 82
5 Bronson Arroyo 2012 32 35 CIN 32 1 1 12 10 .545 202.0 209 86 84 35 129 3.74 112
6 Roy Halladay 2012 25 35 PHI 25 0 0 11 8 .579 156.1 155 78 78 36 132 4.49 90
7 Carl Pavano 2011 33 35 MIN 33 3 1 9 13 .409 222.0 262 123 106 40 102 4.30 94
8 Ted Lilly 2011 33 35 LAD 33 0 0 12 14 .462 192.2 172 88 85 51 158 3.97 93
9 Tim Hudson 2011 33 35 ATL 33 1 1 16 10 .615 215.0 189 86 77 56 158 3.22 119
10 Kevin Millwood 2010 31 35 BAL 31 1 0 4 16 .200 190.2 223 116 108 65 132 5.10 81
11 R.A. Dickey 2010 26 35 NYM 27 2 1 11 9 .550 174.1 165 62 55 42 104 2.84 138
12 Chris Carpenter 2010 35 35 STL 35 1 0 16 9 .640 235.0 214 99 84 63 179 3.22 120
13 Hiroki Kuroda 2010 31 35 LAD 31 0 0 11 13 .458 196.1 180 87 74 48 159 3.39 114
14 Livan Hernandez 2010 33 35 WSN 33 2 1 10 12 .455 211.2 216 93 86 64 114 3.66 110
15 Derek Lowe 2008 34 35 LAD 34 1 0 14 11 .560 211.0 194 84 76 45 147 3.24 129
16 Andy Pettitte 2007 34 35 NYY 36 0 0 15 9 .625 215.1 238 106 97 69 141 4.05 112
Not too many power pitchers in that group, but pitchers who were able to use location, guile and experience to have success. It's safe to say that if they weren't successful, they wouldn't be on major league rosters, let alone making 20 or more starts.

17 pitchers made 20 or more starts at the age of 19:

1 Dwight Gooden 1984 31 NYM 31 7 3 17 9 218.0 161 72 63 73 276 2.60 137
2 David Clyde 1974 21 TEX 28 4 0 3 9 117.0 129 64 57 47 52 4.38 81
3 Bert Blyleven 1970 25 MIN 27 5 1 10 9 164.0 143 66 58 47 135 3.18 119
4 Gary Nolan 1967 32 CIN 33 8 5 14 8 226.2 193 73 65 62 206 2.58 147
5 Larry Dierker 1966 28 HOU 29 8 2 10 8 187.0 173 73 66 45 108 3.18 108
6 Catfish Hunter 1965 20 KCA 32 3 2 8 8 133.0 124 68 63 46 82 4.26 82
7 Wally Bunker 1964 29 BAL 29 12 1 19 5 214.0 161 72 64 62 96 2.69 134
8 Ray Sadecki 1960 26 STL 26 7 1 9 9 157.1 148 76 66 86 95 3.78 108
9 Milt Pappas 1958 21 BAL 31 3 0 10 10 135.1 135 67 61 48 72 4.06 89
10 Mike McCormick 1958 28 SFG 42 8 2 11 8 178.1 192 103 91 60 82 4.59 84
11 Curt Simmons 1948 23 PHI 31 7 0 7 13 170.0 169 110 92 108 86 4.87 81
12 Hal Newhouser 1940 20 DET 28 7 0 9 9 133.1 149 81 72 76 89 4.86 98
13 Bob Feller 1938 36 CLE 39 20 2 17 11 277.2 225 136 126 208 240 4.08 113
14 Frank Shellenback 1918 21 CHW 28 10 2 9 12 182.2 180 77 54 74 47 2.66 103
15 Pete Schneider 1915 35 CIN 48 16 5 14 19 275.2 254 110 76 104 108 2.48 116
16 Ray Keating 1913 21 NYY 28 9 2 6 12 151.1 147 77 54 51 83 3.21 93
17 Chief Bender 1903 33 PHA 36 29 2 17 14 270.0 239 115 92 65 127 3.07 100
With very few exceptions (Clyde the obvious one and I know absolutely nothing about those three right after Bob Feller), these are solid-to-Hall of Fame pitchers. This is not unusual--anyone able to perform at that level at that age can be reasonably expected to have a stellar career, but it's a much more certain pattern with hitters than with pitchers.

These are the best seasons by age since 1950--prior to that and the list is dominated by early 20th Century pitchers:
I won't say it definitely, but I suspect that many will be surprised to see Wilbur Wood on this list in two spots--he was part of that iron-man renaissance that briefly arose in the 1970s when pitchers were throwing 275 or more innings. I just read it (again) in Bill James' Historical Abstract, but consider the following table of number of pitchers throwing 275 or more innings in a season by decade:

The phenomenon is well-known with NFL running backs that they lose effectiveness right around age 30, but since players enter the NFL at approximately the same age of 22-23, it's entirely possible that there is a NUMBER OF PLAYS wall that players hit, and that the wall sits right at the 30-year mark. It could be that pitchers have a "finite" number of pitches in their arms absent freaks of nature like Nolan Ryan, rubber-armed wonders like Wilbur Wood or simply once-in-a-generation talent like Randy Johnson or Greg Maddux. It's a little-known fact that Hall of Famers become so usually by defying traditional age trends--that's what makes them special.

So what will major league teams do with regard to starting pitching--will they replicate recent trends of locking in good young talent to avoid free agency? In some cases this is happening, which the White Sox doing so over the past couple of years with Chris Sale, John Danks and Gavin Floyd with decidedly mixed results--Sale just missed a start, Danks is scheduled to make his first start in over a year and Floyd is out for the year and potentially done with the Sox--around a combined $25 million commitment that hasn't produced. 

It could be that baseball follows the trend with football--draft a running back, ANY running back, run him into the ground for four or five years, reload and go again. The running back has been de-emphasized and big dollar contracts, at least for now, are hard to find. Baseball might do that with pitchers--throw out young pitchers and get what can be had, keep the truly special ones that are able to maintain productivity past the age of 30 (Justin Verlander, Roy Halladay, even though he's clearly on the decline now, same for Cliff Lee, CC Sabathia, etc.). Spend the money on perceived future performance and stay away from the Barry Zito-like contracts that simply will not be repeated going forward. In every generation, there will be a Tom Glavine or Mike Mussina that are able to have long, productive careers with few discernible signs of productivity loss, but they are exceptions.