Postgame Spread
You guys hangin' out? I'll hang out.

Monday, October 30, 2006

MVP? Santana, I guess. Maybe.    

The MVP debate this year has been an interesting, if frustrating, one for me because it's forced me to wrestle with some different ways of looking at these things. If you take Bill James' Win Shares and Runs Created stats seriously, it really looks like Jeter is the guy. On the other hand, if you thought Ortiz deserved it last year, if you jump straight to HRs and RBIs, or are searching for a kind of transcendant quality about your hitter...well, he had a season that would be a trick to classify as inferior to last year's and pretty well nails those other criterion. If you think being the strongest link of a playoff team is the key criterion, you're looking pretty closely at the Twins and trying to figure out which of their players was the most crucial to their success, since they were the most top-heavy of the playoff teams. And it's kind of hard. The easy way out for a lot of people, myself included at times, has been to say well, why not Santana? Waffling between hitters, why not take the most dominant starting pitcher and give starters their due?

The first and most important thing to get out of the way is that the only-every-fifth-day thing is nonsense. I was alerted to this by Tangotiger pointing out that as of 9/21 Santana had faced more batters than Morneau had pitchers. In fact, during the course of the season, Santana faced a total of 923 batters. Zito faced 945, but having pitched less innings that just means he sucks more. By contrast, the league leader among position players, Ichiro, had 752 plate appearances. Jeter had 715, Ortiz 686. So the fact that the situations in which Santana makes his impact are a more concentrated doesn't actually mean that he's playing less (unless you strongly value defensive contributions, even mediocre ones, in which case you're probably sure Jeter is your guy already, or maybe Mauer if defense is really your thing). Moreover, since Eckersley won in 1992, it's worth pointing out that while I don't have 1992 data on hand, none of the top relievers this year faced much more than 300 batters. So if the playing time thing counts against starters it, it should really count against relievers, usual caveats about high-leverage situations aside.

Jeff claims you can't take Santana because this isn't even his best season, let alone a season that stacks up against some of the best seasons of Pedro or Randy. If those guys never won, how do you give it to Santana not even quite at his best? Jim Caple's argument was fairly similar, asking people to compare the Twins this year with the 1995 Mariners trio of Edgar, Buhner, and Randy, each of which arguably had a better and more important performance to lead an otherwise pretty crappy Mariners team to the playoffs. Caple had me for a while, and reminded me about the ridiculousness of trying to name an MVP in general. But it's easy to flip the tables and say, "Well, how can you give it to a DH after snubbing Edgar in 1995?" I don't think that kind of logic really leads anywhere. What is the point of not trying to correct biases? The fact that dominant seasons in the past by a pitcher or a DH have fallen short doesn't mean we should be prejudiced against them now. Plus, Clemens' 1986 can't be the gold standard, since we have Eckersley in '92. While I'm not trying to take away anything from Eckersley in specific or closers in general...well, the bottom line is that I think starters are more important. If anyone wants to take that up, I'll argue it in the comments. Eventually, anyway.

Jeff also claims that you can't take Santana because he only won 19 games. Obviously, he's talking more about MVP voters than himself, as none of us are really the type to overestimate wins in a pitcher's effectiveness. But it also reflects that in any realistic discussion of MVP, you do have to leave some definitions of "best" at the door. Contributions to wins in that sense may be the most important characteristic of a pitcher's value in an MVP debate, but we also know that Santana's contribution to team wins is not captured by his Wins total. We have a pretty good idea as to how much excellent pitching he contributed to a lot of no-decisions this year. Well, the Twins were 8-1 in the games that Santana received a no-decision in, and he pitched at least 7 full innings in 5 of those games, and left before the end of the 6th in just 2 of those cases (and he pitched 8 in the loss). So, if I were to try to more accurate depict the record Santana "should have," I would probably add another 5 wins and not fault him for the no-decision loss. 24-6 gets you a lot closer to an historic season, though of course people who had "real" historic seasons probably got wins sniped from them too.

So let's not underestimate Santana's dominance this year. The Hardball Times stats page makes it pretty clear that 2006 Santana was the second-best AL pitcher of the past three years, after only the 2004 version of himself. And if you toss in the NL, you have deal with Clemens' ridiculousness, sure. But while Clemens was definitely better that year, if you look at the total of innings contributed, more valuable is more of a stretch. I'd be reluctant to make a case either way. Let's stick to the AL, though, and wade through this a little. Take a look:

ERA
1. 2004 Santana 2.61
2. 2006 Santana 2.77
3. 2005 Millwood 2.86
4. 2005 Santana 2.87
5. 2005 Buehrle 3.12
6. 2006 Halladay 3.19

IP
1. 2004 Buehrle 245.3
2. 2005 Buehrle 236.7
3. 2006 Santana 233.7
4. 2005 Santana 231.7
5. 2005 Zito 228.3
6. 2004 Santana 228.0

Ks (BBs in parentheses)
1. 2004 Santana 265 (54)
2. 2006 Santana 245 (47)
3. 2005 Santana 238 (45)
4. 2004 Pedro 227 (61)
5. 2005 Johnson 211 (47)
6. 2004 Schilling 203 (35)
7. 2006 Bonderman 202 (64)

PRC
1. 2004 Santana 173
2. 2006 Santana 156
3. 2005 Santana 143
4. 2004 Schilling 140
5. 2006 Halladay 119
6. 2004 Pedro 115

ERA+

1. 2004 Santana 180
2. 2006 Santana 164
3. 2005 Santana 151
4. 2006 Halladay 148
5. 2005 Millwood 145
6. 2004 Schilling 145

FIP
1. 2005 Santana 2.80
2. 2004 Santana 3.02
3. 2005 Lackey 3.08
4. 2006 Santana 3.15
5. 2004 Schilling 3.21
6. 2006 Bonderman 3.31

So, Santana was notably better in 2004 than he was this year, but it's pretty close in a lot of the categories. He managed to walk less batters while throwing more inning this year, for example. And there's no real question that Santana has been the best pitcher in the American League for three years--with the occasional challenger, sure, but it's a damn impressive run. I don't think it approaches travesty territory, if you think about it, to put these seasons up against the all-time greats, especially considering the quality of hitters he's been facing (though Pedro probably gets a bigger bump in that respect for playing through what we can pretty fairly call now a 'roided era). And there's no question on the valuable-ness to his team that the Twins minus Santana equals bad. Yankees minus Jeter, much less clear. Sox minus Ortiz would be a huge hit, but they didn't make the playoffs anyway. So if that's a big factor, you're leaning toward a Twin, and I think Santana's the best of them.

To me, this has come down to trying to figure out what I think of Win Shares, Runs Created (RC) and it's newer corollary Pitching Runs Created (PRC). If RC really does incorporate all the different ways you add to an offense better than looking at traditional stats, then Jeter's your guy. If PRC doesn't overstate the impact of pitchers, then you have to take Santana. If they're both pretty shaky, it's a whole lot easier to make a case for Ortiz. But it's not so hard to see that RC is flawed in some capacity. Take a look at RC for NL this year:

1. Pujols 150
2. Berkman 142
3. Cabrera 141
4. Howard 138
5. Beltran 125
6. Reyes 125

I love Cabrera and Berkman as much as the next guy (probably more in the case of Cabrera), but they're probably going to finish ahead of Howard on approximately zero MVP ballots, and that's probably appropriate. So, something's definitely up. Continuing, putting RC and PRC for both leagues together yields this list:

1. Santana 156
2. Pujols 150
3. Berkman 142
4. Cabrera 141
5. Howard 138
5. Jeter 138
6. Oswalt 128
7. Ortiz 127
8. Carpenter 125

So, what does this really tell us? One thing that we can say pretty easily is that it probably overstates the importance of batting average for hitters. Or maybe it underestimates the impact of Home Runs by reducing them to 4 Total Bases when their impact is actually much greater. Some formula that awarded points for bases on a 6 point scale like Single=1, Double=3, Triple=4, and HR=6 might be closer to the truth (or it might be a further distortion, I just pulled it out of my ass, but you understand where I'm going with this). Valuing HRs a little bit more, though, would result in Howard leap-frogging Berkman and Cabrera in a heartbeat, and would quickly close the gap between Jeter and Ortiz, so I think something along those lines must be pretty key. But at the same time, it definitely identifies all the main guys pretty quick (remember, the AL list goes Jeter, Ortiz, Sizemore, Thome, Morneau, Hafner, Ibanez, Dye, A-Rod, Guerrero), so it's not so shabby as it is.

But what about PRC? Does it actually give us a stat that allows to directly compare the contributions of pitchers and hitters? I'm skeptical, but David Gassko's reasoning on the subject is pretty interesting. I'll leave it to you guys to decide whether it's worth a damn (first, you might want to read his follow-up article), but I think it makes a strong case. So, let's stack up the past three years in the both leagues RC + PRC, and see what it gets us:

2004 Bonds 174
2004 Santana 173
2006 Santana 156
2006 Pujols 150
2005 D. Lee 144
2005 Clemens 144
2004 Ichiro 143
2004 Pujols 143
2005 Santana 143
2004 Johnson 143
2005 Pujols 142
2006 Berkman 142
2004 Schilling 140
2005 Texeira 138
2006 Cabrera 141
2005 A-Rod 138
2006 Jeter 138
2006 Howard 138
2004 Abreu 137
2005 Ortiz 136
2005 Bay 136
2004 Sheets 136
2005 Manny 134
2005 Sheffield 131
2005 Giles 130
2006 Oswalt 128
2005 Carpenter 128
2006 Ortiz 127
2004 Ortiz 127

That's still a list dominated by hitters, and the pitchers that even get on the list (Santana, Clemens, Johnson, Schilling, Sheets, Oswalt, Carpenter) are the cream of the crop. So, it looks to me like PRC does stack up reasonably well with RC, or at least defensibly well. And by that measure, Santana looks pretty good, but maybe a little too good. It looks to me like Win Shares probably overestimates defense (or at least position) to some extent (a Jeter-Mauer 1-2 punch certainly makes you wonder). But it's kind of hard for me to say it really underestimates pitching, since it gives Santana 25 WS and my little thought experiment pegged him at 24-6 for a record that more accurately reflected how good he was this year. Probably RC is too reliant on OBP and TB to account sufficiently for HR. I also think using TB without finding a way to add walks back into the TB category might be account for the kind of bizarre discrepency between, say Ortiz's 2005 and his 2006. In 2006 he was better in OBP and slugging, so why fewer RC? Should he be penalized meaningfully for playing in 8 less games? Maybe, but I'm certainly not sure. It's not like the Red Sox had to replace him with an offensive black hole when he was out...how much of a difference that should make?

Again, all of this just points out to me how incredibly subjective all this inevitably is. Ortiz clearly actually contributed the most offense, whatever you think of RC and OBP, as he has a huge RBI lead and only a tiny runs scored deficit despite playing on an inferior offensive team. Jeter's defense has to count for something, though, so the debate is still meaningful. And I think Jeter's basestealing this year really is worth a second look. According to net statistics, he was as valuable in that regard as Jose Reyes, which I don't think that many people would realize. So, again, I say Santana, but the closer I look the easier it is for me to see it going a number of different ways. I would much rather the award go to Ortiz than Jeter, but I don't think it's as ridiculous as all that, really. He did a lot for his offense this year.

5 Comments:

  • I love the net SB stat.

    Of course, I love any metric that lands Eckstein on the shit list.

    By Blogger Alex, at 4:24 PM  

  • Lehr, I'm flattered by your shameless attempts to butter me up, but you can't win me over that easily. Yankee-lover!

    Jesse... careful with your arguments concerning historical precedent. Eck's win in '92 defends Santana's candidacy (2nd pp), while Pedro, Randy and Johan's past losses are irrelevant (3rd, 4th pp)? Either history is relevant or it ain't.

    Personally, I think voting history will go a long, long way to determining the outcome of this thing. Remember who's voting: sportswriters. These are people who are much more preoccupied with angles than evidence. You collect evidence to support the angle, not vice versa. Besides, the MVP race is an election, not a calculation, so political pros/cons are just as valuable as stats.

    Case in point... I think Ortiz mouthing off about Jeter really hurt his own bid. THAT is precisely the kind of angle that sportswriter douchebags are looking for. It's just more fuel on the "Jeter = consummate leader" fire, which is a lot easier to take. Precisely because there's no evidence required other than point-of-view. That's how Bush won both elections, and it's how Jeter will win this one. (Congratulations, Lehr, you root for the Republican way. Can't cut and run, Lehr!!!)

    Anyhow, on a different topic, now that I've seen those RC numbers, I'm a lot more skeptical of its value. Singles and doubles do not necessarily equal runs. Home runs and RBIs do. Runs scored are evidence; Jeter has 'em in spades. (As does Papi, in fact.)

    But we both have the same conclusion: it all comes back to personal opinion. I think Ortiz > Jeter > Johan. You think Johan > Ortiz = Jeter or somesuch. Lehr says Jeter > Giambi > Cairo > Ortiz. We can each find any number of stats to argue for any of them. And yet the only thing different between the two of us is this: you're completely wrong. So we're really only haggling over the relative value of each statistic, not the race itself. Just to be clear.

    By Blogger Jeff, at 4:48 PM  

  • I am saying historical precedent doesn't matter to me, that we should try to be clear about value when we as outsiders try to complain about who gets picked. Maybe I should leave the Eck thing out of there, but it's purpose is to deflect your only-the-ultimate-best-pitcher-seasons-ever-should-be-considered point, not to indicate its importance to whether or not Santana should be eligible. But certainly the voters are thinking heavily about that; my post is in no way attempting to be predictive.

    For the record, my basic answers about the relative value of these players would be somewhere along the lines of:

    1) I don't know who is the most valuable.

    2) Virtually any choice will be unsatisfying in one way or another.

    3) I like pitching and pitchers better in general.

    4) The Twins are probably my second-favorite team. Or maybe the Marlins, for now.

    5) It's ridiculous that Manny's never been MVP.

    6) I wish I had better evidence for why Jeter shouldn't get it.

    7) Blah blah blah.

    Anyway, after all that, I mostly just left myself in a pretty confused state, without a clear way of knowing value. I'm glad we've started this debate, though, because you guys are the guys I want to be debating these stats with. I'm happy to be representing the stathead side in it, but I should be clear that imagining the kind of consensus that we can start to move toward has a lot more credence to me than the opinion that I bring to the table in these debates. So I'm more interested in where this goes than in my own case.

    By Blogger Jesse, at 6:26 PM  

  • Regarding the problems with Runs Created that you mentioned, Jeff, clearly, what it tries to achieve is a stat that counts total run production without relying on the context of the team on which the person is playing. I agree it's problematic for both the ways I pointed out and what you said about really it coming down to the runs actually scored. It's not predictive, so the formula of OBP and TB is a problem.

    Do you see another way to handle that? Are RBIs more important than Runs? If you value them equally, you could try to do something like (R + RBI)/Team OPS. Or maybe you should be dividing RBI/SLG of the 5 batters behind and R/OBP of the 5 batters in front. But team OPS would be a lot easier and not dependent on a shifting batting order.

    By Blogger Jesse, at 8:19 PM  

  • Re: Runs vs. RBIs...
    everyone knows runs scored are the most important. That's why Rickey Henderson is the Greatest Of All Time.

    And speaking of Rickey:
    Rickey was playing with John Olerud in Seattle. The story went that a few weeks into Henderson’s stint with the Mariners, he walked up to Olerud at the batting cage and asked him why he wore a batting helmet in the field. Olerud explained that he had an aneurysm at nine years old and he wore the helmet for protection. Legend goes that Henderson said, “Yeah, I used to play with a guy that had the same thing.” Legend also goes that Olerud said, “That was me, Rickey.” Henderson played with Olerud on the Blue Jays and the Mets.

    Incidentally, my current MVP votes go like this:

    Santana > Jeet > Ortiz

    By Blogger Alex, at 11:21 AM  

Post a Comment

<< Home