When you come to grad school to learn how to be a scientist, the first thing that they teach you is how to be a skeptic. Every program has a class where you read papers and discuss them. What conclusions did the researchers come to in this paper? How did they get there? Most importantly, is there enough evidence to support that conclusion? Did they do enough controls? The right controls? Does the data say what they're telling us that it says?
I'm telling you this because this is also how I watch baseball. When I say that the Pirates need to get Kevin Correia out of the rotation, someone always immediately tells me that he's got a 4.12 ERA and that he's not pitching badly and that I hate Kevin Correia and that my bias against him makes it so that I'm unable to properly evaluate his performance. That's not true at all; the reality is that I don't believe his 4.12 ERA is sustainable. He only strikes out one batter every three innings and he's still got a home run problem and his BABIP is .246. Some people look Correia's ERA this year and conclude that he's about an average pitcher. I think that's a conclusion made on false assumptions.
I do the same thing with the whole team, of course. You tell me that being 36-32 is a sign that they're going to contend this year, I'll tell you that they're being outscored by their opponents by ten runs. You tell me that their 3.44 ERA is a sign that the pitching is going to be different this year, I'll tell you that their FIP is 3.91 and their pitchers have a combined BABIP of .279 and so this great pitching staff doesn't necessarily strike me as something that's sustainable throughout the whole season. "Pedro Alvarez has 13 home runs!" you say. "And he's got 69 strikeouts in 226 at-bats," I respond. This isn't because I'm being contrarian or because I'm trying to rain on anyone's parade; it's because throughout my childhood I let myself be fooled by far too many bad baseball teams simply because I thought that it has to be time for the Pirates to catch a break. It never has to be time for any team that hasn't earned it. The universe doesn't owe us anything that we don't take for ourselves. The Pirates won't have a .500 record or win a division until they're a good baseball team. No amount of "just enjoy the ride" will change that fact.
Here's the thing, though: in June, the Pirates have been a good baseball team. They're 11-7. They've scored 92 runs in 18 games, which is an average of 5.1 a night. That's a run more than the NL average of 4.1 in 2012. That's allowed them to win games, despite the pitching staff's return to average (73 runs in 18 games is an average of 4.1 per night). The Pirates went 15-13 in May despite only scoring 89 runs all month. That sort of thing wasn't ever going to be sustainable. What they're doing in June? The way they've played in June? This is way more encouraging to me.
I'm not going to stop being skeptical. The team's record in one-run games (17-11) makes me nervous. The fact that they might not be able to pile runs up all year they way they have this month makes me nervous. The tenuous pitching depth coupled with the age of AJ Burnett and the health of Erik Bedard, the way the bullpen is generally pitching over their heads, the thin line between awesome Pedro Alvarez and awful Pedro Alvarez, the way that Andrew McCutchen's production has increased exponentially without a huge change in his peripherals, all of these things still stand out as bright red flags to me. As a scientist, I'd call June a promising preliminary result that needs more data.
Being a skeptic is no fun, though. Just like Fox Mulder, I want to believe. I want every single nine-run explosion this month to mean that it's something that's going to happen over and over. I want AJ Burnett and James McDonald in the Cy Young discussion in September. I want 35 home runs from Pedro Alvarez and I want the name Andrew McCutchen on the lips of every single baseball fan in America. I want 90 wins and a playoff spot. Remembering my default skeptical position, if I were to talk myself into the Pirates being a contender this year, how would I go about doing it? Let's take a second here and just dream.
1. The Pirates' schedule is favorable through the end of July
The Pirates have carved up bad and mediocre teams over the last month. Three wins against the Cubs, three against the Royals, two of three from the Twins, two of three from the Brewers, two of three from the Indians. Here's who the Pirates play between now and the end of July: the Tigers (34-35), at the Phillies (33-38), at the Cardinals (35-35), the Astros (28-41), the Giants (38-32), at the Brewers (32-37), at the Rockies (26-42), the Marlins (33-36), the Cubs (24-45), at the Astros, and at the Cubs. That's 17 games against three of the NL's worst teams (Rockies, Cubs, and Astros), 15 more against currently mediocre teams (Tigers, Phillies, Cardinals, Marlins, Brewers), and three against a good team (Giants). If the Pirates win, say, 20 of their next 35 games (11 against the bad teams, 8 against the mediocre, one against the Giants), that makes them 56-47 on August 1st. That's maybe an aggressive prediction, but it's not implausible by any means. And any team that's 56-47 is going to be in the thick of the playoff hunt with two months left. Even from there, the Pirates have six against the Padres and a bunch of games against the Cardinals and Brewers in August and a bunch of Cubs and Astros games in September. There's not really another super-tough stretch at any point from here on out.
2. The pitching is much less likely to collapse this year
It's one thing to think that the pitching isn't quite as good as they've been to this point, but it's another to think they're going to collase entirely. Even if they regress back towards their FIP/xFIP/SIERA, that still makes them an average or slightly below average pitching staff, not a terrible, apocalyptically bad one.
3. June is an awfully small sample size to judge the offense by, but so were April and May
Repeat after me: the truth always lies in the middle. The truth always lies in the middle. The Pirates probably aren't a five-runs-per-game team, but I sincerely doubt that they'll revert back to a 530-runs-per-season pace, simply because teams that bad at the plate only come around once per generation or so and the offensive talent on the Pirates just isn't that low.
4. Pedro Alvarez is making some progress with his strikeouts
Since Pedro Alvarez's first hot streak began in earnest, in the July 25th double-header against the Rockies, he's struck out 53 times in 188 plate appearances. That's 28.2%, which means that over an extended period, including a bad slump, Alvarez has struck out in fewer than 30% of his plate appearances. I'd still rather see him around 25% and his slumps do worry me, but he's definitely making some kind of progress at the plate this year.
5. It's time for the front office to shift its focus
Regardless of whether the Pirates end up contending this year or not, it's clear that the future for the Pirates is here. McCutchen is in Pittsburgh and locked down for the forseeable future. Alvarez is in Pittsburgh. Starling Marte is in Triple-A. Gerrit Cole is in Double-A and Jameson Taillon probably isn't far behind him there. Before now, the focus of Neal Huntington and his team was purely talent acquisition. No matter how this season plays out, it's time for a more directed approach to fill needs, even if that requires giving up some minor league talent. If the Pirates are in contention at the deadline this year, it could be a boost bigger than Derrek Lee and Ryan Ludwick is coming.
6. Stranger things have actually happened
If you read between the lines above, you can see that I think the Pirates' true talent is somewhere in the ballpark of average or slightly below average in run prevention and probably below average --but not ridiculously so -- in scoring runs. That makes them about what I thought they'd be before the year: a 75-win team. But if they have the true talent of a 75 win team, they've already won 36 of their first 68 games, which means if they play like a 75-win team from here on out, they win 78 games. Adjust for what could be a potentially easy schedule, toss in a little bit of luck, account for a little bit of improvement over the expected performance level for guys like McDonald and Alvarez, and, hey, we're definitely in the "weirder things have happened" neck of the woods.
7. Andrew McCutchen
I'm not betting against him. Are you?
Look, I have as many questions about this team as anyone. I'm not buying plane tickets to Pittsburgh in October right now. I'm treading cautiously and I'm expecting them to come back to Earth. But I do want to believe in this team, and it's easier to do that right now than it's been in a long time.
I have been reading Pat's blog for quite a few years now. I'd argue that he got much more skeptical when he went to grad school. My brother and I discussed this (he being a scientist (skeptic) and me being in construction management (always waiting for the next debacle)). We agree with almost everything Pat writes. The stats, the sabermetrics, the averages, Correia's inevitable collapse... they don't lie. But they sure are an easy way to dilute my happiness after a Bucs win, and a shovel to dig my hole deeper after a Bucs loss. Pat makes me sad.
I don't think I just want a happier Pat, though. I think that I want a good Pirates team with good stats to back them up. And thus a happy Pat.
Regarding Pedro's Ks: It's generally better to use ABs as opposed to plate appearances if you're trying to determine if a player has trouble making contact. Since walks, sacrifices, HBPs and the like are taken out of ABs, and since there are better metrics to judge plate discipline on, generally when people refer to K-rates it's on an AB basis. And Mr. Alvarez doesn't look as good there, with a 32% K-rate. His plate discipline has indeed improved, but he still has some pretty significant trouble connecting the bat with the ball.
Regarding run differential, since that seems to be a meme in talks about the Bucs the last few days, it seems to me that everyone things that every team's actual record should match their Pythagorean record exactly and if not, it's a statistical anomaly that automatically means the team is over-performing. That seems to be wrong to me.
I'm by no means a statistician (I can barely spell it), but there must be a standard deviation from Pythag within which a teams record is not an outlier. So if the Pirates' Pythag is 33-35, but the standard deviation is 4, then Pirates' record could be expected for a team with their run differential. Wouldn't that be a more nuanced way to look at run differential? (Again, I'm terrible at stats, so if someone can educate me, please do.
Also, in general I think run differential is a pretty blunt instrument to base arguments on. It can swing too quickly during the season - the middle of last week, the differential was close to thirty. There's definitely a correlation between record and differential at the end of the season, but I don't think it tells you much during the season unless there are huge differences between between projected and actual records.
@Barry JT There are two basic ways you can look at the Pythagorean record:
--Regressive: Given this number of runs scored and runs allowed, you could expect to have won this percentage of your games.
--Predictive: If you continue to score and allow runs at the same rate, you would expect to win this percentage of your games.
As you and wkkortas and Pat have mentioned, it's pretty terrible for prediction because it doesn't tell you why you scored/allowed runs at that rate so far, or whether you can expect to do so in the future.
There is definitely a lot of misinterpretation out there. Too many people look at the Pythagorean record and say "we're a few games above the Pythagorean now, so we're going to lose a bunch later so that the record evens out at the end of the season". And there's the strange phenomenon of people celebrating 9-1 wins not because 9-1 wins are awesome, but because the predictive Pythagorean win percentage has moved a little higher.
@Barry JT If I remember correctly--and I'm doing this off the top of my head--the basic pythagorean run differential is, over the long term, accurate to within 2-3 wins, so a team that scored and gave up the same number of runs should finish somewhere between 78-84 and 84-78. I would agree that it's a bit of a blunt instrument, and interleague play and the extremely unbalanced schedule tends to muddy the waters a bit, at least as far as using it to project the Pirates' record going forward.
@wkkortas @Barry JT It's definitely a blunt instrument, but the basic idea (that it's hard to be a lot above .500 if you've been outscored by your opponents) generally holds. There are a lot of other things that should really go into estimating a team's "true" record for predictive purposes, for certain.
I think the schedule is a huge factor--really, the only non-mediocre teams the Pirates play from here on out are the Giants, Dodgers, and Braves, and that's maybe ten games. Plus, the only way the Pirates can see the O's again would be in the World Series, which is nice given the beat-down they suffered. The fact that the talent in the NL is heavily concentrated in the East really works to the Pirates' advantage.
See avatar. Well thought out, Pat. I liked point six. They got real lucky in May, but they don't have to give those wins back. There's no guarantee that equal and opposite luck is destined for the rest of the season, just a performance likely more in line with their true talent, which would still look ok given their record thus far.