MVN - a statistical and sabermetric baseball blog
Statistically Speaking
What run estimator would Batman use? (Part II)
If you haven’t already, I suggest you read Part I first, but it’s not strictly necessary, so long as you have a feel for how run estimators work. Part I goes into a lot of the background of how run estimators work, but there’s not a lot of technical detail.
Now, let’s go ahead and strap some run estimators down to the table, cut them open and see how they work.
Linear weights
First of all, when I refer to linear weights, I should clarify that I use the term to refer to any linear run estimator, not just Pete Palmer’s Linear Weights System. Onward, then.
Simply looking at a linear weights formula should be pretty straightforward. We’ll look at the reduced version of Extrapolated Runs, Jim Furtado’s version of a linear weights formula*:
(.50 * 1B) + (.72 * 2B) + (1.04 * 3B) + (1.44 * HR) + (.33 * (HP+TBB)) + (.18 * SB) + (-.32 * CS) + ((-.098 * (AB - H))
Essentially, every event is multiplied by its average run value, based on a certain run context. (In the case of XR it’s team seasons from 1995 to 1997, but you could use any context you wanted. You could put together a linear weights formula for, say, Greg Maddux’s career if you wanted to.)
This begs the question of how to determine the run value of an event. Looking simply at Runs Batted In won’t help – a single with the bases empty provides value. So what do we do? Here’s where a concept called run expectancy comes in handy. Every base/out state has a certain run expectancy, which essentially is how many runs on average a team scores from that point of the inning. I’m using values from this table by Tango, because they’re already in a nice arrangement.
|
0
|
1
|
2
|
|
|
___
|
0.555
|
0.297
|
0.117
|
|
1__
|
0.953
|
0.573
|
0.251
|
|
_2_
|
1.189
|
0.725
|
0.344
|
|
__3
|
1.482
|
0.983
|
0.387
|
|
12_
|
1.573
|
0.971
|
0.466
|
|
1_3
|
1.904
|
1.243
|
0.538
|
|
_23
|
2.052
|
1.467
|
0.634
|
|
123
|
2.417
|
1.65
|
0.815
|
There’s one case not strictly defined on the table; three outs means a run expectancy of zero.
The linear weights value of an event is the average change in run expectancy by an event. Let’s say you have runners on first and second, no outs; that’s a RE of 1.573. A player hits a double, scoring the two runners in front of him:
2 + 1.189 = 3.189
The double scored two runs and leaves the game with an RE of 1.189, for a total RE of 3.189. Subtract 1.573, and you get 1.616, the run contribution of that double. Take the average RE change of every double available in your dataset, and there’s your linear weights value of a double.
(There are other ways to estimate linear weight values when you don’t have sufficient data to do the Run Expectancy analysis; an overview of the subject is available.)
Holds, Saves and Blown Saves
Francisco Rodriguez of the Angels, with 54 saves in 59 opportunities, is on his way to breaking the all-time single season record of 57, set by Bobby Thigpen of the White Sox in 1990. Percentage wise, the Phillies’ Brad Lidge is perfect, with 33 saves in 33 opportunities. On the opposite end, there are records such as those of Aaron Heilman of the Mets, 3 for 7 this year and 9 for 33 since 2004. It’s obvious Heilman can’t close games, with a record like that. No wonder Willie Randolph got fired. Right? Wrong!
Saves have become a statistic who’s leaders are as well known to the casual fan as the homerun leaders, and save percentage is one of the simplest computations in baseball statistics, but it has always contained an error that grossly distorts the value of middle relievers to the general public. It is easy to understand that the setup man isn’t in a position to get many saves, but save percentage has been held up by many, including the media, as evidence that certain pitchers routinely fail when handed a save situation, proof that they can’t handle the closer role. Read the rest of this entry »
World Famous StatSpeak Roundtable: September 3
Our humble round table welcomes a new guest knight. Please welcome to this week’s version of the roundtable, Will Carroll of Baseball Prospectus. Will has been kind enough to join us here on StatSpeak for a record-setting five-person roundtable. He joins us in a discussion of the ghosts of trade deadline deals past, injuries and Sabermetrics, C.C.’s sorta no-hitter, instant replay, and who will be looking in from the outside on the AL playoffs in October.
Question #1: When I started doing ”Under the Knife” seven years ago, there were no stats and people didn’t think that injuries and sabermetrics went together. I’m still not sure they do, but to me, it’s about information. You guys are stats guys — how would you go about mixing the two?
Will Carroll: I think it comes down to a bit of luck. Is it someone getting hot and carrying the team? Is it an injury that costs them a premier player for a couple weeks or worse? I know that luck is probably the worst thing to say on a site like this but I think its the best way to say that small things make a huge difference and I’m not sure which ones. I think we get lost in this fog because we’re seeing quantifiable effects but in such small quantities that we don’t notice, things that amount to 0.1 runs or less, but enough of them that they add up.
Brian Cartwright: Well, my day job is in data processing, which include designing methods of data collection. So one of my current projects is designing a comprehensive database that hopefully will include everything we can get our hands on, from season stats and play by play to transactions and injuries, as opposed to narrowly constructed ad hoc databases. I’d like to be able to look at the pre injury data and see if there are any indicators, such as simple to derive stuff like lists of pitchers headed to a Verducci Effect (and then test how true it is). Post injury, be able to see how well players recover from various types of injuries.
I know Will has done much of this on his own, but I’d like to see the injury data married to the stats and projections to enable more of us to do these kind of studies.
Colin Wyers: That’s sort of the unexplored frontier of sabermetrics - introducing traditional sorts of data into our models. What’s lacking right now is a good record of who got injured, where, and how. I don’t know if we’ll ever get to that point, but people like Tom Ruane of Retrosheet are working on that sort of data - and all of us who research baseball owe the folks of Retrosheet a huge debt.
Eric Seidman: A fusion of injuries and sabermetrics is something I have actually discussed with Will on numerous occasions because now, with Pitch F/X data in full bloom, there are certain avenues we can explore. For instance, one idea of Will’s (that I wholeheartedly support) is that pitchers that are on the verge of injury will have consistent release points with inconsistent results. Before, this really could not be studied, but now it can. We can run analyses to see which pitchers fit the bill. Or, if someone is experiencing a “dead arm” we can look to their movement. Stats cannot tell us everything about injuries, but just like all other aspects of analysis, the combo of numbers and scouting will ultimately prove to be key in this combination.
Pizza Cutter: I don’t think that the two are opposed at all. I do agree that injury analysis isn’t really something that fits nicely into any of the Sabermetric models that we have now, but that’s more of an engineering problem. To really pursue this line of study, one would have to be familiar with bio-mechanics and statistics, plus have a fairly extensive injury database handy. (So basically, you, Will.) Even at that point, there’s going to be a lot of statistical noise. Suppose that Larry has an elbow problem and goes on the 15 day DL. Even if we assume that we know exactly when he was hurt (and when it started hurting his performance), we’ll never really know how hurt he was. How can we tell if it’s not just him having a bad string of luck? Maybe with a big enough sample, we can detect a signal, but it’s going to be hard to find. Calculating the complete absence of a player is fairly easy. Calculating what it means to have a player at 80% is a lot harder.
The other side of the Sabermetric-injury nexus is predicting who’s an injury risk. My guess is that some team (or several) out there hired an actuary to study just that and they’re keeping it close to the vest. (Can’t blame them.) Plus, with many teams already insuring contracts, someone out there in the insurance industry must be running some sort of tables.
Surprise! Kelly Johnson has gotten better this year
Recently, there was a note at the ever-excellent MLB Trade Rumors which said that the Atlanta Braves were likely looking to shop second baseman Kelly Johnson in the off-season. The post noted that Johnson’s offensive production had declined this year, and the Braves do have fashion designer Martin Prado ready to play second next year. I don’t mind the thought that the Braves might think Prado the better option. He strikes out much less than does Johnson, although Prado seems to have a bit less power. The part that I object to is the thought that Kelly Johnson is actually “losing it” this year.
Certainly, Kelly Johnson’s performance has suffered. Last year, his slash line of .276/.375/.457 was rather nice for a second sacker. This year, Johnson has slipped a little with a slash line around .260/.335/.400. Not bad, but not what Braves fans were hoping for. So Johnson must be losing his mojo, right? Not necessarily. In fact, I’d say that Johnson has actually gotten better this year. How does a player drop 80-90 points worth of OPS and become better? Read on.
First, let’s look at Johnson’s swing and plate discipline profile. What’s important to know is that things involving plate discipline and swinging are the least given to variation over time. It makes sense, because players are the ones who decide whether or not to swing the bat. Hitting a home run requires cooperation of the pitcher, ball, and occasionally, wind. This year, Johnson, a man with a strikeout problem, and a rather pedestrian contact percentage (around 80-81%, which is around the league median) actually started swinging more. And that’s a good thing. In 2007, on my twin measures of plate discipline, Johnson had a response bias rating of 0.84. Now, response bias is a measure of how likely a player is to swing. The ideal number is 1.00, because it minimizes the number of strikes that a player piles up, given whatever abilities he has on the other measure, sensitivity. A number over 1.00 means that a player is swinging too much. Under 1.00 means the player is swinging too little. In 2007, Johnson’s problem is that he was taking too many pitches. Johnson took a step toward fixing that.
My measure suggested that Johnson would benefit from swinging more, and he has done so. Last year, Johnson swung at 39.3% of pitches. This year, he’s been up around 45%. (Maybe he reads StatSpeak?) His strikeout rate has dropped (although only about a percentage point) in response. Swinging more also drove down his walk total, but it meant that he was putting more balls into play. So, let’s look there.
In general, a batter has pretty good control over what type of batted ball he puts into play. The rates at which a batter hits grounders, flyballs, line drives, or popups has pretty good reliability, so changes in them are generally not random in nature, but a change in either talent level or approach. What happens to those batted balls is another matter. More on that in a minute. This year, Johnson’s LD/GB/FB profile went from 18.8%/42.7%/38.5% last year to something around 23%/38%/39%. His flyballs are staying steady, but he’s turning some of his ground balls into line drives. That’s good, because a line drive (which doesn’t leave the yard) has about a 73% chance of going for a base hit, while a grounder has a 24% chance. Line drives are good.
The fine folks over at FanGraphs are fond of using xBABIP for hitters. Given a batter’s batted ball profile, we can get some sort of idea of what we might expect his BABIP to be (hence xBABIP). The formula that FanGraphs uses is .15 * FB% + .24 * GB% + .73 * LD%. Last year, Kelly Johnson’s xBABIP was around .290. His actual BABIP was .330. Johnson did 40 points better than expected given his batted ball profile.
The next question is whether that ability to “outhit” the expectation is something that is luck or skill. As is my custom, I took four years worth of data (2004-2007) and calculated the xBABIP and the actual BABIP for all players, and found the difference between the two (whether they over- or under-performed). It’s possible that some players just hit line drives or ground balls that are harder to catch than others. If that’s the case, then we should see consistency over those four years in which players over-perform and which ones under-perform. To test this, I used my favorite devide, the intra-class correlation (shot!). The result was an ICC of .27 or .28, depending on how much I restricted the sample by the minimum number of PA required.
That means that there is a little bit of skill involved in over- or under-performing one’s xBABIP, although there’s a good deal more luck in there than one might expect. Looking at it from an R-squared perspective, it’s more than 90% luck (or more properly, unexplained). It’s not quite the level of non-correlation found in BABIP for pitchers, but it’s closer to that area than to the “three true outcome” neighborhood. Perhaps it’s time for DIBS.
Going back to Johnson, it means that it’s likely that most of Johnson’s over-performance in the BABIP area was due to chance. I haven’t run the numbers, but I’m guessing that expected BABIP is going to be a better predictor of future results than is actual BABIP. Now, in 2007, Johnson’s expected BABIP was .290. This year, it’s around .315 (more line drives!), which is what his actual performance has been. All performance is talent plus luck. So in reality, Johnson’s numbers from last year, which were fueled mostly by that high BABIP was mostly a matter of luck. This year, he hasn’t had good or bad luck, but the underlying talent seems to have improved. Atlanta’s management might be confusing luck with skill.
The one concerning piece about Johnson’s statline is the drop in HR/FB. HR/FB is a statistic that is mostly in the batter’s control, and his drop from 10.3% to around 7% is a little concerning. His flyball percentage hasn’t changed much from last year… they’re just not leaving the park as much, so perhaps there’s a power outage in there somewhere.
With that said, Johnson isn’t exactly a world-beater. Now that his luck has stablized, we’re getting a pretty good idea of what he’s really capable of. According to VORP he’s in the bottom half of “regular” second basemen in all of baseball among such luminaries of Joe Inglett, Clint Barmes, and Mark Grooz, Grudsil, oh, you know who I’m talking about. He strikes out way to much for a guy who doesn’t put up massive HR numbers. My OPA! fielding system has him rated as a boring old average second baseman in the field. So, while I can’t fault the Braves if they think they have a better option, I’d caution them to be a little more careful in how they make that decision. Kelly Johnson is a symptom of a much bigger problem of the need to understand the separation between talent and performance. He’s actually gotten better this year, despite what it looks like.







