Archive for the ‘Statistical Analysis’ Category

His apple stats say a lot about his oranges.

Friday, August 22nd, 2008

In Jayson Stark’s chat today, someone asks about Pujols’ MVP candidacy, and Stark says this:

Albert Pujols is having an amazing year, on every level. One stat I keep checking on Albert is: Percent of pitches thrown to him that are in the strike zone. He’s still under 50 percent — and he’s slugging over .600! No team wants to pitch to him in a big spot, or let him beat them, and he continues to find ways to keep that offense rolling.

Okay, to begin with, two disclaimers: 1) I support Albert’s MVP candidacy whole-heartedly—he’s meant more to his team’s success than any player since the ‘01-’04 Bonds; and 2) his strike percentage is 54%, so I’m not sure what Stark is talking about. But in any case, citing strike percentage vs. slugging percentage as a measure of a hitter’s ability seems bizarre to me. They don’t have anything to do with each other.

Remember that slugging percentage is measured only in reference to a player’s At Bats—which always end in a strike—and not Plate Appearances, which can end in either a strike or a ball. In other words, every time Albert walks, it decreases the percentage of strikes he sees but has no impact at all on his SLG. Moreover, remember that the hitter has some control over how many strikes he sees (if he swings at a ball out of the zone, it’s still recorded as a strike), and that not all strikes are created equal (fouls on two-strike counts, balls put in play, etc.). If anything, Pujols does his part to “keep that offense rolling” by remaining patient and letting the opposing pitcher walk him—these walks often lead to runs later; see my earlier post on the impact of Pujols’ underappreciated walks.

In the broader perspective, I’m not sure that percentage of strikes seen tells us much about a hitter’s ability. As an example, let’s take a look at a game-in-the-life of two hypothetical players A and B, both of whom go 1-3 with a double and a walk.

In his first plate appearance, Player A gets a two-strike count, fouls off a pitch, then strikes out looking. In his second PA, he strikes out on three pitches. In his third PA, he goes 0-2, then lines the next pitch for a double, and in his fourth PA, he walks on a full count.

By contrast, in his first PA, Player B swings on a 3-0 count and pops out. In his second PA, again on a 3-0 count, he grounds into a double play. Swinging on 3-0 yet again, in his third PA he doubles. Finally, he walks on four straight pitches in his last PA.

Here’s how their balls and strikes break down:

Player PA 1 PA 2 PA 3 PA 4 Total strikes Total pitches % of strikes
A 0B/4S 0B/3S 0B/3S 4B/2S 12 strikes 16 pitches 75 %
B 3B/1S 3B/1S 3B/1S 4B/0S 3 strikes 16 pitches 18.75 %

Player A saw 75% strikes, while Player B saw 18.75%—but who had the better day? Sure, Player A was terribly impatient while Player B needs to learn to keep his bat on his shoulder on 3-0 counts, but I think it’s hard to argue either one of them did better than the other.

Now of course, this comparison is deliberately artificial, but I’m just trying to illustrate my point that strike percentage may or may not actually mean much about a hitter’s performance. As another illustration, here are the top-10 Major League OPS leaders this year, together with their strike percentage:

  • Albert Pujols: 54%
  • Milton Bradley: 56%
  • Chipper Jones: 54%
  • Matt Holliday: 60%
  • Lance Berkman: 57%
  • Ryan Ludwick: 62%
  • Carlos Quentin: 61%
  • Alex Rodriguez: 61%
  • Manny Ramirez: 60%
  • Kevin Youkilis: 63%

As we see, Albert and Chipper lead this group, but not dramatically. So I remain unconvinced that Pujols’ strike percentage is a measure of his (immense) value as a hitter. But I’m open to hearing competing perspectives, so bring ‘em on.

Unrecognized contributions

Monday, August 11th, 2008

In last night’s 4-2 victory over the Marlins, Albert Pujols went 0-for-3 with three strikeouts and one walk. Definitely not a great night, and one which hurt him in all of the major rate stats while contributing nothing to almost all of his counting stats (BB being the lone exception).

And yet the numbers don’t tell the whole story. His walk came in the first inning, after Felipe Lopez hit a one-out single. Lopez, now on second, advanced to third on Ryan Ludwick’s fly-out to center and then scored on Rick Ankiel’s infield single. In the box score, then, when distributing credit for contributing to this run, Lopez increased his run total, Ankiel added to his RBI, and Ludwick received recognition for having made a so-called “productive out”.

Pujols receives no credit at all for having contributed to that run, yet his efforts were crucial to its creation. If he had made an out, even if in doing so he advanced Lopez to second, Ludwick’s fly ball would have ended the inning with no runs scored (remember there was one out already when Lopez singled). But let’s even pretend for the moment that Pujols’ plate appearance simply didn’t happen, that the Cardinals somehow skipped over his turn in the order. Then Lopez would still have been on first when Ludwick flew out, in which case it’s very unlikely he (Lopez) would have advanced to second. And even supposing he did advance, he almost certainly would not have scored from second on an infield single, and then Yadier Molina’s pop-out would have ended the inning, again without Lopez scoring. In other words, if Pujols doesn’t walk (or get a hit), Lopez doesn’t score. Yet while Lopez, Ludwick, and Ankiel receive credit for having contributed to creating the run, Pujols gets none.

I need to draw a distinction here between stats which record a player’s purely individual performance and stats which mark his contribution to run-production. In their stat lines, all four players receive their due for their individual performance: Lopez and Ankiel with singles (which are reflected in their Hits, Total Bases, AVG, OBP, and SLG), Ludwick for his fly-out (which affects his AVG, OBP, and SLG), and Pujols with his walk (as shown in his BB and OBP). But these stats—BB, Hits, Total Bases, AVG, OBP, SLG—do not directly address run production. Obviously, they correlate strongly with scoring runs, but not directly; a player can, for example, improve all of them in a game in which his team is shut out. This is by contrast to Runs and RBI, which by definition are credited only when actual runs are scored.

Of course, these are precisely the reasons why sophisticated analysis of an individual player’s performance discards such stats as Runs and RBI, because these stats are so heavily team-dependent. Instead, such analysis focuses exclusively on individual achievements (such as hits, walks, etc.) and then tries to correlate these individual achievements as closely as possible to run creation. Hence such stats as Runs Created, VORP (Value Over Replacement Player), and EqAvg (Equivalent Average).

These measures and others like them are wonderful advances in our understanding of baseball, and I do not in any way mean to malign them. Nevertheless, baseball is, at the end of the day, a team game, and not just a aggregate of individual performances. To the extent that the structure of baseball games allows individual achievement to correlate strongly with team achievement, it lends itself to the statistical analysis of individual performance much more than, say, football does. Yet despite these sophisticated advances, sometimes the contributions and individual makes to his team’s success can still slip beneath our notice.

To be sure, Albert Pujols and his contributions to the Cardinal’s success hardly slip beneath anyone’s notice; he certainly receives plenty of credit, and justly so.  Still, in a game like last night’s, when Pujols seems to have done almost nothing but strike out, his lone walk in the first inning contributed vitally to the creation of a run, which in a 4-2 victory, came very close to being the entire margin of victory. I suspect that a closer analysis would reveal that he does even more for the Cardinals than we often recognize.