De civitate sabermetricarum: Oh, the Humanities

As I tweeted a week or so ago, this was a good season for the part of me that is a Tigers' fan to miss. I have been dealing with a return of my wife's cancer (the outlook is not great but, as the last lines of the original theatrical release of Blade Runner go, "I didn't know how long we had together... Who does?"), in addition to moving house (and changing countries). However, I accumulated a few bookmarks and other ideas to work through, especially now we can only watch other teams in the post-season.

While I was busy, a very important blog post was made back in May. Phil Birnbaum, who is nothing if not insightful in writing about sabermetrics, announced that dWAR, a measure of fielding value, seemed to him to have a significant problem. Birnbaum proposed that dWAR inherently overvalued fielding. Birnbaum's argument is rooted in mathematical accuracy, so I don't feel confident trying to explain it. If you haven't read the post already, you should go to his blog to read how he explains it.

However, his explanation boils down to three key points, if we focus on the effects:

a) the runs allocated to the fielders under dWAR are too high, by an order of around fifty percent. (So a team dWAR of -40 is actually more like -20

b) The cause of this is that when one assumes "certain balls in play are the same" (as one has to do with older baseball statistics) then the math sends all the credit to the fielders.

"Observations are a combination of talent and luck. If you want to divide the observed balls in play into observed pitching and observed fielding, you're also going to have to divide the luck properly."

Here, I think, we run into the problem of "All things being equal", or the distinction that the philosopher of history R.G. Collingwood made between meteorology and chemistry. It is an essential fact of human life that all things are NOT equal. People working in meteorology can collect observations of events, but cannot reproduce them at will, unlike people working in chemistry. By contrast, the historian can observe events, but they cannot create political or social crises at will, nor send qualified observers back into the past in order to collect the information needed to understand those events in the way scientists might send an expedition to view an eclipse or collect specimens. In scoring a baseball game, at best a sabermetrician can be a weatherman.

One can take issue with the statement "the assassination of Archduke Franz Ferdinand of Austria on 28 June 1914 triggered the First World War" as one of causality, but without doubt the shooting set off a diplomatic crisis that led to the war. More importantly, luck played a crucial role in the event because the Archduke's car came to a complete halt very close to where the "Yugoslav nationalist" Gavrilo Princip, had stationed himself. An earlier attempt to kill the Archduke in a moving car had failed. We have no idea whether Princip could have been successful if his targets had been in a moving car. So, what percentage of responsibility to the war do we assign to Princip, to the driver, to the governor of Bosnia at whose orders the driver stopped, to the Serbian officers who conspired to arm Princip, to the Archduke or to the general diplomatic situation? And any formula that did allocate "responsibility shares" to these people would be essentially an act of faith.

Birnbaum went on to add some further details to his understanding in a threat on the blog of Tom Tango, the tremendously influential pseudonymous saberist. In the comments section of Tango's thread on the post by Birnbaum, Birnbaum suggested in one reply that it was just not possible for a system like Defensive Runs Saved or Ultimate Zone Rating to make distinctions about balls in play that could tell us something about the skill of the fielder.But before that he stated that he wanted to assign the luck to the pitcher. However, reading the comments there is to venture into a world where something like the Responsibility Shares is thought to be possible. Possibly, with enough computing power, such things can be made for evaluating baseball players. But I can't help but think the effect will be small.

To reduce Birnbaum's position down, what he thinks is that about half of the dWAR effects at the team level need to be transferred from the fielder to the pitcher. Another way to think about it is that he wants a cap on the amount of Runs Allowed value distributed to the fielders. But this would also have effects on how we value players. A quick-and-dirty method would be to halve the UZR assigned to any player when calculating their WAR, although I suspect Birnbaum would object on the grounds that something true at the team level may not be true at the level of the individual player.

De civitate sabermetricarum

Thursday 22 October 2015

Oh, the Humanities

No comments:

Blog Archive