Blind Sight: January 2010

Friday, January 22, 2010

Grey is in...

Here's a great opinion piece from the New York Times that approximates what i've been driving at here. Must There Be a Bottom Line? is a review of a book by philosopher Barbara Herrnstein Smith who argues that science and religion constitute distinct, but not necessarily incompatible, epistemological systems.

Insofar as we think of these systems as representing an "underneath-it-all" reality, they are certainly mutually-exclusive. But Smith argues that this type of thinking is not particularly productive (or correct). Instead, it is worthwhile to think of science and religion as providing different kinds of services, equally valuable in their respective markets.

This sort of pragmatism appears to be gaining force these days. While Smith's argument is sure to upset a collection of stuffy folk committed to a black-and-white portrayal of reality, her book likely speaks to many others who recognize that such rigidity is futile. Our world is perpetually in flux; capturing its reality requires a flexible mind.

Incidentally, mental elasticity is also good for your health. Confronting one's own conceptions of reality bolsters new neuronal connections and helps individuals maintain a sharp mind throughout their golden years. Jack Mezirow from Columbia University offers that such mental confrontations, or "disorienting dilemmas," are the essential nutrients for cognitive development in aging (see more).

Friday, January 15, 2010

Halak vs Price: The Definitive Test…[pause]…NOT!

This post is a slight departure from the content of my usual murmurings on religious faith. But because hockey has, arguably, achieved religious status in Montreal, I feel that the full-blown quackery that often surrounds the Canadiens (and especially their goaltenders), warrants some attention here.

There’s been a lot of debate (nay, controversy) recently concerning the Habs’ goaltending: Who’s better, Halak or Price?

The prevailing view (at least, lately) seems to be that Halak is the better goaltender. His record speaks for itself. This season, he’s got more wins (12 vs. Price’s 10), has relinquished fewer goals per game (2.46 vs. Price’s 2.67), and sports a better save percentage (.927 vs. Price’s .915) than Price.

Others have argued that Price remains the better goalie, and that Halak’s success is owed to extraneous variables, such as the team’s improved performance when he takes to the crease. (For an extensive treatment of this argument, check out this blog.)

I intend to shed some light on the debate by following in the tradition of the authors of Freakonomics and The Wages of Wins, who challenge conventional wisdom by uncovering the hidden story buried in the data.

The conventional wisdom in this case is that the number of wins (W), goals-against average (GGA), and save percentage (SV %) is the best way to measure Price’ and Halak’s relative value to Nos Glorieux.

If this is the case, Halak definitely reigns supreme. But yet, as argued by the Jesus Price faithful, there are other variables that muddle the story. For one, Price appears to have faced tougher teams.

To test this theory I decided to run a regression analysis, examining the performance of both goaltenders this season as a function of the strength of their opponents. (I will describe the details and rationale of this analysis shortly.) What I discovered was that, indeed, Halak has been the better netminder this season; but, not to the extent that the number differentials suggest. Had the roles been reversed – Price playing Halak’s games and vice versa – Halak would have emerged only marginally better.

The challenge of uncovering the “hidden story behind the data,” is considerable, perhaps even insurmountable. Indeed, after extensive discussions with my brother Dave (Halak supporter) and roommate Joey (Price supporter), I discovered many limitations in the analysis that I am about to waste my time describing. But I believe that this analysis (albeit oversimplified) offers a glimpse of the hypothetical. In order to appreciate the relative merit of two goaltenders, one must get a sense of how things might have been had their roles been reversed. This is what regression provides. Whether I’ve chosen the right variables to go into the analysis remains an open question.

Regression Analysis

Rationale

Goaltender performance. I decided to evaluate the goalies’ performance with SV% alone, because I believe that it is the most direct measure of a goaltender’s strength. Both Ws and GAA are contingent on the other players on the ice. Also, rather than looking at SV% (which is calculated by totaling the saves a goalie makes throughout the season and dividing that by the total shots faced), I will use SV%/G which is the average SV% recorded each game (not the total SV% accrued throughout the season). These two values are different. I chose the latter stat (SV%/G) because I am interested in examining how each goalie’s performance changes from one game to the next (SV% obscures this information).

Opponent formidability. Similarly, I decided to evaluate the opponent team’s strength based on their scoring percentage (the number of goals scored per shots taken) to date (January 14, 2010). This is generally a bad measure of a team’s overall strength; but, I contend that it is the stat that best characterizes the threat a team presents to a goalie. To illustrate this point, the #1 team in the NHL right now is New Jersey, and the 23rd ranked team is Atlanta. While New Jersey has a scoring percentage of .092 (scoring on 9.2% of their shots), Atlanta has a scoring percentage of .11 (scoring on 11% of their shots). This season, New Jersey has scored 2 goals per game against the Habs, while Atlanta has scored 3.25. So while NJ is the better team, Atlanta carved a greater dent in the Montreal crease.

Research questions.

To what extent are Halak’s and Price’s performance associated with the formidability of their opponents. Does Price tend to play better against a team with a weaker scoring %? Does Halak?
Given these associations, would Price have a better SV% if he played the teams that Halak faced and vice versa?

Results

The table below depicts the variables of interest, including goaltender performance (SV%/G) and opponent formidability (Scoring %).

It is immediately apparent that Price has faced the tougher opponents. Halak’s opponents have scored on 8.9% of their shots this season, while the teams Price faced have scored on 9.3% of their shots this season. Furthermore, Halak’s teams’ average rank is 18th (rounded) in the NHL, while Price’s is 15th. While these differences are small, they may actually confer an appreciable advantage to Halak.

In order to evaluate whether Halak has enjoyed the advantage of the weaker opponent, we must determine how well Price would have performed against Halak’s teams (and vice versa), which can be assessed using regression analysis.

I’m including an explanation of how regression works next. But feel free to skip ahead if the technicalities don’t interest you.

[What is regression? A regression analysis measures the degree of association between two variables and provides a “line of best fit” that characterizes this association. The line depicts how much change in one variable (e.g., feeling full) occurs as a function of change in another (e.g., amount of consumption). These two variables (amount of consumption and feeling full) are presumably highly correlated in a positive direction. As the amount of food consumed rises, so too will the sensation of being full. Variables can also be negatively correlated, such as the association between consumption and hunger – the more we eat, the less hungry we become. Once, you have measured two variables, you can determine how strongly they are correlated (whether positively or negatively) and you can use the line of best fit, and its corresponding equation, to determine a score on variable Y (e.g. feeling full) based on a score on variable X (e.g., consumption). Thus, by knowing how much food a person has consumed, we can use the regression equation to predict how full and hungry the person feels.]

Regression analysis will allow us to determine the equation of the line that best characterizes the degree to which the opposing team’s SC% is associated with a goalie’s SV%/G. Once measured, we can use the equation to predict what Price’s SV%/G would have been had he faced Halak’s teams (and vice versa).

The graph below reveals the results of the regression analysis. Each red dot represents one game and corresponds to the opponent’s SC% (on the X axis) and Price’s save percentage for that game (Y axis). If you traced a line from one data point to each axis, you can determine the values of each variable for that specific game.

The first thing to notice is that there is a reliable negative correlation between opponent SC % (axis X) and Price’s save % (axis Y). This means that the weaker the opponent, the better was Price’s performance. Incidentally, opponent SC% explains 20.2% of the variation in Price’s performance – a considerable contribution. Thus, while there are certainly other variables at play, opponent strength appears to an important predictor of Price’s performance.

The graph also shows the equation of the line (top right), which will allow us to predict Price’s save % had he hypothetically faced Halak’s opponents.

First, if you plug the SC% of the teams that Price faced into the regression formula, you get Price’s actual SV%/G.

-2.1295(.093) + 1.1073 = .909

Follow the red lines below to see how the regression line guides you from values on the X axis (opponent SC% = .093)), to values on the Y axis (SV% = .909).

Now, using the same procedure, we determine how Price’s SV%/G changes by assuming Halak’s meeker opponents.

On average, Halak’s opponents had a .0898 scoring %. If you plug that into the equation, Price’s SV%/G becomes …

-2.1295(.0898) + 1.1073 = .916.

Follow the dotted blue lines above to observe this change. Compared to Halak’s SV%/G of .919, Price’s SV%/G emerges only marginally smaller.

I performed the same analysis with Halak.
Strangely, Halak’s SV%/G does not change as a function of the teams he faces. In fact his SV%/G improves slightly as his opponent’s SC% improves. But this association was not statistically significant, with SC% explaining less than 1% of the variation in his SV%/G. Evidently, for Halak other variables would be better predictors of his performance.

Because of this slightly positive association, Halak’s predicted SV%/G actually improves slightly when facing Price’s opponents (from .919 to .921). Here is a summary of the findings.

In the final analysis, both goalies’ performance improves in the other’s net. But this improvement is greater for Price. Ultimately, the advantage that Halak’s numbers exhibit over Price’s is cut in half from .01 to .005 when their roles are reversed (incidentally, an overall better state of affairs for the Montreal Canadiens).

Conclusions

Opponent SC% is NOT the definitive predictor of a goalie’s SV%/G, but in Price’s case it is an important factor.
On the basis of this (severely limited) analysis, Halak emerges the better goalie, but his supremacy is exaggerated by the fact that he has faced weaker opponents.
If the analysis is correct, Martin would be well advised to play Halak against the stronger teams, Price against the weaker, as this should produce better performance form both goalies.

Thursday, January 7, 2010

The Blunders of Belief: Part I - Confirmation Bias

This is the first of a series of posts that will examine beliefs from a psychological perspective. There are numerous studies demonstrating that (1) belief is a fundamental part of human experience (even for notorious skeptics) and (2) humans are hard-wired to cling to their beliefs regardless of their inherent value. In this series, I will explore research that reveals the feebleness of the human mind, especially with regards to the cognitive blunders associated with belief.

Part I: Confirmation Bias

Consider the following sequence of numbers: 2, 4, 6.

Now, take a moment before reading on and try to think of a pattern that governs the sequence.

. . .

One possibility that immediately presents itself is 'consecutive even numbers.'

Now, suppose that you were able to test this hypothesis, by proposing another sequence of three numbers, and learning whether or not your sequence fits the pattern.

Go ahead. What three numbers would you choose to test the hypothesis?

. . .

If you are like most people, you probably came up with something like 8, 10, 12.

Good job, this is absolutely correct! This fits the pattern perfectly. Now, test the hypothesis once more and propose another three numbers.

. . .

If you thought of 14, 16, 18, or any other consecutive sequence of even numbers, you'd be correct once again.

By this point, you are probably quite confident that you have solved the puzzle. After all, the data is perfectly consistent with your hypothesis!

But in fact, 'consecutive even numbers' is not the pattern I had in mind. The pattern I was thinking of was, 'any sequence of ascending numbers' like 1, 2, 3; 12, 52, 54; or 7, 231, 4378.

But don't despair; nearly everyone makes the same mistake. In a study conducted 50 years ago (Wason, 1960), the vast majority of participants approached the puzzle exactly the same way - by selectively searching for affirming evidence. That is, if their hypothesis was 'consecutive even numbers,' the participants only proposed sequences of consecutive even numbers. And after a sufficient number of affirmative trials, they tried to solve the puzzle.

A small number of participants chose a more effective strategy. Instead of trying to confirm their hypotheses, they immediately proposed a disconfirming sequence like 10, 13, 21. The result was that these participants reached the correct answer in a few short trials. The majority however, continued their futile quest, only changing their strategy once they learned that their hypothesis was wrong. Some participants gave up altogether.

The tendency to test one's beliefs by considering confirming evidence alone (while ignoring disconfirming evidence) is called the confirmation bias and it is a folly we all commit. For any given position in a dispute (God vs. the big bang, Roe vs. Wade, The Beatles vs. The Rolling Stones), advocates tend to be sensitive to information the supports their position and blind to anything else. The tragedy is that there is an abundance of information consistent with any particular view, leaving us vulnerable to believing anything we so desire.

The daily horoscope takes advantage of its readership's bias for confirming information, offering vague insights that can be corroborated by practically any occurrence. "Expect difficult challenges at work today'' is just as valid if you sign a new client, forget your lunch at home, or get fired.

The lesson is this: Whether you are a God-fearing believer or a staunch atheist, you've probably amassed oodles of confirming evidence for your position. But that won't get you any closer to the truth. Search instead for that one devastating fact that brings down your entire worldview. Now, you're getting somewhere.