|
5楼

楼主 |
发表于 2013-7-8 08:55:03
|
只看该作者
Examples to teach: Correlation does not mean causation
We all know the old saying "Correlation does not mean causation". When I'm teaching I tend to use these standard examples to illustrate this point:
Number of storks and birth rate in Denmark;
Number of priests in America and alcoholism
In the start of the 20th century it was noted that there was a strong correlation between `Number of radios' and 'Number of people in Insane Asylums'
and my favourite: pirates cause global warming
However, I don't have any references for these examples and whilst amusing, they are obviously false.
Does anyone have any other good examples?
It might be useful to explain that "causes" is an asymmetric relation (X causes Y is different from Y causes X), whereas "is correlated with" is a symmetric relation.
For instance, homeless population and crime rate might be correlated, in that both tend to be high or low in the same locations. It is equally valid to say that homelesss population is correlated with crime rate, or crime rate is correlated with homeless population. To say that crime causes homelessness, or homeless populations cause crime are different statements. And correlation does not imply that either is true. For instance, the underlying cause could be a 3rd variable such as drug abuse, or unemployment.
The mathematics of statistics is not good at identifying underlying causes, which requires some other form of judgement.
Sometimes correlation is enough. For example, in car insurance, male drivers are correlated with more accidents, so insurance companies charge them more. There is no way you could actually test this for causation. You cannot change the genders of the drivers experimentally. Google has made hundreds of billions of dollars not caring about causation.
To find causation, you generally need experimental data, not observational data. Though, in economics, they often use observed "shocks" to the system to test for causation, like if a CEO dies suddenly and the stock price goes up, you can assume causation.
Correlation is a necessary but not sufficient condition for causation. To show causation requires a counter-factual.
I have a few examples I like to use.
When investigating the cause of crime in New York City in the 80s, when they were trying to clean up the city, an academic found a strong correlation between the amount of serious crime committed and the amount of ice cream sold by street vendors! (Which is the cause and which is the effect?) Obviously, there was an unobserved variable causing both. Summers are when crime is the greatest and when the most ice cream is sold.
The size of your palm is negatively correlated with how long you will live (really!). In fact, women tend to have smaller palms and live longer.
[My favorite] I heard of a study a few years ago that found the amount of soda a person drinks is positively correlated to the likelihood of obesity. (I said to myself - that makes sense since it must be due to people drinking the sugary soda and getting all those empty calories.) A few days later more details came out. Almost all the correlation was due to an increased consumption of diet soft drinks. (That blew my theory!) So, which way is the causation? Do the diet soft drinks cause one to gain weight, or does a gain in weight cause an increased consumption in diet soft drinks? (Before you conclude it is the latter, see the study where a controlled experiments with rats showed the group that was fed a yogurt with artificial sweetener gained more weight than the group that was fed the normal yogurt.)
The number of Nobel prizes won by a country (adjusting for population) correlates well with per capita chocolate consumption
A correlation on its own can never establish a causal link. David Hume (1771-1776) argued quite effectively that we can not obtain certain knowlege of cauasality by purely empirical means. Kant attempted to address this, the Wikipedia page for Kant seems to sum it up quite nicely:
Kant believed himself to be creating a compromise between the empiricists and the rationalists. The empiricists believed that knowledge is acquired through experience alone, but the rationalists maintained that such knowledge is open to Cartesian doubt and that reason alone provides us with knowledge. Kant argues, however, that using reason without applying it to experience will only lead to illusions, while experience will be purely subjective without first being subsumed under pure reason.
In otherwords, Hume tells us that we can never know a causal relationship exists just by observing a correlation, but Kant suggests that we may be able to use our reason to distinguish between correlations that do imply a causal link from those who don't. I don't think Hume would have disagreed, as long as Kant were writing in terms of plausibility rather than certain knowledge.
In short, a correlation provides circumstantial evidence implying a causal link, but the weight of the evidence depends greatly on the particular circumstances involved, and we can never be absolutely sure. The ability to predict the effects of interventions is one way to gain confidence (we can't prove anything, but we can disprove by observational evidence, so we have then at least attempted to falsify the theory of a causal link). Having a simple model that explains why we should observed a correlation that also explains other forms of evidence is another way we can apply our reasoning as Kant suggests.
Caveat emptor: It is entirely possible I have misunderstood the philosophy, however it remains the case that a correlation can never provide proof of a causal link. |
|