Statistics: Association And Causation
Essay Preview: Statistics: Association And Causation
Report this essay
Association and Causation
Statistics is the science pertaining to the collection and analysis of data. It is the refinement of the ambiguous, the distilling of truth from the crudest of resources. For this reason, it is necessary to discern the simplest path from Point A to Point B, disregarding any unnecessary data that may lie in the path. This, however, is easier in theory than in practice, and statisticians have developed various techniques to help differentiate between causation, a variable directly related to phenomena, and association, a variable whos changes occur concurrently with the phenomena, and could be causal or non-causal.
Every morning around sunrise, the rooster crows. Does the rooster cause the sun to rise? Not likely. But, if we were to track and plot the sunrises and rooster crows every morning for a year, it would be evident that they occur concurrently. This is an association. So, then, does the sunrise cause the rooster to crow? Not necessarily. A spurious association is an association between two variables that can be better explained by a third lurking, or hidden variable. In this case, the lurking variable is territorial advertising. “Roosters crow every hour, on the hourÐsaying, ÐThis is my coop, get the heck out of my way, dont mess with my women. If a truck goes by, the rooster interprets the noise as that of an intrusion by another rooster. If most of the crowing takes place in early morning, it is because that is when there is the most activity.” (Feldman, 1990, pp. 51-52). As you can see, association can be easily confused with causation.
Several methods have been developed to determine and analyze association. Just as the rooster and the sunset can be charted, so can other variables. When analyzing the linear relationship of these variables, the strength of their similarities is called correlation. Basically, if two variables exhibit simultaneous movement on the chart, a correlation exists. A high degree of correlation is evidence of an association. This association can be positive, represented by movement in the same direction on the chart, or negative, represented by concurrent movement in opposite directions. There are many types of tests meant to weed out anomalies in the movement of variables, measure the true correlation, and thus give evidence of association. Sometimes, if there are several variables involved, a statistician may take an average of them to fit the particular test he/she is performing. This is acceptable, but we must keep in mind that averaging may remove the peaks and valleys of the lines, and force them into conformity with the majority. If were trying to accurately track correlations with other lines, using averages may be like shooting ourselves in the foot. And, even if the correlations were 100% accurate, they still would not guarantee causation.
Causation, the cause and effect relationship between variables, cannot, strictly speaking, be proven using statistics. The various tests, techniques, and methods used in statistics, however, may measure whether the association is purely random and non-causal, or non-random, which would imply a possible causal relationship. For instance,