Comments to this scatterplot post contained a discussion about when one-tailed statistical significance tests are appropriate. I’d say that one-tailed tests are appropriate only for a certain type of applied research. Let me explain…
Statistical significance tests attempt to assess the probability that we mistake noise for signal. The conventional 0.05 level of statistical significance in social science represents a willingness to mistake noise for signal 5% of the time.
Two-tailed tests presume that these errors can occur because we mistake noise for signal in the positive direction or because we mistake noise for signal in the negative direction: therefore, for two-tailed tests we typically allocate half of the acceptable error to the left tail and half of the acceptable error to the right tail.
One-tailed tests presume either that: (1) we will never mistake noise for signal in one of the directions because it is impossible to have a signal in that direction, so that permits us to place all of the acceptable error in the other direction’s tail; or (2) we are interested only in whether there is an effect in a particular direction, so that permits us to place all of the acceptable error in that direction’s tail.
Notice that it is easier to mistake noise for signal in a one-tailed test than in a two-tailed test because one-tailed tests have more acceptable error in the tail that we are interested in.
So let’s say that we want to test the hypothesis that X has a particular directional effect on Y. Use of a one-tailed test would mean either that: (1) it is impossible that the true direction is the opposite of the direction predicted by the hypothesis or (2) we don’t care whether the true direction is the opposite of the direction predicted by the hypothesis.
I’m not sure that we can ever declare things impossible in social science research, so (1) is not justified. The problem with (2) is that — for social science conducted to understand the world — we should always want to differentiate between “no evidence of an effect at a statistically significant level” and “evidence of an effect at a statistically significant level, but in the direction opposite to what we expected.”
To illustrate a problem with (2), let’s say that we commit before the study to a one-tailed test for whether X has a positive effect on Y, but the results of the study indicate that the effect of X on Y is negative at a statistically significant level, at least if we had used a two-tailed test. Now we are in a bind: if we report only that there is no evidence that X has a positive effect on Y at a statistically significant level, then we have omitted important information about the results; but if we report that the effect of X on Y is negative at a statistically significant level with a two-tailed test, then we have abandoned our original commitment to a one-tailed test in the hypothesized direction.
Now, when is a one-tailed test justified? The best justification that I have encountered for a one-tailed test is the scenario in which the same decision will be made if X has no effect on Y and if X has a particular directional effect on Y, such as “we will switch to a new program if the new program is equal to or better than our current program”; but that’s for applied science, and not for social science conducted to understand the world: social scientists interested in understanding the world should care whether the new program is equal to or better than the current program.
In cases of strong theory or a clear prediction from the literature supporting a directional hypothesis, it might be acceptable — before the study — to allocate 1% of the acceptable error to the opposite direction and 4% of the acceptable error to the predicted direction, or some other unequal allocation of acceptable error. That unequal allocation of acceptable error would provide a degree of protection against unexpected effects that is lacking in a one-tailed test.