Hypothesis testing.
The two types of errors that can be made in a hypothesis testing are described: type I (false positive) and type II (false negative).
p is one of the most cherished values regarding scientific papers reading. Very often we desperately seek it, especially when the paper we’re reading is long and cumbersome, and we flooded with joy and happiness when we find it in the moment we were a bit lost and on the verge to throw the paper away: thank God! p is significant!. It seems then to us that our effort has been worthwhile… doesn’t it?
Well, sometimes it has been and other it hasn’t. To know it, we have to understand what the value of p is and what does it mean. Usually, a statistical test analyzes data collected from a sample in order to calculate the probability that a certain hypothesis is fulfilled in the population. Typically there are two mutually exclusive hypotheses: the null hypothesis (do you remember?, the one with the misleading name), which is stated as the lack of association or difference between the two study variables, and the alternative hypothesis that there is indeed a difference or an association between them.
Hypothesis testing
Suppose we want to measure the lipid lowering effects of two drugs in a sample of patients with hypertriglyceridemia. The usual situation is that we get two different average effects in both study groups, but we won’t a priori know whether this difference reflects the true population’s value (which is approachless) or is due to chance (with two different samples we would surely obtain different values). The steps we have to follow are:
- To state the null hypothesis (H0): there’s no difference in the lipid lowering effects between the two groups. The alternative hypothesis would be the opposite: the effect is actually different in the two groups.
- To decide what is the most appropriate statistical test to compare the results and calculate the value of p.
- Assuming that the null hypothesis is true, p value represents the probability of obtaining by chance a difference between the groups that is the same that the difference founded. Put it another way, it measures the probability of obtaining such a difference by chance. If p<0.05 (5%), we consider that the probability that the observed difference is due to chance is very low, so we assume that this difference probably reflects the actual population value and we reject the null hypothesis. But don’t take things wrong: p value is not the probability that H0 is true, but an estimation of the degree of uncertainty with which we can accept or reject it.
If p>0.05 the probability that the difference is due to chance is very high, so we cannot be sure the difference is real and we cannot reject H0. This doesn’t mean that H0 is true, but only that the study doesn’t have enough power to reject it.
In this difficult and crucial decision, we can blunder in two elegant ways:
- Rejecting the null hypothesis when it’s actually true (type I error).
- Do not get a statistically significant p value and not to be able to reject H0 when it’s actually false (type II error).
We’re leaving…
And is it good or bad to reject the null hypothesis?. It depends. To know what p gives us in a particular situation we’ll have to consider it together with its confidence interval and the specific clinical setting because, incredibly enough, non-significant results from the statistical point t of view may have a much greater clinical impact that others that are statistically significant. But that’s another story…