Neszed-Mobile-header-logo
Tuesday, March 3, 2026
Newszed-Header-Logo
HomeGlobal EconomyRetire statistical significance! | LARS P. SYLL

Retire statistical significance! | LARS P. SYLL

Retire statistical significance!

4 Nov, 2025 at 11:31 | Posted in Statistics & Econometrics | 3 Comments

The rigid focus on statistical significance encourages researchers to choose data and methods that yield statistical significance for some desired (or simply publishable) result, or that yield statistical non-significance for an undesired result, such as potential side effects of drugs — thereby invalidating
conclusions …

Retire statistical significance! | LARS P. SYLLAgain, we are not advocating a ban on P values, confidence intervals or other statistical measures — only that we should not treat them categorically. This includes dichotomization as statistically significant or not, as well as categorization based on other statistical measures such as Bayes factors.

One reason to avoid such ‘dichotomania’ is that all statistics, including P values and confidence intervals, naturally vary from study to study, and often do so to a surprising degree. In fact, random variation alone can easily lead to large disparities in P values, far beyond falling just to either side of the 0.05 threshold …

We must learn to embrace uncertainty. One practical way to do so is to rename confidence intervals as ‘compatibility intervals’ and interpret them in a way that avoids overconfidence. Specifically, we recommend that authors describe the practical implications of all values inside the interval, especially the observed effect (or point estimate) and the limits. In doing so, they should remember that all the values between the interval’s limits are reasonably compatible with the data, given the statistical assumptions used to compute the interval. Therefore, singling out one particular value (such as the null value) in the interval as ‘shown’ makes no sense.

Valentin Amrhein, Sander Greenland, Blake McShane

An article by one of my favourite statisticians, Sander Greenland — well worth reading. His work consistently challenges conventional statistical thinking, especially our overreliance on significance testing and simplistic interpretations of p-values. This particular piece is a great example of his ability to combine deep theoretical insight with practical relevance, offering a thoughtful critique that every empirical researcher can learn from.

In its standard form, a significance test is not the kind of severe test we seek when attempting to confirm or disconfirm empirical scientific hypotheses. This is problematic for several reasons, one of which is the strong tendency to accept the null hypothesis simply because it cannot be rejected at the conventional 5% significance level. In practice, significance tests in their standard formulation bias against new hypotheses by making it unduly difficult to disconfirm the null.

As has been repeatedly demonstrated in application, people often interpret not disconfirmed as probably confirmed. Standard scientific methodology suggests that if there is, say, only a 10% probability that pure sampling error could explain the observed difference between the data and the null hypothesis, it would be more reasonable to regard the null as disconfirmed. Especially when multiple independent tests of the same hypothesis yield similar results — around the same 10% level — most researchers would likely consider the hypothesis even more strongly disconfirmed.

Most importantly, we should never forget that the underlying parameters used in significance testing are model constructs. Our p-values mean very little if the model itself is wrong.



Source link

RELATED ARTICLES

Most Popular

Recent Comments