# Acceptance of null hypothesis

This is a discussion question on the intersection of statistics and other sciences. I often face the same problem: researchers in my field tend to say that there is no effect when the p-value is not less than the significance level. In the beginning, I often replied this is not how hypothesis testing works. Given how often this question arises, I would like to discuss this issue with more experienced statisticians.

Let us consider a recent paper in scientific journal from “the best publishing group” Nature Communications Biology (there are multiple examples, but let’s focus on one)

Researchers interpret a not statistically significant result in the following way:

Thus chronic moderate caloric restriction can extend lifespan and
enhance health of a primate, but it affects brain grey matter
integrity without affecting cognitive performances.

Proof:

However, performances in the Barnes maze task were not different
between control and calorie-restricted animals (LME: F = 0.05,
p = 0.82; Fig. 2a). Similarly, the spontaneous alternation task did
not reveal any difference between control and calorie-restricted
animals (LME: F = 1.63, p = 0.22; Fig. 2b).

The authors also suggest the explanation of the absence of the effect – but the key point is not the explanation but the claim itself. The provided plots look significantly different “by eye” for me (Figure 2).

Moreover, authors ignore the prior knowledge:

deleterious effects of caloric restriction on cognitive performance
have been reported for rats and for cerebral and emotional functions
in humans

I can understand the same claim for the huge sample sizes (no effect = no practically significant effect there), but in particular situation complex tests were used and it is not obvious for me how to perform power calculations.

Questions:

1. Did I overlook any details that make their conclusions valid?

2. Taking into account the need to report negative results in science, how to prove that it is not “the absence of result” (that we have with $p > \alpha$), but “negative result (eg there is no difference between groups)” using statistics? I understand that for huge sample sizes even small deviations from null cause rejection, but let’s assume that we have ideal data and still need to prove that null is practically true.

3. Should statisticians always insist on mathematically correct conclusions like “having this power we were not able to detect effect of significant size”? Researchers from other fields strongly dislike such formulations of negative results.

I would be glad to hear any thoughts on the problem and I’ve read and understood related questions on this web site. There is a clear answer to questions 2)-3) from the point of view of statistics, but I would like to understand how this questions have to be answered in case of interdisciplinary dialogue.

UPD: I think a good example of negative result is the 1st stage of medical trials, safety. When scientists can decide that the drug is safe? I guess they compare two groups and do statistics on this data. Is there a way to say that this drug is safe? Cochrane uses accurate “no side effect were found”, but doctors say that this drug is safe. When the balance between accuracy and simplicity of description mets and we can say “there is no consequence for health”?