How to Read A Study
Before You Believe the Headline
If you are living with memory symptoms, caring for someone with Alzheimer’s disease, or carrying a high-risk gene such as APOE4, you are not reading science as a hobby.
You are reading because the stakes are personal.
You want to know what to eat, which medications to take or avoid, which supplements might help, whether you should be on anti-amyloid treatments, whether to join a trial, and what you can do now. That urgency is understandable. It also makes people vulnerable to exaggerated claims.
A supplement guru can cite a case report. An influencer can cite a breakthrough observational study. A podcast guest can quote an alarming FDA warning. A headline can promise “the truth,” “what doctors won’t tell you,” or “the end of Alzheimer’s.”
The problem is not curiosity. Patients and families should read, ask questions, debate, challenge claims, and participate in research when possible. That is how the field improves.
The problem is certainty with weak evidence, or worse, certainty without evidence.
Science is cautious because biology is complicated. Breakthroughs are rare. A patient story is not a clinical trial. An association is not causation. A statistically significant result is not always meaningful. A warning label is not proof that a drug causes disease. And one study is rarely the final word.
1. Words Designed to Catch Your Attention, but Should Make You Slow Down
The first warning sign is often the headline.
Be careful when a health claim begins with phrases like:
“The truth about…”
“Breaking news…”
“The cure”
“The end of disease”
“The number one cause…”
“Medication risks nobody warns you about”
“Why doing this changes everything”
“What doctors won’t tell you”
“Doctors have it all wrong”
“Big Pharma has been lying to you”
“The real driver of disease”
“The study they don’t want you to see”
“Clinically proven”
“Reverses Alzheimer’s”
“Targets the root cause”
“Ancient remedy validated by science”
These phrases are designed to create a sense of urgency, distrust, or certainty before the evidence is discussed.
Scientific language usually sounds more modest: “is associated with,” “may reduce,” “suggests,” “requires replication,” “hypothesis-generating,” “not powered to detect,” “exploratory analysis,” or “clinical significance remains unclear.”
That language may sound less exciting, but it is usually more honest.
The late astrophysicist Richard Feynman captured this spirit well: “I think it’s much more interesting to live not knowing than to have answers which might be wrong.”
That is the heart of science. It is better to admit uncertainty than to be certain about something false.
A good scientific claim would not give the impression of certainty without weighing the evidence. It shows you the evidence, the uncertainty, and helps you make an informed decision.
2. Case Reports and Abstracts: Useful Clues, Weak Proof
Some of the most persuasive health claims begin with a story.
A patient was declining. Then they started a protocol, supplement, diet, or program. Their memory improved. Their family noticed. Their test score looked better.
Stories matter. And Social Media platforms are filled with these stories.
Case reports can alert doctors to rare side effects, unusual symptoms, unexpected harms, or new disease mechanisms. Some major discoveries began with careful observation of one patient, one family, or one rare genetic variant.
In Alzheimer’s disease, rare families with inherited mutations helped scientists understand amyloid biology long before amyloid PET scans and blood biomarkers existed, or rare mutations that led to medications.
So the problem is not the case report.
The problem is using a case report to claim that a treatment works.
A case report can say:
“This happened.”
It cannot say:
“This works.”
To know whether something works, we need comparison groups, clear methods, fair outcome measurement, and replication.
This is especially important in cognitive disorders. Memory test scores can fluctuate. People can improve because they have taken the same test before, called a practice effect. Symptoms can improve when sleep, mood, medications, thyroid disease, vitamin B12 deficiency, or sleep apnea are treated. Others may be selected for publication because they improved, while non-responders are left out, and we never hear about the patient who did not improve.
This is also why we need to be fair about popular “Alzheimer’s reversal” programs.
Some advice in these programs is reasonable: exercise, better sleep, a healthier diet, treatment of sleep apnea, attention to vascular risk, hearing care, and checking for reversible causes of cognitive symptoms. These are not fringe ideas. Many dementia clinics already recommend them.
The concern is what gets added around them.
Helpful advice can be bundled with expensive supplements, large panels of blood tests, coaching programs, branded protocols, and a price tag that many families can barely afford. The scientific evidence for those added layers is often much weaker than the evidence for the basic lifestyle and medical care recommendations.
That distinction matters.
Exercise is not the same as an expensive supplement stack.
Sleep apnea treatment is not the same as a branded protocol.
Checking B12 or thyroid function is not the same as ordering unconventional tests on toxins or hormones without clear evidence that acting on them changes dementia outcomes.
A program can feel scientific because it includes lab tests, supplements, diet rules, coaching, and personalized reports. But complexity is not the same as evidence.
A common pattern is to market a protocol with words such as “prevent,” “reverse,” “restore,” or “end,” while the published evidence consists mainly of uncontrolled case reports or case series, or even limited trials. These reports may lack clear methods, inclusion criteria, dosing, duration, blinded outcome assessment, the correct sample size, outcome, complete reporting of non-responders, validated cognitive measures, or transparent financial disclosures.
Those omissions matter.
Without them, readers cannot know whether improvement reflects the intervention, natural fluctuation, practice effects, better sleep, treatment of another medical condition, extra attention, placebo response, or selective reporting.
There is no harm in participating in such programs as long as we are well-informed.
It is prudent to separate reasonable, low-risk health practices from expensive claims that have not yet been proven.
Conference abstracts have similar limitations.
A conference abstract is a summary presented at a scientific meeting. It may describe interesting preliminary findings, but it is not the same as a full peer-reviewed paper. Abstracts often do not provide enough detail to judge the methods, missing data, statistical plan, subgroup analyses, limitations, or conflicts of interest.
Peer review is not perfect. Flawed papers can pass peer review, and good papers can be rejected. Reviewers can miss errors. Journals can favor exciting findings. Scientists have biases. Science is a human process, and humans make mistakes.
But science also has tools for correction: replication, criticism, meta-analysis, reanalysis, corrections, and sometimes retractions.
My personal hero, Carl Sagan, captured this well: “Science is far from a perfect instrument of knowledge. It’s just the best we have.”
That is why we should not treat science as a single paper, a single p-value, or a single headline. Science is the process that allows wrong ideas to be tested, challenged, corrected, or replaced.
3. Observational Studies: Association Is Not Causation
Observational studies are essential.
They can follow large groups of people over time and identify patterns that would be impossible, unethical, or too expensive to test in a randomized trial. We cannot randomize people for decades to poverty, air pollution, poor education, diabetes, hypertension, poor access to healthy food, or social isolation.
But observational studies have limits.
They can show that two things occur together. They cannot automatically prove that one caused the other.
One problem is confounding by indication.
This means the reason someone receives a treatment is also related to the outcome being studied.
A simple example is insulin.
People who use insulin are often sicker than people with early or mild diabetes. They may have had diabetes longer, have higher blood sugar, or have more complications. If insulin users later have more kidney disease, heart disease, or dementia, we cannot conclude that insulin caused those problems.
More often, insulin use is a marker of more advanced diabetes.
The same problem can occur with statins.
People prescribed statins often have high LDL, diabetes, hypertension, obesity, vascular disease, inflammation, or prior cardiovascular events. Those same conditions also increase dementia risk. So if statin users later develop dementia at higher rates, the statin may not be the cause. The statin may identify people who were already at higher vascular and metabolic risk.
Another problem is reverse causation.
Reverse causation means the arrow points in the opposite direction from what we first assume. The early disease process may change the exposure, instead of the exposure causing the disease.
In Alzheimer’s disease, brain and metabolic changes can begin many years before dementia is diagnosed. During that period, people may lose weight, change eating patterns, become less active, start or stop medications, or show changes in cholesterol levels.
So if a study finds that people with lower cholesterol are more likely to develop dementia, one possibility is that low cholesterol contributed to dementia.
But another possibility is the reverse: early disease, frailty, weight loss, or inflammation lowered cholesterol before dementia was diagnosed.
The arrow may point in the opposite direction.
4. The Statin Example: When a Signal Spreads Faster Than the Correction
Statins are a good example of how scientific information can be distorted and how easy misinformation can become viral.
In 2012, the FDA added language to statin labels about reports of memory loss, forgetfulness, and confusion. These reports were largely based on patient complaints after statins were already in use.
Such reports matter. They can alert regulators and clinicians to possible side effects and demand clinical trials. But they are not proof of causation.
This does not mean patient symptoms are imaginary.
Some people do experience side effects from statins. Muscle symptoms, medication intolerance, and individual reactions or cognitive symptoms can be real. A large study may show no evidence of harm at the population level, but that does not mean every individual will tolerate the medication perfectly.
Population evidence answers one question: “Does this drug increase risk on average?”
A patient’s experience raises another question: “Is this drug right for this person?”
Both questions matter.
What population studies and randomized trials do not support is the stronger claim that statins cause Alzheimer’s disease or progressive dementia.
A warning label is not a randomized trial.
A complaint is not proof.
A reversible symptom report is not Alzheimer’s disease.
The warning was real, but it was heavily misused, and that may have misled many patients. Online, it was often presented as proof that statins cause dementia or Alzheimer’s disease.
A later conference abstract added fuel. It reported that among people with early mild cognitive impairment and lower baseline cholesterol, users of lipophilic statins had higher dementia conversion over eight years than non-users.
At first glance, this sounds alarming.
But this was not a randomized trial. It was an observational subgroup analysis presented as an abstract. Participants were not assigned to statins by chance. They were taking statins because clinicians had already judged them to have cardiovascular or lipid risk.
The analysis also focused on people with early mild cognitive impairment, exactly the group where reverse causation is a concern. Early Alzheimer’s biology may already be changing weight, diet, frailty, medical care, medication patterns, and cholesterol levels.
The careful conclusion is not:
“Statins cause dementia.”
The careful conclusion is:
“This subgroup analysis raised a concern that needed replication and better-controlled evidence.”
Later, more careful studies and meta-analyses did not replicate the claim that statins cause dementia. But the viral message had already spread on social media platforms.
This is how misinformation works: a signal spreads faster than the studies that later correct it.
5. Clinical Trials: Better Evidence, Still Not Perfect
Randomized clinical trials are a reasonable way to test whether a treatment works. Randomization makes groups more similar at the start. Blinding reduces expectation effects. A control group shows what might have happened without the treatment.
But even trials need careful reading.
A trial should tell us its primary outcome before it starts. This is the main question. Secondary outcomes are supportive. Exploratory outcomes and subgroup findings are clues, not proof.
This matters when the primary outcome is missed. For example, the ALZ-801/APOLLOE4 trial in APOE4 homozygotes did not meet its primary cognitive endpoint, although secondary and subgroup findings suggested possible signals in milder participants and imaging outcomes. Those signals may guide another trial, but they do not replace the missed primary endpoint.
Subgroups need the same caution. The BROADWAY obicetrapib study reported a promising p-tau217 biomarker signal, especially in APOE4 carriers. But this was a biomarker substudy, not proof that the drug prevents Alzheimer’s disease or slows cognitive decline. The right conclusion is: promising signal, needs a dedicated trial.
So when reading a trial, ask:
Was this the primary outcome?
Was the study large enough?
Was the finding replicated?
Was it a subgroup or an exploratory result?
Did the outcome matter to patients?
A signal is something worth testing.
A finding is something that survives testing.
6. P-Values, Effect Size, and the Word “Breakthrough”
A p-value tells us how surprising a result would be if there were truly no difference between two groups.
A small p-value can suggest that the result is unlikely to be due to chance alone. But a p-value does not tell us whether the effect is large, important, or meaningful to a patient.
That is why we also need the effect size.
An effect size tells us how big the difference is.
A treatment can have a statistically significant effect that is still very small. With enough people in a study, even a tiny difference can produce a low p-value.
One common effect size is Cohen’s d. It expresses the difference between two groups in standard deviation units.
A Cohen’s d of 1.0 is large.
A Cohen’s d of 0.5 is moderate.
A Cohen’s d of 0.2 is small.
A Cohen’s d of around 0.1 is very small.
There is a simple way to think about this.
If two groups are identical, and you randomly pick one person from each group, the chance that the person from the treatment group scores higher is about 50 in 100.
If Cohen’s d is 0.13, that chance rises to about 54 in 100.
That is not nothing.
But it is modest.
And modest effects need modest language.
One example to illustrate effect sizes and p-values is multidomain lifestyle trials such as FINGER and US POINTER, and the recent Anti-amyloid trials.
FINGER was an interesting experiment. It was an intensive multidomain intervention compared with a less intensive control condition that also included health advice. The intervention group did better, but the standardized effect size was small, with a reported Cohen’s d ≈ 0.13.
US POINTER (FINGER experiment in the US) also needs careful framing. It did not compare lifestyle intervention with doing nothing. It compared a structured, intensive lifestyle program with a self-guided lifestyle program. Both groups received lifestyle support. Both groups improved. The structured arm did somewhat better, but like FINGER, the effects were very modest (Cohen’s d ≈ 0.06).
Why interpreting this low effect size matters.
Participants in these trials were often motivated and highly educated. An expensive program that works modestly in a highly educated research population may not automatically translate to communities with limited access to healthy food, safe exercise spaces, preventive medical care, hearing care, sleep treatment, or social support, where cheaper “standard” programs can be good enough.
The same caution applies to anti-amyloid treatments.
Lecanemab and donanemab are scientifically important because they showed that removing amyloid can slow clinical decline in early Alzheimer’s disease. But the effect sizes are modest.
In CLARITY AD, lecanemab slowed decline on the Clinical Dementia Rating–Sum of Boxes, or CDR-SB, by 0.45 points over 18 months compared with placebo. The CDR-SB ranges from 0 to 18, with higher scores meaning worse impairment. In standardized terms, this is roughly a small effect size, around Cohen’s d = 0.18 to 0.20.
In TRAILBLAZER-ALZ 2, donanemab slowed decline by about 3.25 points on the Integrated Alzheimer’s Disease Rating Scale, or iADRS, in the primary analysis population. The iADRS ranges from 0 to 144, with lower scores meaning worse cognition and function. In standardized terms, this is also a small effect size, roughly Cohen’s d = 0.20 to 0.25. On CDR-SB, donanemab’s treatment difference was about 0.7 points, again a modest effect.
These numbers matter.
They show that anti-amyloid drugs are not “nothing.” They changed the trajectory of disease in carefully selected patients with early Alzheimer’s disease and confirmed amyloid pathology.
But they are not cures. They have not shown that they reverse Alzheimer’s disease yet. They modestly slow decline, while also carrying risks such as amyloid-related imaging abnormalities, infusion reactions, monitoring burden, cost, and access barriers.
This is the same lesson as the lifestyle trials.
A result can be real and still modest.
A result can be scientifically important and still require careful language.
Good science does not need exaggeration.
7. A Simple Checklist for Reading Health Claims
Conclusion: Ask for Better Evidence
The goal is not to be dismissive.
The goal is to ask better questions.
A patient's story can raise a question.
An observational study can show a pattern.
A clinical trial can test a treatment.
Replication tells us whether the finding holds up.
If the claim is big — “reverses Alzheimer’s,” “statins cause dementia,” “this supplement prevents disease,” or “this protocol ends cognitive decline” — the evidence needs to be big too.
Patients and families should not be told to stop asking questions.
They should be encouraged to ask sharper ones.
Challenge claims. Ask for the primary outcome. Ask about the effect size. Ask whether the finding was replicated. Ask whether someone is selling something.
Good science does not ask us to believe harder.
It asks us to test better. And embrace uncertainty over false claims.
References
Hellmuth J. Can we trust The End of Alzheimer’s? Lancet Neurology. 2020;19(5):389–390.
Ioannidis JPA. Why most published research findings are false. PLoS Medicine. 2005;2(8):e124.
Jager LR, Leek JT. An estimate of the science-wise false discovery rate and application to the top medical literature. Biostatistics. 2014;15(1):1–12.
U.S. Food and Drug Administration. FDA Drug Safety Communication: Important safety label changes to cholesterol-lowering statin drugs. 2012.
Swiger KJ, Manalac RJ, Blumenthal RS, Blaha MJ, Martin SS. Statins and cognition: a systematic review and meta-analysis of short- and long-term cognitive effects. Mayo Clinic Proceedings. 2013;88(11):1213–1221.
Journal of Nuclear Medicine. Statin use, cholesterol level, and dementia conversion in early mild cognitive impairment. J Nucl Med. 2021;62(Supplement 1):102.
Li G, Mayer CL, Morelli D, et al. Effect of simvastatin on CSF Alzheimer disease biomarkers in cognitively normal adults. Neurology. 2017;89:1251–1255.
Abushakra S, et al. Clinical efficacy, safety and imaging effects of oral valiltramiprosate in APOE4/4 homozygous individuals with early Alzheimer’s disease: the APOLLOE4 Phase III randomized controlled trial. CNS Drugs. 2025.
Davidson MH, Szarek M, Scheltens P, et al. Effect of obicetrapib, a potent cholesteryl ester transfer protein inhibitor, on p-tau217 levels in patients with cardiovascular disease. The Journal of Prevention of Alzheimer’s Disease. 2025. doi:10.1016/j.tjpad.2025.100394.
Ngandu T, Lehtisalo J, Solomon A, et al. A 2-year multidomain intervention of diet, exercise, cognitive training, and vascular risk monitoring versus control to prevent cognitive decline in at-risk elderly people: the FINGER randomized controlled trial. Lancet. 2015;385(9984):2255–2263.
Baker LD, Snyder HM, et al. U.S. POINTER Study: multidomain lifestyle intervention in older adults at risk for cognitive decline. JAMA. 2025.
van Dyck CH, Swanson CJ, Aisen P, et al. Lecanemab in early Alzheimer’s disease. New England Journal of Medicine. 2023;388:9–21.
Sims JR, Zimmer JA, Evans CD, et al. Donanemab in early symptomatic Alzheimer disease: the TRAILBLAZER-ALZ 2 randomized clinical trial. JAMA. 2023;330(6):512–527.
Feynman RP. The Pleasure of Finding Things Out. Basic Books; 1999.
Sagan C. The Demon-Haunted World: Science as a Candle in the Dark. Random House; 1995.



