Median Watch

Eyes on statistics

An apology to the public

Sorry state. A few years ago I thought about writing an article that apologised to the public about the poor state of health and medical research. Their tax pays for this research and they give their time and data, and yet far too often the final results are totally unreliable. In the end I bottled it; too worried about the potential harm to my career. But today I’ve read this important paper from a group of statistical colleagues and it’s given me the nerve to apologise.

I humbly present the novel c-index

Show me the shortcut Scientists are busy people. Busy people love a short-cut because it gives them more time to be busy. The p-value is a well-used scientific short-cut. It can decide for us whether something is important or not, and it’s based on an equation so it must be right. Another heaven-sent shortcut is the h-index which allows us to decide the careers of researchers based on just one number (also made by an equation).

A change to judging career disruption

Re-posted from this 2016 AusHSI blog because this is still an issue. Let’s start with the obvious. Winning funding for health and medical research is soul-crushingly hard. Success rates for major schemes are under 20%, so failure is the norm. Your application will be judged by a panel of 6 to 12 senior researchers. A key marker of success is your track record, which may simply mean the number and quality of your papers, and your previous research funding (a very circular measure).

Dear p-values, it's not me, it's not you, it's everyone else

Yet another p-value run-in. For a recent observational study I tried to limit the use of p-values in the paper. My colleagues wanted more p-values and I had to politely push back. During one team meeting I even offered to put the p-values in if someone could accurately tell me what they meant … silence. Predicting that the reviewers would also want to see more p-values, I added this sentence to the paper’s methods: “We have tried to limit the use of p-values, as they are often misunderstood or misinterpreted, and elected to discuss clinically meaningful differences.

Not waving but drowning in data

Our paper examining trends in acronyms in abstracts was recently published in eLife. We examined over 26 million abstracts from the PubMed database, which is easily the largest data set I’ve ever used. In this post I talk about some of the challenges and benefits of dealing with such a massive data set. Data greed One of the most common mistakes I see researchers — new and experienced — make is to collect too much data.