March 28, 2014

A Distorted and Incomplete Picture

When I saw Ben Goldacre's latest book Bad Pharma: How Drug Companies Mislead Doctors and Harm Patients on the new releases bookshelf of my local library, I borrowed it immediately. Not because I thought it would inform my data visualization work but because I really enjoyed Ben's previous book Bad Science.

So, I was pleased when I read Bad Pharma to find that it focuses on data, the raw material we work with when creating visualizations. I was also deeply disturbed by the book given that it details how modern evidence-based medicine is broken. Ben provides a useful summary of Bad Pharma in the book's introduction:
Drugs are tested by the people who manufacture them, in poorly designed trials, on hopelessly small numbers of weird, unrepresentative patients, and analysed using techniques which are flawed by design, in such a way that they exaggerate the benefits of treatments. Unsurprisingly, these trials tend to produce results that favour the manufacturer. When trials throw up results that companies don't like, they are perfectly entitled to hide them from doctors and patients, so we only ever see a distorted picture of any drug's true effects. Regulators see most of the trial data, but only from early on in a drug's life, and even then they don't give this data to doctors or patients, or even to other parts of government. This distorted evidence is then communicated and applied in a distorted fashion. In their forty years of practice after leaving medical school, doctors hear about what works through ad hoc oral traditions, from sales reps, colleagues or journals. But those colleagues can be in the pay of drug companies – often undisclosed – and the journals are too. And so are the patient groups. And finally, academic papers, which everyone thinks of as objective, are often covertly planned and written by people who work directly for the companies, without disclosure. Sometimes whole academic journals are even owned outright by one drug company. Aside from all this, for several of the most important and enduring problems in medicine, we have no idea what the best treatment is, because it's not in anyone's financial interest to conduct any trials at all. These are ongoing problems, and although people have claimed to fix many of them, for the most part they have failed; so all these problems persist, but worse than ever, because now people can pretend that everything is fine after all.
But enough about medicine. What makes Bad Pharma interesting to a data visualization practitioner are not charts and graphs (there are only a few in the book) it's the discussion of data. The book's first chapter Missing Data describes how drug trials performed by pharmaceutical companies overwhelmingly produce results that are favourable to the companies. Goldacre argues that this arises for several reasons
  • flawed experimental design: trials are designed in ways likely to produce a favourable outcome
  • flawed data analysis: see my post on Alex Reinhart's Statistics Done Wrong
  • publication bias: trials that produce unfavourable outcomes are simply not published, skewing published data towards favourable results
This reminds us to be circumspect about the data we visualize. We should ask:
  • How was the data collected?
  • How has the data been transformed or processed?
  • Is the data complete?
The answers to these questions are metadata that we need to communicate as part of any visualization we create. Without it, we risk painting a distorted and incomplete picture of the data we are visualizing.