Conveying the Right Message

Tim Brock / Monday, February 8, 2016

We communicate data and the conclusions we've drawn from our datasets in numerous ways: conversations, presentations, reports, charts shared on Twitter... Frequently, the results will be compressed. After all, we can't write a 200-page thesis on every bit of research. Instead we might publish a short article with a headline. And busy people may only read the headline. The end result can be something like a game of telephone, where the message received is a much distorted (or plain wrong) version of what the underlying data is actually telling us.

Take the recent announcement from the World Health Organization's (WHO) "specialized cancer agency", the International Agency for the Research on Cancer (IARC), regarding consumption of processed and red meats. The UK press came out with headlines like "Processed meats rank alongside smoking as cancer causes – WHO" (Guardian), and "Drop the bacon roll - processed meats including sausages 'as bad for you as SMOKING'" (Daily Express). Is this really what the data tells us? Should we be worried? The article "Processed meat and cancer – what you need to know" by Casey Dunlop of Cancer Research UK avoids the sensationalist, attention-grabbing, headlines. The result is something much more informative.

The IARC did conclude that there is sufficient evidence to say that processed meats do cause cancer, and placed it in the same group as smoking (group 1). Red meats were placed in group 2A. That doesn't mean a lot without group definitions, so here are the five groups alongside bars illustrating the number of entities in each group (taken from the IARC website on 4th November 2015).

It seems the headline writers in the newspapers didn't take the time to understand IARC's classification system. As Dunlop explains, the group placements show "how confident IARC is that red and processed meat cause cancer, not how much cancer they cause". Moreover, the IARC are explicit about this in the Q&A document that accompanied the press release: "processed meat has been classified in the same category as causes of cancer such as tobacco smoking and asbestos (IARC Group 1, carcinogenic to humans), but this does NOT mean that they are all equally dangerous".

It's easy to blame the confusion on newspapers going for eye-catching headlines. Perhaps IARC should take some responsibility too? From the chart above we can see that most things that have either been classified as "Possibly carcinogenic to humans" or as not classifiable(!?). But do such classifications even make sense? And why not include in the press release the clear statement that the classification of processed meat doesn't mean it's as dangerous as smoking?

So how do we make sure we convey results accurately? There's a number of steps we could take, here's four I think are particularly important.

Be clear as to who collected the data and how

I'm more inclined to trust research results from a multinational group of scientists who are experts in their field than a group of politicians with an agenda. That doesn't mean we should give scientists a free pass though; we all make mistakes from time to time. Full disclosure of the method by which data was collected is vital for the assessment of the reliability of results. It's not unreasonable to expect that of others so expect others to expect that of you.

State how much data your conclusions are based on

"Seven out of ten people preferred product A to product B". There's a massive difference in the strength of that last statement depending on whether that was the result of asking 10,000 people, with 7,000 preferring product A, or if that's literally the result of asking ten people. Also, if you asked 10,000 people, seven preferred product A, three preferred product B and 9,990 had no preference then don't forget to mention the latter group! It probably says more about your products or the formulation of your survey than the other numbers do.

Express numbers in a format that is easy to understand

It's important to consider how we express critical numbers that pop up. As with chart design there may be effective and ineffective ways of doing so. For example, saying that action X (like eating processed meat) increases our chances of getting disease Y (eg cancer) by Z% makes for a dramatic headline but tells us nothing about our chances of actually getting the disease, which is what we probably want to know. To calculate that we need to know what the chance was in the absence of taking action X and then do some math. There's a strong case for using natural frequencies instead. For example, you could state the expected number of people getting disease Y in a group of 1,000 processed-meat eaters and the expected number from a group of 1,000 people who don't eat processed meat. From that kind of information most people should be able to make an informed choice about whether the increased risk is worth it. This is something the IARC failed to do in their press release, stating only that there was an 18% increase in risk of colorectal cancer for each "50 gram portion of processed meat eaten daily". This probably sounds scarier than it actually is.

Check with other people

If you can, take the time to check how your presentation of results comes across with at least one person not directly involved with your work. If they find your terminology confusing or misinterpret what you say, there's a good chance other people will too.

Let your data tell a compelling story with over 75 HTML5 charts, gauges, financial charts, statistical and technical indicators, trend lines and more. Download Ignite UI now and experience the power of Infragistics jQuery controls.