Category Archives: Coffee Breaks with Data Viz

Embrace diversity – how to design data visualizations for people with visual impairments.

Have you ever thought that it is possible to discriminate people through data visualization design? Several years ago, it sounded strange to me too, but indeed, it can be done unconsciously if you are not aware of the topic.

Discrimination is most often associated with skin colour, gender, age, religious beliefs, or nationality. However, this negative social phenomenon can have much broader spectrum. One of them, not at all intuitive, is data visualizations practices. The topic is gaining importance as more and more data is used to explain global processes, and those with difficulties in that area are being left behind. It may not be simple, but the onus is on data community and data visualization practitioners to develop new best practices to communicate data in more democratic way with those with difficulties in this area in mind.

To make data visualization more accessible to a wider audience, three dimensions can be improved: vision, cognitive and learning difficulties, and motor capabilities. The basic, obvious difficulty is related with vision impairments; but the degree of impairment is key. I will not discuss the most severe degree, which is blindness (this is a topic for different post), but I will bring closer the subject of colour-blindness and low vision impairments.

COLOUR BLINDNESS

In data visualization, colour is the most important communication channel. The ability to see and understand the meaning of colours helped our ancestors to survive in deep jungles or on savannas. Colour informed them about non-toxic food or allowed them to spot predators in the forest.

Today, we are still sensitive to colours and these naturals reactions are used in many ways. For instance, most warning signals use red colour, because we naturally associate it with danger or action (red is a colour of the blood)[1]. Studies show that prolonged exposure to the red colour can cause the heart rate to accelerate as a result of activating the “fight or flight” instinct[2]. In opposite, blue colour has a calming effect.

However, not everyone can see colours. Approximately about 10% of human population has trouble seeing colours correctly. If you would like to deepen your knowledge about types of colour-blindness, please check the website. There you can learn about causes of colour-blindness, test yourself, and find a tool to check if prepared visualization is in line with best practices.

There are several basic principles that improve your colour palette and enable visualization for broader audience. To understand them we need to understand two important colour properties:  hue, and saturation. Hue defines colour in terms of pink, blue, yellow, or magenta. Saturation is nothing more than volume of the colour. By juggling these main properties we can improve or worsen results of our work.

RED-GREEN

First of all, stop using red-green palette which is confusing or even unrecognizable to colour-blinded people. This is my humble recommendation. For most people with colour difficulties this red and green colour look the same (see Picture 1).

Picture 1

Most modern data visualization tools, such as Tableau or Power BI already have available colour palettes that handle with the topic. Both mentioned tools have also option to create custom compositions and upload them to the application (custom colour palettes for Tableau and Power BI).

If you are wondering about the right colour palettes, check out the ones presented on Picture 2 and Picture 3. They are nice, clean, and fancy and will work for any reports.

Picture 2 – Vivid & Energetic
Picture 3 – Elegant & Sophisticated

CONFUSING COLOUR PAIRS

Even though we try to avoid the red-green colour range there are still other pairs that resulted in similar way. In recent years I have been observing the dizzying career of the grey-blue duet. I like this combination as well, however, it is essential to match them wisely (see Picture 4).

Picture 4

MONOCHROMATIC SCALE

Sometimes the best option is to simply stick with one colour and play with its saturation to differentiate specific categories or data ranges (see Picture 5). This approach can be used in most visualizations.

Picture 5

More practical colour ranges you can find here, and if you would like to test your composition on specific charts use this website.

SHAPE

Another interesting channel we can use to help visually impaired people easily distinguish between coded data is to assign shapes to different data categories. A good example of how the introduction of shapes can make difference is the well-known RAG.

RAG stands for RED-AMBER-GREEN and is widely used in business environment to communicate performances, risks or statuses of activities. It is most commonly used in project management to report status of tasks, but due to its simplicity, it is also used in data visualization to highlight for instance KPIs (key performance indicators) performance. Red indicates about underperforming, amber that something is an issue and needs to be monitored, and green that is fine.

But as you already know RED-GREEN can be very confusing for colour-blind people. So, my suggestion is to use a shape as another visual communication channel to make sure everyone is on the same page. Instead of format with coloured background, it would be better to introduce icons that have different shape and are coloured in red, amber, or green (see Picture 6).

Picture 6

But what about charts like line chart or bar chart? How can we improve distinction between specific lines or bars? We can use different patterns to distinguish one bar from the rest one or to present several lines on one chart (see Picture 7).

Picture 7

WRITTEN INSIGHTS

Written descriptions, recommendations or insights can be tricky. Especially when you want to use colour names to emphasise certain points, data categories or issues. How someone, who does not see green colour (see Picture 1) can understand a message “All departments represented by green bars have exceeded their sales targets this year”? This message must be rewritten to “Departments A,B, and C have exceeded their sales targets this year” to ensure that all stakeholders understand it.

LOW VISION

In addition to the most recognizable challenge, which is colour blindness in data visualizations design, there is another related to vision loss due to age, accidents, or genetics. For those who suffers from low vision, we must remember that size and contrast of displayed text matters. Especially when we display some materials on screens in conference rooms, but even when you present something via communicators as Teams, or Zoom, size matters. You can read more about the topic here.

SIZE

When it comes to the font size, there is no one good recommendation. It depends on the purpose. If you are going to display materials at a conference in a large conference room, it is better not to use smaller fonts than 18 when describing axes or legend and have less information on the slides. There is nothing wrong in having more readable slides rather than fewer but cluttered.

A different approach can be taken when creating reports. I would say use a font size 9 or 10 for axis or legend description, but in no other case should you go lower than 12. In reports crucial thing is to group information together or to display them in close proximity to make it easier to interpret or make decisions. That is why optimizing space is so important. These screens can always be enlarged, and anyone can take advantage of them.

Picture 8

CONTRAST

The general rule is to maintain high contrast between background and foreground (e.g. white – black, black – white). A typical accessible barrier for people with low contrast sensitivity is grey text or figures on a light background. However, for some people better combination is with lower contrast, because they suffer from the bright background (e.g. they have to change a screen background to the darker to be able read what is on the screen).

As you can see there is no single best answer how to approach this challenge. A good practice is to give people the option to change the display mode from bright (light background and dark foreground) to dark one (dark background and light foreground).

Picture 9

By these small changes, we are bringing better user experience in our organizations or widely, if we prepare data visualizations for the media or other public usage.

[1]https://rochester.edu/news/show.php?id=3856

[2] https://journals.sagepub.com/doi/full/10.1177/2158244014525423

A confusing tram trip in Cracow – How humans read information.

It was several years ago. I was in Cracow and, I made an appointment with my friends in a nice vegetarian restaurant. I took with me my nine-year-old daughter. To get there we took a tram. My daughter was very excited about the event and, as a typical child her age, the minute we entered the tram, she started asking where we were getting off.

Fortunately, we sat down just next to an information board, where all the tram stops were displayed. So, I told her stop’s name, pointed to the board, explained to her how it works and proposed counting the stops on her own. I didn’t pay too much attention to it because Cracow is my hometown, and I knew it wasn’t far.

What a surprise it was to hear: “Mom, we’re on the 12th stop”. Knowing it cannot be right (the right number was 3rd), I asked her to count them again, but the response was the same. This situation repeated a few times till eventually she exploded and yelled at me. I swiftly considered the hypothesis that she must have been swapped in the hospital (obviously my own child would be smart enough to correctly count to 5!) and rejected it. Finally, I looked at the information board, and everything became clear. 

The culprit of this confusion was interpretation of the information board with tram stops. You see, western civilization uses the left-to-right reading pattern, so this reading order seems natural to us[1]. Linguistic and reading patterns affect reasoning of time and space, as well as relations between these two dimensions. My daughter made a subconscious assumption that the tram stops on the board were displayed in the “normal” order. Her assumption was totally right.

However, it displayed stops according to tram moving direction (right to left) but not with alignment of left to right perception of time (unexpecting design choice). Even though it consisted of a direction arrow, names of stops, and a moving ball pointing to the current stop, my daughter’s brain was still searching for the familiar left-to-right pattern.

And that is why her answer was 12th (count from the left side the stop marked with yellow circle!, but the start of the trip is on the right side)


This story is a great example how people consume information embedded in the time and how they expect it to be displayed. It’s worth remembering that, in our culture, information is read from left to right and from top to bottom. When we work on reports, dashboards or any data visuals, the human brain uses built-in patterns, helping to store information and save energy. Following this simple rule significantly improves the user experience. 

Check out my other posts about importance of the time orientation in data visualization:

[1]https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3322406/

Circle Charts – when design meets data

  • Circle charts are better to use for entertainment or information purpose. They are not the best choice for a business environment.
  • Circle charts are attractive for receivers and can pull them into your story.
  • Using multilayers demands providing a well-defined legend.

Humans always have had a special attitude to the sun. In prehistoric and ancient ages, in some cultures sun had the status of God. Without any scientific theories, they just knew that the sun is unique and has a crucial role for our planet and any living creature. Even in cultures where ancient humans did not worship the sun, the sun motif was commonly used to decorating buildings, everyday items, or apparel.

Nowadays, we still willingly use the image of the sun, especially in art and architecture. Something is appealing in this figure. Centric circle shape with rays around them somehow reminds me of the wheel of life with rays as special moments.

Maybe that is why the pie chart and all variations of pie charts are so popular and like among people. The father of the most known data visualizations is William Playfair. He invented a pie chart in 1801, and it is still commonly used to depict data.

My personal relationship with a pie chart is …. complicated. I do not use them often in a business environment. It is hard to present accurate data on a pie chart, especially with a good number of categories. When it comes to present information for making decisions it is better to go for more readable visualizations like bar charts (check my post: “PIES ARE FOR EATING NOT FOR DISPLAYING DATA”).

However, a different story is with data journalism, when the purpose is to entertain, or inform the audience. In such case, I would give green light to anybody, who would like to present any complicated data on any variation of a circle chart like a sunburst, radial chart, or spiral chart.

Those charts give you an opportunity to present complex hierarchical information on one chart, so even though there are maybe not idealistically readable, they are still concentrated within one visualization, which is an advantage for the audience. Do not forget that data journalism has a different purpose. The main goal is to pull readers into the story. Surprisingly, more complex visualizations with a huge number of details, colors and shapes can be a better agent than simple one to achieve that mission. It is because readers must spend more time decoding that visualization and retrieve information from it. Another aspect that increases the involvement of readers is chart interactivity. Of course, that case can be applied only on website media.

EXAMPLES

Below infographics are good examples of the complexity vs. the reader engagement. It is hard to understand them at glance. You need to hang your eyes for a longer time and go deeper to acknowledge these images.

The huge advantage is adding other layers or rings to the image. Thanks to that technique additional data are introduced into a chart and we can interpret or read information from different angles or levels. Looking on the same image with several layers of information helps us to find interesting patterns and observation. Would be much harder to achieve that effect when having separate charts.

Global statistics

Our Mother Earth is round at it has a connotation with a round object like a circle. Why not use it to strengthen the message. The chart is combined with several charts placed on circle x-axis: life expectancy and average hours of sunshine is a bar chart. Life satisfaction is a heatmap.

https://www.designboom.com/design/sunshine-and-happiness-infographic/
https://www.visualcapitalist.com/visualizing-all-of-earths-satellites/

Time

The time in western culture is perceived as linear from years perspective. When we present years the line chart or bar chart would be our first choice. However, when it comes to the elements of one year, we perceive them as a cycle. What I definitely admire in circle charts is the possibility to present any periodical phenomenon connected with time:

  • Seasons: Summer, Autumn, Winter, Spring
  • Months
  • Weeks
  • Hours
  • Minutes
https://www.digitalartsonline.co.uk/features/graphic-design/award-winning-infographic-designer-nadieh-bremer-on-how-create-powerful-data-visualisations/

Hierarchical information

Presenting hierarchical data is challenging. However, sunburst charts can handle that. Sunburst charts consist of rings that represent a separate level of hierarchy. This visualization gives us an opportunity to present very complex information in one view.

Note that hierarchical information can be presented as qualitative or quantitative.

The below example presents types of cheese categorize by type of milk and their hardness. This information is qualitative. Another type of visualization that we could use would be a treemap. However, a treemap does not look such good as a circle chart.

https://stackoverflow.com/questions/17069436/hierarchical-multilevel-pie-chart

DOS & DONTS

  • Use colours to catch the attention but remember to choose them in accordance with best practices for colour blindness disabilities. Studies show that around 10% of people population have some disabilities in colours distinguish.
  • Always provide the legend. The legend should explain the meaning of colours, shape, sizes and even positions of objects on your visualization.
  • Add short text on visualisation. If there are points that should be emphasis place additional text with an explanation nearby them. The well-balanced text provides context for a particular point.
  • Plan the objects’ size with available space in mind and readability aspect.
  • Do not use too small fonts.
  • Do not use decorative fonts as they are not readable.
  • Remember about the title and short description of the data visualization.
  • Leave whitespace around the visualization to not clutter the page.

Time orientation

Time orientation is crucial for the modern world to understand events and draw the correct conclusions.

The pre-industrial culture had not been so tided to time, and most often people perceived time in cycles as day-cycle or season-cycle. However, industrialization forced on us to create precise time systems and changed circularity to the linear phenomenon.

Currently, the majority of people live within time, and this time has for most of us one orientation from left to right and can not be reversible. It is one of human heuristics – mental shortcut, which helps us understand the world.

The example

Data visualizations best practices tell us to display time on the x-axis with left-right orientation (most of the culture except, e.g. Middle Eastern) and do not play with it especially when charts are going to be short displayed. In the end of August in Polish Public TV, a chart for unemployment rates was presented (see image below) with all possible misleading characteristics. I can not tell if it was intentional or not and politics are not the topic of this post, but let’s have a closer look at how this chart is designed and why it is designed wrong.

I have mentioned above that the human mind craves for mental shortcuts.  A quite possible scenario, in this case, can be that receiver reads only the first label for first bar from the left side on the x-axis and understands and remembers that on x-axis there are months of 2020 start from July (Lipiec 2020). The automated interpretation would be that two next bars represent data for two upcoming months, so August 2020 and September 2020. Of course, someone can raise a question in here “We don’t have data for September yet”, but my question is what a level of general data literacy and competency within society is? I am going even further and asking is it ethical to show data visualization for short time without a proper explanation of the graph? But it is a topic for another post. Going back to our example, the conclusion which can be seen is that the unemployment rate has decreased. Where is totally opposite.

However, let’s put ourselves in devil’s advocate shoes and consider, can we approach creatively presenting timeline or not? As I mentioned above, human eyes are used to interpret the timeline from left to right side. Due to that, it is good to keep that order. Sometimes we have a temptation to change it because for example, we would like to compare year over year change and we use last year data as a benchmark. However, that way of presenting data will not be intuitive for receivers. We must be very careful, when we are dealing with data associated with time.

How to fix it?

So how we can fix this visualisation?

First of all, let’s break years into two separate columns and give the time a proper order. Adding columns with years, we clearly indicate that we are dealing in here with two different time stamps. A title or a subtitle itself can help us emphasise that we are presenting a comparison between time points(July 2019 to July – June 2020), so don’t hesitate to include it. Also, I decluttered visualisation by removing background colour and 3d effects, which helps receivers focus only on data. To highlight the most current bar, I changed colour to orange.

All those changes enabled to present data story professionally and properly. Apart from all aesthetic aspects, data visualisation designers need to remember about ethics. The same as in other professions, data visualisation designers have their code of conduct.

Use the force of tables, but choose wisely.

“You must unlearn what you have learned”, said Master Yoda. Tables are not visuals! Truth? Have you ever heard that?

Nothing more wrong. Tables are a very powerful tool for visualizing data if you use them wisely. The main advantage of tables is the ability to present several measures for the same category in one row. This allows your audience to make quicker decisions because all important information is “on the table”.

However, the human brain READ the table. There are plenty back and forward iterations which it does to understand table content. So to make understanding easier, some additional elements should be introduced into tables. In the end, we don’t want to overload the lazy brains of our audience. Let’s see how we can improve tables to make them more accessible for people.

What makes the bottom table better than this at the top? There are several bullet points, which I’m going to address. You should have already noticed titles. Titles, itself, are introducing a huge difference.

Flat table

This table is simply flat. All information is at the same level, which means that they equally attract your attention. Nothing is highlighted, except for the second rows… which is unnecessary. Well, it’s hard to read, right? There are more sins: small fonts, cluttering elements such as lines, grey backgrounds, no formats of values.

Meaningful table

In the table, I’ve introduced information hierarchy by using different font colour. Rows and columns headers are in the background. Values have the darker, bold font. What is more, visual elements are added. Bars differentiate revenue volume, RAG icons simply convey the message about target realization, arrows indicate the direction of the year over year change. Columns headers well describe a column content and columns order leads through information importance.

Spaghetti Monster. Visualising multicategories.

When I say “multicategories”, I mean more than 4 categories. Sometimes, a challenge of visualizing multicategories is like an old polish proverb “eat a cookie and have a cookie”, which is hard to put into practice. I often observe how data analysts try to approach this challenge. Common scenarios are for products, countries, businesses, departments, teams, agents or cost centres. For all these data, they try to find out meaningful insights by depict patterns and highlight interesting points… mostly on one chart. That visual decision creates a beautiful piece of abstract art riched in colours, shapes, different sizes of objects, patterns or crossing lines.

Could you imagine how someone must be determined and persistent to look at and try to understand the column bar chart with 15 categories presented on 5 years horizon? Which category has upward and downward trend? Which is a leading one? And foremost which one is which?

When I think about “multicategories”, my first association popping into my head is “clutter”. The clutter is one of the greatest factors of cognitive overload. To understand clutter impact imagine that, you try to talk to your friend in a crowded space like a bar. You are all ears to follow her or his, and even then, you are not successful. The same effort your brain does, when it is exposed to the flashy visualization.

So how to overcome this challenge?

Both visualisation present the same information:

  • trends over time,
  • product comparison,
  • the best and the worst-selling products.
Spaghetti Monster
Clear view

Doesn’t it look like Spaghetti Monster? You rummage with a fork to find a juicy bit of meat. Similar is with decoding some information from this visualisation (line chart), it costs a lot of effort and time. Our brain decoding one information eg. line colours, then stores it in memory, then compares lines position on the chart, then look for trends for each of lines goes back and forward through the chart to make a sense of it.

On the second approach, I’m showing the different strategy to present the same data set. Splitting information into two visualisations gives clarity and ability to draw a conclusion. The left chart enables the receiver to compare sales amount between products and memorise them easily. The right chart provides information about particular product trend in comparison to average sales. In this approach, we guide the audience through data. We pay their attention to important points. We don’t leave them alone having hope that they draw a conclusion on their own.

We are the data storytellers.

Let’s start from 0

I came up with the idea for this article on the last webinar, which I had the pleasure to conduct with my coworkers. One of the participants paid attention to the starting point of the line chart, which I presented. He noticed that the starting point of the axis wasn’t in “0”. He addressed it with the famous book by Alberto Cairo “How Charts Lie” and commented that the line chart should have started at 0.

There is no doubt when it comes to the bar chart that it should ALWAYS start at 0. Bar charts encode data by length. People have developed the ability to compare objects in terms of length for thousands of years of evolution. Thanks to that they were able to estimate how high the food hangs on a tree branch or compare themselves with the enemy to fight or escape. Placing starting point in non-zero skews data and misleads our audience, because in the first place, unconsciously, they will start comparing bars length.

Of course, we can label bars and axes properly. The crime would be to switch off the Y-axis (in such case), what I observe from time to time. But even then in our brain, there is cognitive dissonance. Numbers don’t reflect lengths and proportions. Lengths and proportions are what our brain will remember because numbers are quite fresh phenomenon for our brain.

Let’s compare below examples for the bar and line chart with zero and the non-zero starting point and check what consequences it might have in the interpretation of data.

Skewed Y-axis & Bar Chart

To have no heart attack in the near future and be still in fit, WHO (World Health Organization) recommends taking a 10 000 steps per day. There are plenty of apps which can track your daily physical activities. Above charts presenting my recent results from the same range of dates. On the left side chart, a proper baseline is applied in 0. All daily results fluctuating nearby the daily goal. In one second, the level of dopamine in my blood pomps up looking at bars achieving the daily target.

The right chart doesn’t give me a reason to be proud of my self at first glance. Firstly, my brain notices gaps between bars and target line. And OMG, twice I almost took no steps! If you don’t notice Y-axis label, you can interpret this chart so dramatically. Worse, if you just had a chance to see it for a few second, you would probably make such a conclusion. Your brain wouldn’t have time to notice Y-axis labelling. But two times I exceeded the target more than twice. Awesome! Everything WRONG.

Skewed Y-axis & Line Chart

A different situation is with line charts. There is no length to compare. There are only slops and positions. In this case, context and narration play first fiddles.

On these line charts, the same data set is presented. From the chart on the left side, we can take out a similar story. The performance is almost aligned with the target. However, looking on the right chart, our brain doesn’t make automated assumptions on lengths because there are no lengths. We see connected dots.

And now is a question. Does the non-zero axis skews data at line chart or not?

There is a discussion around it. Still, non-zero baseline, even though there are no bars to compare, can mislead the audience, presenting steep slope of tiny mountains. However, in some cases, having a particular purpose in mind, it can be the best option to choose. Non-zero axis at line charts is good for presenting minor fluctuations or changes of phenomenons like stock exchange rates, products quality tracking (production series) etc. Especially, tracking performance within companies. Even small changes can have a huge impact.

In our scenarios. Well, to pat on the back myself, I would choose the bar chart with “0″ baseline, but to be able to control my daily results in details, I would definitely choose the line chart with non-zero baseline.

Key Ingredient of Compelling Stories. The Power of Context.

Seneca said, “We are more often frightened than hurt, and we suffer more from imagination than from reality.” Imagination is a powerful weapon. Designing compelling data visualizations to sell stories might get a human imagination down to work.

To make it happen, the context is a key player. Without the context, it is hard to understand presented numbers or outcomes. The human brain always seeks for comparisons to create a meaningful picture of the world. In this article, I would like to talk about how we can add context to presenting the behaviour of the phenomenon over time.

From my experience, I often see a single line of eg. revenue, sales, costs or number of claims presented on a line chart. However, without the proper highlighted background, it’s hard to say if what we see is positive or negative. Is this change is for better or worse. Using additional information, the message is strengthened and helps tell a thoughtfully crafted story.

This approach is especially important when the report supports the decision-making process. Quick business insights can be easier revealed when decision-maker can benchmark presented data to thresholds.

Let’s check how different stories can be told. On this chart, we can see a single line represents revenue of company X. Analytical eyes will see the downward trend over time. However, maybe this observation is not so clear for people who have other skills then analytical.

The first story can be about a decline in revenue over two last years. The declines in revenue can be depicted with an added trend line. In real scenario would be good to highlight specific points in time which caused this change.

The second one can focus on now and then. Comparing the two times period, current and last year helps see the magnitude of change. However, it’s good to remember that on such visualization trend over the longer period is lost.

The last one doesn’t emphasize changes over a longer time at all. It just presents performance vs. budget and directs the audience attention to “here and now”.

In conclusion, there are three different contexts for the same dataset, which changing the data perspective. Frankly speaking, combining these three perspectives gives an insightful story of revenue condition.

Numbers with Human Face

Recently, I’ve taken part in a discussion about how to present numbers to convey a message about true people stories.

We often forget that these are not only numbers. Each number represents a human being, his/her tragedy and tragedy of her/his relatives.

Statistics often show numbers, % of populations, rises, falls and trends. There is a huge challenge and effort to depict context and tell the story behind datasets. Especially, when we try to depict in numbers the phenomenon such as #COVID-19. We have to remember that “confirmed cases” are real people, who are diagnosed with coronavirus. A number of deaths is a number of people who lost their lives because of this disease.

Daily, we are exposed to numerous statistics in media, workplaces, schools. They describe current situations, accidents, local and global events like car accidents, infants mortality or unemployment. Most of them are expressed as a ratio or percentage. These formats are not intuitive and for most people are hard to interpret. However, there are some methods, which connect numbers with people. Maybe not with individuals, but with countable human beings, with whom we can empathize.

KPI approach

A good example is the unemployment rate, which is one of the most important economic indicators. In the governmental statistics, unemployment is presented as a ratio of employees to all people who can work.

An unemployment rate expressed as a percentage does not cause any emotions among most of us. Most of us understand what see, but … it is nothing personal. Percentages are abstract objects. It is about closer indefinite part of the population throughout the country.

As studies show, we can transform this message in a way to evoke people feelings and make them start to take a more human perspective. Instead of abstract 20%, we can present that 1 out of 5 people is unemployed. Each of us can count to five. Each of us can easily list five people. Behind this number, people’s faces may stand. In such a small group of people, our neighbour or our family member may be out of work. This is no longer an abstraction but a very real threat.

Human approach

The Nature of the Phenomenon. Linear vs. logarithmic scale

The one dataset, two charts, two opposite stories.

The introduced scale has a huge impact on how we digest and interpret the presented data. The linear scale represents natural numbers, which we can easily compare. The logarithmic scale is not intuitive for us. It’s a mathematical concept, which we can use when we want to describe multiplicative factors or when is a huge skewness towards large numbers. We need to use brainpower to understand it. What is more, we are so used to linear one that we can easily overlook that visual is depicted on a logarithmic scale. We should inform our audience that logarithmic is used… and make sure that they understand how to read it.

Because of COVID-19 huge amount of statistic are generated and published across the internet. Those statistics try to tell a story about COVID-19 phenomenon. Most of them focus on a number of confirmed cases and deaths. I notice two data visualisation’s trends regarding presenting data about this virus. The first one concentrates on the growth of a total number of confirmed cases and the second one on the pace of disease spreading.

Let’s feel the difference.

“PANIC chart” — I saw somewhere a good name of such a linear chart. I couldn’t more agree. Tell me, what feelings this chart evokes in you?

This is an exponential chart (another mathematical concept), which depict the growth of the phenomenon. Very rapid growth to be specific.

Below we can see the same data. However, embedded on a different scale. Please, look carefully. Each grid represents 10 to n power. Don’t you think that the below chart isn’t so scary?

What stories these two charts tell us?

Let’s base them on 18th of Mar and 4th of Apr. The Linear chart tells us that till 18th of Mar nothing spectacular happened. Totally opposite to the Logarithmic one, where we can see the fastest growth of confirmed cases. Between 18th and 4th on the Linear, we can see the huge growth. On the second one, the pace of growth decelerates. After 4th of April, the Linear continues to present the same pace of growth (steep hill), but on the Logarithmic, it’s plain to see that the curve flattens.