Nov 262017
 

November 25, 2017

Special thanks to Jeffrey Shaffer, Andy Cotgreave, and Rody Zakovich for feedback that helped improve the dashboard that appears at the end of this post.

Overview

It seems I’m not the only person who has been thinking about stacked bar charts (see posts from Cole Nussbaumer Knaflic, Jonathan Schwabish, and Andy Cotgreave.)

My problem with these charts, and their first cousin, the area chart, is that the many people who design them don’t understand the possible pitfalls and end up creating charts that are attractive but that don’t convey a lot of useful information.

In this blog post we’ll see examples of where stacked bar and area charts work, where they fail, and what you can do to add some functionality to your dashboards so that if you do use stacked bar and area charts they will work better.

The people who market data viz tools love these charts

Some of the chief culprits include the data visualization vendors themselves who sometimes fashion “screaming cat” visualizations like these in their marketing materials and promotions.

01_StackedBars_Cat

Figure 1 — Sample dashboard Tableau uses to showcase its extensions API.

Figure 2 -- Microsoft PowerBI dashboard.

Figure 2 — Microsoft PowerBI dashboard.

Figure 3 -- Area chart from Tableau's home page

Figure 3 — Area chart from Tableau’s home page

I’ll admit the last one looks particularly cool, but do you have any inkling what it’s trying to show you?

Before we get into exactly what’s wrong with the charts (and how to fix them) let’s look at a couple of examples that work very well.

Some good examples

Here’s an example from The Big Book of Dashboards’ Complaints dashboard.

Figure 4 -- A portion of the Complaints Dashboard showing open, closed, and overall complaints (Dashboard by Jeffrey Shaffer).

Figure 4 — A portion of the Complaints Dashboard showing open, closed, and overall complaints (Dashboard by Jeffrey Shaffer).

With this chart it’s very easy for me to see the total number of complaints (overall length of blue bars plus red bars) as well as compare the number of open complaints (red bars) because there are only two colors and the items I want to compare are open complaints (red) and total complaints (red plus blue), both of which have a common baseline.

Another example comes from Matt Chambers’ Mayweather vs. McGregor fight analysis dashboard.

Figure 5 -- Stacked bar chart comparing overall punches and punches that landed between Mayweather and McGregor

Figure 5 — Stacked bar chart comparing overall punches and punches that landed between Mayweather and McGregor

You should check out the complete dashboard, but this stacked bar chart gets to the heart of why Mayweather won the fight: McGregor exerted more effort in launching 430 punches vs. Mayweather’s 320, but far fewer of McGregor’s punches landed (111 to 170).

As you consider why this chart is so effective notice that we only care about two things — the punches that landed and the total number of punches.

So, why do I like these two examples, but cite the earlier dashboards as “screaming cats”? It has to do with how many segments there are, and which segment is along the baseline.

Let’s explore a bit.

Understanding the strengths and weaknesses of stacked bar charts

Consider the chart shown below.

Figure 6 -- Typical stacked bar chart. We can make accurate comparisons of overall and of the first category (Central), but nothing else.

Figure 6 — Typical stacked bar chart. We can make accurate comparisons of overall and of the first category (Central), but nothing else.

I can see that Phones has more sales overall (1), that Chairs is the biggest seller in the Central region (2) and the Bookcases is the lowest seller in the Central region (3).  If that’s all that is important then we may be all done here (although it is hard to see that Bookcases is in fact less wide than Machines… more on that in a moment.)

But suppose I want to know what were the three lowest selling categories in the Central region, or if I wanted to easily compare sales in the East or West? In these cases this visualization isn’t much help and *that’s* the biggest problem with stacked bar and area charts: You can only accurately compare overall values and the one region that hugs the baseline.

Adding functionality — sorting and focus

Let’s address the “what were the three lowest sellers in the Central region?” question first. One way to do this would be to have a widget on your dashboard that allows you to sort by both total sales and by sales for a particular region. Here’s what the sort would look like for the Central region.

Figure 7 -- Bars sorted by Central region. Now it's easy to see which where the top and bottom sellers in that region.

Figure 7 — Bars sorted by Central region. Now it’s easy to see which where the top and bottom sellers in that region.

Ah, now we can easily answer the question “what were the bottom three sellers in the Central region?” They are Accessories, Machines, and Bookcases.

This is great if all you care about is the Central region, but suppose you wanted co compare sales in the South?  With the way the chart is configured above this is very difficult, but if you add a “widget” that allows your audience to select a region to focus on, the chart can easily answer the question.

Figure 8 -- Adding some functionality to the visualization so the audience can move a selected region to the baseline and sort by that region.

Figure 8 — Adding some functionality to the visualization so the audience can move a selected region to the baseline and sort by that region.

The “Focus on” parameter allows the user to select which region gets placed along the baseline and the “Sort Bar Cart by” parameter allows the user to sort either by the Selected region or by overall.

But, if what we’re interested in is showing how one region compares with itself and overall, why bother to have the other regions as different colored bars?  That is, why not make the two things we care about — overall and the region in question — stand out more?

Highlighting the selected region

My fellow author Jeff Shaffer suggested I add this functionality to the visualization and I think it’s a terrific addition. Let’s see how much easier it is to focus on the two main questions (overall and the Selected region) when we mute the colors that aren’t stacked along the baseline.

Here’s the results when we sort by the selected region.

Figure 9 -- Stacked bar chart with muted colors sorted by Selected region.

Figure 9 — Stacked bar chart with muted colors sorted by Selected region.

And here are the results when we sort by overall sales.

Figure 10 -- Stacked bar chart with muted colors sorted by overall sales.

Figure 10 — Stacked bar chart with muted colors sorted by overall sales.

What about Area charts?

You’ll need to address the same issues with area charts as you can only make accurate comparisons for totals and for segments that hug the baseline, as shown below.

Figure 11 -- Area chart showing sales over time. Note that we can compare overall sales and sales in the West as there is common baseline.

Figure 11 — Area chart showing sales over time. Note that we can compare overall sales and sales in the West as there is common baseline.

Note that because we are not including the product sub-categories the sorting feature is not needed.

100% stacked bar charts

Whereas the regular stacked bar chart allows you to make accurate comparisons of overall sales and one region at a time, a 100% stacked bar chart will allow you to accurately compare the two outer regions, as we can see below.

Figure 12 -- 100% stacked bar chart. We can compare the outer regions (Central and West) because there is a common baseline

Figure 12 — 100% stacked bar chart. We can compare the outer regions (Central and West) because there is a common baseline

Buy we can’t accurately compare the inner regions:

Figure 13 -- 100% stacked bar chart. We cannot compare the inner regions (East and South) because the elements are floating; there isn’t a common baseline.

Figure 13 — 100% stacked bar chart. We cannot compare the inner regions (East and South) because the elements are floating; there isn’t a common baseline.

Give the dashboard a try

The dashboard below allows you to explore the functionality discussed in this post. Please note that I’m not suggesting you should include all the widgets in the dashboard. Indeed, maybe this is something you use on our own to help curate interesting findings in your data that you then highlight in a presentation or using Storypoints.

As for how to build all this functionality into Tableau, if you download and the workbook and look under the hood you’ll see there’s nothing terribly complicated going on (indeed, there isn’t one LOD calc). That said, my solution is not very robust — it’s hard-coded to only show the four known regions that are currently in the data set. I’m sure with a bit more effort one could fashion something extensible but for this blog post I wanted to prototype the functionality, not craft a robust solution.

Parting thoughts: Do make sure to check out this post where Rody Zakovich applies a different approach to looking at overall and segmented sales for individual customers.

 

Jan 112016
 

Overview

I spend a lot of time with survey data and much of this data revolves around gauging people’s sentiments and tendencies using either a Likert Scale or a Net Promoter Score (NPS) type of thing.

Examples

Here’s an example of gauging sentiment using a 5-point Likert scale.

Indicate how satisfied you are with the following:

00_Grid1

Here’s an example of measuring tendencies, using a 4-point Likert scale.

How often do you use the following learning modalities?

00_Grid2

So, what’s a good way to visualize responses to these types of questions?

Over the past ten years I’ve spent thousands of hours working on the best ways to show how opinion and tendencies skew one way or another.  I have found that in most cases a divergent stacked bar chart helps me (and more importantly, my clients) best see what’s going on with the survey responses.

In this blog posts we’ll

  • See an example of a divergent stacked bar chart (also called a staggered stacked bar chart)
  • Work through a data visualization improvement process
  • Show how to visualize different scales (e.g., NPS, Top 3/Bottom 3, 5-point Likert, etc.)
  • Show sentiment and tendencies over time
  • Present a dashboard that will allow you to experiment with different visualization approaches

Note: for step-by-step instructions on how to build a Likert-scale divergent stacked bar chart in Tableau, click here.

Divergent Stacked Bar vs. 100% Stacked Bar

Readers of my newsletter and folks visiting the web site may have seen my redesign of a New York Times infographic that showed the tendencies of politicians to lie or tell the truth.  Here’s the 100% Stacked Bar chart that appeared in the New York Times.

Figure 1 -- 100% stacked bar chart.

Figure 1 — 100% stacked bar chart.

Here’s the redesign using a divergent stacked bar chart.

Figure 2 -- Divergent stacked bar chart.

Figure 2 — Divergent stacked bar chart.

With both the 100% stacked bar chart and the divergent stacked bar charts the overall length of the bars is the same, but with the divergent approach the bars are shifted left or right to show which way a candidate leans. I, and others I’ve polled, find that shifting the bars makes the chart easier to understand.

How We Got Here — Likert Scale Improvement Process

Consider the table below that shows the results from a fictitious poll on the use of various learning modalities.

Figure 3 -- Table with survey results.

Figure 3 — Survey results in a table.

I can’t glean anything meaningful from this.

What about a bar chart?

Figure 4 -- Likert scale questions using a bar chart. Yikes.

Figure 4 — Likert scale questions using a bar chart. Yikes.

Wow, that’s really bad.

What about a 100% stacked bar chart?

Figure 5 -- 100% stacked bar chart using default colors.

Figure 5 — 100% stacked bar chart using default colors.

Okay, that’s better, but it’s still pretty bad as Tableau’s default colors do nothing to help us see tendencies that are adjacent. That is, “Often” and “Sometimes” should have similar colors, as should “Rarely” and “Never.”

So, let’s try using better colors…

(…and don’t even think about using red and green.)

Figure 6 -- 100% stacked bar chart using a more appropriate color scheme.

Figure 6 — 100% stacked bar chart using a more appropriate color scheme.

This is certainly an improvement, but the modalities are listed alphabetically and not by how often they’re used. Let’s see what happens when we sort the bars.

Figure 7 -- Sorted 100% stacked bar chart with good colors.

Figure 7 — Sorted 100% stacked bar chart with good colors.

It’s taken us several tries, but it’s now easier to see which modalities are more popular.

But we can do better.

Here’s the same data rendered as a divergent stacked bar chart.

Figure 8 -- Sorted divergent stacked bar chart with good colors.

Figure 8 — Sorted divergent stacked bar chart with good colors.

Of course, we can also look take a coarser view and just compare Sometimes/Often with Rarely/Never, as shown here.

Figure 9 – Divergent stacked bar chart with only two levels of sentiment.

Figure 9 – Divergent stacked bar chart with only two levels of sentiment.

I find that the divergent approach “speaks” to me and it resonates with my colleagues and clients.

Experiments using Different Scales

A while back Helen Lindsey was kind enough to send me some data that contained responses to some Net Promoter Score questions.  Specifically, folks were asked to rate companies/products on a 0 to 10 or 1 to 10 scale.

Figure 10 -- The classic Net Promoter Score (NPS) question

Figure 10 — The classic Net Promoter Score (NPS) question

We compute NPS by subtracting the percentage of folks that are promoters (i.e., people who responded with a 9 or a 10), subtracting the percentage of folks that are detractors (i.e., people who responded with a 0 through 6) and multiplying by 100.

But sometimes my clients have questions that are on a 10 or 11-point scale but instead want to compute the percentage of folks that responded with one of the top three boxes minus the percentage of folks that responded with the bottom three boxes.

I realized that the Lindsey data set could provide a type of “sandbox” where we could experiment with different sentiment scales including NPS, Top 3 minus Bottom 3, 5-point Likert, 3-point Likert, and 2-point Likert.

Let’s look at the results of some of these experiments.

NPS

Here are two ways we can visualize NPS data.  The first shows the percentages of people that fall into the three categories.

Figure 11 -- NPS showing percentages

Figure 11 — NPS showing percentages

Here’s the same view, but with the NPS score superimposed over the divergent stacked bars.

Figure 12 -- NPS with score superimposed

Figure 12 — NPS with score superimposed

NPS over Time

It turns out that divergent stacked bars are great at showing NPS trends over time.  Here’s a view using percentages.

Figure 13 -- Divergent stacked bar showing NPS over time with percentages

Figure 13 — Divergent stacked bar showing NPS over time with percentages

Here’s the same view but with the score superimposed.

Figure 14 -- Divergent stacked bar showing NPS over time with scores

Figure 14 — Divergent stacked bar showing NPS over time with scores

Note – for some other interesting treatments of showing sentiment over time, see Joe Mako’s visualization on banker honesty.

Net = Top 3 minus Bottom 3

Let’s take the same data but divide it into the following buckets:

  • Positive = Top 3 Boxes
  • Neutral = Middle 4 Boxes
  • Negative = Bottom 3 Boxes

Here are the associated visualizations.

Figure 15 -- Top 3 / Bottom 3 showing with percentages

Figure 15 — Top 3/Bottom 3 showing with percentages

Figure 16 -- Top 3 / Bottom 3 with scores

Figure 16 — Top 3/Bottom 3 with scores

Five, Three, and Two-Point Likert Scale Renderings

Let’s suppose that instead of asking a questions on a 1 through 10 scale we instead asked folks to select one of the following five responses:

  • Strongly disagree
  • Disagree
  • Neutral
  • Agree
  • Strongly agree

Here’s the same NPS data but rendered using a five-point Likert scale.

Figure 17 -- Divergent stacked bar chart showing all responses

Figure 17 — Divergent stacked bar chart showing all responses

And here’s the same data, but divided into positive, neutral, and negative sentiments (3-point Likert).

Figure 18 -- Divergent stacked bar showing positive, neutral, and negative

Figure 18 — Divergent stacked bar showing positive, neutral, and negative

Finally, here’s the same data, but only showing positive and negative sentiments (2-point Likert).

Figure 19 -- Divergent stacked bar showing just positive and negative

Figure 19 — Divergent stacked bar showing just positive and negative

Try it yourself

Below you will find a dashboard that allows you to explore different combinations of the 1 to 10 scale.

I strongly recommend you do NOT give your audience all these scaling options;  these are here for you to experiment and see how the visualizations and ranking change based on what scales you use.  The only option I would present to your audience is the ability to toggle back and forth between percentages and scores.

Jan 312013
 

I spend half my time as a musician and the other half as a data visualization “scientist”.  I love both professions but one downside shared by both professions is that I cannot listen to music nor glance at a chart without trying to figure out what is going on inside the music and inside the chart.

Consider this snippet from a recent NY Times / CBS Poll on Americans’ Views on Gun Control:

I was able to interpret this and all the other charts in the article quickly, but I found myself wondering if the information would read or “sing” better with a divergent stacked bar chart instead of a standard stacked bar chart.  Here’s a version I created using Gantt bars in Tableau:

I like how the divergent (or” staggered”)  approaches shows the skew in sentiment.

For information on how to create this type of chart, see Likert Scales: The Final Word and Masie’s Mobile Pulse Survey.

Note: I’m not able to post the workbook as I created it using Tableau 8 and I do not have access to Tableau 8 Public yet (it is in restricted beta).  As per Joe Mako’s comments below, you can find a downloadable solution at http://public.tableausoftware.com/views/firearmownership/Dashboard.