Oct 132015
 

Overview

In writing about visualizing survey data using Tableau I’ve found that the number one impediment to success is getting the data in the right format. In accompanying posts I’ll explain how to get this done using Alteryx, Tableau 10.x, the Tableau Excel add-in, and Tableau 9.0 pivot feature (you can come close with 9.x, but can’t get it perfect).

What do I mean by “just so”?

When I deal with survey data there are usually four different elements that need to fit together:

  1. The demographic information (e.g., age of respondents, gender, etc.)
  2. Survey responses in text format
  3. Survey responses in numeric format
  4. Meta data that describes the survey data.

Let’s see what the four elements look like using an Excel sample data set (click here to download).

Demographic data

Here’s what the demographic data looks like.

Figure 1 -- Demographic data

Figure 1 — Demographic data

Survey responses in text format

Here are several columns of survey responses in text format.  Column F contains data for a Yes / No / Don’t know question.  Column G contains responses for a question about salary.  Columns H through P are responses for check-all-that apply questions and columns Q and R contain Likert scale responses.

Figure 2 -- Survey responses in text format

Figure 2 — Survey responses in text format

Survey responses in numeric format

Here are the same responses but in numeric format.

Figure 3 -- Survey responses in numeric format

Figure 3 — Survey responses in numeric format

I’ll explain why it’s so useful to have the survey responses in both text and numeric format in a bit.

Meta Data (the “helper” file)

Here’s some data that I usually prepare by hand as most survey tools won’t produce it for me automatically.  Having this helps me understand the data and will  greatly streamline my work in Tableau.

Figure 4 -- Survey data meta data. This doesn’t take long to create and will be a huge time saver once we get the data into Tableau.

Figure 4 — Survey data meta data. This doesn’t take long to create and will be a huge time saver once we get the data into Tableau.

What does “just so” look like?

Our goal is to combine and reshape the various elements so that they look like this.

Figure 5 -- Reshaped data joined with meta data. Survey data in this format is very easy to use with Tableau.

Figure 5 — Reshaped data joined with meta data. Survey data in this format is very easy to use with Tableau.

As I’ve written previously, the key thing is that I no longer have a separate column for each survey response.  Indeed, I’ve reduced the number of columns from 45 to just 11, but I’ve also increased the number of rows from 845 to over 25,000. That is a good thing.

Why this works so well with Tableau

Our goal is to see how to get Alteryx to get the data in this format, not to actually use the data, but if you need convincing on why the meta data is so helpful, consider the following example.

Let’s say that in your survey you ask people to indicate the importance and satisfaction about certain services, as shown here.

Figure 6 -- Question comparing importance with satisfaction

Figure 6 — Question comparing importance with satisfaction

With the data set up “just so” conducting this comparison in Tableau becomes easy.  First we can drag Question Grouping into Filters and indicate that we just want to look at Importance and Satisfaction questions.

Figure 7 -- Using the Question Grouping field to just focus on Importance and Satisfaction questions

Figure 7 — Using the Question Grouping field to just focus on Importance and Satisfaction questions

Then we can drag Wording and Question Grouping onto the Rows shelf which gives us the framework for comparing importance and satisfaction across ten different questions.  No more having to “look up” which questions we want to explore and no more having to alias question IDs.  I love this!

Figure 8 – The helper file meta data provides the framework for comparing questions and building visualizations.

Figure 8 – The helper file meta data provides the framework for comparing questions and building visualizations.

Why do we need both text and numeric results?

We don’t really need them, but I know I certainly want them.

Consider all of the Likert scale question results.  The universe of possible values are

1
2
3
4
5

Suppose we want to know just what each of the values (1, 2, 3, 4 and 5) stand for?  The problem is that it depends on the question being asked as sometimes a 5 means “Strongly agree”, for other questions it  means “Critical” and for others it means “Extremely satisfied”.

Without having both numeric and text results we will have to write A LOT of IF / CASE statements and I, for one, do not want to do that.

So, now that we understand how and why we want the data “just so” we’ll see how to get it that way using Alteryx, Tableau 10.x, the Tableau-add-in for Excel, and Tableau 9.x.

 Posted by on October 13, 2015 2) Visualizing Survey Data, Blog Tagged with: , , , ,  5 Responses »
Sep 232015
 

Overview

I recently wrote about emotional vs. accurate comparisons and several people questioned whether the word “emotional” was appropriate.  (Several people questioned my assertions, too.  You can read their comments here.)

For this discussion I’ll use the term “engagement” in place of “emotion” and we’ll look into the challenges of creating public-facing visualizations that attract and engage, are clear and accurate, and do these things without “dumbing down” the subject matter.

Time Magazine and a cumbersome infographic

Stephen Few recently wrote a great post about the following infographic that appeared in Time Magazine in August, 2015.

Figure 1 -- Time Magazine's "Why we still need women's equality day" infographic. See http://time.com/4010645/womens-equality-day/.

Figure 1 — Time Magazine’s “Why we still need women’s equality day” infographic. See http://time.com/4010645/womens-equality-day/.

I have three major problems with this treatment.

  1. This is an important subject but the cutesy approach trivializes it.
  2. With so many chart types I have to work very hard to make comparisons among the different areas (Federal, Congressional, etc.). In addition, the chart is very long and requires a lot of scrolling.
  3. I strongly suspect that most people thought this was a dashboard having to do with Republicans and Democrats. I know that for me, whenever I see red and blue in a political context I think Republicans and Democrats and I had to fight this expectation to see that this was about men and women.

Stephen Few’s redesign

Here is Few’s redesign.

Figure 2 – Stephen Few’s clear and compact redesign.

Figure 2 – Stephen Few’s clear and compact redesign.

The collection of stacked bars makes it very simple to compare across the various categories and treats an important subject with the seriousness that is warranted.

But…

Few’s treatment is rather clinical and may be a little too dry for Time Magazine.

So, is there a way to fashion a graphic that is clear and accurate, like Few’s, but does more to draw the reader in?

Alberto Cairo’s redesign

Stephen Few asked Alberto Cairo to have a look at the source graphic and Cairo was able to turn out the following in a matter of minutes.

Figure 3 -- Cairo's redesign of Few's redesign.

Figure 3 — Cairo’s redesign of Few’s redesign.

Here are Stephen Few’s comments upon seeing the redesign:

“Alberto,

You’re the man! I love your improvements to the graphic.

You described your version as middle ground between my position and that of the embellishers, but I don’t see it that way. I’m an advocate of the kinds of embellishments that you added to the graphic for journalistic purposes, for they don’t detract from the information in any way. I’ve always said that journalistic infographics can be both informative and beautiful without compromising either. Doing this takes skill, however, that relatively few of the folks producing infographics possess. It also takes graphic design skill that I don’t possess, which is why I don’t design journalistic infographics. You’ve illustrated what it takes to do this well. As I said, you’re the man.”

I think Cairo would be the first to agree that there are many shortcomings to his rendering (e.g., colors, the guy on right looks like he’s holding a boomerang and not reading a book, etc.) but remember, Cairo put this together in a few minutes simply to show that it is in fact possible to create something that is beautiful and emotionally engaging without sacrificing one pixel of analytic integrity.

 

Sep 212015
 

Overview

I’ve conducted a lot of Tableau training classes and have found three things that confuse students simply because of the nomenclature Tableau uses for these things.  These three terms are

  • Headers
  • Table Calculations
  • Quick Filters

Headers

Consider the chart below that has both mark labels and an axis along the bottom.

Figure 1 -- Bar chart with visible axis.

Figure 1 — Bar chart with visible axis.

Because each bar has a label we don’t need to see the axis.  We can hide the axis by right-clicking it and selecting…

Figure 2 -- Turning off the header turns off the... footer.

Figure 2 — Turning off the header turns off the… footer.

… Show Header.

Yes, indicating that we don’t want to display a header will make Tableau hide…

the footer!

As I explain to students, in Tableau anything that surrounds a chart is called a Header.  If it’s along the top of a chart, it’s a Header.  Left side of the chart?  Header?  Bottom?  Header.  Right side?

Header.

Table Calculations

I know the first time I saw this I thought “Table Calculations” pertained to a visualization that used text tables. As I explain to students, I think of table calculations as Tableau having the ability to do math in its head.

Consider the example below where we show the raw vote count for each candidate from the 2012 US presidential election.

Bar chart based on query to the back-end database

Figure 3 — Bar chart based on query to the source database

Here, Tableau has queried the underlying database and is displaying the results based on that query.

With a table calculation, Tableau looks at the results that are already on display, as it were, and then does some additional internal calculations.  In the case of asking Tableau to show the percent of total, Tableau adds up the total for all three candidates and then divides the tally for each candidate by that total.

As I said, I find it helpful to think of Tableau Calculations as Tableau doing math in its head.

Quick Filters

To filter results in Tableau, you drag dimensions and measures from the Data window to the Filters card and then apply the settings you want for the various filters.

If you want easier access to the filter settings you can right-click a filter and select Show Quick Filter.

The problem with this term is that people new to Tableau think this pertains to speeding up the filter when it in fact means that you just want the filter control to be visible on a worksheet or a dashboard.  It has nothing to do with making filters quick.  In fact, having lots of quick filters on a worksheet can slow Tableau down because Tableau has to calculate what selections should appear in each of the quick filters.

The only rationale I can see for the name is that it allows you to access the settings quickly rather than having to go through the Filters dialog box.  Still, it’s quite confusing for those first learning Tableau.

Summary of confusing terms

Here’s a summary of the terms that often confuse people new to Tableau.

Term What students think it means What it actually means
Header Something at the top of a chart Anything that surrounds a chart
Table Calculation Something having to do with text tables / cross tabs The ability for Tableau to do math “in its head”
Quick Filters Some setting that makes filters work faster Make the filter control visible

What should we call these things and should Tableau rename them?

Given just how entrenched Tableau is it may be too late to change these terms, but if it’s not too late…

In the case of Show Quick Filters I would change it to Show Filter Control.

What about Table Calculations and Headers?  Got any ideas?

 

Sep 152015
 

Overview

Figure 1 – Bar charts are better than pie charts are better than donut charts.  Most of the time.

Figure 1 – Bar charts are better than pie charts are better than donut charts.  Most of the time.

As anyone who has read this blog knows I’m definitely a “bar charts are better than pie charts are better than donut charts” kind of guy, at least when you need to make an accurate comparison.

But in my classes, as I rearticulate the case against pies and donuts, I find myself wondering if there are in fact times when a pie chart might be a better choice.

Most of my data visualization work is for internal purposes so I focus on making it easy for people to make an accurate comparison.

But as my clients and I make occasional forays into public-facing visualizations I think about how to make it easy for people to make an emotional comparison.  By this I mean that I want people viewing the visualization to just “get it”.

Better yet, I want people to get it, be engaged by it, and in some cases, “feel” it.

With this in mind, in this post we’ll explore cases where

  • a pie chart is in fact as good, if not better, than a bar chart.
  • circles and spheres do a better job conveying magnitude than do bars.
  • a waffle chart produces an emotional wallop without compromising analytic integrity.

Where a pie chart trumps a bar chart

So, it’s the year 2034 and in this somewhat dystopian future there’s a movement afoot to add an amendment to the US constitution banning the use of pie charts.

Those of you familiar with the United States Constitution know that three-quarters of the states need to approve an amendment for said amendment to become law.  In 2034 it turns out the 39 of 50 states will in fact ratify the amendment.

Does that get us the needed 75%?  Here’s a simple, compact chart that lets us know immediately.

Figure 2 -- The amendment banning pie charts passes as I can see that the "Yes" votes fill more than three quarters of the circle.

Figure 2 — The amendment banning pie charts passes as I can see that the “Yes” votes fill more than three quarters of the circle.

It’s so easy to see that the “Yes” votes fill more than three-quarters of the pie that I don’t need labels indicating the large slice is 78% and small slice is 22%.

Compare this with a bar chart.

Figure 3 -- Did the "Yes" exceed 75%?  Without labels it's very hard to tell.

Figure 3 — Did the “Yes” exceed 75%?  Without labels it’s very hard to tell.

Without labels showing the percentages I cannot tell for sure if the “Yes” bar is more than three times larger than the “No” bar.

Okay, Okay, Okay!  I know that a simplified bullet chart would work, too.

Figure 4 -- A bullet chart shows that we've exceeded the goal.

Figure 4 — A bullet chart shows that we’ve exceeded the goal.

Yes, the bullet chart makes it clear that I’ve exceeded my goal but I need to know that the goal was 75%.  I don’t need the goal line with the pie chart.

So, does this mean that it’s okay to use pie charts instead of bar charts?

No.  Based on this example it’s only okay to use a pie chart (singular).  In addition, your pie chart (singular) needs to meet the following conditions:

  • One of the slices has to make up at least 50% of the pie.
  • If you’re pie has more than two slices you don’t ask people to compare the smaller slices.

Where circles and sphere’s do better than bars

As we all know Jupiter is big, really big.

Just how much bigger is it than Earth?

Should I create a bar chart to show this? If I were to create one should I compare the radius or the surface area of each planet?

Or should I really go nuts and compare the volume of the planets?

I don’t think the dashboard shown above is nearly as effective as the visualization shown below.

Figure 5  -- "Size planets comparison" by Lsmpascal - Own work. Licensed under CC BY-SA 3.0 via Commons - https://commons.wikimedia.org/wiki/File:Size_planets_comparison.jpg#/media/File:Size_planets_comparison.jpg

Figure 5  — “Size planets comparison” by Lsmpascal – Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:Size_planets_comparison.jpg#/media/File:Size_planets_comparison.jpg

Jupiter and Saturn – and even Neptune and Uranus – really dwarf earth and the other planets and with this visualization I feel it.

Even the simple chart comparing the area of the cross section of the planets gives me a better feel for the data than does the bar chart.

Figure 6 -- Circles comparing cross-section area of the planets.  Yup, I can tell that Jupiter is way bigger than Earth.

Figure 6 — Circles comparing cross-section area of the planets.  Yup, I can tell that Jupiter is way bigger than Earth.

Is it essential that I can tell exactly how much larger one planet is than another?  I don’t think it is and I much prefer the emotional pull of the circles and the spheres.

A Fun Tangent

One thing that’s very hard to express in a static chart is how much space there is between the sun and the planets.  To get a sense of just how incredibly vast the distances are check out this fascinating, albeit somewhat tedious, interactive visualization from Josh Worth.

Getting an emotional wallop with waffles

A few weeks ago Cole Nussbaumer posted a tweet asking people what they thought of this chart from The Economist:

Figure 7 – A waffle chart from the article "Teens in Syria".  See http://www.economist.com/blogs/graphicdetail/2015/08/daily-chart-6?fsrc=rss.

Figure 7 – A waffle chart from the article “Teens in Syria”.  See http://www.economist.com/blogs/graphicdetail/2015/08/daily-chart-6?fsrc=rss.

The first thing that surprises me about this is that The Economist went with a waffle chart and not a bar chart, like the one below.

Figure 8 -- The type of chart I would have expected to see in The Economist.

Figure 8 — The type of chart I would have expected to see in The Economist.

The second thing that surprised me was that I preferred the waffle chart.  Yes, as Jeffrey Shaffer correctly points out, the dots are so tightly packed that you literally see stars between the circles, but  this can easily be remedied.  The question on my mind is why do I prefer waffles?

My answer is that the having each dot represent one of the 120 people surveyed connected with me in a way that the bar chart did not. Combined with the percentage labels (which are critical to the success of the visualization) the waffle chart hit me hard and it did so without dumbing down the importance of the discussion one bit.

So, are bars charts always boring?

No!  In my next blog post I’ll show you an example of a bar chart embedded inside a “come hither” graphic that

  • attracts and engages
  • does not trivialize an important issue
  • represents the data clearly and accurately

Stay tuned.

Sep 012015
 

Overview

I’ll admit that I have a problem with treemaps in Tableau, but it’s not because the chart type is in some way inferior. My problem is with how people use – and misuse – treemaps.

Here’s a good example of misuse.  Instead of displaying something straightforward that looks like this…

Figure 1 -- The humble, but accurate bar chart

Figure 1 — The humble, but accurate bar chart

… some people feel compelled to add “visual variety” to their dashboards and instead create something that looks like this.

Figure 2 -- Look , Ma! I made a Mondrian!

Figure 2 — Look , Ma! I made a Mondrian!

Except for the “it looks cool” factor there’s no good reason to use a treemap in this situation.

So, when should you use a treemap?

What’s in a treemap and why it can be useful

With a treemap you have two attributes at your disposal:

  1. The size (area) of rectangles, and
  2. The color of the rectangles

A treemap consists of packed rectangles where the area of a rectangle corresponds to the size of a particular measure.  In the example above the size of the rectangle is based on the number of people that come from a particular region.  North America has the largest value so it’s represented with the largest rectangle. Europe has a smaller value to its rectangle is proportionally smaller.

Treemaps really come in handy is when you have A LOT of marks to plot and you need to show all of the marks in a compact area.

So, this sounds like a great chart – we’ve got rectangles to show how big and small stuff is, color to group related rectangles intelligently, and we can fit a lot of stuff in small space.  Why not use this chart all the time?

The downside is that we are comparing the area of rectangles and with rectangles it is difficult to make an accurate comparison. People may be very good at comparing the length of bars but as a species we are not particularly good at comparing the area of rectangles (and we’re downright awful at comparing the area of circles.)

So, given the advantages and shortcomings, just when should you use it?  Let’s look at a particular scenario.

Showing Presidential Electoral Results

A Filled Map

Consider the electoral map below showing electoral votes by state for Barack Obama and Mitt Romney in 2012.

Electoral Map Filled

Figure 3 — Filled map showing electoral votes for the 2012 presidential election (displaying 48 out of 50 states)

Our Electoral College system is fairly confusing and I can only imagine how somebody from outside the US would look at this as there appears to be more red on the map than blue… but the blue guy won!

This discrepancy becomes even more pronounced when we include Alaska and Hawaii in the map.

Figure 4 -- Filled map showing electoral vote winners for the 2012 presidential election (displaying 50 states)

Figure 4 — Filled map showing electoral vote winners for the 2012 presidential election (displaying 50 states)

Clearly, a map designed to show how much area there is in a state fails with Electoral College results where the numbers are based on population not land mass.  In the example above there’s A LOT more red then blue, but again, the blue guy won the election.

Perhaps a different type of chart will do a better job?

Symbol Map

Here’s a symbol map of the same data.

Figure 5 -- Symbol map showing electoral vote winners for the 2012 presidential election (displaying 48 out of 50 states)

Figure 5 — Symbol map showing electoral vote winners for the 2012 presidential election (displaying 48 out of 50 states)

I think this is more accurate as there’s clearly more blue than red, but it’s still a tough read.  What else might work?

Cartogram

Here’s a cartogram from Professor Mark Newman of the University of Michigan showing the same data, except the polygons for each state has been adjusted to reflect the population of the state.

Figure 6 -- Cartogram showing election results where the shape of the state is based on its population and not land mass.

Figure 6 — Cartogram showing election results where the shape of the state is based on its population and not land mass.

While it’s very clear that there is more blue than red on this map there are two problems with this approach:

  1. There aren’t many tools that will support this type of distortion; and,
  2. This map will frighten small children.

Summary Bar Chart

Why not just display a simple bar chart showing the total number of electoral votes, like the one shown here?

Figure 7 -- Electoral vote count by candidate

Figure 7 — Electoral vote count by candidate

This is certainly very clear and we can see easily by how much Obama won, but we’re missing an important part of the story.

In US presidential elections a winner is chosen by tallying the electoral votes from each state and the summary bar chart doesn’t show us how each state contributes to the total for each candidate.

And the Winner is… ? The Treemap!

Here’s a treemap showing the exact same data.

Figure 8 -- Treemap showing 2012 electoral vote results

Figure 8 — Treemap showing 2012 electoral vote results

Of all the single visualizations I think this treemap tells the most complete story.  We can see just how much states like California, Texas, Florida, and New York contribute to the total as well as gauge —  to some degree  — just how many more electoral votes Obama received than did Romney.

One shortcoming, however, is that we can’t see the names of all the states as some of the rectangles are too small.

One way to address this is by adding a tool tip, as shown here.

Figure 9 -- Hovering over a mark allows me to see the name of the state and number of electoral votes.

Figure 9 — Hovering over a mark allows me to see the name of the state and number of electoral votes.

While this works, a problem we should address is that the small states are not easily searchable.  That is, if I want to know the results for Alaska, Hawaii, Delaware, etc., I have to go hunting for them.

At this point we’ve gotten about as far as we can get with a single chart.  To tell the complete story – and to make it easy for people to find results for a particular state – we should create a dashboard.

The Electoral Vote Dashboard

Here’s a dashboard that puts two of the views together and that allows the user to find a particular state’s rectangle by selecting the state from a list.

Figure 10 -- Electoral votes dashboard.  Selecting a state from the list will display that state’s rectangle in the treemap.

Figure 10 — Electoral votes dashboard.  Selecting a state from the list will display that state’s rectangle in the treemap.

While the “star” of the dashboard is the treemap, the summary bar chart and the selectable list make the story complete and we get a solid understanding of the 2012 Electoral College results.

And we achieved this without using an actual map.

Click here to interact with dashboard.

Aug 112015
 

Overview

So, here’s why until recently I’ve recommended that my clients avoid large dashboards.

We’ve been working on a collection of killer dashboards and we’re all set to make a big presentation to the CEO. This thing is so high profile we get to use the executive conference room with the super bright projector and the 120-inch screen.

Our dashboards are all 1,325 x 1,000 pixels, but they’re going to look fantastic on that giant screen.

We’re incredibly well prepared.

At least we think we’re incredibly well prepared because when we arrive an hour early we discover the top resolution of that ever-so-fancy projector is 1,280 x 800 and our ever-so-well-crafted dashboards won’t fit on the screen.

Tableau Desktop and Reader will not scale the dashboard intelligently.

It doesn’t fit! Tableau Desktop and Reader will not scale the dashboard intelligently and we end up with the dreaded scroll bars.

Yikes, we have scroll bars! What are we going to do?

And don’t suggest using Tableau’s “Automatic” dashboard setting as it will just squish the different visualizations and won’t scale the fonts.

Let your browser scale the dashboard

While Tableau Desktop and Reader cannot scale your dashboard, Tableau Public, Tableau Online, and Tableau Server — with the help of your browser — can scale the dashboard, and scale it intelligently.

For example, using Tableau Public with the  “Zoom” feature in Google Chrome…

Using Google Chrome's "Zoom" Setting

Using Google Chrome’s “Zoom” Setting

… allows us to “fit” the dashboard on our large, but relatively low-resolution, screen.

It fits!  Thank you, browser.

It fits! Thank you, browser.

Conclusion

If you are presenting your work using Tableau Desktop or Tableau Reader then you either have to compose for the lowest-common-denominator screen or live with scroll bars.

If, however, Tableau Public, Tableau Online, or Tableau Server are an option, you should be able to use your browser’s zoom feature to make sure your dashboards fit on the screen.

Jun 042015
 

Showing Differences between Periods and Statistical Significance in Tableau

Overview

Addressing this scenario has been the most popular request I’ve received over the past year. Here’s a summary what my clients and students have asked:

  • How do I show the change in Sales, Percentage of Promoters, Number of Visits, etc., between this month / quarter / year, and the previous month / quarter / year?
  • How do I make it easy to see which areas of the organization had an increase this period and which had a decrease?
  • How do I make it easy to see how much greater / less this period’s numbers are than the previous period?
  • How do I determine and show if this change is statistically significant? That is, how do I apply the stat test we like to use in our organization?
  • If the change is statistically significant, is it a one-time thing or should I start hyperventilating?

This is a LOT to take on and we won’t be able to fit all of it into a single visualization.

But we can fit it into a compact dashboard.

Important Ground Rules

In the example that follows I look at the percentage of people that responded with a “9” or “10” to a survey question. That is, I am only looking at the percentage of people that selected one of the top two boxes.  I am NOT trying to see if there is statistical significance or calculate the margin of error in the change in Net Promoter Score over time.

The concepts I explore are not just for survey data; I just happen to have some good longitudinal survey data that is well-suited for seeing how to build a stat test formula in Tableau.

I hope you will indulge me and accept that “the company stat guru” has a fine reason for applying a particular statistical test to the data we’ll be analyzing. That said, you should push back on “business-as-usual” assumptions to determine if what you are visualizing and testing really is important (this is the focus of the work Stacy Barr is doing with her Measure Up blog and is the foundation for Stephen Few’s most recent book Signal.)

So, with the assumption that the particular stat test we want to apply – or any stat test, for that matter – is warranted, how do you show it and how do you build it?

Let’s first explore the working dashboard then see how to build it with Tableau.

Note: A very heartfelt thanks to Kelly Martin,, Joe Mako, Vicki Reinhard, Susan Ferarri, and Tiffany Spaulding who helped vet the dashboard.  I went through many different approaches before settling on the one shown below.

A very special thanks to Jeffrey Shaffer who reviewed the blog post and asked some very good questions, and also to Helen Lindsay who provided sample data.

The data and what we want to show

The data below contains the first few rows of Net Promoter Score survey data with fields for date and role.

Figure 1 -- Net Promoter Score survey data with dates and roles

Figure 1 — Net Promoter Score survey data with dates and roles

For the dashboard I built I only focused on the percentage of people that were Promoters; that is, people who responded with a 9 or 10 when asked if they would recommend a product or service.

I decided to look at the data broken down by quarters as this particular data set didn’t lend itself to month over month comparison.  Note that the techniques we’ll see will work for any time period.

Here’s the top portion of the interactive dashboard.

1_SSDashboardTop

Figure 2 — Top portion of dashboard.  Notice that you can change the selected period, the confidence percentage, and filter by company.

Understanding the chart

Figure 3 -- The key features of the chart

Figure 3 — The key features of the chart

Let’s review what we can glean from the chart.  We can see

  • The percentage of promoters for a particular period and sort them by role, using a bar chart.
  • Which roles have a percentage of promoters that is greater than the previous period and which have less, using color to distinguish (blue for greater, brown for less).
  • Just how much more or less the percentage for this period is compared to the previous using a reference line (the bar is the current period; the vertical line is the previous period).
  • Which roles showed a significantly significant increase or decrease (the red dot).

Note that that the chart uses “Cotgreavian” tooltips that allow you to glean more detail for a particular role when you hover over a bar:

Figure 4 -- Hover over a bar for in-depth information about the role for the current period and the previous period

Figure 4 — Hover over a bar for in-depth information about the role for the current period and the previous period

So, we can see from the red dot that something is up with Lawyers, Doctors and Nurses; that is, the percent increase from the previous period for Doctors and Lawyers is statistically significant and the percent decrease for Nurses is also significantly significant.  Is this a one-time thing or a trend?

Looking at changes over time

Clicking a role or roles will display trends for that role / roles.  For example, if we select Nurse in the top chart a second chart showing percentage of promoters over time will appear, as shown here.

Figure 5 -- Percentage of nurses that are promoters, over time.

Figure 5 — Percentage of nurses that are promoters, over time.

The big takeaway for me is that up until the first quarter of 2013 there were very few responses and after that there was both a consistent number of responses along with a consist decline in the percentage of nurses that were promoters.

Should you be hyperventilating because of the four-month downward trend?  That discussion is beyond this blog post but I again encourage you to check out the work Stacy Barr is doing at her Measure Up blog as well as Stephen Few’s most recent book Signal.

How the This Quarter vs. That Quarter Chart is Built

Let’s dig into how to build this in Tableau, starting with the top viz in the dashboard.

Figure 6 -- What's under the hood.

Figure 6 — What’s under the hood.

  1. Promoters – Current Quarter. This is the measure that drives the bars.  It’s also driving what appears on the labels.
  2. Promoters – Previous Quarter. This measure is on the Level of Detail and drives the reference lines.
  3. Greater / Less. This is a discrete measure that determines the color of the bar.

Promoters – Current Quarter

What we want is the percentage of people that were promoters for the selected quarter, the “selected” quarter being determined by a parameter that the user can control.

Specially, we want to add up everybody that responded with a 9 or 10 for the selected quarter and divide by the total number of people that responded.  Here’s the calculation that handles this.

SUM(

IF [Value]>=9 and DATETRUNC(‘quarter’, [Select Period])==DATETRUNC(‘quarter’,[Date])
then 1 else 0
END)

/

SUM(

IF DATETRUNC(‘quarter’, [Select Period])==DATETRUNC(‘quarter’,[Date])
then 1 else 0
END)

The translation into English is

Take the sum of

If the value from a respondent is greater than or equal to 9 and the date value, truncated to the nearest quarter from the parameter drop down [Select Period] is the same as the date value, truncated to the nearest quarter for [Date], then 1, else 0.

Divide this by the sum of

If the date value, truncated to the nearest quarter for the selected period is the same as the date value, truncated for the nearest quarter for [Date], then 1, else 0.

Not sure about the [DATETRUNC] function vs. the [DATEPART] function?  Have a look at Joshua Milligan’s excellent post explaining date values vs. date parts.

Promoters – Previous Quarter

This calculation is very similar to the calculation for the Current Quarter, except we want to find results for the quarter that occurred just prior to the selected quarter.  Here’s the calculation.

SUM(

IF [Value]>=9 and DATETRUNC(‘quarter’, [Select Period])=DATETRUNC(‘quarter’,DATEADD(‘quarter’,1,[Date]))
then 1 else 0
END)

/

SUM(

IF DATETRUNC(‘quarter’, [Select Period])==DATETRUNC(‘quarter’,DATEADD(‘quarter’,1,[Date]))
then 1 else 0
END)

The formula is the same except we use the DATEADD function to add an additional quarter; that is, we’re saying that we only want to find results where, when we add an additional quarter, we get a value equal to the current quarter; i.e., the previous quarter, plus one quarter, gives us the current quarter.

Greater / Less

The color of the bars is determined by this discrete measure:

IF [Promoters — Current Quarter] > [Promoters — Previous Quarter] then “Greater than previous”
else “Less than previous”
END

Yes, I suppose we should have a contingency for when the percentage of promoters for the current period is the same as the previous period; I leave it as an exercise for the reader to add this functionality.

So, we’ve explained everything except … The Red Dot.

The Red Dot – Computing Statistical Significance on the Fly

Most of my clients and students are surprised to find out that you can fashion a test for statistical significance inside Tableau and it can test for statistical significance “on the fly”; e.g., you can apply filters and Tableau will recalculate based on the filter settings.

The first step is determining just how the client wants to test for statistical significance. This usually entails sending an inquiry to “the stats person” who responds with something that looks like this:

Figure 7 -- Z-test formula for statistical significance

Figure 7 — Z-test formula for statistical significance

I hope your eyes aren’t glassing over as this really isn’t very complicated; it just might look complicated if you’re not used to seeing stat formulas with square root symbols.  Here are the critical things you need to know:

p1            Percentage of promoters for the current period

p2            Percentage of promoters for the previous period

n1            Number of respondents for the current period

n2            Number of respondents for the previous period

If z1 is greater than or equal to 1.96 then there is a 95% degree of confidence that the difference between the two periods is statistically significant.

So, how do we build this formula?

Slowly, and in easy-to-digest pieces.

The Dot Itself

Figure 8 -- The discrete measure Z-Test Significance Dot is responsible for displaying the dot

Figure 8 — The discrete measure Z-Test Significance Dot is responsible for displaying the dot

The calculation that produces the dot is called Z-Test Significance Dot and it is defined as follows.

IF ABS([Promoters — Z-Score Quarter])>=[Confidence] THEN “•”
ELSE “”
END

This translates as

If the absolute value of [Promoters – Z-Score Quarter] is greater than or equal to the confidence parameter (currently set to 1.96, or 95%) then display a dot; otherwise, display a null string.

And just how is [Promoters – Z-Score Quarter] defined?  Let’s explore the next layer of the onion.

Promoters – Z-Score Quarter

This is defined as follows:

[Promoters — Z-Score Quarter Numerator] /

SQRT(

([Promoters — Z-Score Quarter Denom – Current] +
[Promoters — Z-Score Quarter Denom – Previous])
)

Here’s how it maps to the stat formula we saw earlier:

Figure 9 -- Mapping the components of the formula to different calculated field

Figure 9 — Mapping the components of the formula to different calculated field

So now we just need to understand the three different pieces that go into the stat function.

Promoters – Z-Score Quarter Numerator

This is very simple and refers to calculations we’ve already used.

[Promoters — Current Quarter] –
[Promoters — Previous Quarter]

Promoters — Z-Score Quarter Denom – Current

This is fairly straightforward given what we’ve already explored.

([Promoters — Current Quarter]*(1-[Promoters — Current Quarter]))
/SUM([Promoters — Current Quarter Count])

Where [Promoters – Current Quarter Count] is defined as follows.

IF DATETRUNC(‘quarter’, [Select Period])==DATETRUNC(‘quarter’,[Date])
THEN 1 END

So SUM(Promoters — Current Quarter Count]) is just adding up all the people that responded during the selected quarter.

Promoters — Z-Score Quarter Denom – Previous

([Promoters — Previous Quarter]*(1-[Promoters — Previous Quarter]))/
SUM([Promoters — Previous Quarter Count])

This uses the same logic as [Promoters – Z-Score Quarter Denom – Current] but instead aggregates results from the previous quarter.

Putting it all together

In addition to building the components in a piecemeal fashion I will often build a crosstab of all these components to see if they are working as I would expect.  Consider the crosstab shown here.

Figure 10 -- Crosstab showing all the pieces that contribute to the red dot

Figure 10 — Crosstab showing all the pieces that contribute to the red dot

The cross tab allows us to examine all the intermediate calculations to see how the contribute to the determining calculation in the last column.

What about the secondary chart?

So we’ve now seen how to build the top chart that shows current and previous quarters broken down by role.  How does the secondary chart – the chart that appears when you click a role or roles in the first chart – work?

Figure 11 -- Percentage of promoters for Nurses over time

Figure 11 — Percentage of promoters for Nurses over time

Here we have a dual axis chart so that we can have both a line (gray) and a circle (colored based on whether the change for the previous period is statistically significant).

In this case we have to construct all of the pieces using a table calculation, but the process of putting together the different components is identical to what we saw earlier.  For example, the calculation that determined the color of the circle, [LONG_Z-Test Significance], is defined as follows.

IF ABS([LONG_Z-Score])>=[Confidence] then “Significant”
else “Not significant”
end

And [LONG_Z-Score] is defined this way:

[LONG_Z-ScoreNumerator] /

SQRT(

([LONG_Z-Score Denom Current] +
[LONG_Z-Score Denom Previous])

)

I also built a crosstab to see how all the pieces fit together, as shown below.

Figure 12 -- Crosstab to help put together a z-test calculation for values shown over time

Figure 12 — Crosstab to help put together a z-test calculation for values shown over time

Conclusion

The dashboard in this blog post shows the percentage of promoters, sorted by role, for a particular quarter, compared with the percentage of promoters for the previous quarter.  Roles where the percentage difference is statistically significant are marked with a red dot. You can drill down on a particular role (or role) and see how scores have changed over time.

While the critical visual component was showing bars and reference lines, most of the “heavy lifting” went into determining if a change was statistically significant.  The key here was to not be intimidated by a statistical formula and to build the calculations in small pieces, using crosstabs to check the work.