Aug 272014


I was reading a very interesting blog post this weekend from which I learned that counseling may do little to help young people with drinking problems.  Specifically, counseling reduces the average number of drinks consumed from 13.7 drinks per week to 12.2 drinks per week.

I wondered if a visual might drive this point home, and then wondered how different media outlets would display the chart if the article were to appear in that outlet.

The Economist

Here’s what an accompanying chart might look like in The Economist.

The Economist

Actually, in The Economist the axis would appear along the top of the chart, but there’s no easy way to do that in Tableau.

USA Today

Here’s how USA Today might handle the same information.

USA Today

Special thanks to Joe Mako who helped me with data densification / padding and the masking so I would not have to resort to Excel to create the pictogram.

Fox News

Depending on if the editorial board wanted to slant the results, here’s how the chart might appear at Fox News.

Fox News

Think I’m kidding? Check out this link.  The chart does not start at zero and the axis is hidden.


Jun 252014


The catalyst for this post comes from my recent attendance of a Tableau user group where the presenter demonstrated a dashboard that featured a packed bubble chart.  I spent a lot of time shaking my head – not because this was a very poor visualization choice — but because the presenter was in a position of authority and there were people in attendance that were new to Tableau and to data visualization.  These people would likely come away from the presentation thinking that they should, when presented with similar data, use a packed bubble chart.

I then recalled something that I had written previously:

If I see a visualization that is poorly designed or worse, misleading, I’m going to say something about it. I hope you will do the same.

The culprit visualization

I do not have the data that drove the Tableau user group visualization so I will use Superstore Sales data to illustrate my point.

For whatever reason, the presenter eschewed creating a clear and simple bar chart, like this one…


Figure 1 — A simple but abundantly clear bar chart.

… and instead built a difficult-to-interpret packed bubble chart that looked like this:


Figure 2 – A cool, but analytically-bereft packed bubble chart.

With the packed bubbles I have to work to determine which bubbles belong in which category and I have to work especially hard to determine how much larger a particular bubble is than another bubble.  In addition, in some cases the bubble is too small for the supporting sub-category and measure labels.

A good rule of thumb – Ask yourself these three questions

As I considered the flaws in this chart type I began to codify some simple principals that I use when building visualizations.  Specifically, before I go live with a visualization, I ask these three questions:

  1. Do I need different colors?
  2. Do I need a legend?
  3. Do I need measure labels?

In the case of the bar chart I don’t need to use color, I don’t need a legend and I don’t need to show the numbers next to the bars. I might want to show the numbers, but I don’t need to show them.  With the packed bubble chart I need all three items in order to make sense of the viz.

I’m not saying that you should never use color, legends, labels, or circles; I just suggest that you ask yourself if there’s a way to build a clear visualization that doesn’t need one or more of these elements as the more of these elements you need the harder your audience will need to work.

Let’ see how this triumvirate of questions expose some of the flaws in pie charts, circle charts, 100% stacked bar chart, and “snakey” diagrams.

The problem with pies

Many people have written articles about this, my favorite being Stephen Few’s white paper on the subject.  Indeed, if you need ammunition to move your organization away from pie charts I encourage you to download Few’s paper.

I’ll present an abbreviated discussion of the problems with pies to show how it underscores the utility of the three questions.

Consider the chart below which shows poll results to the question “what is your favorite beverage”?


Figure 3 — Simple pie chart showing poll results.

I can see that Chateau Lafite Rothschild comes in first, but I can’t tell if it’s Coffee or Dogfish IPA that comes in second, and I really can’t tell how much larger one segment is than another.

Here’s an alternative pie chart that adds color, a legend, and measure labels.


Figure 4 — Pie chart with added stuff so you can make sense of the pie chart.

Well, I can now determine the ranking and relative magnitude, but I have to spend a lot of time going back and forth between the legend and the chart.  Cosnider how much simpler it is to understand the poll results using a bar chart:


Figure 5 — Poll results displayed in a bar chart.

So, just why is the pie chart harder to understand?  In addition to requiring a legend, it also has to do with people’s inability to compare the area of circles.

The Problem with Circles

As with pies, Stephen Few has written about this subject, as has Alberto Cairo in his book The Functional Art.  (Do you own a of Cairo’s book?  If the answer is “no” you should buy it now.  Really.  Stop reading this and buy it).

Now that you’ve bought the book…

Consider the collection of bar charts below.  Two of the groups have measure values that are labeled incorrectly while one of the groups is correctly labeled.


Figure 6 — One of the groups is labeled correctly and two are mislabeled. Can you tell which one is correct?

Can you tell which one of the three groups is labeled correctly?

Now have a look at the same data presented with a packed bubble chart where again one group is correctly labeled and the other two are not.


Figure 7 — One of the groups is labeled correctly and two are mislabeled. Can you tell which one is correct?

If you are like most people you’ll solve the bar chart example very quickly and probably won’t have a clue with the packed bubble charts.

Note – The answers may be found at the end of this blog post.

Incidentally, the differences are pretty significant, but I could magnify the errors quite a bit in the circle charts and people still wouldn’t be able to tell which group was correct as people are just horrible with comparing the area of differently-sized circles.

There are certainly places where circles are useful and most welcome, but they don’t work well here and they don’t work well in the example I discuss below.

Substituting a bad chart type with another bad chart type

I’ve recently read a collection of blog posts where the author suggests ways in which people can avoid the stranglehold of pie charts by using other chart types.  I liked the promise of this blog series and was pleased to see that the first example presented a bar chart similar to the one show in Figure 1.  I was however, surprised at some of the other approaches as I did not think they presented data clearly.

One of the questionable alternatives was a panel chart like the one shown below.

8_Panel Chart

Figure 8 — A panel chart comprising circles that makes me have to work harder than I would like.

I have to work very hard to “grok” this viz and that’s because I cannot make sense of the data without reading and interpreting the measure labels.  In addition, because the items were not grouped I had to refer to the color legend to see which circles represented Technology products, which were for Office Supplies, etc.

I grant that there are cases where you may want to present the product sub categories from largest to smallest without grouping them into a hierarchy, but I still maintain that it’s much easier to interpret the data with a bar chart like the one shown below, which does not require measure labels.

8A_Bar Chart no Hierarchy

Figure 9 — Bar chart with hierarchy removed. We need a legend but don’t need measure labels.

Note: I am not saying that you should not use measure labels; I am saying that if the visualization requires measure labels then there is a good chance you’ll be able to craft a better visualization.

Does this mean you should never use circles?

There are of course myriad instances where circles would be very welcome.  Consider the following map that shows the number of orders by location.

9_Circles on a Map (Number of Orders by State)

Figure 10 — Symbol map

I can see very easily that the number of orders on the West Coast (Washington, Oregon, and California) is considerably larger than the rest of the country.  In this case it’s seeing the circles on top of a map that helps me conclude that there’s a lot of activity happening in one area of the country.  If I wanted to know just how much activity, and if I wanted to be able to make quantitative comparisons, I would need an accompanying chart that helped me sort and determine the relative magnitude of orders for each state.  That is, if I needed to know more than “whoa, look at the number of orders on the West Coast!” then I would probably craft a dashboard that would also contain a bar chart showing orders by state in descending order.

100% Stacked Bar Charts

I try to avoid 100% stacked bar charts as they absolutely require that I use color and they can be somewhat difficult to interpret without measure labels.  Consider the visualization below that compares % of total shipping costs by product category, broken down by ship mode.

1a_Percent of Shipping Costs by Category

Figure 11 – A collection of 100% stacked bar charts. I think of this as being a “cubist” pie chart

It’s easy pretty easy to determine the Regular Air values as the axis starts at zero.  It’s a bit harder to glean the Express Air and Delivery Truck Values without displaying the mark labels.

Still, it ‘s an easier read than three pie charts.

1b_Percent of Shipping Costs by Category -- Pie Charts

Figure 12 — Trying to understand the breakdown of shipping costs by Ship Mode across categories using pie charts (yuck).

While I try to avoid 100% stacked bar charts, I am a very big fan of divergent stacked bar charts.  Here’s an example from a recent blog post.  While I do need a color legend I can get by without measure labels.


Figure 13 — Divergent stacked bar chart.

Stacked bar charts also play a supporting role in Sankey diagrams which we explore below.

Where “snakey” Sankey diagrams work

Consider this snippet from Jeffrey Shaffer’s winning entry in the Tableau Quantified Self visualization competition.


Figure 14 — Jeffrey Shaffer’s Sankey chart maps how one stacked bar chart maps to another stacked bar chart.

At the bell of the trumpet we find a stacked bar chart where we can hover over items to see to what they refer:


Figure 15 — Hovering over a bar shows info about the bar and shows how the item is mapped to a different set of measures.

From this action I can see that Shaffer performed at five weddings where he played music by Bach, Clarke, Mouret, Reiche, Vivaldi, and Purcell.

Within this context, this very creative chart works as it’s not essential that I know the exact details of Shaffer’s performances.  Instead, I can explore this and other portions of what is a very fun and playful dashboard and get a sense of who Shaffer is and whether I’d like to hang out with him (and I would as I know for a fact that musician / data visualization consultants are among the most interesting people on the planet.  You can look it up.)

Where Sankey diagrams don’t work

I first saw this type of chart in a visualization Shaffer published earlier this year where he took a stab at redesigning his utility company’s fuel usage bill.  Here’s what his redesign looks like:


Figure 16 — Shaffer’s energy bill redesign.

The chart is very decorative, but does it help me understand energy expenditures?  There’s a really big story sitting in the data but I don’t think this chart helps me see it.  Indeed, I would argue that while the chart is pretty it in fact obfuscates what is the big story.

Consider this visualization of the same data.


Figure 17 — A redesign of the redesign.

So what the big story?  Almost half of the total energy expenditures (44%) goes towards heating the home. 

My reaction to the Sankey diagram is “cool!”  My reaction to the stacked bar chart is “crap!”

In the Quantified Self dashboard “cool” is the desired reaction.  With the fuel bill “crap” is better as it may lead to better decisions and behavior changes (e.g., replace the windows, add insulation, or wear a sweater).


The goal of good data visualization is to elucidate, not decorate.  If your visualization requires color, legends, and measure labels you should at least consider an approach that does not ask your viewers to work hard to see and understand what is important in the data.


Answers key:

Bar Chart – Group 2 is labeled correctly

Packed Bubble Chart – Group 1 is labeled correctly.




 Posted by on June 25, 2014 1) General Discussions, Blog 6 Responses »
Jun 102014

… and some thoughts on the evolving art and science of visualizing data

I tend to gravitate towards occupations that are hard to explain.  I started my professional life, and continue to be, a music arranger and orchestrator.  I can tell by people’s perplexed looks that they are wondering if I’m the guy that decides where the brass section should sit in the pit.

I run into similar problems when I tell people I’m a data visualization consultant.  I was trying to come up with a concise way to explain what that is when I came across an excellent blog post from Stephen Few.

I take this and turn it into that

I do encourage you to read the full post (you can do it now if you like; I’ll wait).

I was struck by the first example where Few shows how hard it is to glean any meaning from a text table.  Here’s his example of poll results published on the PBS website from a 2004 study by the Pew Center for Research.


Figure 1 — Favorable and Unfavorable views of the U.S.A.

I have to work very hard to get a sense of which countries have the most positive sentiments towards the U.S.A. and which have the most negative.

Few proposes a different way to present the data that makes it much easier to see, rank, and understand the findings.


Figure 2 — Few’s alternative to presenting the findings

Yes!  This exercise encapsulates what it is that I do!  I take “this” and turn it into “that”, thereby allowing companies to better see, understand, and glean insights into their data.

An alternative to the alternative and how the industry keeps evolving

I cannot just listen to music.  My training and proclivities force me to dissect the music I hear so that I can understand what’s going on inside the music.

A similar thing happens when I see a data visualization.  After taking in the presentation I stop and wonder if there is an alternative approach that would allow me to better understand what’s going on and thereby draw better conclusions that in turn allow me to make better decisions.

In a moment I’m going to suggest an alternative to Few’s approach but I do want to emphasize that the data visualization field is very new and it’s the free exchange of ideas that’s pushing people to create new ways visualize data.  A perfect example of this is my own evolution in displaying Likert scale data (see Likert Scales – The Final Word).  It was discussions with friends and colleagues Naomi Robbins and Joe Mako that resulted in what I think is a better way to explore and glean insights from the World Opinions data.

The divergent (or staggered) stacked bar chart

Consider the screenshot of a dashboard below where we skew the stacked bars right and left based on overall positive and negative sentiment. Note that you will find a working dashboard at the end of this post.


Figure 3 — Conveying sentiment using a divergent stacked bar chart.

If you split the neutral responses evenly you see that, overall, Poland has the most positive sentiment and Egypt the most negative.

But what happens if you eliminate the neutrals?  If you sort by least negative you see certain things pop out.


Figure 4 — Neutral responses are hidden results are sorted by least negative

Here Poland is ranked first and Jordan is last (and notice how polarized Jordan is).

Compare this with the view when you remove the neutral responses and sort by most positive.

Figure 5 -- Neutral responses are hidden results are sorted by most positive

Figure 5 — Neutral responses are hidden results are sorted by most positive

In this case Kenya is ranked first and Egypt is last.


The divergent stacked bar is my “go to” viz type whenever I deal with Likert scale data.  The only downside is that is takes a bit more time to create in Tableau and it warrants using a color legend, something I try to avoid where possible.

But this divergent stacked bar chart is my Likert-scale viz of choice today.  Who knows what people will create in the coming years that does an even better job of helping people understand their data.

Oh, and I now have a compact explanation of just what it is I do.  I turn a this into a that.

Postscript: I’ve been thinking about this and want to modify my explanation… let’s change it to “I take this and I try to turn it into the best that that’s possible”.

May 232014


I’ve had a spate of requests from clients to show how survey responses rank across different categories and I’ve come up with a way that makes it very easy to see where the big stories are.

Note that this approach works for any measure that can be ranked, not just survey responses.

Let’s see what I mean…

Consider the bar chart below that shows the results to a survey question “indicate which of the following that you measure; check all that apply”.

Figure 1 -- Percentage of respondents that measure selected items, ranked from highest to lowest.

Figure 1 — Percentage of respondents that measure selected items, ranked from highest to lowest.

Traditional approach to showing rank within a category

Now, suppose you wanted to see the percentages and rankings broken down by different demographic components (e.g., location, gender, age, etc.).  There are myriad Tableau knowledge base articles and blog posts on how to do this and they lead to results that look like the one shown in Figure 2.

Note: Pretty much all of those articles and blog posts are now obsolete as they make clever use of the INDEX() function.  With Tableau 8.1 you can use the RANK(), or one of its variations, and not have to go through as many hoops.

Figure 2 -- Traditional approach to showing ranking within a category.

Figure 2 — Traditional approach to showing ranking within a category.

I find this a tough read.  Even if I add a highlight action it’s still hard for me to see where a particular item ranks across the four categories.

Figure 3 -- Ranking within a category with highlighting.

Figure 3 — Ranking within a category with highlighting.

Don’t try to show everything at once

My solution is place the Generation on the Columns shelf and to not show everything at once, but to instead allow the user to explore each of the possible responses and see how these responses rank across the different categories.

Consider the dashboard shown below where the top worksheet shows the responses across all categories.

Figure 4 -- Dashboard with no item selected.

Figure 4 — Dashboard with no item selected.

Now see what happens when we select one of the items in the list.

Figure 5 -- Dashboard with an item selected shows that items rank and percentage across different generations.

Figure 5 — Dashboard with an item selected shows that items rank and percentage across different generations.

Okay, not much to report here – Adrenaline Production is ranked first in three categories and second among Traditionalists, although Traditionalists’ measure it quite a bit lower than the other three groups.  Still, we’re not seeing any wide swings.

But look what happens when we select Breathing…

Figure 6 -- Breathing: our first big story.

Figure 6 — Breathing: our first big story.

Now that’s a big story!  And it pops out so clearly.

Reporting vs. interacting

This is all fine and good if you publish this as an interactive dashboard and you expect people to, well, interact; but what happens if you want to publish this as a static graphic in a magazine?

The solution is to find where the big stories are and show those in the magazine; that is, do the work for your reader and show him / her where the big differences are.  In fact, that is exactly what I’ve done in Figure 7.

How the dashboard works

Here’s how the top part of the dashboard is set up.

Figure 7 -- Configuration of top worksheet.

Figure 7 — Configuration of top worksheet.

Rank is defined as


Note that we’re addressing the table calculation using Wording.

Notice also that Wording is on the Rows shelf.

The bottom part of the dashboard is set up like this.

Figure 8 -- Configuration of the bottom worksheet.

Figure 8 — Configuration of the bottom worksheet.

Goodness, we can’t tell what any of the bars mean because Generation is on the Columns shelf and Wording is on the Level of Detail and not Rows.  If you put it on Rows you get something that looks like this.

Figure 9 -- Placing Wording on the rows shelf tells a different and harder-to-understand story.

Figure 9 — Placing Wording on the rows shelf tells a different and harder-to-understand story.

The key takeaway is that we cannot make a single visualization that tells the story.  You need both the first and second visualizations working together.

A Filter and a Highlight Action

We use both a Highlight and a Filter action to make the two visualizations work well.  The Filter action is there to make the second worksheet disappear once you clear the selection in the first worksheet; The Highlight action highlights where the item appears in the second worksheet.

Here are the two actions:

Figure 10 -- Two actions tied to the same mouse click.

Figure 10 — Two actions tied to the same mouse click.

The Filter action is defined as follows.

Figure 11 – Definition of the Filter action.

Figure 11 – Definition of the Filter action.

This tells Tableau that when a user selects something from the first worksheet (Percent that Measure-Overall) it should filter the second worksheet (Percent that measure-by Generation)  by the field Temp.  Temp is just a string constant that I’ve placed on the color shelf; it’s only use is that we have to filter by something in order for the Exclude all values setting to work (and that is critical for the behavior of the dashboard.)

Here’s how the Highlight action is defined.

Figure 12 -- Definition of the Highlight action.

Figure 12 — Definition of the Highlight action.

This tells Tableau that when a user selects something from the worksheet on top, Tableau should highlight items in the second worksheet using Wording as the selected field (where Wording is the dimension we placed on the level of detail rather than on the Rows shelf.)


I’ve found this approach to showing of rank across categories very useful and it’s been a very big hit with my clients.  By placing the categories across columns and using highlight actions we make it very easy to see where the big differences are among different respondent groups.

Mar 182014

Note: Since writing this post in 2014, I have, in fact, become a fan of sparklines. That said, I continue to see many instances where I think the dashboard author could present data more clearly using a different approach. Make sure to read the comments at the end of the post.

I’ve never been a big fan of sparklines and I’m a bit concerned with how often they are cropping up in dashboards.  While I appreciate that this chart type provides a compact mechanism for showing how a collection of measures wax and wane over time, I believe there are many cases where other chart types will do a better job getting the message across.

Stephen Few’s Dashboard Design Competition

I’ve been reading the second edition of Stephen Few’s Information Dashboard Design and was drawn to a discussion of the design competition Few ran in 2012.

Consider this data snippet from the competition where we see student test performance over time:

Student test results

Student test results

The winning entry, the runner up, and Few’s own solution rely heavily on sparklines to present this and similar data.

My Attempt at Sparklines

I’ll be honest that I have a very difficult time being able to understand any of the sparkline renderings from any of the design entries. Perhaps if I took a stab at myself…?

Consider my attempt below:

Student test results rendered using sparklines

Student test results rendered using sparklines

I ask you if you can see — at a glance — that the best performing students are at the top and the lowest performing students are at the bottom?  Can you see that Regan Petrero (about 60% of the way down the list) received “C”s for his first three assignments, a “B” for the fourth assignment, and a “D” for the fifth assignment?

Granted, I can try to make certain things stand out better by adding banding and not having the axis start at zero, but even with these additions I’m not able to come up with anything that tells as clear a story as what I get with a simple highlight table.

Student Data, Take Two – A Highlight Table

Here’s the same data rendered using a highlight table:

Student test results rendered using a highlight table

Student test results rendered using a highlight table

I can see immediately that Holly Norton is a straight “A” student, that Donald Chase just missed being a straight “A” student, and that Xu Mei has had some wide fluctuations.  The chart is compact, easy-to-read, and I can discern both comparative performance and relative performance with very little effort.

What about Frederick Chandler?

If you look at my sparklines tendering  you will see that there may be an interesting story with respect to Frederick Chandler and the third assignment.  In the sparkline you can see there was a big dip; in the highlight table you can only see that Mr. Chandler received an “F”.

It turns out that Mr. Chandler received a zero on the assignment.  Is it important to show this, versus just showing a failing grade?  I don’t know the answer, but if it is important then we can create a six point color scale, as shown here:

Mr. Chandler’s zero, for all the world to see

Mr. Chandler’s zero, for all the world to see


See For Yourself

I present the sparklines and highlight table side-by-side in the dashboard below. Have a look and let me know what you think.  If you have a way to make the sparklines “sing” better by all means please share it.

Please realize that I’m not suggesting that you should never use sparklines; I only ask that you consider whether sparklines are the best way to show what is important about the data before you publish. I very much encourage your to explore other options.

Jan 162014


One of the new features in Tableau 8.1 that Tableau Software is trumpeting quite a bit is one-click Box and Whisker Plot generation.  While I appreciate the new functionality, this chart type doesn’t “sing” to me the as much as jittering does.  Indeed, this “jittering” capability was the BIG discovery for me in 2013.

Let’s see how a box and whisker plot compares with jittering using a simple example.

Note: Interactive dashboards that illustrate jittering techniques may be found at the end of this blog post.  Feel free to download and explore.

Salary and Age Bins – Default

Consider the following pre-Tableau 8.1 salary chart that shows how salaries are distributed across age bins.


Figure 1 — Default Salary Distribution by Age Bins


While we can see that the top salaries are enjoyed by people in their 50s, there’s nothing that gives us concrete percentiles nor shows us where the outliers are.  We also can’t tell that there are in fact thousands of dots in the visualization as so many marks are sitting on top of each other.

Salary and Age Bins – Box and Whisker Plot

To see percentiles and outliers we can use Tableau’s Show Me feature and click the Box-and-Whisker Plot button.


Figure 2 — Salary Distribution by Age Bins with Box and Whisker Overlay


This is definitely an improvement, but I really don’t “feel” the data as I can’t see how the dots are distributed; they are all stacked on top of each other.

Salary and Age Bins – Jitters

Here’s the original chart, but with the marks “jittered” using a modified version of Tableau’s built-in INDEX() function.


Figure 3 — Salary Distribution by Age Bins with the marks “jittered”

This gives me a much better feel for the data as I can how the thousands of marks cluster.  Of course, I can still superimpose the box plot, as shown here.


Figure 4 — Salary Distribution by Age Bins with the marks “jittered” and box plot overlay

Getting Jitters Using INDEX()

To “jitter” the marks I create a calculated field called “Index” that uses Tableau’s INDEX() function.  I put this on the Columns shelf and compute using ID, as shown here.


Figure 5 – First attempt using Tableau’s INDEX() function

It turns out that for this particular example INDEX() by itself works because there is an equal distribution of IDs across each of the age bins.  Consider the example below where we show a distribution of Superstore Sales across different customer segments.


Figure 6 – Shortcomings of using INDEX() by itself.

Notice that the strip of dots within “Corporate” is much wider than the other segments because there were more orders within “Corporate” than there are in the other segments.

The easiest way to fix this is to edit the axis and select “Independent ranges for each row or column” from the Edit Axis dialog box.  While this will work fine we’ll look at a different technique that will allow us to control the degree of jittering.

Using Modulus to Control Jittering

When I first blogged about this technique last year, Alex Kerin of Data Driven suggested a simple and elegant solution to different-sized partitions using Tableau’s Mod function.   For those of you that forgot your high school mathematics, we use a modulus is to determine the remainder when you divide one number by another.  Here’s an example

14 ≡ 30 Mod 8

Translation: 14 is equivalent to 30 Mod 8 because you get the same remainder when you divide 14 by 8 as when you divide 30 by 8 (both remainders are equal to 6).

So, how do we use this capability in our visualization?  We want the same number of dots in each segment, so instead of using INDEX() we will instead use INDEX()%25

This will create 25 “rows” of dots within each segment.

Specifically, when

INDEX()=1, INDEX()%25 will be mapped to 1
INDEX()=2, INDEX()%25 will be mapped to 2

INDEX()=26, INDEX()%25 will be mapped to 1
INDEX()=27, INDEX()%25 will be mapped to 2

Note that 25 is not a magic number.  For this example anything above 15 will do the trick (and in the demo workbook I have a parameter slider that controls the MOD setting).


Jittering is a very simple technique and it helps overcome the problem of marks being stacked atop each other when plotting a distribution within a dimension.  It only takes up a little more screen real estate and it packs a terrific visual wallop.


Oct 312013

If I see a visualization that is poorly designed or worse, misleading, I’m going to say something about it. I hope you will do the same.

In March of 2013 Stephen Few published a scathing review of Tableau 8. Few’s thesis was that Tableau had caved to marketing pressure and its new product would encourage users to craft “analytically impoverished” visualizations.

At the time I thought that Few’s screed was unfair (see my blog post), but a recent post from Emily Kund about a company’s internal “Iron Viz” competition made me wonder if perhaps Few was right.

Before I get into what deeply troubles me about the aftermath from the contest I do want to applaud Kund and her colleagues for fostering interest in Tableau and data visualization best practices.  Clearly, I have a fondness for these types of contests and like the excitement they generate about visualization.  I also believe strongly in making interactive visualizations that are fun and inviting.

My problem is that while everybody is rightfully patting Kund on the back for having the contest, nobody in the Tableau data visualization community (and it is an amazing community) has pointed out what is wrong with the dashboard — and there is a lot that is wrong with the dashboard.

Too Much Sugar

Let’s have a look at the winning entry from the Halloween data visualization competition.

DTSS Winning Viz Image

Winning entry

This winning viz epitomizes the type of creation Stephen Few feared that people would construct in his now infamous review as this dashboard sacrifices clarity and accuracy for whimsy. Why have the stacked bubble chart, and why have the pumpkins representing annual spending? Humans are absolutely horrible at comparing areas of circles — why use them here? I also don’t buy the size of the pumpkins at all as the $4.7B pumpkin for 2009 is considerably smaller than the $5.0 billion for 2006.  It looks to me like the author exaggerated the size of the pumpkins.

More importantly, by fighting Tableau’s own default settings the author has hidden the biggest story the data is trying to tell us.

Why Didn’t You Let Tableau Make a Line Chart?

Let’s focus on the pumpkin chart along the left side of the dashboard:

DTSS Winning Viz Image_leftside

Unreliably-sized pumpkin chart

Here we see annual sales by year.  Using the same data, in Tableau if we simply select the two fields and click the Show Me button Tableau will automatically generate the following visualization.


The default chart Tableau creates

Now, tell me you didn’t just think “whoa… what happened in 2009?”

THAT’S the big story.

Have Your Candy and Eat It, Too…

I “get” that the nobody is going to get very excited about the viz Tableau creates by default.  Without something to capture the viewer’s interest he/she may not bother with the viz (see Ben Jones’ excellent posts on this subject.)

So, if we must add some “viz candy” why not start with the line chart and dress it up, like the one below?

Line chart with pumpkins

A “fun” chart. 10 seconds to build the default line chart and five minutes to apply some graphic design.

Are Stacked Bubbles Inherently Bad?

I don’t think the stacked bubbles work in the dashboard.  I have to work too hard to see that “Candy” at $22.37 is slightly larger than “Decorations” at $20.99.  With a bar chart I could see the differences immediately.

That said, there are some good examples where bubbles elicit an emotional response and just fit with the design flow (see this example from Kelly Martin).

I also like having this chart type in my quiver, even if I never use it on a published dashboard.  I welcome anything chart type that will help me better understand the data, even if I never use that chart type in production.

Getting People to Use The Tools Correctly

I still don’t agree with Few — I don’t think Tableau should remove features for fear that people will use them incorrectly.

But I am very concerned that visualizations that are poorly rendered are being presented as examples to emulate.  As a community we need to do our best to prevent this from happening, so if you see something that is poorly designed — or worse, misleading — point out the problem and show the person a better way to get the desired result.

I have tried to do that here.





Sep 302013

I recently attended the Tableau Customer Conference.  It was a great conference and if you are into Tableau you should definitely go to next year’s event (see

In any case, during the myriad networking opportunities I was very pleased by the number of people who told me how much they had gotten out of a blog post I had written over two years ago.  I decided I should revisit that post and see what, if anything, I would write differently.

So, before you read this post, make sure you read that post.

Did you read it?  The comments, too?

I think it holds up quite well but there are some things I would change.

Let’s go through the major points one by one.

Size Matters

I think I’m ready to retract this recommendation for two reasons:

  • The world has gotten a lot smarter about embedding Tableau Public visualizations.  Either folks make sure to get the size right or they have a link to where people can view the full-sized visualization.  Indeed, it’s been awhile since I’ve seen anything that was really mangled.
  • I would hate to handcuff people from doing work that warrants a larger canvas.  For example, have you looked any of the things Kelly Martin is doing over at her VizCandy blog?  Go ahead, click here (but do come back when you’re finished.)

Some great stuff going on there — in fact my reaction when I first saw what she’s producing was “damn, I’m really going to need to ‘up my game'” — but I’ll blog about this at a later date.  In any case, while the constraints of a 650 pixel-wide canvas can in fact be very useful as it forces you to pare down the fluff, by all means use a wider canvas if your viz warrants it.

Never Use Red and Green as Contrasting Colors

I got some grief about this one, but my rationale is still spot on (more about this in a moment).  That said, I will modify the rule so that it reads as follows:

Never use red and green as contrasting colors without an affordance

And just what do I mean by an affordance?

Consider your typical traffic light.  How do people with red-green color blindness deal with traffic lights?  The answer is that they look at the positioning of the light: red is on top, yellow in the middle, and and green is on the bottom. If the light is configured sideways, red is left, yellow is in the middle, and green is to the right.  If there were only a single light that changed color there would be many more accidents — and not just because of confusion among the color-blind; the non color-blind fine the positioning very helpful as well.

So, if you like red and green or feel you must use it, the following two visualizations are acceptable as they both contain an affordance.  Specifically, The first contains arrows that point up or down and the second has bars that ascend or descend.




The following view is not acceptable as there is nothing besides the red and green to telegraph high values vs. low values.


As for having to worry about this at all, I attended Maureen Stone’s session on Best Practices for Using Color: How and Why and came away with two key findings.

1) This woman is brilliant; and,

2) Yes, you need to worry about this. So do yourself and your interactors / viewers a favor and do as I suggest.


There’s nothing I would remove from this section, but there is so much more that I should add.

I’m not going to do that right now (sorry).


… as I re-read my post there’s one idea I really want to underscore and that is to ask at least one other person to “fly” your dashboard before you publish it.  That person will both break the thing immediately and find the major stumbling points.  Really, it’s amazing how quickly a set of fresh eyes can find all the flaws.

Hover Help

I still completely endorse this practice but will provide one caveat about creating the ersatz calculated field called “Help”: adding this custom field can, in some cases, really slow down performance if you have a large database and are executing a big query against that database.

The very simple workaround is to create a very sparse additional data source (you can do it in Excel) and create the custom calculation in that additional data source.  Indeed, I’ve gotten in the habit of placing my help and navigation control scaffolding in a secondary (and very small) data source.

Navigation – Dealing with Multiple Tabs

My major takeaway in reviewing this is that I can’t wait for Tableau 8.2 to be available as it will have a new feature called Story Points that will be WAY better than the click forward / click back navigation buttons I recommend when publishing multi-tabbed workbooks.  I only saw a very brief demo of Story Points but what I saw was absolutely beautiful.

In the meantime I still recommend you hand-chisel these forward and back buttons and, should you run into performance problems, put any calculated fields you use in a secondary data source as described in the “Hover Help” section above.


I agree with everything I wrote but think my examples are dated as the state-of-the-art has improved enormously in the past two years.  In particular, Tableau’s ability to support floating elements has made it easy for the design-savvy among us to do some great things.

By the way, now would be a good time to revisit Kelly Martin’s VizCandy blog.  And when you’re done check out this post at Anya A’Hearn’s DataBlick web site.

They draw you in, don’t they?

As fun and inviting as all this stuff is I have found the best way to get people engaged is to somehow make the visualization about the person who is viewing it.  One of my favorite examples is the one below, a much-scaled down version of an interactive salary dashboard I built awhile back.  The reason this works is that people want to know (and know immediately) how they compare with their peers.  So, if you really want people to use your stuff your “stuff” should be able to answer questions like these:

  • How does my department compare with other departments?
  • How does our company compare with others?
  • How does my performance compare with my peers?

If you do this people will interact with your dashboards, I guarantee it.

Aug 202013

In this installment we’ll look at Utah State University’s publication of student engagement results.  Utah State is one of many collegiate institutions that have participated in NSSE’s national survey of student engagement (see and

Special thanks to Allan Walker for making the underlying data available to me.

Note: I’ve published four sets of questions from the survey as interactive dashboards that you can find at the end of this blog post.

The Good

Utah State University should be lauded for making its survey results available in an interactive format.  This is a great way to foster engagement from students, faculty, administration, and other interested parties.

The Bad and The Ugly

It’s almost impossible to glean anything useful from the published results.

The “Before” Picture

Here’s a screenshot of the analysis of the first set of questions in the survey (see

Five of the ten questions in the group -- this requires lots of scrolling and makes it impossible to compare results across questions

Five of the ten questions in the group — this requires lots of scrolling and makes it impossible to compare results across questions

Note that there are a total of ten Likert scale questions in this set and they are presented in the same order that they appeared in the survey.

Here are the things I would like to know, but cannot at all glean from the visualizations:

  • Which activities where done most often and which were done least often?
  • Are there any significant differences when you compare results by gender?
  • Are there any significant differences when you compare results by ethnicity?

The “After” Picture

I’ve written extensively on the best ways to visualize Likert Scale data (see and

Here’s what happens if we apply this approach to the Utah State University NNSE data.

Divergent stacked bars showing all responses

Divergent stacked bars showing all responses

And if we apply a parameter setting to only show extremes (e.g., “very often/often” vs. “sometimes/never”) the results are even easier to sort and grok.

Divergent stacked bars combining responses

Divergent stacked bars combining responses

This approach also allows us to break the data down by gender and see if there are any questions where there are major differences (and there are major differences).

Comparing results by gender

Comparing results by gender

We can likewise distinguish major differences from Caucasian / non-Caucasian respondents when we look at the results from Question 14.

Comparing results by ethnicity

Comparing results by ethnicity

Seven-Point Likert Scale Examples

Here’s another set of results for questions where the students could provide seven possible responses.

Impossible-to-compare seven-point LIkert scale questions

Impossible-to-compare seven-point LIkert scale questions

I can’t make any sense of the data when it’s presented as a bunch of bars, but when I use divergent stacked bars it becomes very easy to compare and sort the results.

Combined values for seven-point Likert scale questions

Combined values for seven-point Likert scale questions

Recommendations to Utah State University

  1. Continue to make these results public, but make the results usable.  You can do this by…
  2. Reshaping the data to make it much easier to manage in Tableau (see
  3. Using divergent stacked bar charts to display Likert scale data.

Click HERE to see interactive dashboard.

Jul 112013

I love Tableau and I articulate that love through consulting, training, evangelizing, and blogging.

But there are some things about the product that just drive me nuts.

Here’s a flaw that’s been in the software since at least version 2.0 that I know has tripped up EVERYONE that uses Tableau.

And just what is this problem?

This incredibly intelligent product with so much built-in smarts becomes positively brain-dead when you click File | Save as.  Specifically, when you attempt to save a workbook under a new name, Tableau saves the file into whatever fold from which you just opened a file, not the folder where the previous version of the file was saved.  Let me illustrate.

Let’s say I have two customers, Coke and Pepsi, and I’m currently working with a workbook called SodaWorkbook_Coke.twbx that is saved in the Coke folder.


A file saved in the Coke folder

Now let’s say I want to look at something that is in another workbook and that workbook is in a different folder, so I open that workbook from the other folder, in this case the Pepsi folder.

Open a file from a different folder, in this case the Pepsi folder

Open a file from a different folder, in this case the Pepsi folder

Now let’s go back to the first file and perform some low-tech version control; specifically, saving the file under the name SodaWorkbook_Coke_B.twbx.


Doh! The file gets saved into the wrong folder!

Unless I override Tableau, Tableau will save the Coke file to the Pepsi folder.


As you may have gathered, I’ve been using Tableau for a long time and have gotten used to this “ill-behaved-Windows-application” anomaly.  There have been occasions, however, that I’ve forgotten about this “gotcha” and I now have random files littered across myriad folders because when it comes to saving files I had expected Tableau to behave like EVERY OTHER WINDOWS APPLICATION ON THE PLANET !

(Oops, caps lock problem… my bad.)

This issue seems so fundamental.  Why the delay in fixing it?

Want to see this fixed?  Chime in at






 Posted by on July 11, 2013 1) General Discussions, Blog Tagged with: , ,  2 Responses »