Mar 182014
 

Note: Since writing this post in 2014, I have, in fact, become a fan of sparklines. That said, I continue to see many instances where I think the dashboard author could present data more clearly using a different approach. Make sure to read the comments at the end of the post.

I’ve never been a big fan of sparklines and I’m a bit concerned with how often they are cropping up in dashboards.  While I appreciate that this chart type provides a compact mechanism for showing how a collection of measures wax and wane over time, I believe there are many cases where other chart types will do a better job getting the message across.

Stephen Few’s Dashboard Design Competition

I’ve been reading the second edition of Stephen Few’s Information Dashboard Design and was drawn to a discussion of the design competition Few ran in 2012.

Consider this data snippet from the competition where we see student test performance over time:

Student test results

Student test results

The winning entry, the runner up, and Few’s own solution rely heavily on sparklines to present this and similar data.

My Attempt at Sparklines

I’ll be honest that I have a very difficult time being able to understand any of the sparkline renderings from any of the design entries. Perhaps if I took a stab at myself…?

Consider my attempt below:

Student test results rendered using sparklines

Student test results rendered using sparklines

I ask you if you can see — at a glance — that the best performing students are at the top and the lowest performing students are at the bottom?  Can you see that Regan Petrero (about 60% of the way down the list) received “C”s for his first three assignments, a “B” for the fourth assignment, and a “D” for the fifth assignment?

Granted, I can try to make certain things stand out better by adding banding and not having the axis start at zero, but even with these additions I’m not able to come up with anything that tells as clear a story as what I get with a simple highlight table.

Student Data, Take Two – A Highlight Table

Here’s the same data rendered using a highlight table:

Student test results rendered using a highlight table

Student test results rendered using a highlight table

I can see immediately that Holly Norton is a straight “A” student, that Donald Chase just missed being a straight “A” student, and that Xu Mei has had some wide fluctuations.  The chart is compact, easy-to-read, and I can discern both comparative performance and relative performance with very little effort.

What about Frederick Chandler?

If you look at my sparklines tendering  you will see that there may be an interesting story with respect to Frederick Chandler and the third assignment.  In the sparkline you can see there was a big dip; in the highlight table you can only see that Mr. Chandler received an “F”.

It turns out that Mr. Chandler received a zero on the assignment.  Is it important to show this, versus just showing a failing grade?  I don’t know the answer, but if it is important then we can create a six point color scale, as shown here:

Mr. Chandler’s zero, for all the world to see

Mr. Chandler’s zero, for all the world to see

 

See For Yourself

I present the sparklines and highlight table side-by-side in the dashboard below. Have a look and let me know what you think.  If you have a way to make the sparklines “sing” better by all means please share it.

Please realize that I’m not suggesting that you should never use sparklines; I only ask that you consider whether sparklines are the best way to show what is important about the data before you publish. I very much encourage your to explore other options.

Jan 162014
 

Overview

One of the new features in Tableau 8.1 that Tableau Software is trumpeting quite a bit is one-click Box and Whisker Plot generation.  While I appreciate the new functionality, this chart type doesn’t “sing” to me the as much as jittering does.  Indeed, this “jittering” capability was the BIG discovery for me in 2013.

Let’s see how a box and whisker plot compares with jittering using a simple example.

Note: Interactive dashboards that illustrate jittering techniques may be found at the end of this blog post.  Feel free to download and explore.

Salary and Age Bins – Default

Consider the following pre-Tableau 8.1 salary chart that shows how salaries are distributed across age bins.

1_Salarydistribution_Age

Figure 1 — Default Salary Distribution by Age Bins

 

While we can see that the top salaries are enjoyed by people in their 50s, there’s nothing that gives us concrete percentiles nor shows us where the outliers are.  We also can’t tell that there are in fact thousands of dots in the visualization as so many marks are sitting on top of each other.

Salary and Age Bins – Box and Whisker Plot

To see percentiles and outliers we can use Tableau’s Show Me feature and click the Box-and-Whisker Plot button.

2_SalaryDistrib_BoxWhisker

Figure 2 — Salary Distribution by Age Bins with Box and Whisker Overlay

 

This is definitely an improvement, but I really don’t “feel” the data as I can’t see how the dots are distributed; they are all stacked on top of each other.

Salary and Age Bins – Jitters

Here’s the original chart, but with the marks “jittered” using a modified version of Tableau’s built-in INDEX() function.

3_SalDisJitters

Figure 3 — Salary Distribution by Age Bins with the marks “jittered”

This gives me a much better feel for the data as I can how the thousands of marks cluster.  Of course, I can still superimpose the box plot, as shown here.

4_SalDisJittersBox

Figure 4 — Salary Distribution by Age Bins with the marks “jittered” and box plot overlay

Getting Jitters Using INDEX()

To “jitter” the marks I create a calculated field called “Index” that uses Tableau’s INDEX() function.  I put this on the Columns shelf and compute using ID, as shown here.

5_Index

Figure 5 – First attempt using Tableau’s INDEX() function

It turns out that for this particular example INDEX() by itself works because there is an equal distribution of IDs across each of the age bins.  Consider the example below where we show a distribution of Superstore Sales across different customer segments.

6_superstore

Figure 6 – Shortcomings of using INDEX() by itself.

Notice that the strip of dots within “Corporate” is much wider than the other segments because there were more orders within “Corporate” than there are in the other segments.

The easiest way to fix this is to edit the axis and select “Independent ranges for each row or column” from the Edit Axis dialog box.  While this will work fine we’ll look at a different technique that will allow us to control the degree of jittering.

Using Modulus to Control Jittering

When I first blogged about this technique last year, Alex Kerin of Data Driven suggested a simple and elegant solution to different-sized partitions using Tableau’s Mod function.   For those of you that forgot your high school mathematics, we use a modulus is to determine the remainder when you divide one number by another.  Here’s an example

14 ≡ 30 Mod 8

Translation: 14 is equivalent to 30 Mod 8 because you get the same remainder when you divide 14 by 8 as when you divide 30 by 8 (both remainders are equal to 6).

So, how do we use this capability in our visualization?  We want the same number of dots in each segment, so instead of using INDEX() we will instead use INDEX()%25

This will create 25 “rows” of dots within each segment.

Specifically, when

INDEX()=1, INDEX()%25 will be mapped to 1
INDEX()=2, INDEX()%25 will be mapped to 2


INDEX()=26, INDEX()%25 will be mapped to 1
INDEX()=27, INDEX()%25 will be mapped to 2
etc.

Note that 25 is not a magic number.  For this example anything above 15 will do the trick (and in the demo workbook I have a parameter slider that controls the MOD setting).

Conclusion

Jittering is a very simple technique and it helps overcome the problem of marks being stacked atop each other when plotting a distribution within a dimension.  It only takes up a little more screen real estate and it packs a terrific visual wallop.