Dec 092014

In Part One of this series I discussed why the Tableau support community is unique and why you should care. In Part Two I shared my thoughts on the early years of the community and how one person in particular set the tone for sharing knowledge and expertise.  In this final post I make recommendations on the things you can do to ensure that the community continues to thrive.

What you can, and should do, to ensure the community thrives

I rely on this community to inspire me, cheer me on, and help me when I need it.

I don’t want to lose this invaluable asset, so I’m going to enlist you to contribute to its wellbeing, assuming you are not already doing so. Here are some things you can do.

Ask for help

If you can’t find what you need through a web search, ask for help as it will help the community as a whole. While counterintuitive, asking for help will generate a discussion that will lead to solutions that will help not just you but others that are having or will have the same problem you have.

And just where should you ask for help?  Tableau’s community forum is a great place to start.  If you look you will see a lot of Zen Masters who have posted questions, not just answers, through the years.

In addition to asking, if you want to observe noteworthy Tableau activity, make sure to check out the Twitter hashtag #tableau and also check out the list of Tableau-related tweeters Andy Cotgreave has assembled here.

Show the love

If someone has helped you or something has inspired you, send them a “thank you” e-mail, launch a tweet, comment that person’s blog, but above all please let the person who helped you know you appreciate what he / she has done (and in my case feel free to send dark chocolate and / or red wine).




Figure 1 -- Beers were free at The 2014 Tableau Conference.  But I appreciate the sentiment.

Figure 1 — Beers were free at The 2014 Tableau Conference. But I appreciate the sentiment.

Cheering each other on is a big reason the community thrives.

Post your work to Tableau Public.

If you create something worthwhile, share it with the world.  Tableau Public makes it easy, and it’s free.

If you recall from Part One, I stated that Tableau Public is a masterstroke in fostering community and visualization excellence in that it provides a free service for people to post their work.  The public will in turn remark on the work, but the really amazing thing is that a Tableau user can download packaged workbooks to see how they work.

Consider this great “how-to” example from Josh Milligan.

Figure 2 -- A great "how to" example from Josh Milligan that anybody can download and dissect.

Figure 2 — A great “how to” example from Josh Milligan that anybody can download and dissect.

Notice the “Download” button in the bottom right corner.  With Tableau Public I can do more than just interact with the viz; I can download the workbook and see how the person built it.

Help others whenever and wherever you can.

You may not be able to pay back the person or people that helped you, but you can help others.  Do not feel pressure to change the world or have the same impact as a Joe Mako or a Jonathan Drummey, but there’s a lot you can do including participating on the Tableau forum, writing a blog, attending a user group meeting (live or virtual), helping a non-profit understand their data, or just commenting on someone else’s work.

With respect to the forum, try “lurking” (just hanging out and observing the various conversations) to see if this might be an outlet for your abilities. If nothing else you’ll learn a great deal.

With respect to blogging, the barrier to entry has never been lower and this is a great way to find your voice and contribute to the community.  Indeed, Andy Cotgreave maintains that if you can have a Google account you can create a blog and publish a post in fewer than three minutes.

Figure 3 --  Dan Montgomery, Paul Banoub, and Anya A'Hearn, and Lewell Loree stopping traffic and evangelizing blogging at the 2014 Tableau Conference.

Figure 3 — Dan Montgomery, Paul Banoub, Anya A’Hearn, and Jewel Loree stopping traffic and evangelizing blogging at the 2014 Tableau Conference.

Do not celebrate or reward mediocre work.

We should, as a community, be working to improve the art and should not reward stuff that isn’t good.  I’m not saying that you should be a jerk (remember, there are no jerks in this community, at least not yet) but if you see something that you know can be better, please let the person – and the world – know what you would do to make it better.

I’ve written about this on several occasions (please see My Problems with a Company’s Iron Viz Competition and Ask These Three Questions.)

Incidentally, people critique my work all the time and I’m grateful for the feedback.  Indeed, if I have any “high-stakes” dashboards I want to publish I will always ask both colleagues and laypeople to review the work before it goes live (please see the “Usability” section of Your Tableau Public Viz is Ugly *and* Confusing.)

Don’t be too hard on yourself

I remember something Joe Mako told me several years ago:

I like Tableau because it allows me to fail faster.

Do not be afraid to fail, and to fail easily and often. It takes time, study, and practice to get good at data visualization and Tableau.  Do not be afraid to post something on Tableau Public and ask for help or criticism.  Most people will offer constructive help and you’ll get better, fast.

I look forward to seeing your work, reading your tweets, and pondering your questions.


Dec 012014

In this continuation from Part One I share my thoughts on the early years of the community and how one person in particular set the tone for sharing knowledge and expertise.  

How did this start?

There were a lot of really great people contributing to the community in Tableau’s early years.  I’ve already mentioned Jonathan Drummey and Richard Leeke.  Others I recall include Alex Kerin, Andy Cotgreave, James Baker, and Russell Christopher.

But there’s one person in particular that I think set the tone and established the precedent for discovering and sharing Tableau knowledge.

Meet Joe Mako

At the 2011 customer conference in Las Vegas, Tableau singled out Joe Mako for responding to an unfathomable number of Tableau forum posts that directly helped hundreds and indirectly helped thousands of people.

And what prompted Tableau to do this? This is a great example of the community raising its voice and recognizing contributions of one of its own.  Several people, including Matt Shoemaker, Richard Leeke, Dan Murray, Mel Stephenson, Tim Costello, and Tom Brown, petitioned Tableau, urging the company to recognize Joe for his incredible contributions.

This citing by Tableau led to the creation of the Tableau Zen Master program.

For the first two years of the program Tableau used the following Venn diagram to show the confluence of skills and temperament that comprise a Zen Master:

Figure 7 -- Generic Zen Master Venn Diagram

Figure 7 — Generic Zen Master Venn Diagram

Here’s what I think would be the appropriate diagram for Joe:

Figure 8 -- Joe Mako Zen Master diagram

Figure 8 — Joe Mako Zen Master diagram

Anyone who has been on a screen sharing call with Joe (and there are probably hundreds of people who have availed themselves of Joe’s generosity) can attest to stunning, fully-fleshed solutions emerging from Joe’s brain.

I’ve spent enough time with Joe to realize it isn’t just Robin Williams-like brilliance but Bruce Lee-type discipline Joe has applied to really understanding Tableau.

Joe is also remarkably kind and reassuring, offering a soothing, Mr. Rogers-like “don’t worry, we’ll figure this out together” encouragement whenever I’ve been stuck and needed help.

The only downside of a screen sharing session with Joe is that you get off the phone and think that “jeez, this guy is smarter than I am, more disciplined than I am, and … he’s nicer than I am” (and I’m a very nice guy.)

We are very lucky to have him in our community.

How did these seeds produce so much fruit?

How did the contributions of Joe and a handful of others lead to such a large, rich community?

I can’t speak for others, but my contributing stemmed from a desire to repay those people (especially Joe) that had given me so much help.

The problem was that I could not pay these people back directly as there was not much I had to offer them, save appreciation and gratitude.*

But I could give back to the community as a whole.  In my case I don’t attempt to answer forum posts in near real time.  This skill is best left to folks like Shawn Wallwork, Mathew Lutton, Grayson Deal, Joe Oppelt, Noah Salvaterra, Joshua Milligan, KK Molugu, and many others that do an amazing job.  I give back through pro bono work and blog posts.  Specifically, in addition to helping out non-profit organizations I try to publish a useful “here’s how you do this” blog post at least once a month.  The posts can take hours to write, but that’s a small price to pay for what the community gives me in return (and I will confess that they do generate interest in my work).

If you are like me you rely on this community to help and inspire you.  I, for one, love having the safety net of knowing that there are literally dozens of great minds that I can tap for help and inspiration.

I’ve already told you what I do to contribute to the community.  In Part Three of this series I’ll provide ideas on what others can to ensure the community thrives.

* Note: Expressing appreciation and gratitude are essential to the community.  I’ll discuss this more in Part Three of this series.

Nov 242014

This is the first in a series of three blog posts that explores why what makes Tableau’s support community so special, why you should care, and what you can do to ensure that the community thrives.

My sincere thanks go to Andy Cotgreave and Jonathan Drummey who reviewed an early draft and provided invaluable feedback.  Their actions exemplify what this community is about.


Anybody recognize that form of user ID?  It’s my CompuServe ID, circa 1985.


Figure 1 — Getting help from a support forum looked like in the Australopithecus era of personal computing.

I bring this up because over the past 30 years I’ve been involved with dozens of different software communities and I’ve never seen anything that matches the quality, creativity, generosity, and, well, love, that surrounds Tableau.

And just what makes me say that, and why should you care?

Let me start with the second question first.

Why you should care

When people ask me why I like Tableau and why I recommend the product I tell them the following:

  • Tableau more often than not allows “mere mortals” to discover what is important in their data, create compelling visualizations, and share these visualizations with others. By “mere mortals” I mean people not steeped in data science or graphic design.
  • If, however, you are not able to figure out what to do there is a community of people at your beck and call that will help you. The community will NOT allow you to fail.
  • This same community will inspire you to do better work.

Examples of the community at work and what makes it unique

Forum Posts

I was at a client earlier this year and we had our collective heads-down trying to find a visualization that would really make the underlying data sing.

The client asked about displaying pie marks on a scatterplot. Although I was almost certain that this would not yield useful insights (my polite way of saying it would look really dumb), I figured we might as well give it a try as it’s so easy to just try stuff with Tableau and see if it yields good results.

Well, it’s usually really easy to just try stuff.

In this instance I was spinning my wheels for a bit so I invoked what I call the Tableau ten-minute rule:

If, after ten minutes of working in Tableau you come to the conclusion that you are not making any progress, see if anybody else has already solved the problem.

So, I typed the following into Google:

“scatterplot with pie marks Tableau”

My first hit was this:



Figure 2 — A snippet for an amazingly cogent response, typical of Jonathan Drummey and so many others who share their expertise freely on the Tableau support forum.

Holy Venn Diagram, Batman!  Not only was this exactly what I needed, but the post contained a wonderfully-written discourse on how Tableau works.  Who takes the time to write stuff like this?

Ah, of course.  Jonathan Drummey:  Tableau Zen Master, Tableau forum contributor extraordinaire, and author of Drawing with Numbers, one of the “must read” data visualization and Tableau blogs.

This experience reminded me of many times that I’ve been flummoxed and I’ve sought help on the forum  One “career-critical” incident occurred just as I was starting out as an independent consultant and I was in a quandary.  I managed to find my post and the attendant response from Richard Leeke (another generous-with-his-time-and-expertise Zen Master) at http://community.tableausoftware.com/thread/111283.


Figure 3 — Richard Leeke, saving my hide at a critical juncture in my career.

I can still remember my relief when I tried Richard’s solution and it worked.  Perfectly.


While the forum posts make the community so helpful, it’s the blog posts that make it so rich.  There are scores of people sharing their solutions and creativity and their influence on me has been profound.

I’ll cite one example.

In mid-2013 I decided to check out the “viz of the day” and see if there was anything good.  I’ll confess that I can be an insufferable snob when it comes to dashboard design and my previous visits to this site had left me unimpressed.

But then I saw this http://www.tableausoftware.com/public/gallery/socialworld


Figure 4 — An example of Kelly Martin’s work

F&*k!  While I wasn’t paying attention a bunch of people really raised the bar on Tableau dashboard design.  Now aware that I was not quite as bad-ass as I thought, I decided to find out who built this particular dashboard and discovered Kelly Martin of VizCandy.

There is so much great work here, and so many useful blog posts.  This combination of quality and quantity is indicative of dozens of the Tableau bloggers that contribute to the community by creating something either useful or beautiful (or both), posting for the world to see using Tableau Public, and then explaining how they did it.

(I won’t attempt to list all of the great blogs and attendant posts, but in early 2014 Andy Cotgreave assembled a list of influential Tableau-related blog posts.)

Tableau Public

Tableau Public is a masterstroke in fostering community and visualization excellence in that it provides a free service for people to post their work.  The public will in turn remark on the work (both praise and brickbats), but the really amazing thing is that a Tableau user can download packaged workbooks to see how they work.

Consider this great example from Mark Jackson.


Figure 5 — A great example of Tableau’s Story Points feature from Mark Jackson.

Notice the “Download” button in the bottom right corner.  With Tableau Public I can do more than just interact with the viz; I can download the workbook and see how the person built it.

I can cite dozens of times where somebody posts something cool that somebody else downloads, dissects, improves, and then re-posts, only to inspire somebody else to repeat the process.

I’ve been part of this “cycle-of-improvement” on several occasions and have downloaded hundreds of workbooks to see for myself how a fellow Tableau author built something wonderful.  I’ve gotten much better at what I do because of this free exchange.

Note: I encourage you to read Andy Cotgreave’s post about why a chart should start, not end, a conversation.  In this article you will find a great example of how three Tableau users found different and important truths in the same data set.

Tableau Employees

When I first started working on this essay in the spring something had gone horribly wrong with the forum: when I conducted a search through Google – or even on the forum itself – I could not find any forum posts.

This was not a good thing; my entire thesis of “if you can’t figure it out yourself, conduct a search on Google” was not going to withstand any scrutiny if none of the forum posts would show up in the search.

I won’t get into the details of what happened but it was an unintended consequence of making a change to the way the underlying support system (Jive Software) had been implemented.

I was irate over what had happened but as much as it was bothering me, it was bothering the folks tasked with making the forums work even more.  These are Tableau employees that care deeply about nurturing the community and I suspect they weren’t sleeping well while this was going on. It took a while, but they fixed it, so major kudos to Tracy Fitzgerald, Dustin Smith, Ross Perez, Patrick Van Der Hyde for their efforts in remedying the problem and in nurturing the online community.

Note: There’s so much more I should write about Tableau employees but I’ve decided to mention just those employees of which I am aware whose work focuses mostly on nurturing the support community.

No Jerks

I mentioned at the beginning of this essay that I’ve been involved with a lot of software communities and they have had more than their fair share of dysfunctional, bordering on toxic, personalities.

Figure 6 –For whatever reason, there don’t seem to be any jerks (at least of which I’m aware) in the Tableau community.

Figure 6 –For whatever reason, there don’t seem to be any jerks (at least of which I’m aware) in the Tableau community.

I have yet to meet any jerks, either in person or online, within the Tableau community.  They may exist, but so far I’ve only met people that were smart, eager to learn, eager to share, and remarkably well adjusted.

I have some ideas on why the community is as functional as it is, and it had a lot to do with the temperament of some of the earliest contributors (and one contributor in particular).

I will share my thoughts in Part Two of this series.

Sep 182014


A dashboard from Radio Free Europe / Radio Liberty has received a lot of views since it was published earlier this week and for good reason: There’s a lot of important information packed into a compelling story.

There’s a lot I like about the dashboard but two things that I believe desperately need to be corrected.

Before going any further you can see the dashboard here.

What I Like and Don’t Like

Here’s a re-sized snapshot of the dashboard.


What I Like:

  • Colors
  • Hovering over a country provides more information about the country.
  • Syria and Iraq are labelled so I can find the focus of the story quickly.
  • The author has presented normalized data in the bottom chart  to show proportion of fighters from a particular country. This is brilliant and important.

What I Do Not Like:

  • There are two different axes at the bottom of each bar chart with very different values. If I don’t look at the axes I would think that Belgium has the same number of fighters per million as Tunisia. This defeats the brilliance of presenting the data in a normalized form.
  • The fighter icon takes away from the gravity of the story as this should not be a frivolous visualization. The visualization should skew more towards The Economist and less towards USA Today (see this post.)
  • The bars are in alphabetical order.

The Makeover

Here’s what I’ve changed.

  • Clicking a country name in one of the visualizations will highlight that country in the other visualization making it easy to find.
  • The axes for the two sets of bar charts at the bottom are consistent.
  • I’ve replaced the fighter icons with labeled bars.
  • You can switch between displaying normalized data (fighters per million) and the number of fighters.
  • The bars are color coded to reflect the same color legend in the top chart.

Click here or the image below to access the interactive dashboard.


Sep 162014

Finally, a good use for packed bubbles!

The Problem

I recently received a query from a client on how to compare responses to one question with responses to another question when both questions have possible LIkert values of 1, 2, 3, 4, and 5.  That is, if you have a collection of questions like this:


How would you show response clusters when you compare “Good Job Skills” against “Likes the Beatles”?

This question is particularly applicable if you are a provider of goods and services and you want to see if there is alignment or misalignment between “how important is this feature” and “how satisfied are you with this feature”.

Note: There’s a Tableau forum thread that has been looking into this issue as well.  Please see http://community.tableausoftware.com/thread/137719.

So, how can we fashion something that helps us understand the data?

Before we get into the nitty gritty here’s a screen shot of one of the approaches I favor.  Have a look to determine if reading the rest of the blog post is worth the effort.


Still reading?  Well, I guess it’s worth the effort.

The Traditional Scatterplot Approach

Consider the set up below where we see how Tableau would present the Likert vs. Likert results in a standard scatterplot.


So, what is going on here?

There are a total of nine Likert questions available from the X-Question and Y-Question parameter drop down list boxes.  Our desire here is to allow us to compare any two of the nine at any time.

The “meat” of the visualization comes from the SUM(X-Value) on the columns shelf and SUM(Y-Values) on the rows shelf where X-Value and Y-Value are both defined as

IF [Wording]=[X-Question] then [Value]+1 END

This translates into “if the selected item from the list is the same as one of the questions you want to analyze, use the [Value] for that question”. Note that [Wording] is the same as [QuestionID] but with human readable values (e.g., “Likes the Beatles” instead of “Q52”)

We use [Value]+1 is because the Likert values are set to go from 0 to 4 instead of 1 to 5, and most people expect 1 to 5.

We can use SUM(X-Value) and SUM(Y-Value) because we have Resp ID on the Details shelf.  This forces Tableau to draw a circle for every respondent.  The problem is that we have overlapping circles and even with transparency you don’t get a sense of where responses cluster. Yes, it is possible with a table calculation to change the size of the circle based on count but we’ll I’ll provide what I think is a better approach below.

A note about the filters: The Question filter is there to constrain our view so that we only concern ourselves with Likert Scale questions.  It isn’t necessary but is useful should we be experimenting with different approaches.  The SUM(X-Value) and SUM(Y-Value) filters remove nulls from the view.

Packed Bubbles to the Rescue

I’m not a big fan of packed bubbles (see this post) but for this situation we can use them and get some great results, as shown below.


I’ve made a couple of changes to the traditional scatterplot visualization the most important being SUM(X-Value) and SUM(Y-Value) are now discrete and we get a trellised visualization instead of a continuous axis.  Note that I had to change the sort order of the Y-axis elements so that they appear in reverse order (5 down to 1).

I got the packed bubbles by placing CNTD(Resp ID) on the size button. This assures that each bubble is the same size and triggers Tableau’s packing algorithm.

Note that I also added an on-demand “Drill down” so that you can color the circles by different demographic dimensions.

I’ve experimented with this with some large data sets and Tableau does a great job with packing the bubbles intelligently.

What About Trend Lines?

Since we are using discrete measures on the rows and columns shelves we cannot produce trend lines.  When I first started this project I experimented with more traditional jittering and was able, with a fair amount of fuss and bother, to produce this.


A special thanks to Jeffrey Shaffer who provided a link on how to create pseudo-random numbers in Tableau (thank you, Josh Milligan).

I prefer the example that doesn’t require the jittering, but if you need to trend lines or if you prefer the jittered look I’ve included the example in the downloaded packaged workbook (see below).

It also occurred to me that the trend line would be based on the jittered values and not the actual values.  The same workbook contains a “home grown” trend line based on the actual values (courtesy of Joe Mako). It turns out the jittered trend line is almost identical to the non-jittered trend line so I suspect you won’t need to take the “home grown” approach.


I received a number of comments here and on LinkedIn about the “drill down / break down” capability and that it is hard to see the percentage of dots by category.  For example, if you break down by generation do the dots for one generation cluster more in one part of the trellis than in others?

I thought that in this case having a different-colored bubble per category where the size of the bubble was proportionate to the percentage of responses within that category made sense.

Size by Category

I thought building this would be easy, but I needed to call in the heavy artillery (Joe Mako).

I’ll blog about the solution later. In the meantime the packaged workbook below contains this additional approach.

Sep 032014


In my experience the number one impediment to success with Tableau is getting data in a format that plays nicely with Tableau. Alteryx is a combination ETL (extract, transform, load), geospatial, and statistical modeling solution that just may solve this “getting-the-data-right” problem.

And it plays very nicely with Tableau.

In this blog I will recount some experiences I’ve had with Alteryx and some thoughts on what the future might hold for the two companies.

Client One

In April of 2013 I was working with one of my favorite clients and we ran into a roadblock in that they needed to blend a lot of data from disparate sources and this confluence of data was an absolute monstrosity. I was on the precipice of recommending that they hire a data warehousing consultant when I happened to attend a Tableau road show in New York City. Alteryx sponsored the lunch that day and I paid attention to their presentation.

Fast-forwarding a bit, I received a very compelling demo and product roadmap from Dean Stoecker, Alteryx’s CEO.

I visited my client the next day and told them to hold off on hiring the consultant and look into Alteryx as a better short, mid, and long term solution.  It proved to be a good recommendation as the client is now blending data from a lot of sources and gleaning insights that would have taken much longer had we gone the consultant /consolidated warehouse route.

Client Two

About six months ago I was working with another client that was swimming in data, but that data was missing some key elements.  The client was tracking hundreds of service calls throughout the New York metropolitan area and although they had street addresses for every incident they were only able to produce a map that showed results at the county level, and this didn’t reveal very much.

I asked them to look into seeing if the incidents clustered in particular neighborhoods.  For this we would need latitude and longitude coordinates for each street address.

Six weeks later the client triumphantly called and told me their IT department had finally added zip code information to the database query they used to drive the visualization.  I sighed and politely told them that while having zip code-level data was better than county-level data, zip codes would not give us the granularity we needed.

At this point I asked them to send me a copy of the data and, armed with Alteryx and the Tom Tom US maps, I generated latitude and longitude for 99% of the addresses.

And I did this in about 15 minutes.

The next day I presented the client with a symbol map that contained a different color-coded circle for every incident in the database.  I wish I could tell you that we discovered something truly amazing once we had this (we didn’t) but the critical point is that tools like Alteryx and Tableau allowed us to pursue a hunch in a matter of minutes. The next hunch might yield an incredible insight and with Alteryx and Tableau we can investigate these notions without having to tax an already over-burdened IT department.

Client Three – Me

Any followers of this blog know that I do a lot of work with survey data and to get Tableau to do what it does so well survey data as downloaded from a survey tool needs to be parsed, pummeled, and browbeaten into submission.

For years I’ve been using either Tableau’s free Excel add-in (when the data is in that format) or relying on the kindness of DBAs to render the data in the format I need.

The problem with the Excel approach is that it requires a lot of hand manipulation and if the client decides they want to include responses after they have sworn the survey is closed, well, I end up having to go through the whole error-prone process a second or third time.

Enter Alteryx, which allows me to set up the process once, automate it, and then run it whenever I need.    The icing on the cake is that it generates a ready-to-use Tableau .TDE file. In addition to the process being faster and safer, I can start visualizing survey data way before the survey is closed. This has been a huge time saver for me and I will never go back to hand-massaging the data. Plus, if the data source is a database (vs. downloaded files) I can apply the same tool and the same process without needing the DBA to fashion anything special for me.

Will Tableau Acquire Alteryx?

Given that Alteryx fills a gap that currently exists with Tableau and that the two products play so nicely together, in September 2013 I predicted that within a year Tableau would acquire Alteryx.

So I was wrong.  But will it happen down the road?  I do like how the two tools work together but there are some things about Alteryx that Tableau users may find off-putting, including:

Alteryx is a “heavy lunch”

Alteryx is an ETL, geospatial, predictive modeling, breath mint, candy mint, floor wax, all-in-one tool.  This cornucopia of options can be intimidating.

Alteryx assumes more knowledge

Alteryx assumes a greater level of programming sophistication than does Tableau. For example, Alteryx makes a distinction between equivalence and an assignment. In Alteryx you would write


To assign the value 7 to the variable X.  But if you were performing a comparison in an IF statement you would write

IF X==7 Then [what to do] ENDIF

Tableau does not make the distinction between the single and double equal sign.  Granted, if you attempt to use one equal sign in Alteryx you’ll get an error message with the suggestion that you should use the double equal sign, but there appears to be an assumption that the user comes into Alteryx with an appreciation for standard programming syntax.

It’s Easier to Break Things in Alteryx

If you change the name of a field in Tableau everything that refers to that field also changes.  This is not the case with Alteryx and modifying your Alteryx module to address this field name change can be a pain.

Indeed, I think this example epitomizes the difference in refinement between the two tools.  Don’t get me wrong, Alteryx is a terrific tool and I am very happy to have it in my quiver, but there is a higher degree of user affordance in Tableau, and users accustomed to Tableau’s luxury car feel may find Alteryx a bit of a bumpy ride.

So, while I don’t see Tableau acquiring the Alteryx and supporting the tool in its current form, who knows what will happen as both tools and companies mature?

So, Should You Buy Alteryx?

I can’t answer this question but you should at least download an evaluation copy and try it out.  I will tell you that for my survey data practice the product has been a godsend.

Aug 272014


I was reading a very interesting blog post this weekend from which I learned that counseling may do little to help young people with drinking problems.  Specifically, counseling reduces the average number of drinks consumed from 13.7 drinks per week to 12.2 drinks per week.

I wondered if a visual might drive this point home, and then wondered how different media outlets would display the chart if the article were to appear in that outlet.

The Economist

Here’s what an accompanying chart might look like in The Economist.

The Economist

Actually, in The Economist the axis would appear along the top of the chart, but there’s no easy way to do that in Tableau.

USA Today

Here’s how USA Today might handle the same information.

USA Today

Special thanks to Joe Mako who helped me with data densification / padding and the masking so I would not have to resort to Excel to create the pictogram.

Fox News

Depending on if the editorial board wanted to slant the results, here’s how the chart might appear at Fox News.

Fox News

Think I’m kidding? Check out this link.  The chart does not start at zero and the axis is hidden.


Jun 252014


The catalyst for this post comes from my recent attendance of a Tableau user group where the presenter demonstrated a dashboard that featured a packed bubble chart.  I spent a lot of time shaking my head – not because this was a very poor visualization choice — but because the presenter was in a position of authority and there were people in attendance that were new to Tableau and to data visualization.  These people would likely come away from the presentation thinking that they should, when presented with similar data, use a packed bubble chart.

I then recalled something that I had written previously:

If I see a visualization that is poorly designed or worse, misleading, I’m going to say something about it. I hope you will do the same.

The culprit visualization

I do not have the data that drove the Tableau user group visualization so I will use Superstore Sales data to illustrate my point.

For whatever reason, the presenter eschewed creating a clear and simple bar chart, like this one…


Figure 1 — A simple but abundantly clear bar chart.

… and instead built a difficult-to-interpret packed bubble chart that looked like this:


Figure 2 – A cool, but analytically-bereft packed bubble chart.

With the packed bubbles I have to work to determine which bubbles belong in which category and I have to work especially hard to determine how much larger a particular bubble is than another bubble.  In addition, in some cases the bubble is too small for the supporting sub-category and measure labels.

A good rule of thumb – Ask yourself these three questions

As I considered the flaws in this chart type I began to codify some simple principals that I use when building visualizations.  Specifically, before I go live with a visualization, I ask these three questions:

  1. Do I need different colors?
  2. Do I need a legend?
  3. Do I need measure labels?

In the case of the bar chart I don’t need to use color, I don’t need a legend and I don’t need to show the numbers next to the bars. I might want to show the numbers, but I don’t need to show them.  With the packed bubble chart I need all three items in order to make sense of the viz.

I’m not saying that you should never use color, legends, labels, or circles; I just suggest that you ask yourself if there’s a way to build a clear visualization that doesn’t need one or more of these elements as the more of these elements you need the harder your audience will need to work.

Let’ see how this triumvirate of questions expose some of the flaws in pie charts, circle charts, 100% stacked bar chart, and “snakey” diagrams.

The problem with pies

Many people have written articles about this, my favorite being Stephen Few’s white paper on the subject.  Indeed, if you need ammunition to move your organization away from pie charts I encourage you to download Few’s paper.

I’ll present an abbreviated discussion of the problems with pies to show how it underscores the utility of the three questions.

Consider the chart below which shows poll results to the question “what is your favorite beverage”?


Figure 3 — Simple pie chart showing poll results.

I can see that Chateau Lafite Rothschild comes in first, but I can’t tell if it’s Coffee or Dogfish IPA that comes in second, and I really can’t tell how much larger one segment is than another.

Here’s an alternative pie chart that adds color, a legend, and measure labels.


Figure 4 — Pie chart with added stuff so you can make sense of the pie chart.

Well, I can now determine the ranking and relative magnitude, but I have to spend a lot of time going back and forth between the legend and the chart.  Cosnider how much simpler it is to understand the poll results using a bar chart:


Figure 5 — Poll results displayed in a bar chart.

So, just why is the pie chart harder to understand?  In addition to requiring a legend, it also has to do with people’s inability to compare the area of circles.

The Problem with Circles

As with pies, Stephen Few has written about this subject, as has Alberto Cairo in his book The Functional Art.  (Do you own a of Cairo’s book?  If the answer is “no” you should buy it now.  Really.  Stop reading this and buy it).

Now that you’ve bought the book…

Consider the collection of bar charts below.  Two of the groups have measure values that are labeled incorrectly while one of the groups is correctly labeled.


Figure 6 — One of the groups is labeled correctly and two are mislabeled. Can you tell which one is correct?

Can you tell which one of the three groups is labeled correctly?

Now have a look at the same data presented with a packed bubble chart where again one group is correctly labeled and the other two are not.


Figure 7 — One of the groups is labeled correctly and two are mislabeled. Can you tell which one is correct?

If you are like most people you’ll solve the bar chart example very quickly and probably won’t have a clue with the packed bubble charts.

Note – The answers may be found at the end of this blog post.

Incidentally, the differences are pretty significant, but I could magnify the errors quite a bit in the circle charts and people still wouldn’t be able to tell which group was correct as people are just horrible with comparing the area of differently-sized circles.

There are certainly places where circles are useful and most welcome, but they don’t work well here and they don’t work well in the example I discuss below.

Substituting a bad chart type with another bad chart type

I’ve recently read a collection of blog posts where the author suggests ways in which people can avoid the stranglehold of pie charts by using other chart types.  I liked the promise of this blog series and was pleased to see that the first example presented a bar chart similar to the one show in Figure 1.  I was however, surprised at some of the other approaches as I did not think they presented data clearly.

One of the questionable alternatives was a panel chart like the one shown below.

8_Panel Chart

Figure 8 — A panel chart comprising circles that makes me have to work harder than I would like.

I have to work very hard to “grok” this viz and that’s because I cannot make sense of the data without reading and interpreting the measure labels.  In addition, because the items were not grouped I had to refer to the color legend to see which circles represented Technology products, which were for Office Supplies, etc.

I grant that there are cases where you may want to present the product sub categories from largest to smallest without grouping them into a hierarchy, but I still maintain that it’s much easier to interpret the data with a bar chart like the one shown below, which does not require measure labels.

8A_Bar Chart no Hierarchy

Figure 9 — Bar chart with hierarchy removed. We need a legend but don’t need measure labels.

Note: I am not saying that you should not use measure labels; I am saying that if the visualization requires measure labels then there is a good chance you’ll be able to craft a better visualization.

Does this mean you should never use circles?

There are of course myriad instances where circles would be very welcome.  Consider the following map that shows the number of orders by location.

9_Circles on a Map (Number of Orders by State)

Figure 10 — Symbol map

I can see very easily that the number of orders on the West Coast (Washington, Oregon, and California) is considerably larger than the rest of the country.  In this case it’s seeing the circles on top of a map that helps me conclude that there’s a lot of activity happening in one area of the country.  If I wanted to know just how much activity, and if I wanted to be able to make quantitative comparisons, I would need an accompanying chart that helped me sort and determine the relative magnitude of orders for each state.  That is, if I needed to know more than “whoa, look at the number of orders on the West Coast!” then I would probably craft a dashboard that would also contain a bar chart showing orders by state in descending order.

100% Stacked Bar Charts

I try to avoid 100% stacked bar charts as they absolutely require that I use color and they can be somewhat difficult to interpret without measure labels.  Consider the visualization below that compares % of total shipping costs by product category, broken down by ship mode.

1a_Percent of Shipping Costs by Category

Figure 11 – A collection of 100% stacked bar charts. I think of this as being a “cubist” pie chart

It’s easy pretty easy to determine the Regular Air values as the axis starts at zero.  It’s a bit harder to glean the Express Air and Delivery Truck Values without displaying the mark labels.

Still, it ‘s an easier read than three pie charts.

1b_Percent of Shipping Costs by Category -- Pie Charts

Figure 12 — Trying to understand the breakdown of shipping costs by Ship Mode across categories using pie charts (yuck).

While I try to avoid 100% stacked bar charts, I am a very big fan of divergent stacked bar charts.  Here’s an example from a recent blog post.  While I do need a color legend I can get by without measure labels.


Figure 13 — Divergent stacked bar chart.

Stacked bar charts also play a supporting role in Sankey diagrams which we explore below.

Where “snakey” Sankey diagrams work

Consider this snippet from Jeffrey Shaffer’s winning entry in the Tableau Quantified Self visualization competition.


Figure 14 — Jeffrey Shaffer’s Sankey chart maps how one stacked bar chart maps to another stacked bar chart.

At the bell of the trumpet we find a stacked bar chart where we can hover over items to see to what they refer:


Figure 15 — Hovering over a bar shows info about the bar and shows how the item is mapped to a different set of measures.

From this action I can see that Shaffer performed at five weddings where he played music by Bach, Clarke, Mouret, Reiche, Vivaldi, and Purcell.

Within this context, this very creative chart works as it’s not essential that I know the exact details of Shaffer’s performances.  Instead, I can explore this and other portions of what is a very fun and playful dashboard and get a sense of who Shaffer is and whether I’d like to hang out with him (and I would as I know for a fact that musician / data visualization consultants are among the most interesting people on the planet.  You can look it up.)

Where Sankey diagrams don’t work

I first saw this type of chart in a visualization Shaffer published earlier this year where he took a stab at redesigning his utility company’s fuel usage bill.  Here’s what his redesign looks like:


Figure 16 — Shaffer’s energy bill redesign.

The chart is very decorative, but does it help me understand energy expenditures?  There’s a really big story sitting in the data but I don’t think this chart helps me see it.  Indeed, I would argue that while the chart is pretty it in fact obfuscates what is the big story.

Consider this visualization of the same data.


Figure 17 — A redesign of the redesign.

So what the big story?  Almost half of the total energy expenditures (44%) goes towards heating the home. 

My reaction to the Sankey diagram is “cool!”  My reaction to the stacked bar chart is “crap!”

In the Quantified Self dashboard “cool” is the desired reaction.  With the fuel bill “crap” is better as it may lead to better decisions and behavior changes (e.g., replace the windows, add insulation, or wear a sweater).


The goal of good data visualization is to elucidate, not decorate.  If your visualization requires color, legends, and measure labels you should at least consider an approach that does not ask your viewers to work hard to see and understand what is important in the data.


Answers key:

Bar Chart – Group 2 is labeled correctly

Packed Bubble Chart – Group 1 is labeled correctly.




 Posted by on June 25, 2014 1) General Discussions, Blog 6 Responses »
Jun 102014

… and some thoughts on the evolving art and science of visualizing data

I tend to gravitate towards occupations that are hard to explain.  I started my professional life, and continue to be, a music arranger and orchestrator.  I can tell by people’s perplexed looks that they are wondering if I’m the guy that decides where the brass section should sit in the pit.

I run into similar problems when I tell people I’m a data visualization consultant.  I was trying to come up with a concise way to explain what that is when I came across an excellent blog post from Stephen Few.

I take this and turn it into that

I do encourage you to read the full post (you can do it now if you like; I’ll wait).

I was struck by the first example where Few shows how hard it is to glean any meaning from a text table.  Here’s his example of poll results published on the PBS website from a 2004 study by the Pew Center for Research.


Figure 1 — Favorable and Unfavorable views of the U.S.A.

I have to work very hard to get a sense of which countries have the most positive sentiments towards the U.S.A. and which have the most negative.

Few proposes a different way to present the data that makes it much easier to see, rank, and understand the findings.


Figure 2 — Few’s alternative to presenting the findings

Yes!  This exercise encapsulates what it is that I do!  I take “this” and turn it into “that”, thereby allowing companies to better see, understand, and glean insights into their data.

An alternative to the alternative and how the industry keeps evolving

I cannot just listen to music.  My training and proclivities force me to dissect the music I hear so that I can understand what’s going on inside the music.

A similar thing happens when I see a data visualization.  After taking in the presentation I stop and wonder if there is an alternative approach that would allow me to better understand what’s going on and thereby draw better conclusions that in turn allow me to make better decisions.

In a moment I’m going to suggest an alternative to Few’s approach but I do want to emphasize that the data visualization field is very new and it’s the free exchange of ideas that’s pushing people to create new ways visualize data.  A perfect example of this is my own evolution in displaying Likert scale data (see Likert Scales – The Final Word).  It was discussions with friends and colleagues Naomi Robbins and Joe Mako that resulted in what I think is a better way to explore and glean insights from the World Opinions data.

The divergent (or staggered) stacked bar chart

Consider the screenshot of a dashboard below where we skew the stacked bars right and left based on overall positive and negative sentiment. Note that you will find a working dashboard at the end of this post.


Figure 3 — Conveying sentiment using a divergent stacked bar chart.

If you split the neutral responses evenly you see that, overall, Poland has the most positive sentiment and Egypt the most negative.

But what happens if you eliminate the neutrals?  If you sort by least negative you see certain things pop out.


Figure 4 — Neutral responses are hidden results are sorted by least negative

Here Poland is ranked first and Jordan is last (and notice how polarized Jordan is).

Compare this with the view when you remove the neutral responses and sort by most positive.

Figure 5 -- Neutral responses are hidden results are sorted by most positive

Figure 5 — Neutral responses are hidden results are sorted by most positive

In this case Kenya is ranked first and Egypt is last.


The divergent stacked bar is my “go to” viz type whenever I deal with Likert scale data.  The only downside is that is takes a bit more time to create in Tableau and it warrants using a color legend, something I try to avoid where possible.

But this divergent stacked bar chart is my Likert-scale viz of choice today.  Who knows what people will create in the coming years that does an even better job of helping people understand their data.

Oh, and I now have a compact explanation of just what it is I do.  I turn a this into a that.

Postscript: I’ve been thinking about this and want to modify my explanation… let’s change it to “I take this and I try to turn it into the best that that’s possible”.

May 232014


I’ve had a spate of requests from clients to show how survey responses rank across different categories and I’ve come up with a way that makes it very easy to see where the big stories are.

Note that this approach works for any measure that can be ranked, not just survey responses.

Let’s see what I mean…

Consider the bar chart below that shows the results to a survey question “indicate which of the following that you measure; check all that apply”.

Figure 1 -- Percentage of respondents that measure selected items, ranked from highest to lowest.

Figure 1 — Percentage of respondents that measure selected items, ranked from highest to lowest.

Traditional approach to showing rank within a category

Now, suppose you wanted to see the percentages and rankings broken down by different demographic components (e.g., location, gender, age, etc.).  There are myriad Tableau knowledge base articles and blog posts on how to do this and they lead to results that look like the one shown in Figure 2.

Note: Pretty much all of those articles and blog posts are now obsolete as they make clever use of the INDEX() function.  With Tableau 8.1 you can use the RANK(), or one of its variations, and not have to go through as many hoops.

Figure 2 -- Traditional approach to showing ranking within a category.

Figure 2 — Traditional approach to showing ranking within a category.

I find this a tough read.  Even if I add a highlight action it’s still hard for me to see where a particular item ranks across the four categories.

Figure 3 -- Ranking within a category with highlighting.

Figure 3 — Ranking within a category with highlighting.

Don’t try to show everything at once

My solution is place the Generation on the Columns shelf and to not show everything at once, but to instead allow the user to explore each of the possible responses and see how these responses rank across the different categories.

Consider the dashboard shown below where the top worksheet shows the responses across all categories.

Figure 4 -- Dashboard with no item selected.

Figure 4 — Dashboard with no item selected.

Now see what happens when we select one of the items in the list.

Figure 5 -- Dashboard with an item selected shows that items rank and percentage across different generations.

Figure 5 — Dashboard with an item selected shows that items rank and percentage across different generations.

Okay, not much to report here – Adrenaline Production is ranked first in three categories and second among Traditionalists, although Traditionalists’ measure it quite a bit lower than the other three groups.  Still, we’re not seeing any wide swings.

But look what happens when we select Breathing…

Figure 6 -- Breathing: our first big story.

Figure 6 — Breathing: our first big story.

Now that’s a big story!  And it pops out so clearly.

Reporting vs. interacting

This is all fine and good if you publish this as an interactive dashboard and you expect people to, well, interact; but what happens if you want to publish this as a static graphic in a magazine?

The solution is to find where the big stories are and show those in the magazine; that is, do the work for your reader and show him / her where the big differences are.  In fact, that is exactly what I’ve done in Figure 7.

How the dashboard works

Here’s how the top part of the dashboard is set up.

Figure 7 -- Configuration of top worksheet.

Figure 7 — Configuration of top worksheet.

Rank is defined as


Note that we’re addressing the table calculation using Wording.

Notice also that Wording is on the Rows shelf.

The bottom part of the dashboard is set up like this.

Figure 8 -- Configuration of the bottom worksheet.

Figure 8 — Configuration of the bottom worksheet.

Goodness, we can’t tell what any of the bars mean because Generation is on the Columns shelf and Wording is on the Level of Detail and not Rows.  If you put it on Rows you get something that looks like this.

Figure 9 -- Placing Wording on the rows shelf tells a different and harder-to-understand story.

Figure 9 — Placing Wording on the rows shelf tells a different and harder-to-understand story.

The key takeaway is that we cannot make a single visualization that tells the story.  You need both the first and second visualizations working together.

A Filter and a Highlight Action

We use both a Highlight and a Filter action to make the two visualizations work well.  The Filter action is there to make the second worksheet disappear once you clear the selection in the first worksheet; The Highlight action highlights where the item appears in the second worksheet.

Here are the two actions:

Figure 10 -- Two actions tied to the same mouse click.

Figure 10 — Two actions tied to the same mouse click.

The Filter action is defined as follows.

Figure 11 – Definition of the Filter action.

Figure 11 – Definition of the Filter action.

This tells Tableau that when a user selects something from the first worksheet (Percent that Measure-Overall) it should filter the second worksheet (Percent that measure-by Generation)  by the field Temp.  Temp is just a string constant that I’ve placed on the color shelf; it’s only use is that we have to filter by something in order for the Exclude all values setting to work (and that is critical for the behavior of the dashboard.)

Here’s how the Highlight action is defined.

Figure 12 -- Definition of the Highlight action.

Figure 12 — Definition of the Highlight action.

This tells Tableau that when a user selects something from the worksheet on top, Tableau should highlight items in the second worksheet using Wording as the selected field (where Wording is the dimension we placed on the level of detail rather than on the Rows shelf.)


I’ve found this approach to showing of rank across categories very useful and it’s been a very big hit with my clients.  By placing the categories across columns and using highlight actions we make it very easy to see where the big differences are among different respondent groups.