Updated on September 11, 2018.
Note: Special thanks to Joe Mako who suggested this approach.
Overview
I’m fortunate that when I work with survey data that is not well-coded I can take out my trusty copy of Tableau Prep, EasyMoprh, or Alteryx Designer and get my data “just so.”
But what if you don’t have a copy of one of these data prep tools? Or what if you want to use Qualtrics’ or SurveyMonkey’s Web Data Connector and don’t have the ability to insert a data prep tool within your workflow?
In this blog post we’ll see how you can get Tableau to address the problem. Specifically, we’ll see how to work with check-all-that-apply questions where the data has been coded as 1s and blanks instead of 1s and 0s.
So, What’s Wrong with Blanks?
Consider the data snippet below that shows responses to a group of check-all-that-apply questions. This is what I consider to be badly-coded data.
Now here’s how I would prefer the data to be coded.
So, why is this better?
By having 1s and 0s we can see, easily, if some respondents didn’t answer the question group at all. In the figure below, we can see that respondents 15 and 29 did not make any selection for any of the check-all-that-apply questions. More importantly, with data coded with 1s and 0s, Tableau can just filter out the Nulls and we’ll get accurate results.
So, given our constraints, how are we going to come up with a way to distinguish between a blank that means a respondent didn’t select a particular option and a blank that, when combined with the other related blanks, means the person didn’t respond to the group of questions at all?
Before going any further, note that there’s no way we’re going to deal with the data in this wide format. That is, instead of the data looking like it does in the previous example, the data will instead look like this.
If this format is foreign to you, stop reading this now and have a look at Getting Survey Data “Just So.”
So, with this tall format in mind, and the ability to easily indicate that you just want to look at a group of questions, let’s see how we can get a handle on who among the respondents selected something.
NOTE: Qualtrics does in fact have a setting that will allow you to download the data in a more Tableau-friendly format. Look for these settings for both the text and numeric downloads:
How Many Things Did Each Respondent Select?
Consider the visualization below which shows the SUM([Value)]) for each Resp ID where [Value] can be 1, 0, or blank (null). Note that we have a filter set so we only see the responses to the group of related check-all-that-apply questions.
We can filter the results so that we only see respondents that answered a question by dragging [Value] to the Filters shelf, selecting All Values, and indicating that we should show non-null values.Note that when we do this we see that only 342 responded to this group of questions, as shown below.
Okay, this is great, but the visualization we want to create shouldn’t look anything like this. Indeed, we have too many things on the level of detail at this point.So what are we going to do?
Level-of-Detail Expressions to the Rescue
Anytime you think “I need Tableau to consider some dimensions when making a calculation, but those dimensions can’t be in the visualization” a giant alarm bell should go off in your head — this is precisely the use case for a LoD expression!
Here’s the LoD calc, that when placed on the Filters shelf, will do the trick:
A translation into English would be:
Even though [Question Grouping] and [Resp ID] may not be on the viz detail, take that combination of these two fields and find the SUM([Value]). Then find only the Null values from this and exclude them by setting the filter to True.
Note that we are using FIXED instead of INCLUDE as the former will automatically create a dimension and will work no matter what other filter settings may be in place.
Putting this Altogether
First, we need to create a calculation that gives us the percentage of people that selected an option from the group of check-all-that-apply questions. For this instance, the following calculation will work well.
A quick translation into English would be:
Add up everybody that answered “1” (meaning they selected an option) and divide by the number of people that answered the question.
Let’s see if this works. Consider the visualization shown below.
Does this pass the “smell” test? These percentages seem very low for a check-all-that-apply question that had nine things people could select. Indeed, if we place SUM([Number of Records]) as a discrete measure on the rows shelf we see that Tableau believes 845 people responded to the question, but we know only 342 responded.
We need to put our LoD calc on the Filters shelf and set it to True.
Conclusion
How check-all-that-apply questions are coded is a problem with many survey tools including Qualtrics and Survey Gizmo. If you have a good data preparation tools like Tableau Prep, EasyMorph, and Alteryx you can fix the problem, but if you don’t have such a tool or don’t want to insert something into your workflow then you can get Tableau to address the problem with a Level-of-Detail expression.