In writing about visualizing survey data using Tableau I’ve found that the number one impediment to success is getting the data in the right format. In accompanying posts I’ll explain how to get this done using Tableau Prep, Alteryx, Easymorph, and Tableau 10.x.
What do I mean by “just so”?
When I deal with survey data there are usually four different elements that need to fit together:
- The demographic information (e.g., age of respondents, gender, etc.)
- Survey responses in text format
- Survey responses in numeric format
- Meta data that describes the survey data.
Let’s see what the four elements look like using an Excel sample data set (click here to download).
Here’s what the demographic data looks like.
Survey responses in text format
Here are several columns of survey responses in text format. Column F contains data for a Yes / No / Don’t know question. Column G contains responses for a question about salary. Columns H through P are responses for check-all-that apply questions and columns Q and R contain Likert scale responses.
Survey responses in numeric format
Here are the same responses but in numeric format.
I’ll explain why it’s so useful to have the survey responses in both text and numeric format in a bit.
Meta Data (the “helper” file)
Here’s some data that I usually prepare by hand as most survey tools won’t produce it for me automatically. Having this helps me understand the data and will greatly streamline my work in Tableau.
What does “just so” look like?
Our goal is to combine and reshape the various elements so that they look like this.
As I’ve written previously, the key thing is that I no longer have a separate column for each survey response. Indeed, I’ve reduced the number of columns from 45 to just 11, but I’ve also increased the number of rows from 845 to over 25,000. That is a good thing.
Why this works so well with Tableau
Our goal is to see how to use our preparation tool to get the data in this format, not to actually use the data, but if you need convincing on why the meta data is so helpful, consider the following example.
Let’s say that in your survey you ask people to indicate the importance and satisfaction about certain services, as shown here.
With the data set up “just so” conducting this comparison in Tableau becomes easy. First we can drag Question Grouping into Filters and indicate that we just want to look at Importance and Satisfaction questions.
Then we can drag Wording and Question Grouping onto the Rows shelf which gives us the framework for comparing importance and satisfaction across ten different questions. No more having to “look up” which questions we want to explore and no more having to alias question IDs. I love this!
Why do we need both text and numeric results?
We don’t really need them, but I know I certainly want them.
Consider all of the Likert scale question results. The universe of possible values are
Suppose we want to know just what each of the values (1, 2, 3, 4 and 5) stand for? The problem is that it depends on the question being asked as sometimes a 5 means “Strongly agree”, for other questions it means “Critical” and for others it means “Extremely satisfied”.
Without having both numeric and text results we will have to write A LOT of IF / CASE statements and I, for one, do not want to do that.