May 2, 2018
Don’t have Tableau Prep? You can download it here.
Want the packaged Tableau Prep Flow explored in this post? Click here to download it.
Overview
As anyone who has read anything on my blog that relates to survey data knows, the number one impediment to success with Tableau is getting your survey data “just so.”
In this post I’ll show you how to take the same data set I use in all my classes and transform it using Tableau Prep so that it works perfectly with Tableau.
Working with Tableau Prep
Using data similar to what I discuss in the “here’s how your data needs to be setup” blog post, let’s see how we’re going to get the data-as-numbers, the data-as-labels, and the helper file to all play nicely in Tableau Prep.
Importing and Reshaping the Label Responses
Here’s the Tableau Prep home page.
Let’s begin by importing the Label responses from Excel.
We’ll start by clicking Connect to Data so we can see all the sources to which we can connect.
In this example we’ll choose Microsoft Excel and select DataRevelations_SurveyData_V3.xlsx.
Tableau Prep recognizes that there are three tabs in this file. We’ll start by dragging Data Labels onto the canvas.
Inspecting the Merchandise
Let’s see what we have here by clicking the Plus sign next to our newly imported data source and selecting Add Step.
This will change the view so that we can see both the data profile and the data grid, as shown below.
Woah! Look at the stuff in the middle! You can see a histogram of all the responses to each question, before you even analyze things in Tableau! For example, we can se that more men than women took this survey (1) and that there were a handful of people that did not specify where they live (2).
Tableau Prep also allows for some interactive exploration. For example, if instead of showing bucketed values for RespID we instead indicate we want to see details…
We can select individual items in the profile view and see the responses for those selections, as shown below.
Pivoting the Data
We’re now ready to leave the “demographic” columns (Q0_Gender, Q0_Location, etc.) intact and “pivot” the question columns. I place the word “pivot” in quotes because many other tools refer to the action of taking columns and turning them into rows as “unpivoting”. No matter; we are going to “reshape” the data.
We do this by clicking the Plus sign next to the “Clean 1” step and selecting Add Pivot.
Now we need to specify which fields we want to reshape and which we want to leave intact. Below we drag all the fields except the demographic fields and RespID into the Pivot Values area.
Renaming the Pivoted Fields
The only thing left to do is to rename the reshaped fields (Pivot1 Name and Pivot1 Values). We can do that by using the ever-so-Tableau approach of right-clicking the field name and selecting Rename Field.
Here we’ll change the first field to Question ID and the second to Text Responses.
At this point if we want to inspect the results we can click the Plus sign and selecting Add a Step, the results of which are shown below.
Importing and Reshaping the Numeric Responses
We’re now ready to import numeric-encoded survey responses.
We’ll start by dragging the Data Numbers source into the canvas.
It turns out there are some fields we won’t need as already have all the demographics fields we need from the Data Labels source. Below we deselect the fields. Note that Q0_Weight can only have a numeric value and is encoded as such in the Data Labels data source.
Pivoting the Data
We’re now ready to leave RespID intact and “pivot” the question columns
We do this by clicking the Plus sign in the “Clean 1” step and selecting Add Pivot.
As with the Data Label responses, we now need to specify which fields we want to reshape and which we want to leave intact. Below we drag all the fields except the RespID into the Pivot Values area.
Renaming the Pivoted Fields
As before, the only thing left to do is to rename the reshaped fields (Pivot1 Name and Pivot1 Values). We can do that by right-clicking the field name and selecting Rename Field.
Here we’ll change the first field to Question ID and the second to Numeric Responses.
Joining the Text and Numeric Responses
Now that we have both the text and numeric results pivoted / reshaped we need to merge the data so that all the numeric responses line up with all the text responses. That is, for every Resp ID we want to make sure the text and numeric responses for each Question ID line up properly.
We’ll do this in Tableau Prep by joining the two reshaped data sources.
Tableau Prep will see that there’s a match on RespID and create a join using just that field.
We need to indicate that we also want to join using the Question ID field. We do that by clicking the plus and specifying the additional field, as shown below.
If we inspect our results by adding a new step we see that everything lines up perfectly, but we also have redundant RespID and Question ID fields.
We can take care of the redundant fields by right-clicking and selecting Remove fields.
Renaming a Step and Seeing Changes
You can change the wording below each of the steps in the Tableau Prep flow to convey something a little more descriptive than “Pivot 1” and “Clean 2.” For example, we can right-click Clean 3 and rename it “Remove fields” as shown below.
You may also be wondering how you can see those changes should you want to edit the flow. Notice below that there is a strip along the left of the profile pane that contains the text “Changes (2)”.
Clicking the “>” icon will expand the changes pane so you review and edit changes.
Adding the Question Helper Metadata
The only thing remaining is to merge the Question Helper data into the flow. We do this by dragging the Question Helper onto the canvas and joining this data source with the Remove fields step as shown below.
Note that even though the field names are not exactly the same (“Question ID” vs “QuestionID”) Tableau Prep infers that these are the fields to join, as shown here.
Let’s see the results by adding a new step.
Note that at this point the data has morphed from containing 45 columns and 845 rows into 12 columns and over 34,000 rows (1).
Note also that by selecting Q2_3 in the profile pane we can see the results for just that question in the data grid (2).
Finally, we may want to remove the redundant QuestionID field (3).
Removing the Null Values
Assuming our survey tool encodes check-all-that-apply questions correctly (coding them as 1s and 0s instead of 1s and blanks) we can reduce the size of our data set by removing Null values. Note that this should not be a cavalier assumption as many survey tools do not do this correctly (e.g., Survey Monkey and Qualtrics). Make sure to check out the companion blog post Using Tableau Prep to fix problems with Check-All-That-Apply questions.
So, assuming it’s safe to do so, here’s how to remove the Nulls.
In the Profile pane, right-click the null bar and select Exclude from the context menu.
Exporting the Results
Now that everything is “just so” we can export our results to either a CSV, TDE, or Hyper format.
Here we’ll create a Hyper data extract by clicking the Plus sign next to the Clean 3 step and select Add Output.
Below we specify that we want to save the results to a file (rather than publishing to a server).
The last step is to click Run Flow which will run our process and output the results to a Hyper file. We’re now ready to use our “just so” data source with Tableau.
Conclusion
Tableau Prep was very easy to learn, and you can’t beat the price (free with your Tableau subscription or part of your current maintenance). The Profile Pane also provides insights to your data that I’ve not seen in any other ETL tool, and Prep’s ability to handle badly coded survey data is elegant and straightforward (see related blog post.)
Tableau Prep also allows you to save your work as a packaged workflow so that you can share the flow and the source data in a single file.
The first release certainly has its downsides; e.g., you cannot connect to SPSS files and you cannot automate when to run an update, and you cannot export to anything except CSV and Tableau extract formats. But even with these shortcomings Tableau Prep is now my tool of choice for getting survey data “just so.”
Don’t have Tableau Prep? You can download it here.
Want the packaged Tableau Prep Flow explored in this post? Click here to download it.