Jan 112012
 

Note: This post is very dated. Please see the YouTube video from the Tableau 2014 conference and the Visualing Survey Data page for more up-to-date information.

Overview

In my previous blog post I discussed how to reshape your survey data so that it plays nicely with Tableau.  In this post we’ll explore how to visualize three types of survey questions:

  1. Yes / No / Maybe
  2. Check-All-That-Apply
  3. Likert Scale

Note: A fully working collection of interactive visualization may be found at the end of this blog post.

Connecting to the Data

We need to connect to the reshaped data source we created in Part 1.  For those of you that are following along and downloaded the sample data, there are several tabs in the source Excel file.  Make sure you connect to the one marked “Use This One” as shown below.

Connecting to the right data source

When asked how you want to connect to the data, select “Import all data”.

You’ll be happier if you import all the data and let Tableau create an extract

There are two reasons for doing this:

  1. With larger data sets your performance will be much faster; and,
  2. You will need Tableau’s COUNTD() function and that is not available when you connect directly to Excel (or Access, for that matter.)

Your dimensions and measures shelves should look like the one shown here.

Default Dimensions and Measures

Note that the ID “Measure” should in fact be a Dimension, so when you get a chance, drag the field up to the Dimensions shelf.

So, just who are these people?

Before visualizing any questions I like to get a sense of who participated in the survey.  I’ll start by looking at Gender.

Start by getting a feel for who answered the survey questions

Notice that I place COUNTD(ID) on the columns shelf indicating that I want to count the number of unique.  Had I used the regular COUNT() function I would count all of the rows that contain a respondent’s ID, and that is a large number given how we’ve reshaped the data.

The “Null” value indicates that some people chose not to respond to this demographic question.  I suggest creating an alias called “Did not respond” to handle these non-responses.

From here I will create views for any other demographic groups (e.g., Generation, Location, etc.).

Visualizing a Yes / No / Maybe question

Let’s start by seeing what questions are available by placing Question on the Rows shelf.

With reshaped data there’s only one field, but many questions

As you may recall from the previous blog post, our sample survey contained the question “Do you plan to vote in the upcoming election?” for which there were three possible responses (“Yes”, “No”, and “I don’t know.”)

As I’m not interested in all the questions in the survey I need to add a filter that just keeps the first question in the list.

To see preliminary results for this question I will place “Response” on the rows shelf and COUNTD(ID) on the columns shelf.  While not particularly pretty, it does show us something useful.

Preliminary results

I will next alias the Question (and Responses), sort the Responses, and place Response on the color shelf.  I will also add a table calculation to the Columns shelf so that instead of showing a count we show the percentage of the total.

A simple, clear visualization

Now we’re talking!

The nice thing about the way we have this set up is that it’s easy to break down the results by our demographic segments simply by dragging a dimension to either the rows or column shelf, as shown below.

Results broken down by generation

Check-All-That-Apply Questions

To get a sense of what we have going on for this question I duplicated the sheet from the previous visualization and then changed the Question filter so that I’m looking at the check-all-that-apply questions.  Here are the filter settings.

Selecting questions using a filter

Do you now see the major advantage of having reshaped the data?  Had I kept the one column / one question approach I would have to use Tableau’s Measure Name / Measure Names feature and I would likely have to build a separate calculated field for every question I want to analyze.  With reshaped data I just put Question on the Rows shelf and filter by the questions I want to analyze.

Tip: I will usually create a set for each question grouping I want to review, sort of a “named filter”, as it were. In the accompanying workbook you will find two sets, one called “Check All That Apply” and the other called “LikertQuestions”.  You can then drag either of these sets to the filter shelf.

Here’s the resulting visualization.

A bad way to show check-all-that-apply results

Well, it’s a start, but it’s a not a a very good visualization.  For one thing, I only care about the percentage of people that selected “yes”; I don’t even need to consider the “no” folks.

I need to fashion a formula that counts of how many “yes” responses there were for each question, then divide by the number of responses.  This would be simple of the responses for these questions were 1 and 0, but unfortunately the responses are labels that look like this:

1: Yes
0: No

So, I need to perform a little string arithmetic.  The calculated field that does what we want is shown here.

SUM(
IIF (Left([Response],1)="1" ,1,0)
)
/COUNTD([ID])

And a visualization that employs this formula is shown here.

A better way to show check-all-that-apply results

That’s much better.  And of course, I can add a demographic dimension to glean additional insights.  In the screen below I show a comparison between male and female respondents, using an overlapping bar chart.

Check-all-that-apply question, broken down by gender

Likert Scale Questions

Note: Before continuing, you must promise that upon reading the rest of this blog post you read the following two blog posts:

http://www.datarevelations.com/the-likert-question-question.html

http://www.datarevelations.com/likert-scales-the-final-word.html

You promise?

Okay, since you promised…

As with the previous two question types, the first things I need to do is place “Question” on the filters shelf and select the group of questions we want to analyze in this visualization:

Using a filter to select the Likert scale questions

Now I’ll create a stacked bar chart, by placing “Question” on the Rows shelf, CNTD(ID) on the Columns shelf, and response on the Color shelf.

First pass at Likert Scale viz

Next I need to replace the CNTD(ID) measure with a table calculation that shows the percent of the total based on Response.

Adding percent of total to the viz

Since the questions are sorted in alphabetical order it’s hard to gauge which questions garner the greatest amount of agreement.  This is where using the Likert scores (4 for “Very high degree”, 3 for “High degree”, etc.) can help add clarity.

While I go into much greater detail in the two blog posts YOU PROMISED TO READ, I’ll walk you through how to calculate the average Likert score and superimpose if atop the stacked bars.

In our example, the value I need is the first character in the Response string, so we just need to grab the left-most character and convert it into a number.  I can do that using the following calculated field called LikertValue.

Float(Left([Response],1))

Next, I’ll place the Average of this calculated field on the Columns shelf next to the existing table calculation (don’t worry that the result will not look good.)  I now need to indicate that we have multiple marks and make the first chart a Bar chart and the second a Circle chart, as shown here.

As is often the case, as you first make improvements you can get some wacky results

This still looks pretty dumb, but if I remove “Response” from the Color shelf, then right-click AVG(LikertValue) on the columns shelf and select “Dual Axis” we get something that starts to be useful.

Getting close…

Finally, if I show labels for the circle chart and sort the questions by Average LikertValue we get something that tells a clear story.

Not too shabby

Conclusion

In this and the accompanying blog post we’ve explored how to reshape survey data and visualize three different types of survey question.  Granted, there are other question types (e.g., “add up to 100”, “rank these in order”, etc.) as well as intra-question analysis (e.g., create a scatterplot showing respondent’s age vs. annual income) that are a bit trickier to implement, but what we’ve delved into here should cover a wide variety of needs.

Question – any interest in a one or two day course devoted to visualizing survey data in Tableau?

[suffusion-the-author]

[suffusion-the-author display='description']
 Posted by on January 11, 2012 2) Visualizing Survey Data, Blog  Add comments

  52 Responses to “Using Tableau to Visualize Survey Data — Part 2”

Comments (51) Pingbacks (1)
  1. Very nice. Hopefully people look at this second post to get why reshaping is so powerful. Nice look on the final view. Now I need to go read the other two blog posts.

  2. Great information, Steve!   

  3. Steve,

    This is one of the most useful blog posts I have read in a long time.  It was very timely for me as I was asked to create a report/dashboard of a recent customer survey.

    The dashboard I have created is nearly identical to the example you’ve provided.

    Thanks for all of your great work!

    Andy

  4. Steve,

    Agreeing with Andy.   Super informative.  My mother would ask you if your head hurts from all the smarts.

    Peace and All Good!
    Michael

  5. Hi Steve, the problem I have with this approach is the ability to cross tab data….you can’t put different questions in the column and row areas…and you cannot easy filter on responses if you’re using dissimilar questions (e.g., country and likert). How do you handle that?

    • Ah, Tony, I’ve been waiting for somebody to bring this up.
      What you describe is intra-question analysis where you want to compare responses to one question against another (vs. just filtering by gender, education, ethnicity, etc.) In this case, you will need to join the question table to itself. There are a bunch of different ways you can approach this, including using parameters to allow the user to choose which two questions he/she wants to compare.

      Steve

      • Steve,  excellent series of posts.  Very useful as an end-to-end survey analysis process.

        Can you possibly elaborate on intra-question analysis further?  I’ve been experiencing a number of difficulties lately in dealing with multi-choice questions that behave like demographics.  Using your example above, I need to know across the select-many options, how did they rate the “Grace Under Pressure” likert question?  In a single-select multi-choice like Gender, this is easy… just drop the dimension on the shelf and, vioala!  Not quite as easy in a select-many because the select-many answers and the likert answers are on separate rows.

        I’m not sure parameters get you there, but I am curious about “joining a question table to itself”.  Do you have any other posts that discuss this concept, or could you elaborate a bit further?

        Thanks!

        • Matt,

          Hmm. Sounds like a need to write “Using Tableau to Visualize Survey Data — Part 3”.

          Tony and Matt, give me a few days to get to this, but if you can’t wait you just add another table — and that table is the question table — the field names will be renamed with a different prefix.

          Sorry I can’t hit it any sooner…

          Steve

          • Looking forward to part 3! I’ll play with joining the question table in the meantime. Thanks! 

  6. awesome stuff, thanks a bunch

  7. Hi Steve,
    We generally do a weighting on survey data to represent the sample to the population – e.g. age group, gender ect. In this process every respondent will be counted with a weight factor instead of counting “1”. It is easy to handle the weight factors for single response questions but how do we handle these factors for multi response questions? Do you have any sample which can be shared?
    Kind Regards,
    Sajeevan

    • Sajeeven,

      As I believe you are looking for the percentage of people that selected a particular choice from a multi-select question, I think the following formula (or something similar) will do the trick for you:

      sum(if [Value]=1 then 1*[Weight]
      else 0
      end)/sum([Weight])

      Where [Value] is either 0 or 1 depending on whether a person selected the item and [Weight] is the weighting for the particular question.

      Steve

  8. We are new Tableau users and struggling to create a good visualization of results from a ranking exercise. In essence, with <50 respondents and 16 rankable concepts we believe a nice visualization would be a vertical scatter plot for each concept displaying the number of persons (typically 0 to 4) who gave a particular rank (0 to 16)  to each concept. We actually are close ( I can make tableau show the number if persons assigning a given rank with a number if dots). But ideally I want that count if persons translated into the size of a single circle, rather than a number of equally sized circles.

    With all that said I was curious if you have posted anything on using tableau to dismally ranking exercises?

    • SG,

      If you send me the data I’d be willing to see what I can do. You may also want to post to the Tableau forum and see what others develop. It’s a little hard for me make suggestions without better understanding the underlying data.

      Steve

  9. As the dual axes for the average Likert and the Gantt bars cannot be synced, do you have a method for choosing the axis range to get a reasonable visualization of the circles vs. Gantt?

  10. Alex, good question.  In the non-Gantt version above I have the axis fixed to go from 0 to 4, which are the extreme Likert values.  In the Gantt version at http://www.datarevelations.com/likert-scales-the-final-word.html I’m doing something needlessly complicated as I can again just mirror the Likert extremes (which are 1 through 5 in this case) and I get something that works well.  I’m not pleased that I have to enter fixed axis numbers vs. having something that is automatically generated.  A way to get something generated automatically does not immediately come to mind…

    • I came up with this – it seems to work pretty well by scaling the average over the same range that the likert gantt operates on:

      (avg([LikertValue])-[Likert Center Parameter])/([Likert Center Parameter]-1)

      and then syncing the axis. It operates on the principle that if the average Likert score is exactly in the middle of the responses, the mark should lie on the 0 of the Gantt axis.

  11. I am trying to present some survey data and I’ve been following through your blog posts. I am a Tableau beginner and I’m finding myself stuck on this bit: “Notice that I place COUNTD(ID) on the columns shelf indicating that I want to count the number of unique.” Can you explain where you get the COUNTD(ID) field from? I tried creating a count formula but when I dragged it to the columns section it turns into AGG(CountFormula) which doesn’t work. (I have demographic data from another source which I have linked with my survey responses table, which might be complicating things.) 

    Many thanks! 

    • Kirsty,

      There should be some type of identifier for each survey respondent (e.g., “ID”, “RespID”, “Respondent ID”, etc.) If the field name contains the letter “ID” in it and you are using Tableau 8.1 Tableau will know this is a dimension and not a measure. If Tableau think’s the field is a measure you should drag it from the Measures area to the Dimensions area.

      Next, you get the COUNTD([Field Name]) by right-draging the field onto the Columns shelf and selecting COUNTD([Field Name]) from the dialog box that appears.

      Steve

  12. Thank you!
    I’m a new tableau user and was starting to worry that I wasn’t going to be able to use it for surveys at all, but this series of posts (and the other two which I have read and will read again) is saving me. 

  13. Hi Steve,

    Your posts and ebook are my bible. As you already know.

    I have complex multi-level choose-all-that-apply questions with answers spread across many columns. For instance:

    3. Which channels do your customers use now / which do you expect to support in three years / in five years?Please select all that apply in each column. Channel – brokers and agents

    Answers would be something like:

    Now
    Now

    Now
    Now

    etc.

    So this question has about 30 columns –

    Channels 1-5 (learn about products)
    Channels 1-5 (buy products)
    For now, in 3 years, in five years?

    My question is this: How would you shape this data so it’s easy to make comparisons between the answers of interest? Probably now, in three years, in five years, and for selected channels.

    Your posts are very clear but my survey writers are out of control.

    Thanks, Steve.

    Dan

    • Dan,

      Funny you bring this up as I will be addressing this exact issue (and many others) when I present at the Tableau Conference in Seattle in September (see http://tcc14.tableauconference.com/schedule/tuesday).

      I’ve not blogged about this yet but I now recommend that people create a simple “helper” spreadsheet that contains the following columns

      QuestionID
      Wording (The human-readable version of the question)
      Grouping (this makes it easy to grab all the questions that are related to each other)

      You then join QuestionID from your reshaped data to this simple “helper” spreadsheet.

      So, how does this address your situation where you have these two similar question groups, the first might be called “what things do you measure today” and the other might be called “what things do you plan to measure next year”?

      In both cases the “wording” for the QuestionID is the same, but there is a different Question Group. It might look something like this:

      QuestionID Wording Question Grouping
      Q0 Vote in the upcoming election? Vote
      Q1 Pulse Rate What do you measure — Today
      Q2 Metabolism What do you measure — Today
      Q3 Blood Pressure What do you measure — Today
      Q4 Temperature What do you measure — Today
      Q5 Galvanic Skin Response What do you measure — Today
      Q6 Breathing What do you measure — Today
      Q7 Perspiration What do you measure — Today
      Q8 Pupil Dilation What do you measure — Today
      Q9 Adrenaline Production What do you measure — Today
      Q1n Pulse Rate What do you measure — Next Year
      Q2n Metabolism What do you measure — Next Year
      Q3n Blood Pressure What do you measure — Next Year
      Q4n Temperature What do you measure — Next Year
      … you get the idea

      You would then filter by Grouping and include “What do you measure — Today” and “What do you measure — Next Year”.

      You would then place “Wording” on rows and “Grouping” on columns and you would be off to the races, as it were.

      I do plan to blog about this, but probably not until after the conference. If you want me to walk you through it feel free to give me a call.

      Steve

      • Steve,

        Yeah, I’ve been forced to concatenate my labels too, but I the filtering and grouping piece is where I fell down. Sometimes the labels go three deep, though (as in the example I gave in my query). Very helpful response, especially the suggestion about “wording” on rows and “group” in columns. I’m not quite there, but this helps me move forward. It’s a matter of putting in the hours. Sorry I won’t be able to hear you in Seattle.

        Dan

  14. I am trying to reshape and visualize survey data where the demographics are “check all that apply”. I end up with 6 race columns filled with (1,0, -99 or blank). If I ‘combine fields’ the 6 race questions turn into 19 combinations across questions that represent the single or multiracial configurations.

    I am trying to figure out the best way to organize and reshape my data to more easily look at survey questions responses by demographic responses.

    Thanks!

  15. Can someone point me to the “accompanying workbook” with the “Check all that apply” and “Likert questions” sets? Can’t find them in this series of posts.

    These would be very helpful to use as templates for my survey questions.

    Thanks,

    –TE

    • Tom, at the bottom of the pst (but before the comments) is an interactive dashboard. Along the bottom of the dashboard is a “download” link that when clicked will download the packaged workbook (.twbx file).

      I also think you will find the materials that I presented at the Tableau conference in September 2014 useful (indeed, they cover more material and employ some new techniques). Here’s a link: http://www.datarevelations.com/DataRevelations_TC2014.zip

      Kindest regards,

      Steve Wexler

  16. Perfect. Thanks

  17. Steve-
    Thank you for these posts. Being redundant (but who doesn’t like praise every so often?), these are really helpful as I start to wade through the survey data waters. Thanks!

  18. Hi Steve,

    How would you handle a ranking question where they have a list of items and were asked to rank their top three reasons. Ex: What were your top three reasons for shopping at XYZ store today? (Price, Product Quality, Good Customer Service, Location, etc.) Sorry if you already have blogged about this, I might have missed it.

    Thanks,
    Trudy

    • Trudy,

      Good question. While I have vizzed this I haven’t done it in years and would have to think about it a bit. It will be a variation of a check-all-that-apply question where you would have the percentage of people that ranked something first, second, third, etc.

      Do make sure to look at the blog post on visual ranking within a category as well.

      I’ll see if I can dig up that particular viz with the “thank these first, second, third”.

      Steve

  19. good post….love it.

    i need to design a dashboard for digital marketing analytics.

    and i had one dimension called resolution rating …how can i design a scorecard which consist of following data :
    Column name : Resolution Rating

    1. Yes but it was somewhat difficult
    2. No
    3.Yes it was easy

    i need total count of yes and no in percentage in my resolution rating scorecard.

    and if user will select particular geo ( country ) the resolution rating will change

  20. I feel like an idiot but when I go to open my reshaped data, I do not get the Count(ID) variable.

    • Ashley, the variable may not be called “ID”. Do you have something in the source data that’s called “Response ID” or “Respondent Number”, or something like that?

      For this to work, each respondent needs some type of unique identifier.

      Steve

  21. Dear Steve, great job!
    Just one question, How Can I reverse the Likert scale scores for questions denoting adverse feelings?
    Thanks a lot for Your information.

    • Giuseppe,

      I’m not sure what you mean be reversing the Likert scale scores. Do you mean that instead of

      5 = Excellent

      it should be

      1 = Excellent?

  22. This blog post has been incredibly useful! I am new to Tableau, and will be using it often for survey data analysis.

    I have been successful in following your guidelines all the way up to creating the average Likert score to superimpose over my stacked bars. For some reason, the numerical variable I created won’t “average”- it seems to calculate for every unique ID, rather than the question as a whole.

    I hope that makes sense, and I hope you can help me.

    Many thanks,
    Katie

    • Katie,

      It sounds like you have a unique ID on the level of detail so to get a average you would have to use either a table calculation or a LoD expression.

      Would you like to send me the packaged workbook and I can have a look?

      Steve

  23. Very useful, used this with the new Qualtrics web-connector and used the “Numeric Answer” to calculate the mean. Worked great. Thanks!

  24. Hi Steve, this blog post is very useful. I have a question, how can I visualize two types of questions “Single response” and “check all that apply” together, knowing that both questions have the same response choices. I need them in one chart showing the total percentage?

    Thanks :)

    • Abeer,

      I think you answered your question — the single response question is a single “yes / no” question and the check-all-that-apply is just a collection of “yes / no” questions. Both can use the same formula to determine the percentage:

      SUM(If response =”Yes” then 1 else 0 end) / SUM([Number of Records])

      Steve

  25. Hi Steve,

    I encountered a problem while calculating the average in the likert scale question.

    The calculated average by Tableau differed from the right manually-calculated one & from excel. By digging deeper, I found that Tableau multiply “the score answered by each respondent” by “the number of questions answered by that respondent”. And in my case not all respondents answered all the survey questions (due to skips).

    Is there any way to fix this issue?

    Thank You

    • Yehia,

      I suspect excluding Null responses by using a filter will solve the problem.

      The Excel add-in removes the Nulls when it reshapes the data. The 9.0 Pivot feature leaves the Nulls.

      Steve

  26. “any interest in a one or two day course devoted to visualizing survey data in Tableau?”

    YES!!!!! Please.

  27. hi steve,
    i just encountered a problem like this..
    Could you teach me a little?

    https://s-media-cache-ak0.pinimg.com/originals/89/84/05/898405936511c007c148c2f34135e90c.png

    Thanks, Jason

  28. How can I get an comparission (cross table) between two questions using this data format?
    e.g. No of users who answered question 1 with yes and questions 2 with no etc.?

    Thanks

  29. I want to say thanks so much for this post. You have saved me hours of work as I attempt to make hundreds of descriptive tables and figures for my thesis, which use longitudinal health data from three linked surveys. Thanks so much for adding this clarity to the task at hand. This is the first time I’ve ever posted a ‘thank you’ on a site, but this was amazingly helpful.

 Leave a Reply

(required)

(required)