Apr 162012
 

Note: While the information in this blog is useful,  I’ve discovered some better ways to perform intra-question analysis; I just haven’t blogged about them yet.  Feel free to nag me.

Overview

So, I thought I was all done with this subject (see http://www.datarevelations.com/using-tableau-to-visualize-survey-data-part-2.html) but a couple of blog readers (Matt and Tony) “busted” me and pointed out that my approach did not address a way to conduct intra-question analysis.  That is, we could cut Likert scale and check-all-that-apply questions by gender, location, etc., but there was not an easy mechanism to compare responses for people that answered “yes” to one question with their Likert scale opinions on another question.  To put it another way, we did not have a way to see whether or not folks that answered “no” to voting in the next election strongly agree with the statement that a candidate is good at playing jazz.

I’ll confess that I don’t perform this type of analysis that often, but when I do, an addition to drinking Dos Equis, I find that the most interesting insights usually come from a scatterplot analysis where, for example, we compare the responses for annual household income with how much respondents are willing to pay per month for Internet service.

Note: If you have not already done so, please read Part 1 and Part 2.

Further note: click here to download the sample survey data.

Final note: The interactive visualization may be found at the end of this blog post (you’ll find the download link there as well).

 

What it Looks Like When it’s Done

Consider the example below where I compare responses to the question “Will you vote in the upcoming election?” with respondents’ Likert scale opinions on a candidates’ ability to play jazz, their grace under pressure, consistency in eating all his/her vegetables, and intelligence.

Comparing plans to vote in next election with several Likert-scale questions

Notice that I’m using a staggered Likert scale visualization to offset negative from positive responses.  Joe Mako busted my chops last time for not doing this in my survey analysis visualization.

That said, I am going to build a simplified visualization (the one below) as I want our focus to be on intra-question analysis, not the best way to fashion Likert-scale visualizations.

A simpler visualization

 

Getting Two Sets of Questions by Joining a Table to Itself

(Or joining a table that’s really similar to the primary table)

If you downloaded the data source workbook you will see there is a tab called “Use This One” that contains data that looks like this:

Data from the “Use This One” table

There’s another tab called “Reshaped Questions” and it contains the exact same information, but without the demographic columns:

Data from “Reshaped Questions” table.

To achieve our goals we will need to join these two sheets (or tables) using ID as a common field.

 

Editing the Data Connection and Joining the Tables

We need to edit the Tableau data connection, indicate that we want to use multiple tables, and then add a table (in this case the table called “Reshaped Questions”).  We then need to fashion an inner join that looks like this:

Joining the two tables

Note: This approach will generate A LOT of additional rows (180K vs 9K).  Once you figure out just which questions you want to include in your intra-question analysis you may want to refine your extract and exclude questions you don’t need.

Tableau will add two new dimensions to the mix, as shown below.

New dimensions giving us two sets of data

 

So, Let’s See What We Got

If we place ‘Reshaped Questions$’_Question and ‘Reshaped Questions$”_Reponse on the columns shelf, Question and Response on the rows shelf, and CNT(Number of Records) on the Text mark shelf, we’ll see a complete cross tab of all questions and responses plotted against all questions and responses.

Cross tab of all questions

Now we can refine our view by looking at just one question from the Reshaped Questions table and a handful of questions from the “Use This One” table, as shown here.

Comparing results from one questions with responses from several Likert-scale questions

Next, by moving things around a little bit and by changing the CNT formula to show a percentage of the total, we start to glean some interesting insights.

Refining the cross tab

And finally, if we apply some of the Likert scale visualization techniques we explored in several earlier blog posts, we start to see some interesting patterns (e.g., folks that don’t know if they will vote in the upcoming election offer more positive responses to the Likert-scale questions.)

Further refinement combining stacked bars and circle marks

 

So, Why is this Part 2 ½ and not Part 3?

There is so much more we could explore with respect to intra-question analysis, but I have a backlog of posts I need to write.  I did, however, want to get people heading in the right direction.

One thing I will encourage people to explore is using parameters to allow consumers to select which questions they want to see displayed, especially if you build scatterplots.

Matt and Tony, I hope this addresses some of the missing pieces in visualizing survey responses in Tableau.

[suffusion-the-author]

[suffusion-the-author display='description']
 Posted by on April 16, 2012 2) Visualizing Survey Data, Blog  Add comments

  17 Responses to “Using Tableau to Visualize Survey Data — Part 2 ½”

Comments (17)
  1. For another example see
    http://dl.dropbox.com/u/32015088/NJTUGJan31.twbx

    I presented this example to the NJ Tableau Users Group in January.

  2. A few quick things.

    Firstly, I don’t agree with showing neutrals either side of the zero on the offset stacked view.  I had a discussion with Andy Cotgreave on my site when I showed this a year or so ago (http://www.organizationview.com/net-stacked-distribution-a-better-way-to-visualize-likert-data).  The summary is that I believe neutrals should be treated as effectively ‘zeros’ when looking at positive responses vs negative responses.  You might have lots of zeros but you can’t split them 50/50

    Second, you shouldn’t take an average (mean) of a likert value as it’s ordinal data.  Median and mode are appropriate but don’t help you much.  My preferred method is to create a ‘net’ score (i.e. (positive responses – negative responses)/100 and shown as a percentage).

    Finally, when comparing questions I use heat maps in Tableau.  They act like small multiples.  As you say, use parameters to select questions.

    • Andrew,

      This blog post was really more about the mechanics of setting up survey data than the different approaches to displaying Likert scale responses. I have written three separate blog posts on the subject, the most recent of which may be found here.

      You’ll notice that I have a “kitchen sink” dashboard that allows you to control whether or now you want to see Likert scores, neutral responses, and so on. Personally, I want to see the neutrals and want to see the average Likert score, but I respect the arguments not to see these things.

      Thanks for taking the time to post.

      Steve

  3. Very nice job! It’s so bad I didn’t discover this post 3 days ago just after you wrote it. I participated in the Tableau Student challenge in association with CARE and it dealt with a survey about people in Lesotho. This likert scale would have made my dashboard much more impressive! I am glad I found your site.

    Thanks for sharing! I look forward to more posts like this one.

    Arturo

  4. Steve, any reason you don’t just do a self join instead of the extra worksheet? When selecting multiple tables you can choose the same worksheet and join on Responder ID. You do get two copies of the demographic data but it saves a step. Am I missing something?

    • I have also been experimenting with using data blending instead of a join to reduce the cross-product effect. By blending on responder ID, but not question it seems you can set up at least some straightforward question by question comparisons. I still haven’t decided if this is easier than a self-join. 

  5. Alex, you can indeed do a self join.

  6. Hi I am a beginner, I am trying to apply the same example you did. can you tell me what does the dollar sign $ mean in [‘Reshaped Questions$’]?

    • Yehia,

      The screen shots are from an older version of Tableau. [Reshaped Questions$] is the name of the tab in the Excel workbook. We are joining two tables and this is just the name of one of the tables.

      Steve

  7. Hi Steve! I see only counts and no % in the tab ‘Cross tab for all responses’. What’s the reason for your not including %?
    When my clients ask for this type of cross-tab-for-all like this, they always want percentages as well as count. In my data the layout is slightly different from yours. I have Question Section/Question/Response in rows & my columns show only 1 question at a time. Question Section is mainly to help keep likert/grid questions together, as opposed to their being separate like in your example of Can Play Jazz, Eat all his/her vegetables, etc. I’ve figured out a calculation that count respondent ID’s by Question Section, and I then use this to Percent of Total, Pane Down calculation. It seems to work for now…although I have a feeling I haven’t discovered a mistake yet.

    Any thoughts?

    • Hanh,

      The numeric counts are for us to get a sense of the data, not something that I would use to present the data.

      Not sure what problems you are running into. Remember, this post is about intra-question analysis (breaking down the results of one question by another question) so the approach will be a bit different than for just breaking down by a demographic.

      Steve

  8. Hi!

    Great stuff as usual!

    Any plans to write up “better ways to perform intra-question analysis”?

    Thanks!

  9. Steve,

    Great post!

    I applied the same methodology to my weighted survey data set.
    However, I am running into issues while calculating the total %.

    I would like to look at (Sum of weights of responses for question)/(total sum of weights of responses for reshaped question).
    For example, (Sum of weights of “Can play jazz”) / (sum of weights of “yes”)

    Thanks!

  10. Hi Steve:

    At the top of this blog you mentioned finding better ways of conducting intra-question analysis. Since your original blog looks like it was written before Tableau 9 was released, I’m curious if there is a way to incorporate both sets of questions (i.e. the “Use this One” and “Reshaped Questions”) using the internal Pivot tool instead of the Excel reshaper. I also wonder if LOD calculations might also be a way to go.

    Thanks!

    • Ron,

      So many things on my to-do list, including this.

      I’m sure there are wonderful things we can do… just need a few days to sit down and do them!

      Steve

Leave a Reply to Naomi B. Robbins Cancel reply

(required)

(required)