Apr 052017
 

More thoughts on the Markimekko chart and in particular how to build one in Tableau.

April 4, 2017

Overview

Given my reluctance to embrace odd chart types and my conviction that I would find something better I was surprised to find myself last month writing about — and endorsing — the Marimekko chart.

If I was surprised then I’m absolutely gobsmacked to be writing about it again.

What precipitated all this was another very good example of the chart in the wild. After admiring it I couldn’t help but “look under the hood” (hey, we are talking about Tableau Public and people sharing this stuff freely) and I thought that the dashboard designer was working harder than he needed to build the visualization.

So, if people are going to use these things I thought I would share an alternative, and I think easier, technique for building them.

The Great Example from Neil Richards

Here’s the terrific Makeover Monday dashboard from Neil Richards where we see the likelihood of certain jobs being replaced by automation.

01_Neil

Neil does a great job highlighting some of the more interesting findings, but if you want to know more than what Neil highlights you’ll need to explore the dashboard on your own.

Notice that in both this case and in Emma Whyte’s we are dealing with only two data segments; e.g., male vs. female and at-risk vs. not at-risk jobs. Having only two colors is one of the main reasons why the chart works well.

Okay! Uncle! I agree that under the right conditions this is a useful chart and I can see what you may want to make one.

But is there an easier way to make one?

An Easier Way to Create a Markimekko Chart in Tableau

It turns out the same technique Joe Mako showed me six years ago for building a divergent stacked bar chart works great for fashioning a Markimekko.  Let’s see how to do this using Superstore data with fields similar to what was available in both Emma and Neil’s dashboards.

Let’s say I want to compare the magnitude of sales with the profitability of items by region.  Figure 2 shows the overall magnitude of sales but makes comparing profitability difficult.

Figure 2 -- Overall sales is easy to see but comparing profitability across regions is difficult.

Figure 2 — Overall sales is easy to see but comparing profitability across regions is difficult.

Here’s another attempt using a 100% stacked bar chart.

Figure 3 -- Showing profitability with a 100% stacked bar chart.

Figure 3 — Showing profitability with a 100% stacked bar chart.

Yes, this does a much better job allowing us to compare the profitability of each region, but there’s no way to easily glean that Sales in the West is almost double sales in the South (which is easy to do in Figure 2.)

So, how can we make the regions that have large sales be wide and the regions that have small sales be  narrow?

Understanding the Fields

Before going much further let’s make sure we understand the following three fields:

  • Percentage Profitable Sales
  • Percentage Unprofitable Sales
  • Sales Percentage of
[Percentage Profitable Sales]

This is defined as

SUM(IF [Profit]>=0 THEN [Sales] END)/SUM(Sales)

… and translates as “if the profit for an item within a partition is profitable, add it up, then divide by the total sales within the partition.”

This is the field that gives us the 90%, 77%, 76%, and 72% results shown in Figure 3.

[Percentage Unprofitable Sales]

This is defined as

1 - [Percentage of Profitable Sales]

… and gives us the 10%, 23%, 24%, ad 28% shown in Figure 3.

[Sales Percentage of]

This is defined as

SUM([Sales]) /TOTAL(SUM([Sales]))

… and we will use it to compute the percentage of sales across the four regions (i.e., show me the sales for one region divided by the sales for all the regions). Here’s how we might use it in a visualization.

Figure 4 -- Using the calculation to figure out how wide each region should be.

Figure 4 — Using the calculation to figure out how wide each region should be.

So, in Figure 4 we can see that the West segment is a lot thicker than the South segment.

How can we apply this additional depth to what we had in Figure 3?

Make it Easy to See if the Math is Correct

At this point it will be helpful to see the interplay of the various measures and dimensions using a cross tab like the one shown in Figure 5.

Figure 5 -- Cross tab showing the relationship among the different measures and dimensions.

Figure 5 — Cross tab showing the relationship among the different measures and dimensions.

The first four columns are easy to interpret:

“I see that sales in the West is $725,458 of which 10% is unprofitable and 90% is profitable.  That $725,458 represents 31.6% of the total sales.”

But how is the field called [Start at] defined and how are we going to use it?

Understanding [Start at]

[Start at] is defined as

PREVIOUS_VALUE(0)+ZN(LOOKUP([Sales Percentage of],-1))

This is the calculation that figures out where the mark should start while [Sales Percentage of] will later determine how thick the mark should be.  Let’s see how this all works together.

Figure 6 -- How [Start at] and [Sales Percentage of] will work together.  Note that “Compute Using” for the two table calculations is set to [Region].

Figure 6 — How [Start at] and [Sales Percentage of] will work together.  Note that “Compute Using” for the two table calculations is set to [Region].

For the West region we want to start at 0% and have a bar that is 31.6% units side. The function

PREVIOUS_VALUE(0)

Tells Tableau to look at whatever is the value for [Sales at] for the row above and if there is no row above make the value 0 (see Item 1 in Figure 6, above.)

Add to this the value for [Sales Percentage of] in the previous row (Item 2 which is also not present) and you get 0 + 0 (Item 3).

For the East region we want to start wherever West left off (Item 3 plus Item 4, which gives us item 5) and make the mark 29.5% wide (item 6).

For the Central region we want to start wherever the previous region left off (Item 5 plus item 6, which gives us item 7) and make the mark 21.8% wide (Item 8).

Let’s see how this all fits together into the Marimekko visualization in Figure 7.

Figure 7 -- Using [Start at ] and [Sales Percentage of] to make the Marimekko work.

Figure 7 — Using [Start at ] and [Sales Percentage of] to make the Marimekko work.

There are three things to keep in mind.

  1. [Start at] is on columns and determines the starting point (how far to the right) for each of the regions.
  2. [Sales Percentage of] is on Size and determines how thick the bars should be.
  3. Size is set to Fixed width, left aligned, where Fixed means the measure on the Size shelf is determining the thickness.
Figure 8 -- Size must be fixed and left-aligned.

Figure 8 — Size must be fixed and left-aligned.

Some Interesting Findings

I built a parameter-driven version of the Marimekko (embedded at the end of this blog post) that allows the viewer to select different dimensions and different ways to sort. Here’s what happens when we look at Sub-Category sorted by Profitability.

Figure 9 -- Profitability by Sub-Category.

Figure 9 — Profitability by Sub-Category.

Okay, not a big surprise here given how many visualizations we’ve all seen showing that Tables are problematic.

That said, I was in for a surprise when I broke this down by state and sorted by the magnitude of sales, as shown below.

Figure 10 -- Profitability by state, sorted by Sales.

Figure 10 — Profitability by state, sorted by Sales.

Wow, after 11 years of living with this data set I never realized that 60% of the items sold in Texas were unprofitable.  Who knew?

To be honest I’m not convinced we need a Marimekko to see this clearly.  A simple sorted bar chart will do the trick, as shown in Figure 11.

Figure 11 -- Sorted bar chart.

Figure 11 — Sorted bar chart.

Indeed, I think this very simple view is better than the Marimekko in many respects.

I guess it depends what you’re trying to get across.

See for Yourself

I’ve included an embedded workbook that has the Superstore example as well as versions of the visualizations Emma Whyte and Neil Richards built, but using this alternative technique.

I encourage you to think long and hard before deploying a Marimekko.  But if you do decide to build one I hope the techniques I explored here will prove useful.

 

Mar 202017
 

Or

How I stopped worrying and learned to love appreciate the Marimekko

March 19, 2017

Overview

Readers of my blog know that I suffer from what Maarten Lambrechts calls xenographphobia, the fear of unusual graphics.  I’ll encounter a chart type that I’ve not seen before, purse my lips, and think (smugly) that there is undoubtedly a better way to show the data than in this novel and, to me, unusual chart.

That was certainly my reaction to “Marimekko Mania” when Tableau 10.0 was first released. I didn’t see a solid use case for this chart. There were some wonderful blog posts from Jonathan Drummey and Bridget Cogley on the subject, but I just wasn’t buying the need for the chart type.

Note: It turns that for many situations you can make a perfectly fine Marimekko just using table calculations. I’ll weigh in on this later.

Enter Emma Whyte and Workout Wednesday

My “I’ll never need to use that” arrogance was disrupted a few weeks ago when I read this blog post from Emma Whyte.  The backstory is that Emma reviewed a Junk Charts makeover of a Wall Street Journal graphic, really liked the makeover, and decided to recreate it in Tableau.

Here’s the Wall Street Journal graphic.

Figure 1 -- Source of inspiration for Junk Charts  and Emma Whyte. From a 2016 survey by LeanIn.org and McKinsey & Co.

Figure 1 — Source of inspiration for Junk Charts  and Emma Whyte. From a 2016 survey by LeanIn.org and McKinsey & Co.

There are two important things the data is trying to tell us:

  1. The percentage of women decreases, a lot, the higher up you go in the corporate hierarchy; and,
  2. There are far more entry-level positions than there are managers than there are VPs, etc.

The chart does a good job on the first point but only uses text to covey the second point.

Contrast this with Emmy Whyte’s visualization:

Figure 2 -- Emma Whyte's makeover.

Figure 2 — Emma Whyte’s makeover.

Whoa.

I immediately “grokked” this.  There are way more men than women among VPs, Senior VPs, and in the C-Suite, but look how much narrower those bars are!  True, I cannot easily compare how much wider the Entry Level column is than the VP column, but is that really important?

Is the Marimekko in fact the “right” way to show this?

Being a little bit stubborn I was not ready to declare a Marimekko victory so I decided to see if I could build something that worked as well, if not better, using more common chart types.

Anything You Can Do, I Can Do…

I won’t go through all ten iterations I came up with but I will show some of my attempts to convey the data accurately and with the visceral wallop I get from Emma’s makeover.

100% Stacked Bar with Marginal Histogram

Putting a histogram in the margin has become a “go to” technique when I’m dealing with highlight tables and scatterplots so I thought that might work in this situation. Here’s a 100% stacked bar chart combined with a histogram.

Figure 3 -- 100% stacked bar with marginal histogram. 

Figure 3 — 100% stacked bar with marginal histogram.

I was so convinced this would just smoke the Marimekko. I mean just look how easy it is to make accurate comparisons!

That may be true, but I think the Marimekko in question does a better job.

Connected Dot Plot

Here’s another attempt using a connected dot plot.

Figure 4 -- Connected dot plot where the size of the circles reflects the percentage of the workforce.

Figure 4 — Connected dot plot where the size of the circles reflects the percentage of the workforce.

Here the lines separating the circles show the gender gap and the size of the circles reflects the percentage of the workforce.

OK, I think the gap is well represented but the spacing between job levels is a fixed width.  In my pursuit of accuracy I needed to find a way spread the circles based on percentage of the workforce.

Diverging Lines with Bands

Figure 5 shows two diverging lines with circles and bands that are proportionate to the percentage of the workforce (Entry level is 52 units wide, Manager is 28 units wide, and so on).

Figure 5 -- Diverging lines with dots and correctly-sized circles and bands

Figure 5 — Diverging lines with dots and correctly-sized circles and bands

But why are the lines sloping?  Shouldn’t the lines be flat for each job level?

Flat Lines

Here’s a similar approach but where the lines stay flat for each job level.

Figure 6 -- Flat lines and accurate circles and bands.

Figure 6 — Flat lines and accurate circles and bands.

More Approaches and the Graphic from the Actual Report

All told I made ten attempts.  The calculation I came up with for Figure 5 also made it possible to create a Markimekko just using a simple table calculation.

Note: I asked Jonathan Drummey to have a look at the Marimekko-with-table-calc approach and he points out that in both my example and Emma Whyte’s example the data isn’t “dense” so you can break the visualization simply by right-clicking a mark and selecting Exclude. That said, the technique is fine for static images and dashboards where you disable the Exclude functionality.

I also reviewed the full Women in the Workplace report and saw they used an interesting pipeline chart to relate the data.

Figure 7 -- "Pipeline" chart from Women in Workplace report (LeanIn.Org and McKinsey & Co.)

Figure 7 — “Pipeline” chart from Women in Workplace report (LeanIn.Org and McKinsey & Co.)

I applaud the creativity but have a lot of problems with the inaccurate proportions. Notice that this chart also has a sloping line suggesting a continuous decrease as you go from one level to another.

And The Winner is…

For me, Emma Whyte’s Marimekko does the best job of showing the data in a compelling and accurate format and I thank Emma for presenting such a worthwhile example.

Will I use this chart type in my practice?

It depends.

If the situation calls for it, I would try it along with other approaches and see what works best for the intended audience.

Here’s a link to the Tableau workbook that contains a copy of Emma Whyte’s original approach and many of my attempts to improve upon it. If you come up with an alternative approach that you think works well, please let me know.

Postscript

Big Book of Dashboards co-author Jeff Shaffer encouraged me to make some more attempts. Here’s a work in progress using jittering.

Jitter with bands

I think this looks promising.