How Data Helps Tell the Story: A COVID-19 Example

AvatarApril 14, 2020
blank

In uncertain times, like those we are in right now, data is everywhere. Bar graphs and line charts are flooding our news feeds and taking over our television screens, but what does it all mean?

Analysts continue to be empowered with business intelligence (BI) and “Data Viz” software upgrades. However, the tools alone can’t tell the story contained in your data. It’s the responsibility of the person behind the data to take the default “out of the box” story and fine tune it to better reveal the real story that is hidden deep inside the data.

The Case for Data Storytelling: COVID-19

With all the COVID-19 dashboards out there, how do you make your graph stand out from the rest? The quick answer, Data Storytelling.

To better explain how to tell a better story with your data, let’s walk through a possible scenario in this time of COVID-19. Put yourself in the position of someone who is helping distribute emergency COVID-19 supplies to different states and counties that need them most. You have been given a typical COVID-19 dataset that has counts of cases and deaths at a US county level. Ultimately you have to identify what counties that we need to get these medical supplies to first

Here’s what the dataset looks like.

 

blank

Where do we start? How can we make our dashboard tell us the story of which counties need our help?

Before we proceed – does this look familiar?

 

blank

It seems to be the default map view for rendering the data in our extract above. Visually jarring, it certainly grabs our attention. But what exactly is this map telling us? If we are trying to identify counties most in need of supplies, then it fails miserably.

Luckily, there is a better way to look at this same data… Let’s start building out our story.

First up, let’s see if we can make the bubble map more useful. I’m going to rework this data in a MicroStrategy Dossier as that is what I prefer, but it is equally easy to create in any other BI tool you’d prefer to use.

  • First, let’s pull in the COVID-19 dataset from John Hopkins University
  • Select the Geospatial Services Visualization
  • Drop in the geographic longitude and latitude fields required to map the county level data
  • To give the map more meaning, drop in in a metric “COVID-19 Cases Reported”

Pass 1 done. Let’s look at what MicroStrategy has generated for us…

 

blank

For this first pass, it’s not too bad. For certain, it’s much better than the red bubble map we looked at above. But we still have a problem. While it is interesting to look at, what’s the story being told here? Just like the red bubble map, it’s quite difficult to see exactly what’s going on. It looks like the darker colors indicate more cases of COVID-19 and vice versa for the lighter colors.

Unfortunately, this view still does a bad job of identifying the high impact counties, which is the aim of this exercise after all. How can we improve on the default out of the box view? While the out of the box blue color buckets are not that bad, they are not set up to tell the story that we need.

The out-of-the-box default buckets look like this…

blank

 

What we need to do is to take these defaults and start tweaking the color buckets until we start seeing the story appear. For an analysis like this, I usually start by making the following changes to the colors being rendered on the map. A good thing to constantly remind yourself of is  “what exactly are you trying to see on the map?”. In this case, we are trying to identify the top states and counties with the most recorded cases of COVID-19.

With this in mind, let’s update the buckets as follows

 

blank

This will allow us to “see” the Top 5% or 159 counties with recorded COVID-19 cases while sending all the “less impacted” states to the background in a muted grey color.

Here’s what this looks like…

 

blank

Much better, wouldn’t you agree. We can already see the “hot spots” take shape– our story is starting to be revealed. The muted grey colors and pops of color draw your attention to the counties that are most impacted. Basically, we are taking the noise out of our data on the map and are starting to reveal the story of “Which Counties Need Medical Supplies Now”.

Next, we need to fine tune the story a bit more and look at just the Top 2% or 64 counties in need of help. Update the buckets to show three grey buckets and a single 2% color pop bucket.

blank

 

Let’s take a look…

 

blank

This time, we see even less distractions on the map and only the top counties are catching our eye. We can go another step further and identify the Top 1% or 32 impacted counties.

Update the buckets as follows – three grey buckets and a single 1% color pop bucket.

 

blank

Now, we have really fine-tuned our map to only show the counties that are most impacted. These counties are the ones that we need to get supplies to.

 

blank

Compare this map to the original “Red Bubble” map. Which map is telling us the best story? Which one would allow us to take action faster?

Now that we have identified the top impacted counties in the United States, we can take this story even further. How about superimposing an additional layer of data on top of our county data to add even more value to the insights we have already discovered?

Adding a second layer of medical supply warehouse locations and current stock levels at each could add even more value to our analysis. Let’s do that.

  • Add a new “map layer”
  • Set it to appear “on top” of the county level layer
  • Format the Warehouse locations as equal sized circles
  • Finally, add some color thresholding for the circles to allow us to visually see the health of the stock levels at each warehouse

Render the map and it should look like this:

 

blank

Now, we can easily see our “Most impacted COVID-19 counties” along with visual signals identifying the closest supply warehouse and its current in stock supply level.

The Importance of Data Storytelling

We know that data can be an extremely powerful tool in communicating, however too much data can be dangerous if not organized. Data Storytelling defined as communicating actionable insights through visual and narrative stories, does more than just put pretty pictures and graphs on a page. Instead it operationalizes the data to reduce time to insight and ensure more meaningful outcomes for users.

Using data storytelling above we were able to transform complex information into something easier to understand. Without much effort we were able to identify a story and inform actions from a complex dataset revealing the proximity of warehouses to counties most in need of COVID-19 aid.

 

blank

 

Questions about where to start? Sign up for a design session, and one of our analytics experts will consult with your company about your goals with data science.