…Or when two half chart are more than one, or maybe ‘The love between the streamgraph and stack column chart’… In this post we’ll review some ideas in the chart used by Financial Times in it’s Coronavirus tracker, and how we can replicate it in R using highcharter package.
Week ago I see a tweet from Steven Bernard @sdbernard from Financial Times showing a streamgraph on the top of a stacked column chart. I take a look some seconds and then boom: What a combination! Why?
I like the streamgraph but it is hard to see the change the distribution between categories when the total change sudden. So have this auxiliar chart is a nice add to don’t loose from sigth the distribution.
Data
In this post we will use the Our Workd In Data Covid deaths (link here) because I’m not sure what is the data used by Financial Times team.
We’ll load the data and check the structure and get only what we need to replitcate the chart:
# A tibble: 7 × 2
continent n
<chr> <int>
1 Africa 86641
2 Asia 75850
3 Europe 82907
4 North America 62373
5 Oceania 36492
6 South America 21284
7 <NA> 18423
FT used the 7-day rolling average in the chart so we’ll use the {RcppRoll} package to get that series for each contienent. Check the next code to see how the function roll_meanr works.
[1] NA NA 2 3 4 5 6 7 8 9
Now we need to group the data to calculate the roll mean for every country/location and then filter to reduce some noise.
The chart show continent so we’ll group by date and continent.
Before combine two charts we need to know how to get every chart independently. Let’s start with the main one:
A good start, but we can do it better. So, some considerations:
The yAxis don’t have a meaning in the streamgraph so we’ll remove it.
We can set endOnTick and startOnTick en yAxis to gain some extra vertical space.
Remove the vertical lines to get a more clear chart.
Get a better tooltip (table = TRUE).
In this case but we can try adding labels to each series instead of using legend, same as the FT chart.
This is not associate to the chart itself but what is representing: In the original FT chart some countries like UK, US are separated for their continent because are relevant, and then the color used is similar to their continent to get the visual association.
To separate the information for some coutries from theirs continent we’ll create a grp variable:
# A tibble: 13 × 3
continent grp n
<chr> <chr> <int>
1 Africa Africa 8322
2 Asia Asia 7205
3 Asia India 146
4 Europe Europe 7734
5 Europe Russia 146
6 Europe United Kingdom 146
7 North America Mexico 146
8 North America North America 5694
9 North America United States 146
10 Oceania Oceania 3504
11 South America Brazil 146
12 South America Chile 146
13 South America South America 1752
Fun part #1: To the continent which have separated countries will add the "Rest of " to be specific this is no the total continent.
# A tibble: 13 × 3
continent grp n
<chr> <chr> <int>
1 Africa Africa 8322
2 Asia India 146
3 Asia Rest of Asia 7205
4 Europe Rest of Europe 7734
5 Europe Russia 146
6 Europe United Kingdom 146
7 North America Mexico 146
8 North America Rest of North America 5694
9 North America United States 146
10 Oceania Oceania 3504
11 South America Brazil 146
12 South America Chile 146
13 South America Rest of South America 1752
Fun part #2: We’ll use a specific color for each continent, and a brighten variation for the the separated countries. For this task the {shades} package offer the brightness function.
continent
grp
n
aux
continent_color
fct
grp_color
continent_cln
Africa
Africa
36659
TRUE
#f1c40f
0.00
#F1C40F
africa
Asia
India
75811
FALSE
#d35400
0.10
#EC5E00
asia
Asia
Rest of Asia
140569
TRUE
#d35400
0.00
#D35400
asia
Europe
Russia
56198
FALSE
#2980b9
0.05
#2C89C6
europe
Europe
United Kingdom
30761
FALSE
#2980b9
0.10
#2F92D2
europe
Europe
Rest of Europe
199806
TRUE
#2980b9
0.00
#2980B9
europe
North America
United States
154283
FALSE
#2c3e50
0.05
#33485D
north_america
North America
Mexico
47339
FALSE
#2c3e50
0.10
#3A526A
north_america
North America
Rest of North America
19174
TRUE
#2c3e50
0.00
#2C3E50
north_america
Oceania
Oceania
3304
TRUE
#7f8c8d
0.00
#7F8C8D
oceania
South America
Brazil
98961
FALSE
#2ecc71
0.05
#31D978
south_america
South America
Chile
9002
FALSE
#2ecc71
0.10
#34E67F
south_america
South America
Rest of South America
83471
TRUE
#2ecc71
0.00
#2ECC71
south_america
Then exctract some vectors:
Before continuing let’s see the original colors and the finishes obtained with the {shades} package.
The colors and levels are ready so let’s regroup the data using this new grp variable:
Then plot the previous chart but now considering all the comments made before.
The stacked column chart
For the stacked column chart we’ll use the data which have deaths by continent (no grp). This is a simple chart so the only important part is set borderWidth, groupPadding, pointPadding to 0 to remove the space between columns.
The final chart
There are some important things to do before code the final chart:
Create and add two yAxis using hc_yAxis_multiples and create_yaxis functions. One for each type of series. The two series will share the same xAxis.
For the column series we’ll use the id parameter with the unique(cont) value, then in the streamgraph use the linkedTo parameter to link the series. With this the Russia, UK and Rest of Europe series from the streamgraph are link with the Europe series from the stacked column chart, so if the user click the Europa legend all those series will hide.
Data, Code and VisualizationData, Code and Visualization/about.htmlhttps://github.com/jbkunst/bloghttps://twitter.com/jbkunsthttps://fosstodon.org/@jbkunst
The tale of two charts combined – Data, Code and VisualizationThe tale of two charts combined – Data, Code and VisualizationThe tale of two charts combined – Data, Code and VisualizationData, Code and Visualization…Or when two half chart are more than one, or maybe ‘The love between the streamgraph and stack column chart’… In this post we’ll review some ideas in the chart used by Financial Times in it’s Coronavirus tracker, and how we can replicate it in R using highcharter package.…Or when two half chart are more than one, or maybe ‘The love between the streamgraph and stack column chart’… In this post we’ll review some ideas in the chart used by Financial Times in it’s Coronavirus tracker, and how we can replicate it in R using highcharter package.