Replicating NYT Weather App

So much time since my last post so I want to post something, no matter what it is, but I hope this will be somehow helpful.
data-visualization
highcharts
Author

Joshua Kunst Fuentes

Published

December 2, 2016

So much time since my last post so I want to post something, no matter what it is, but I hope this will be somehow helpfull

In this post I will show some new features for the next version of highcharter package. The main feature added is hc_add_series now is a generic function! This mean you can add the data argument can be numeric, data frame, time series (ts, xts, ohlc) amonth others so the syntaxis will be a little cleaner.

What we’ll do here? We’ll make an interactive version of the well-well-know-and-a-little-repeated Tufte weather chart.

There are good ggplot versions if you can start https://rpubs.com/tyshynsk/133318 and https://rpubs.com/bradleyboehmke/weather_graphic.

But our focus will be replicate the New York Time App: How Much Warmer Was Your City in 2015? where you can choose among over 3K cities!. So let’s start. So we need a interactive charting library and shiny.

Data

If you search/explore in the devTools in the previous link you can know where is the path of the used data. So to be clear:

All the data used in this post is from http://www.nytimes.com – Me.

We’ll load the tidyverse, download the data, and create an auxiliar variable dt to store the date time in numeric format.

Code
library(tidyverse)
library(highcharter)
library(lubridate)

url_base <- "http://graphics8.nytimes.com/newsgraphics/2016/01/01/weather/assets"
file <- "new-york_ny.csv" # "san-francisco_ca.csv"
url_file <- file.path(url_base, file)

data <- read_csv(url_file)
data <- mutate(data, dt = datetime_to_timestamp(date))

data
# A tibble: 365 × 20
   date       month temp_max temp_min temp_rec…¹ temp_…² temp_…³ temp_…⁴ temp_…⁵
   <date>     <dbl>    <dbl>    <dbl>      <dbl>   <dbl>   <dbl>   <dbl> <chr>  
 1 2015-01-01     1       39       27         62      -4      39      28 NULL   
 2 2015-01-02     1       42       35         68       2      39      28 NULL   
 3 2015-01-03     1       42       33         64      -4      39      28 NULL   
 4 2015-01-04     1       56       41         66      -3      39      27 NULL   
 5 2015-01-05     1       49       21         64      -4      38      27 NULL   
 6 2015-01-06     1       22       19         72      -2      38      27 NULL   
 7 2015-01-07     1       23        9         64       4      38      27 NULL   
 8 2015-01-08     1       21        8         65       2      38      27 NULL   
 9 2015-01-09     1       33       19         64      -1      38      27 NULL   
10 2015-01-10     1       23       16         60      -6      38      27 NULL   
# … with 355 more rows, 11 more variables: temp_rec_low <chr>,
#   precip_value <dbl>, precip_actual <dbl>, precip_normal <dbl>,
#   precip_rec <chr>, snow_rec <chr>, annual_average_temperature <dbl>,
#   departure_from_normal <dbl>, total_precipitation <dbl>,
#   precipitation_departure_from_normal <dbl>, dt <dbl>, and abbreviated
#   variable names ¹​temp_rec_max, ²​temp_rec_min, ³​temp_avg_max, ⁴​temp_avg_min,
#   ⁵​temp_rec_high

Setup

Due the data is ready we’ll start to create the chart (a highchart object):

Code
hc <- highchart() %>%
  hc_xAxis(
    type = "datetime",
    showLastLabel = FALSE,
    dateTimeLabelFormats = list(month = "%B")
  ) %>%
  hc_tooltip(
    shared = TRUE,
    useHTML = TRUE,
    headerFormat = as.character(tags$small("{point.x: %b %d}", tags$br()))
  ) %>%
  hc_plotOptions(series = list(borderWidth = 0, pointWidth = 4))

hc

Nothing. All acording to the plan XD.

Temperatures

We’ll select the temperature columns from the data and do some wrangling, gather, spread, separate and recodes to get a nice tidy data frame.

Code
dtempgather <- data %>%
  select(dt, starts_with("temp")) %>%
  select(-temp_rec_high,-temp_rec_low) %>%
  rename(temp_actual_max = temp_max,
         temp_actual_min = temp_min) %>%
  gather(key, value,-dt) %>%
  mutate(key = str_replace(key, "temp_", ""))

dtempspread <- dtempgather %>%
  separate(key, c("serie", "type"), sep = "_") %>%
  spread(type, value)

temps <- dtempspread %>%
  mutate(
    serie = factor(serie, levels = c("rec", "avg", "actual")),
    serie = fct_recode(
      serie,
      Record = "rec",
      Normal = "avg",
      Observed = "actual"
    )
  )

temps
# A tibble: 1,095 × 4
              dt serie      max   min
           <dbl> <fct>    <dbl> <dbl>
 1 1420070400000 Observed    39    27
 2 1420070400000 Normal      39    28
 3 1420070400000 Record      62    -4
 4 1420156800000 Observed    42    35
 5 1420156800000 Normal      39    28
 6 1420156800000 Record      68     2
 7 1420243200000 Observed    42    33
 8 1420243200000 Normal      39    28
 9 1420243200000 Record      64    -4
10 1420329600000 Observed    56    41
# … with 1,085 more rows

Now whe can add this data to the highchart object using hc_add_series:

Code
hc <- hc %>%
  hc_add_series(
    temps,
    type = "columnrange",
    hcaes(dt, low = min, high = max, group = serie),
    color = c("#ECEBE3", "#C8B8B9", "#A90048")
  )

hc

A really similar chart of what we want!

The original chart show records of temprerature. So we need to filter the days with temperature records using the columns temp_rec_high and temp_rec_low, then some gathers and tweaks. Then set some options to show the points, like use fill color and some longer radius.

Code
records <- data %>%
  select(dt, temp_rec_high, temp_rec_low) %>%
  filter(temp_rec_high != "NULL" | temp_rec_low != "NULL") %>%
  mutate_if(is.character, str_extract, "\\d+") %>%
  mutate_if(is.character, as.numeric) %>%
  gather(type, value,-dt) %>%
  filter(!is.na(value)) %>%
  mutate(type = str_replace(type, "temp_rec_", ""),
         type = paste("This year record", type))

pointsyles <- list(
  symbol = "circle",
  lineWidth = 1,
  radius = 4,
  fillColor = "#FFFFFF",
  lineColor = NULL
)

records
# A tibble: 9 × 3
             dt type                  value
          <dbl> <chr>                 <dbl>
1 1439769600000 This year record high    95
2 1441670400000 This year record high    97
3 1446768000000 This year record high    74
4 1449964800000 This year record high    67
5 1450051200000 This year record high    67
6 1450137600000 This year record high    68
7 1450915200000 This year record high    72
8 1451001600000 This year record high    66
9 1424390400000 This year record low      2
Code
hc <- hc %>%
  hc_add_series(records, "point", hcaes(x = dt, y = value, group = type),
                marker = pointsyles)

hc

We’re good.

Precipitation

A nice feture of the NYTs app is and the chart is show the precipitaion by month. This data is in other axis. So we need to create a list with 2 axis using the create_yaxis helper and the adding this axis to the chart.

Code
axis <- create_yaxis(
  naxis = 2,
  heights = c(3, 1),
  sep = 0.05,
  turnopposite = FALSE,
  showLastLabel = FALSE,
  startOnTick = FALSE
)

Manually add titles (I know this can be more elegant) and options.

Code
axis[[1]]$title <- list(text = "Temperature")
axis[[1]]$labels <- list(format = "{value}ºF")

axis[[2]]$title <- list(text = "Precipitation")
axis[[2]]$min <- 0

hc <- hc_yAxis_multiples(hc, axis)

hc

The 2 axis are ready, now we need add the data. We will add 12 series -one for each month- but we want to asociate 1 legend for all these 12 series, so we need to use id and linkedTo parameters and obviously. That’s why the id will be a 'p' for the firt element and then NA to the other 11. And then linked this 11 to the first series (id = 'p').

Code
precip <- select(data, dt, precip_value, month)

hc <- hc %>%
  hc_add_series(
    precip,
    type = "area",
    hcaes(dt, precip_value, group = month),
    name = "Precipitation",
    color = "#008ED0",
    lineWidth = 1,
    yAxis = 1,
    fillColor = "#EBEAE2",
    id = c("p", rep(NA, 11)),
    linkedTo = c(NA, rep("p", 11))
  )

The same way we’ll add the normal precipitations by month.

Code
precipnormal <- data %>%
  select(dt, precip_normal, month) %>%
  group_by(month) %>%
  filter(row_number() %in% c(1, n())) %>%
  ungroup() %>%
  fill(precip_normal)

hc <- hc %>%
  hc_add_series(
    precipnormal,
    "line",
    hcaes(x = dt, y = precip_normal, group = month),
    name = "Normal Precipitation",
    color = "#008ED0",
    yAxis = 1,
    id = c("np", rep(NA, 11)),
    linkedTo = c(NA, rep("np", 11)),
    lineWidth = 1
  )

Final Result

Curious how the chart looks? Me too! Nah, I saw the chart before this post.

Code
hc

Shiny App

With R you can create a press style chart with some wrangling and charting. Now with a little of love we can make the code resuable to make a shiny app.

Someone put the grid lines for the 2 axis as the original NYT app please to these charts! I will grateful if someone code that details.

See you :B!

Reuse

Citation

BibTeX citation:
@online{kunstfuentes2016,
  author = {Joshua Kunst Fuentes},
  title = {Replicating {NYT} {Weather} {App}},
  date = {2016-12-02},
  url = {https://jkunst.com/blog/posts/2016-12-02-replicating-nyt-weather-app},
  langid = {en}
}
For attribution, please cite this work as:
Joshua Kunst Fuentes. 2016. “Replicating NYT Weather App.” December 2, 2016. https://jkunst.com/blog/posts/2016-12-02-replicating-nyt-weather-app.