Milos Popovic

Map rivers with sf and ggplot2 in R

March 27, 2022 | Milos Popovic

Last week on March 22 we had the World Water Day, which focuses on the importance of freshwater and brings attention to the 2.2 billion people living without access to safe water. This year the focus was groundwater, a valuable source that feeds our springs, rivers, lakes and wetlands.

I celebrated the World Water Day with a map of European rivers, which highlights the most important streams on the old continent.

In this tutorial, we will dive into the nuts and bolts of reconstructing this map. We will use the Global River Classification (GloRiC). GloRiC applies supervised classification to global stream network from World Wildlife Fund’s HydroSHEDS to create over river reach types at the global level. The dataset includes over 35 million kilometers of rivers and streams classified into more than 8 individual river reaches. Impressive 😲!

Inspired by this cool blog post, I decided to write a short and sweet tutorial on making a crisp river map with different widths based on a long-term average discharge. Let’s get started!

Get river data

Let’s import a few libraries, shall we? This time we’ll use only 3 libraries: httr to retrieve the data via GET function; tidyverse and sf for spatial analysis and data wrangling.

windowsFonts(georg = windowsFont('Georgia'))

libs <- c("httr", "tidyverse", "sf")

installed_libs <- libs %in% rownames(installed.packages())
if (any(installed_libs == F)) {
  install.packages(libs[!installed_libs])
}

invisible(lapply(libs, library, character.only = T))

As a warm up, we write and send a request to the GloRiC database via this link. The url leads to the Sync.com depository where the shapefile is located in a zipped folder. Below, we download the zipped folder under the name “eu_rivers.zip”, set the progress bar (feel free to omit the latter if you dislike verbose outputs), and unzip the folder. Finally, we list all the shapefiles that include the name of the downloaded file.

# 1. GET RIVERS DATA
#---------

get_data <- function(url, res, filenames, riv_list) {

  url <- "https://m201.syncusercontent1.com/mfs-60:cb0f370cfc9aad110e47af0c14c6d6a0=============================/p/HydroRIVERS_v10_eu_shp.zip?allowdd=0&datakey=ge8BrffNb5ff90cF/3Dr8CGlfSRTS2g+I3poOJEOpd65Nwv0agBsI6yzqEFKLIDb/9iORBIKcrW+jONfWXU+Dm2WNnWn2KcP3R2WQMtiHUpSXuIg9e3gkGUd+nMSCIDRVvkSLQfFifZAeYjNKnNMSWPKn5mJIC/0RZ3kWGVVfDgZRYHLVgMMtxn+47XRsMbvz2WOE8R8J4n0RB7xVI5tlpmFR8Bv3nZB0HYc43cdyYSW5DqyKGji5CkoYTBNzE/jow5oBri2TOZWNxWVV9ET/9/udwd8KuOH++GoRzpgVJZ5WH1LgW7IBaHnDPTHdYENxRe/4JNlXanhauLfqOMNMQ&engine=ln-1.11.18&errurl=GLK4RA2KpWnRCSNSKOrDC8T6XxhjDIuUGGVvUo/zJAGGma48px5C67R8o6en+Qmegmceb/EJkpCEYm4SRBZDNMs+CgNFjFgEYdN4byFeygcAnvSAclpxktxsJc18rczzZtNuL4oyVr7SxjPeDdyHsveNgnibmLp1UWA45i5K12NRlbfK55BDYGOrwLu2RRJUm4AWUNeYpo4f8yqZ8zxQGUuLcKGx68bKP5I7ri8Lf8nkGsxRvT/axtcNmUiaZ706TofwP6mkzzaDZs+xin46q3P//GOuLornz7vN+q/d89qYsgpm5KsqyqEV+DePj2K6/vMDT4BLYWtxLeY1ESZgyg==&header1=Q29udGVudC1UeXBlOiBhcHBsaWNhdGlvbi96aXA&header2=Q29udGVudC1EaXNwb3NpdGlvbjogYXR0YWNobWVudDsgZmlsZW5hbWU9Ikh5ZHJvUklWRVJTX3YxMF9ldV9zaHAuemlwIjtmaWxlbmFtZSo9VVRGLTgnJ0h5ZHJvUklWRVJTX3YxMF9ldV9zaHAuemlwOw&ipaddress=1398165091&linkcachekey=89159c590&linkoid=51000013&mode=101&sharelink_id=3443504720013&timestamp=1649430246943&uagent=0c3c442ba67c66fa3227e09aeb8c763c79676604&signature=1cdeb688c90dc709eb9e652f6ccaf3beaa64197c&cachekey=60:cb0f370cfc9aad110e47af0c14c6d6a0============================="

  res <- GET(url,
             write_disk("eu_rivers.zip"),
             progress())
  unzip("eu_rivers.zip")
  filenames <- list.files("HydroRIVERS_v10_eu_shp", pattern="*.shp", full.names=T)

  riv_list <- lapply(filenames, st_read)

  return(riv_list)
}

We read the Europe rivers shapefile into R by calling the previous function to get the list of files for import. Then we apply st_read to it and retrieve a list object. Since we want sf object, it’s sufficient to take the first component of the list and our wishes will be fulfilled.

get_rivers <- function(filenames, list_riv, eu_riv) {

  filenames <- get_data()
  list_riv <- lapply(filenames, st_read)
  eu_riv <- list_riv[[1]] %>% 
  st_cast("MULTILINESTRING")

  return(eu_riv)
}

In the next step, we make sure that our lines are regarded not as single LINESTRING objects but rather as interconnected MULTILINESTRING object. Here is how our river object looks in tabular format.

Simple feature collection with 938544 features and 14 fields
Geometry type: MULTILINESTRING
Dimension:     XY
Bounding box:  xmin: -24.47292 ymin: 12.60208 xmax: 69.51667 ymax: 81.78958
Geodetic CRS:  WGS 84
First 10 features:
   HYRIV_ID NEXT_DOWN MAIN_RIV LENGTH_KM DIST_DN_KM DIST_UP_KM CATCH_SKM
1  20000001         0 20000001      0.69        0.0        5.6     13.23
2  20000002         0 20000002      4.09        0.0        7.7     30.58
3  20000003         0 20000003      5.09        0.0        9.2     20.30
4  20000004         0 20000004      3.95        0.0        9.8      8.42
5  20000005         0 20000005      1.77        0.0       12.0      6.01
   UPLAND_SKM ENDORHEIC DIS_AV_CMS ORD_STRA ORD_CLAS ORD_FLOW  HYBAS_L12
1        13.2         0      0.175        1        1        7 2120062410
2        30.6         0      0.401        1        1        7 2120062410
3        20.3         0      0.262        1        1        7 2120062380
4         8.4         0      0.107        1        1        7 2120062380
5        33.7         0      0.420        2        1        7 2120062370
                         geometry
1  MULTILINESTRING ((59.2625 8...
2  MULTILINESTRING ((58.24167 ...
3  MULTILINESTRING ((57.50833 ...
4  MULTILINESTRING ((57.07917 ...
5  MULTILINESTRING ((56.55208 ...

Create river widths

European rivers and catchments are an intricate network of millions of lines. It would be worthwhile to distinguish among the most prominent rivers. We could assign different widths based on a size class from the GloRiC database.

Luckily, the creators of the database have already classified the rivers into ordered categories. One of them is ORD_FLOW, a logarithmic size classes of rivers based on their long-term average discharge. There are 8 such classes (3-10) organized in a descending order. So we just need to assign different widths based on these classes. We do this below using mutate to create width and case_when to assign widths. The latter is well-known among the users of SQL. In this context, it assigns a value to our width column based if the condition is met.

# 2. CREATE RIVER WIDTHS
#---------

get_river_widths <- function(eu_riv) {

  eu_riv <- get_rivers() %>% 
  mutate(width = as.numeric(ORD_FLOW),
         width = case_when(width == 3 ~ 1,
                           width == 4 ~ 0.8,
                           width == 5 ~ 0.6,
                           width == 6 ~ 0.4,
                           width == 7 ~ 0.2,
                           width == 8 ~ 0.2,
                           width == 9 ~ 0.1,
                           width == 10 ~ 0.1,
                           TRUE ~ 0)) %>%
  st_as_sf()

  eu_riv$geometry <- eu_riv$geometry %>%
    s2::s2_rebuild() %>%
    sf::st_as_sfc()

  return(eu_riv)
}

Newer versions of sf don’t use the flat Earth model. Instead the package uses the spherical geometry operators from s2 library. In our case, this would break the code because some river lines have invalid spherical geometry.

A quick fix is to turn off this feature via sf::sf_use_s2(FALSE). Ideally, we want to fix the features with invalid spherical geometry, so that s2 can process it. That’s why we apply s2::s2_rebuild() in the chunk above.

Bounding box

Just a few more steps before we do magic with ggplot2. Our object includes both Europe and the Middle East so we want to make sure that we capture mostly Europe. We do so by making a bounding box. Let’s define the parameters for our bounding box with WGS84 coordinates. In this tutorial, we’ll be using the World Equidistant Cylindrical projection to flatten our map. So we first define this projection and then transform the coordinates.

# 3. MAKE BOUNDING BOX
#---------

get_bounding_box <- function(crsLONGLAT, bbox, new_prj, bb) {

  crsLONGLAT <- "+proj=longlat +datum=WGS84 +no_defs"

  bbox <- st_sfc(
  st_polygon(list(cbind(
    c(-10.5, 48.5, 48.5, -10.5, -10.5),
    c(35.000, 35.000, 69.5, 69.5, 35.000)
    ))),
  crs = crsLONGLAT)

  new_prj <- st_transform(bbox, crs = 4087)
  bb <- st_bbox(new_prj)

  return(bb)
}

Mapping European rivers

Alright, folks, we’re ready to map European rivers. We first plot the river lines and assign specific colors based on the classes and width based on the defines widths.

Since we aim to narrow our view to Europe, we use coord_sf to set our latitude and longitude limits based on the pre-defined bounding box.

We will use shades of blue to paint our river classes. Also, we will define the size limits as a range of numeric values from 0 to 0.3. I encourage you to play with this range and see what you get. Finally, we use a sequence of alpha values to make larger rivers stand out on our map.

# 4. MAP
#---------

get_river_map <- function(eu_riv, bbox, p) {

  eu_riv <- get_rivers()
  bbox <- get_bounding_box()

  p <- 
    ggplot() +
    geom_sf(data=eu_riv, aes(color=factor(ORD_FLOW), size=width)) +
    coord_sf(crs = 4087,
      xlim = c(bbox["xmin"], bbox["xmax"]), 
      ylim = c(bbox["ymin"], bbox["ymax"])) +
    labs(y="", subtitle="",
         x = "",
         title="Rivers of Europe",
         caption="©2022 Milos Popovic https://milospopovic.net\nSource: ©World Wildlife Fund, Inc. (2006-2013)\n HydroSHEDS database http://www.hydrosheds.org") +
    scale_color_manual(
        name = "",
        values = c('#08306b', '#08519c', '#2171b5', '#4292c6', '#6baed6', '#9ecae1', '#c6dbef', '#deebf7')) +
    scale_size(range=c(0, .3)) +
    scale_alpha_manual(values=c("3" = 1, "4" = 1, "5" = .7, "6" = .6, "7" = .4, "8" = .3, "9" = .2, "10" = .1)) +
    theme_minimal() +
    theme(text = element_text(family = "georg"),
      panel.background = element_blank(), 
      legend.background = element_blank(),
      legend.position = "none",
      panel.border = element_blank(),
      panel.grid.minor = element_blank(),
      panel.grid.major = element_blank(),
      plot.title = element_text(size=40, color="#2171b5", hjust=0.5, vjust=0),
      plot.subtitle = element_text(size=14, color="#ac63a0", hjust=0.5, vjust=0),
      plot.caption = element_text(size=10, color="grey60", hjust=0.5, vjust=10),
      axis.title.x = element_text(size=10, color="grey20", hjust=0.5, vjust=-6),
      legend.text = element_text(size=9, color="grey20"),
      legend.title = element_text(size=10, color="grey20"),
      strip.text = element_text(size=12),
      plot.margin = unit(c(t=1, r=-2, b=-1, l=-2),"lines"),
      axis.title.y = element_blank(),
      axis.ticks = element_blank(),
      axis.text.x = element_blank(),
      axis.text.y = element_blank())

  return(p)
}


p1 <- get_river_map()
ggsave(filename="european_rivers_new.png", width=7, height=8.5, dpi = 600, device='png', p1)

Aaaaand, voila! 🤩

photo1

Alright, that’s all ladies and gents! In this tutorial, you learned how to import river spatial files and make a cool river map of Europe in R. Feel free to check the full code here, clone the repo and reproduce, reuse and modify the code as you see fit.

This tutorial opens the doors for you to map other river networks from the GloRiC database. In fact, you can tweak my code a bit and make river map of Africa, Americas or Asia. For example, here is the link to the Africa shapefile. Try it out and let me know how it goes!

I’d be happy to hear your view on how this map could be improved or extended to other geographic realms. To do so, please follow me on Twitter, Instagram or Facebook! Also, feel free to support my work by buying me a coffee here!