Introduction

In this practical, we will replicate a koala population sustainability study conducted in the Lower Hunter region of NSW for the Department of Sustainability, Environment, Water, Population and Communities by the consultancy Eco Logical.

This study serves as an excellent example of applying geospatial analysis to discrete entities. It involves intersecting multiple thematic data layers, each assigned weights based on attribute values, feature geometry, or proximity to other layers. To ensure the practical is manageable within the three-hour timeframe, some layers have been pre-prepared. Additionally, the study area has been reduced to three of the five Lower Hunter LGAs analysed in the original Eco Logical study.

The goal of this practical is to prepare the remaining data layers and conduct the final overlay operation to generate a koala habitat value index.

Further details can be found in the Eco Logical Final Report, which is included in this practical’s data package and available for download from the Australian Government website.

R packages

The practical will primarily utilise R functions from packages explored in previous sessions:

The tidyverse: is a collection of R packages designed for data science, providing a cohesive set of tools for data manipulation, visualization, and analysis. It includes core packages such as ggplot2 for visualization, dplyr for data manipulation, tidyr for data tidying, and readr for data import, all following a consistent and intuitive syntax.
The sf package: provides a standardised framework for handling spatial vector data in R based on the simple features standard (OGC). It enables users to easily work with points, lines, and polygons by integrating robust geospatial libraries such as GDAL, GEOS, and PROJ.
The terra package: is designed for efficient spatial data analysis, with a particular emphasis on raster data processing and vector data support. As a modern successor to the raster package, it utilises improved performance through C++ implementations to manage large datasets and offers a suite of functions for data import, manipulation, and analysis.
The tmap package: provides a flexible and powerful system for creating thematic maps in R, drawing on a layered grammar of graphics approach similar to that of ggplot2. It supports both static and interactive map visualizations, making it easy to tailor maps to various presentation and analysis needs.

Resources

The Geocomputation with R book.
The tidyverse library of packages documentation.
The sf package documentation.
The terra package documentation.
The tmap package documentation.

Study methodology

The methodology behind the koala habitat study is described in detail in the Eco Logical Final Report. The following section reproduces information from Table 3 of the report, providing information regarding the various data layers used in the modelling done by Eco Logical.

Data layers

The data layers that are provided as part of this practical were obtained from Eco Logical and represent the layers that were used in their Final Report. The data layers that are derived as part of the practical use raw data that is different to the data used by Eco Logical in their study (e.g., roads, rail corridors, and built areas are taken from a different source) and so the calculated final layers may be different to the ones produced by Eco Logical.

All data layers were cropped to the extent of the three Lower Hunter LGAs considered here: Lake Macquarie, Newcastle, and Port Stephens.

Data layers provided

1. Primary feed trees

Description:
Selection of vegetation types known to have primary preferred koala tree species with the proportional abundance of the specific tree species within the vegetation type identified as either Nil, low, moderate, or high.
Details:
An initial multiplier of \(\times 3\) was applied to these data to distinguish from secondary feed or supplementary tree species.
Values:
0: Outside of any vegetation classification containing identified primary feed trees.
60: Vegetation types with a low proportion of primary feed trees.
150: Vegetation types with a moderate proportion of primary feed trees.
300: Vegetation types with a high proportion of primary feed trees.
Weighting: \(\times 3\)

2. Secondary feed trees

Description:
Selection of vegetation types known to have secondary preferred koala tree species with the proportional abundance of the specific tree species within the vegetation type identified as either Nil, low, moderate, or high.
Details:
An initial multiplier of \(\times 2\) was applied to these data to distinguish from primary feed or supplementary tree species.
Values:
0: Outside of any vegetation classification containing identified secondary feed trees.
40: Vegetation types with a low proportion of secondary feed trees.
100: Vegetation types with a moderate proportion of secondary feed trees.
200: Vegetation types with a high proportion of secondary feed trees.
Weighting: \(\times 2\)

3. Supplementary trees

Description:
Selection of vegetation types known to have supplementary tree species supporting shelter and roosting for koala, with the proportional abundance of the specific tree species within the vegetation type identified as either Nil, low, moderate, or high.
Values:
0: Outside of any vegetation classification containing identified supplementary trees.
20: Vegetation types with a low proportion of supplementary trees.
50: Vegetation types with a moderate proportion of supplementary trees.
100: Vegetation types with a high proportion of supplementary trees.
Weighting: \(\times 1\)

4. Vegetation proximity to water

Description:
Extant vegetation within 50 m of 3rd order or higher drainage lines.
Values:
0: Outside identified buffer.
100: Within identified buffer.
Weighting: \(\times 2\)

5. Vegetation patch size

Description:
The size of a patch of vegetation. A patch is defined as an area of consolidated vegetation that is separated from other patches by more than 100 m by mapped infrastructure to represent consolidated koala habitat.
Values:
0: Smallest patch size.
100: Patch at least 100 ha in size.
Weighting: \(\times 3\)

Data layers derived in practical

1. Soil fertility

Description:
Identification of areas with high soil fertility.
Values:
0: Outside of identified fertile soils.
100: Within identified fertile soils.
Weighting: \(\times 2\)

2. Buffered koala sightings

Description:
100 m buffer around known recorded sightings of koala.
NB: This is not a comprehensive dataset; it is based only on areas where sightings have been recorded.
Values:
0: Not within 100 m of a known record.
75: Within 100 m of a known record dated 1985 or earlier.
100: Within 100 m of a known record dated 1986 or later.
Weighting: \(\times 1\)

3. Distance to infrastructure

Description:
300 m buffer around identified and mapped major barriers to koala movement.
Values:
0: Within 300 m of identified infrastructure.
100: 300 m or greater from identified infrastructure.
Weighting: \(\times3\)

Koala habitat value index

The final koala habitat suitability map is calculated by combining all the scores assigned to each data layer (see values above) with each layer further multiplied by a weight (see weighting above).

The calculation is as follows:

\[khv = [(veg_{prim} \times 3) + (veg_{sec} \times 2) + veg_{supp}] + (soil \times 2) +\] \[+ (veg_{water} \times 2) + koala + (urb \times 3) + (veg_{patch} \times 3)\] where:

\(veg_{prim}\) : primary feed trees
\(veg_{sec}\) : secondary feed trees
\(veg_{supp}\) : supplementary trees
\(soil\) : soil fertility
\(veg_{water}\) : vegetation proximity to water
\(koala\) : buffered koala sightings
\(urb\) : distance to infrastructure
\(veg_{patch}\) : vegetation patch size

The final score is then normalised to provide a habitat value index from 0 to 100:

Lower koala habitat value (0 – 19) – Areas that have little or no identified mapped values for koala habitat within the landscape. The majority of these areas are highly disturbed, fragmented or urbanised.
Moderate koala habitat value (20 – 33) – Areas that have some mapped koala habitat values within the landscape. In most cases the values within this category will provide supporting habitat for koala in the area.
High koala habitat value (34 – 48) – Areas with priority koala habitat values. These areas will include a large proportion of the criteria for koala habitat and provide an important resource for koalas in the area.
Very High koala habitat value (49 – 100) – An accumulation of priority values for koala habitat. These areas contain the majority of or even all values identified as criteria meeting priority koala habitat within the Lower Hunter area.

Derive missing data layers

Load necessary R packages

All necessary packages should already be installed. However, if any are missing, you can install them using the install.packages() function in R:

# Main spatial packages
install.packages("sf")
install.packages("terra")

# tmap dev version
install.packages("tmap", repos = c("https://r-tmap.r-universe.dev",
                                   "https://cloud.r-project.org"))
# leaflet for dynamic maps
install.packages("leaflet"

# tidyverse for additional data wrangling
install.packages("tidyverse")

Load packages:

library(sf)
library(terra)
library(tmap)
library(leaflet)
library(tidyverse)

# Set tmap to static maps
tmap_mode("plot")

Data files provided

The following data files are provided:

Vector data:

built_areas: a polygon ESRI shapefile representing various ‘built’ land uses in the study area. Data was obtained from OpenStreetMap.
koala_sightings: a point ESRI shapefile representing koala sightings in the study area. Data was obtained from the BioNet Atlas of NSW Wildlife hosted by the NSW Office of Environment and Heritage.
railways : a line ESRI shapefile representing railway tracks in the study area. Data was obtained from OpenStreetMap.
roads : a line ESRI shapefile representing roads in the study area. Data was obtained from the Geoscience Australia GEODATA TOPO 2.5M 2003 data product.
soil_landscapes : a polygon ESRI shapefile representing soil landscapes. The data is based on digital maps from the NSW Office of Environment and Heritage. The following map sheets were combined together: Soil Landscapes of the Gosford-Lake Macquarie (1:100,000), Soil Landscapes of the Newcastle (1:100,000), and Soil Landscapes of the Port Stephens (1:100,000).
study_area_bnd : a polygon ESRI shapefile representing the the study area, obtained by merging the administrative boundaries of the three LGA’s considered here: Lake Macquarie, Newcastle, and Port Stephens.

Raster data:

allveg_patch_size: a geoTIFF raster representing size of vegetation patch. Data obtained from Eco Logical.
allveg_prim, allveg_sec, and allveg_supp: geoTIFF rasters representing koala feed trees. Data obtained from Eco Logical.
dist_to_waters: a geoTIFF raster representing vegetation proximity to water. Data obtained from Eco Logical.

Layer 1: Soil fertility

Load the soil landscapes dataset (soil_landscapes) along with the polygon shapefile representing the study area boundary (study_area_bnd):

# Read in shapefiles
soil_landscapes <- read_sf("./data/data_koala/vector/soil_landscapes.shp")
hunter_bnd <- read_sf("./data/data_koala/vector/study_area_bnd.shp")

# Print the first rows of the soil landscapes layer
head(soil_landscapes)

The soil landscapes layer includes four attribute columns, the forth (PROCESS) representing the geomorphic type of landscape on which the soils have formed.

Plot the soil_landscapes layer, colouring each polygon according to the value of the PROCESS column:

# Plot data for quick visual inspection
soils_map <- tm_shape(soil_landscapes) + 
        tm_fill("PROCESS") + 
        tm_layout(frame = FALSE, legend.text.size = 0.5, legend.title.size = 0.5)

# Plot study area bnd
soils_map + tm_shape(hunter_bnd) +  tm_borders("black")

To clean up the data, we will first crop soil_landscapes to the extent of the study area (i.e., hunter_bnd). Next, we will isolate those soil landscapes that can sustain koala feed tree species. These soils include those formed on Quaternary alluvial, colluvial, and aeolian deposits, and those formed in swampy areas.

To crop the soil_landscapes layer will use the sf function st_intersection():

# Intersect the two polygon layers
soil_landscapes_bnd <- st_intersection(soil_landscapes, hunter_bnd)

# Plot data for quick visual inspection
tm_shape(soil_landscapes_bnd) + 
  tm_fill("PROCESS") + 
  tm_layout(frame = FALSE, legend.text.size = 0.6, legend.title.size = 0.6)

Next we extract the soil landscapes of interest. We use the dplyr function filter():

# Create list of PROCESS values of interest
soils <- c("AEOLIAN", "ALLUVIAL", "COLLUVIAL", "SWAMP", "RESIDUAL", "ESTUARINE")

# Filter data and extract polygons of interest
soil_landscapes_filter <- filter(soil_landscapes_bnd, PROCESS %in% soils)

# Plot study area bnd first as it has larger extent
soils_map2 <- tm_shape(hunter_bnd) +  tm_borders("gray")

# Add soils  
soils_map2 + tm_shape(soil_landscapes_filter) + 
      tm_fill("PROCESS") + 
      tm_layout(frame = FALSE, legend.text.size = 0.6, legend.title.size = 0.6)

The next chunk of code adds a new attribute column (called SCORE) and uses the dplyr function mutate() to assign values based on another column (in this case PROCESS). We assign the value 100 to each polygon where PROCESS equals “AEOLIAN”, “ALLUVIAL”, “COLLUVIAL”, “SWAMP”, “RESIDUAL”, or “ESTUARINE”.

# Add a new numeric field 'SCORE' based on conditions applied to 'PROCESS'
# Here, we assign:
#   100 if 'PROCESS' equals "AEOLIAN", "ALLUVIAL", "COLLUVIAL", ...
#   0 for all other cases
soil_landscapes_filter <- soil_landscapes_filter %>%
  mutate(SCORE = case_when(
    PROCESS == "AEOLIAN" ~ 100,
    PROCESS == "ALLUVIAL" ~ 100,
    PROCESS == "COLLUVIAL" ~ 100,
    PROCESS == "SWAMP" ~ 100,
    PROCESS == "RESIDUAL" ~ 100,
    PROCESS == "ESTUARINE" ~ 100,
    TRUE ~ 0  # default numeric value for unmatched cases
  ))

# Plot study area bnd first as it has larger extent
soils_map3 <- tm_shape(hunter_bnd) +  tm_borders("gray")

# Add soils  
soils_map3 +  tm_shape(soil_landscapes_filter) + 
        tm_fill("SCORE") + 
        tm_layout(frame = FALSE, legend.text.size = 0.6, legend.title.size = 0.6)

The last step will convert soil_landscapes_filter to a raster. This will be our final soil fertility layer:

# Convert the sf object to a terra SpatVector
soil_landscapes_terra <- vect(soil_landscapes_filter)

# Read in the target SpatRaster that defines the extent and resolution
# Here we read in one of the Eco Logical geoTIFFs provided
target_raster <- rast("./data/data_koala/raster/allveg_patch_size.tif")

# Rasterise the polygons using an attribute field
soil_landscapes_rast <- rasterize(soil_landscapes_terra, target_raster, field = "SCORE")

# Convert all NA (no data) values in the new raster to 0
soil_landscapes_rast[is.na(soil_landscapes_rast)] <- 0

# Plot data for quick visual inspection
soils_map4 <- tm_shape(soil_landscapes_rast) + 
       tm_raster(col.scale = tm_scale_categorical()) + 
       tm_layout(frame = FALSE, legend.text.size = 0.6, legend.title.size = 0.6)

# Plot study area bnd on top
soils_map4 + tm_shape(hunter_bnd) +  tm_borders("white")

Layer 2: Koala sightings

Load the koala sightings dataset (koala_sightings):

# Read in shapefile
koala_sightings <- read_sf("./data/data_koala/vector/koala_sightings.shp")

# Print the first rows of the soil landscapes layer
head(koala_sightings)

The koala sightings layer contains 20 attribute fields, with the last one (DateRank) containing two values: ‘75’ for all sightings made before 1985, and ‘100’ for all koala sightings dated 1986 and later. Data on sightings reported before 1985 are considered less reliable than those made after, explaining the lower rank.

Plot the koala_sightings layer, colouring each point according to the value of the DateRank column:

# Plot study area bnd first as it has larger extent
koala_map <- tm_shape(hunter_bnd) +  tm_borders("black")

# Add koalas
koala_map + tm_shape(koala_sightings) + 
        tm_dots(size = 0.2, fill = "DateRank", fill.scale = tm_scale_categorical()) + 
        tm_layout(frame = FALSE, legend.text.size = 0.5, legend.title.size = 0.5)

In the next step, we drop all columns except DateRank and create a 100 m buffer around each point using st_buffer(). Next, we dissolve the layer using st_union() so that all features are grouped together based on their value for DateRank. Lastly, we rename DateRank to SCORE.

# Retain only the "DateRank" attribute column (geometry is preserved automatically)
koala_sightings <- koala_sightings %>% 
  select(DateRank)

# Create a 100m buffer around each point
koala_buffers <- st_buffer(koala_sightings, dist = 100)

# Dissolve (union) buffers by the DateRank field and rename field to SCORE
koala_buffers_dissolved <- koala_buffers %>%
  group_by(DateRank) %>%
  summarise(geometry = st_union(geometry), .groups = "drop") %>%
  rename(SCORE = DateRank)

# Plot study area bnd first as it has larger extent
koala_map2 <- tm_shape(hunter_bnd) +  tm_borders("black")

# Add koalas
koala_map2 + tm_shape(koala_buffers_dissolved) + 
        tm_fill(fill = "SCORE", fill.scale = tm_scale_categorical()) + 
        tm_layout(frame = FALSE, legend.text.size = 0.5, legend.title.size = 0.5)

We are now ready to convert the koala buffer layer (koala_buffers_dissolved) to a raster with scores:

# Convert the sf object to a terra SpatVector
koala_buffers_terra <- vect(koala_buffers_dissolved)

# Read in the target SpatRaster that defines the extent and resolution
# Here we read in one of the Eco Logical geoTIFFs provided
target_raster <- rast("./data/data_koala/raster/allveg_patch_size.tif")

# Rasterise the polygons using an attribute field
koala_buffers_rast <- rasterize(koala_buffers_terra, target_raster, field = "SCORE")

# Convert all NA (no data) values in the new raster to 0
koala_buffers_rast[is.na(koala_buffers_rast)] <- 0

# Plot data for quick visual inspection
koala_map3 <- tm_shape(koala_buffers_rast) + 
       tm_raster(col.scale = tm_scale_categorical()) + 
       tm_layout(frame = FALSE, legend.text.size = 0.6, legend.title.size = 0.6)

# Plot study area bnd on top
koala_map3 + tm_shape(hunter_bnd) +  tm_borders("white")

Layer 3: Distance to infrastructure

For our final layer, we need to calculate a 300 m buffer around infrastructure that may represent a barrier for koala movement, and assign a score of 100 to all areas that are beyond this 300 m buffer. Here, we will consider the following types of infrastructure as barriers to koala movement: roads, railway corridors, and built-up areas, including those classified as developed land even if they lack actual buildings.

Load the relevant datasets, namely, built_areas, roads, and railways:

# Read in shapefiles
built_areas <- read_sf("./data/data_koala/vector/built_areas.shp")
roads <- read_sf("./data/data_koala/vector/roads.shp")
railways <- read_sf("./data/data_koala/vector/railways.shp")

# Plot bnd on top for larger extent
bnd_map <- tm_shape(hunter_bnd) +  tm_borders("black") + tm_layout(frame = FALSE)

# Plot the three layers on the same map

built_map <- bnd_map + 
      tm_shape(built_areas) + tm_fill("lightgray")
roads_map <- built_map + 
      tm_shape(roads) + tm_lines(col = "red", lwd = 1, lty = "solid")
roads_map + tm_shape(railways) + tm_lines(col = "black", lwd = 2, lty = "solid")

The roads layer contains all road types, including minor roads and tracks, and not all of these will constitute a barrier for koala movement. For this reason we need to clean up this layer before proceeding with the calculation of the buffers.

Print a list of the roads layer attribute table columns:

names(roads)

##  [1] "OBJECTID"   "FEATTYPE"   "NAME"       "CLASS"      "FORMATION" 
##  [6] "NRN"        "SRN"        "FEATREL"    "ATTRREL"    "PLANACC"   
## [11] "SOURCE"     "CREATED"    "RETIRED"    "PID"        "SYMBOL"    
## [16] "FEATWIDTH"  "TEXTNOTE"   "SHAPE_Leng" "geometry"

There is a column called CLASS and this will contain the various road types. Print a list with the entries in this column:

unique(roads$CLASS)

## [1] "Minor Road"       "Track"            "Secondary Road"   "Principal Road"  
## [5] "Dual Carriageway"

From the above, we can see that we have six road types, including “Minor Road” and “Track”. The next chunk of code drops all roads with these two classifications:

# Create list of CLASS values of interest
road_class <- c("Secondary Road", "Principal Road", "Dual Carriageway")

# Filter data and extract roads of interest
roads_filtered <- filter(roads, CLASS %in% road_class)

# Repeat the previous map to see roads remaining
# Plot bnd on top for larger extent
bnd_map <- tm_shape(hunter_bnd) +  tm_borders("black") + tm_layout(frame = FALSE)

# Plot the three layers on the same map

built_map <- bnd_map + 
  tm_shape(built_areas) + tm_fill("lightgray")
roads_map <- built_map + 
  tm_shape(roads_filtered) + tm_lines(col = "red", lwd = 1, lty = "solid")
roads_map + tm_shape(railways) + tm_lines(col = "black", lwd = 2, lty = "solid")

Next, drop all attribute fields, create a new SCORE field for each layer, and populate with the value zero:

# Create new sf objects that retain only the geometry column
# This effectively drops all other attribute fields
built_areas_clean <- st_sf(geometry = st_geometry(built_areas))
roads_clean <- st_sf(geometry = st_geometry(roads_filtered))
railways_clean <- st_sf(geometry = st_geometry(railways))

# Add a new attribute field called SCORE and populate it with 0 for each feature
built_areas_clean$SCORE <- 0
roads_clean$SCORE <- 0
railways_clean$SCORE <- 0

# Optionally, reorder the columns so that SCORE comes before the geometry column
built_areas_clean <- built_areas_clean[, c("SCORE", "geometry")]
roads_clean <- roads_clean[, c("SCORE", "geometry")]
railways_clean <- railways_clean[, c("SCORE", "geometry")]

Create the 300 m buffer and dissolve features:

# Create a 300m buffer around each point
built_buffers <- st_buffer(built_areas_clean, dist = 300)
roads_buffers <- st_buffer(roads_clean, dist = 300)
railways_buffers <- st_buffer(railways_clean, dist = 300)

# Dissolve (union) buffers by the SCORE field
built_buffers_dissolved <- built_buffers %>%
  group_by(SCORE) %>%
  summarise(geometry = st_union(geometry))
roads_buffers_dissolved <- roads_buffers %>%
  group_by(SCORE) %>%
  summarise(geometry = st_union(geometry))
railways_buffers_dissolved <- railways_buffers %>%
  group_by(SCORE) %>%
  summarise(geometry = st_union(geometry))

# Join the three layers together and dissolve again
# Combine the sf objects using rbind 
# (works if they have identical column names)
infrastructure <- 
    rbind(built_buffers_dissolved,
          roads_buffers_dissolved,
          railways_buffers_dissolved
          )

# Dissolve (union)
infrastructure_dissolved <- st_union(infrastructure)

# Convert the unioned geometry back into an sf object
infrastructure_sf <- st_sf(geometry = infrastructure_dissolved)

# Add SCORE field and sort attribute columns
infrastructure_sf$SCORE <- 0
infrastructure_sf <- infrastructure_sf[, c("SCORE", "geometry")]

# Plot bnd
bnd_map <- tm_shape(hunter_bnd) + 
            tm_borders("black") + 
            tm_layout(frame = FALSE)

# Add infrastructure
bnd_map + tm_shape(infrastructure_sf) + 
            tm_fill("gray") + 
            tm_layout(frame = FALSE)

We are now ready to convert the infrastructure buffer layer (infrastructure_dissolved) to a raster with scores:

# Convert the sf object to a terra SpatVector
infrastructure_terra <- vect(infrastructure_sf)

# Read in the target SpatRaster that defines the extent and resolution
# Here we read in one of the Eco Logical GeoTIFFs provided
target_raster <- rast("./data/data_koala/raster/allveg_patch_size.tif")

# Rasterise the polygons using an attribute field
infrastructure_rast <- rasterize(infrastructure_terra, target_raster, field = "SCORE")

# Convert all NA (no data) values in the new raster to 300
infrastructure_rast[is.na(infrastructure_rast)] <- 300

# Plot data for quick visual inspection
infra_map <- tm_shape(infrastructure_rast) + 
       tm_raster(col.scale = tm_scale_categorical()) + 
       tm_layout(frame = FALSE, legend.text.size = 0.6, legend.title.size = 0.6)

# Plot study area bnd on top
infra_map + tm_shape(hunter_bnd) +  tm_borders("white")

We have now calculated the three missing final layers:

\(soil\) : soil fertility. This is layer soil_landscapes_rast
\(koala\) : buffered koala sightings. This is layer koala_buffers_rast
\(urb\) : distance to infrastructure. This is layer infrastructure_rast

Calculate koala habitat value

To repeat from above, the final koala habitat suitability map is calculated by combining all the scores assigned to each data layer, with each layer further multiplied by a weight:

The calculation is as follows:

\[khv = [(veg_{prim} \times 3) + (veg_{sec} \times 2) + veg_{supp}] + (soil \times 2) +\] \[+ (veg_{water} \times 2) + koala + (urb \times 3) + (veg_{patch} \times 3)\] where:

\(veg_{prim}\) : primary feed trees
\(veg_{sec}\) : secondary feed trees
\(veg_{supp}\) : supplementary trees
\(soil\) : soil fertility
\(veg_{water}\) : vegetation proximity to water
\(koala\) : buffered koala sightings
\(urb\) : distance to infrastructure
\(veg_{patch}\) : vegetation patch size

Calculating \(khv\) using map algebra is straight forward:

# Read in Eco Logical rasters
veg_prim <- rast("./data/data_koala/raster/allveg_prim.tif")
veg_sec <- rast("./data/data_koala/raster/allveg_sec.tif")
veg_supp <- rast("./data/data_koala/raster/allveg_supp.tif")
veg_water <- rast("./data/data_koala/raster/dist_to_water.tif")
veg_patch <- rast("./data/data_koala/raster/allveg_patch_size.tif")

# Map algebra
koala_habitat <- (3 * veg_prim) + (2 * veg_sec) + veg_supp + 
                 (2 * soil_landscapes_rast) + 
                 (2 * veg_water) + 
                 koala_buffers_rast + 
                 (3 * infrastructure_rast) + 
                 (3 * veg_patch)

#Change the name of the layer to "khv"
names(koala_habitat) <- "khv"

# Plot bnd
khv_map <- tm_shape(hunter_bnd) +  tm_borders("black") 

# Add khv
khv_map + tm_shape(koala_habitat) + 
    tm_raster("khv", 
          col.scale = tm_scale_continuous_sqrt(
            values = "viridis", 
              ), 
          col.legend = tm_legend(
            title = "Habitat \nSuitability \nIndex",
            position = tm_pos_out(cell.h = "right", cell.v = "center"),
            orientation = "portrait"
              )
    ) +
    tm_layout(frame = FALSE, 
              legend.text.size = 0.6, 
              legend.title.size = 0.6)

To calculate the normalised \(khv\), first we check the maximum value:

print(koala_habitat)

## class       : SpatRaster 
## dimensions  : 3793, 4165, 1  (nrow, ncol, nlyr)
## resolution  : 25, 25  (x, y)
## extent      : 332811.5, 436936.5, 6312635, 6407460  (xmin, xmax, ymin, ymax)
## coord. ref. : GDA_1994_Transverse_Mercator 
## source(s)   : memory
## varname     : allveg_prim 
## name        :  khv 
## min value   :    0 
## max value   : 3000

The maximum \(khv\) value is 3000, and so to normalise the \(khv\) values, we divide all grid cells of the koala_habitat raster with 3000 and then multiply by 100:

koala_habitat_norm <- (koala_habitat / 3000) * 100

We are now ready to make our final map:

# Draw bnd
khvnorm_map <- tm_shape(hunter_bnd) +  tm_borders("black") 

# Add normalised khv  
khvnorm_map + tm_shape(koala_habitat_norm) + 
       tm_raster("khv",
          col.scale = tm_scale_intervals(                               # Custom breaks
            values = c("#584053", "#8dc6bf", "#fcbc66", "#f97b4f"),     # Define colours
            breaks = c(0, 20, 34, 49, 100),           
            labels = c("low", "moderate", "high", "very high")          # Legend labels
            ), 
          col.legend = tm_legend(
            title = "HSV Class",
            position = tm_pos_out(cell.h = "right", cell.v = "center"),
            orientation = "portrait"
            )
       )+
       tm_layout(frame = FALSE, 
                 legend.text.size = 0.6, 
                 legend.title.size = 0.6)

Summary

The analyses above replicated a koala population sustainability study conducted in the Lower Hunter region of NSW for the Department of Sustainability, Environment, Water, Population and Communities by the consultancy Eco Logical. While some data sources differed from those used in the original study, the resulting koala habitat suitability maps closely align with the original findings, highlighting the same areas of high and very high suitability.

List of R functions

This practical showcased a diverse range of R functions. A summary of these is presented below:

base R functions

c(): concatenates its arguments to create a vector, which is one of the most fundamental data structures in R. It is used to combine numbers, strings, and other objects into a single vector.
head(): returns the first few elements or rows of an object, such as a vector, matrix, or data frame. It is a convenient tool for quickly inspecting the structure and content of data.
na: represents missing or undefined values in R. It is integral to data analysis as it denotes the absence of a value and requires careful handling in computations and data manipulations.
names(): retrieves or sets the names attribute of an R object, such as a vector, list, or data frame. It is essential for labeling elements to improve readability and facilitate data management.
print(): outputs the value of an R object to the console. It is particularly useful for debugging or for ensuring that outputs are displayed correctly within functions.
rbind(): combines R objects by rows, stacking them vertically to form a new matrix or data frame. It is especially useful for merging datasets that share the same column structure.
unique(): extracts distinct elements from a vector, data frame, or other R object by removing duplicate entries. It preserves the order of first occurrences and is useful for identifying unique values in a dataset.

dplyr (tidyverse) functions

case_when(): acts as a vectorised if-else statement by evaluating multiple conditions and returning corresponding values. It is useful for creating new variables based on complex conditional logic.
filter(): subsets rows of a data frame based on specified logical conditions. It is essential for extracting relevant data for analysis.
group_by(): organises data into groups based on one or more variables, setting the stage for group-wise operations. It is often used together with summarise or mutate.
mutate(): adds new columns or transforms existing ones within a data frame. It is essential for creating derived variables and modifying data within the tidyverse workflow.
rename(): changes the names of columns in a data frame to make them more descriptive or consistent. It provides a clear syntax for renaming variables during data cleaning.
select(): extracts a subset of columns from a data frame based on specified criteria. It simplifies data frames by keeping only the variables needed for analysis.
summarise(): reduces multiple values down to a single summary statistic per group in a data frame. When used with group_by(), it facilitates the computation of aggregates such as mean, sum, or count.

sf functions

read_sf(): reads spatial data from various file formats into an sf object. It simplifies the import of geographic data for spatial analysis.
st_buffer(): creates a buffer around spatial features at a specified distance. It is useful for proximity analyses and for creating zones of influence.
st_geometry(): extracts or assigns the geometry component of an sf object. It allows you to isolate and manipulate the spatial aspects of the data separately from the attributes.
st_intersection(): computes the geometric intersection between spatial features. It is used to identify overlapping areas or shared boundaries between different spatial objects.
st_sf(): converts a regular data frame into an sf object by attaching spatial geometry information. It is the foundational step for performing spatial analysis using the simple features framework.
st_union(): merges multiple spatial features into a single combined geometry. It is useful for dissolving boundaries and creating aggregate spatial features.

terra functions

rast(): creates a raster object from files, matrices, or arrays, and is used for spatial analyses and modeling. It is the entry point for working with raster data in terra.
rasterize(): converts vector data (points, lines, or polygons) into a raster format. It is key for aligning vector data with a raster grid for subsequent analysis.
vect(): reads and creates vector objects from various spatial data formats. It handles points, lines, and polygons for versatile spatial data manipulation.

tmap functions

tm_shape(): initialises a tmap object with a spatial dataset, setting up the base layer for thematic maps. It establishes the spatial context for subsequent map layers.
tm_borders(): adds border lines to spatial objects on a map, highlighting boundaries. It enhances map readability by clearly delineating regions.
tm_fill(): fills polygon areas with specified colors based on attribute data, providing visual context to thematic maps. It is commonly used to represent different categories or intensities.
tm_dots(): plots point features on a tmap, useful for visualizing locations or events. It supports customization of size, color, and shape to enhance map interpretation.
tm_lines(): draws line features on a map, ideal for representing roads, rivers, or connections. It provides options for styling, such as line width and color.
tm_raster(): displays raster data layers on a map, allowing integration of continuous data like elevation or temperature. It supports various styling and transparency options.
tm_layout(): customises the overall appearance and layout of the map, including margins, titles, and backgrounds. It helps fine-tune the visual presentation of the map.
tm_legend(): adjusts the properties and placement of the map legend, improving clarity and context. It offers options to modify text size, symbols, and arrangement.
tm_pos_out(): positions map elements such as legends or scale bars outside the main map panel. It helps reduce clutter within the map area.
tm_scale_categorical(): adds a categorical scale bar to represent discrete variable classes on the map. It is tailored for visualizing grouped data.
tm_scale_continuous_sqrt(): applies a square root transformation to a continuous color scale in tmap. It is useful for visualizing data with wide-ranging values by reducing skew.
tm_scale_intervals(): creates scale intervals to categorise continuous data into bins. It enhances interpretability by breaking down data into meaningful segments.

Case Study: Koala habitat suitability

SCII205: Introductory Geospatial Analysis / Practical No. 7

Alexandru T. Codilean

Last modified: 2025-06-14