Accessibility Analysis
Accessibility is often described as the ease with which individuals can reach places and opportunities, such as employment, public services, and cultural activities. We can utilise transport network data to quantify accessibility and characterise areas based on their accessibility levels. This week, we will use the dodgr
R library to measure accessibility between different points of interest by calculating the network distances between them.
Lecture slides
You can download the slides of this week’s lecture here: [Link].
Reading list
Essential readings
- Geurs, K., Van Wee, B. 2004. Accessibility evaluation of land-use and transport strategies: review and research directions. Journal of Transport Geography 12(2): 127-140. [Link]
- Higgins, C., Palm, M. DeJohn, A. et al. 2022. Calculating place-based transit accessibility: Methods, tools and algorithmic dependence. Journal of Transport and Land Use 15(1): 95-116. [Link]
Suggested readings
- Van Dijk, J., Krygsman, S. and De Jong, T. 2015. Toward spatial justice: The spatial equity effects of a toll road in Cape Town, South Africa. Journal of Transport and Land Use 8(3): 95-114. [Link]
- Van Dijk, J. and De Jong, T. 2017. Post-processing GPS-tracks in reconstructing travelled routes in a GIS-environment: network subset selection and attribute adjustment. Annals of GIS 23(3): 203-217. [Link]
Accessibility in Lambeth
This week, we will analyse the accessibility of fast-food outlets in the London Borough of Lambeth. Specifically, we will examine how closely these outlets are located within walking distance of primary and secondary schools, and explore any potential relationships between their proximity and the relative levels of deprivation in the area.
We will extract the points of interest that we will use for this analysis from the Point of Interest (POI) data for the United Kingdom, obtained from the Overture Maps Foundation and pre-processed by the Consumer Data Research Centre to provide users with easy access.
You can download a subset of the POI dataset via the link provided below. A copy of the 2011 London LSOAs spatial boundaries, the boundaries of the London Boroughs, and the 2019 English Index of Multiple Deprivation. Save these files in your project folder under data
.
File | Type | Link |
---|---|---|
Lambeth Overture Points of Interest 2024 | GeoPackage |
Download |
London LSOA 2011 Spatial Boundaries | GeoPackage |
Download |
London Borough Spatial Boundaries | GeoPackage |
Download |
England 2019 Index of Multiple Deprivation | csv |
Download |
To download a csv
file that is hosted on GitHub, click on the Download raw file button on the top right of your screen and it should download directly to your computer.
To extract the Lambeth Overture Points of Interest data, a 2-kilometre buffer was applied around the boundaries of Lambeth. This approach ensures that points just outside the study area are included, as locations beyond the borough boundary may still be accessible to residents and could represent the nearest available options.
Open a new script and save this as w08-accessibility-analysis.r
.
We will start by loading the libraries that we will need:
R code
# load libraries
library(tidyverse)
library(sf)
library(tmap)
library(osmdata)
library(dodgr)
You may have to install some of these libraries if you have not used these before.
Next, we can load the spatial data into R.
R code
# read poi data
<- st_read("data/Lambeth-POI-2024.gpkg") poi24
Reading layer `Lambeth-POI-2024' from data source
`/Users/justinvandijk/Library/CloudStorage/Dropbox/UCL/Web/jtvandijk.github.io/GEOG0114/data/Lambeth-POI-2024.gpkg'
using driver `GPKG'
Simple feature collection with 65060 features and 11 fields
Geometry type: MULTIPOINT
Dimension: XY
Bounding box: xmin: 526556.6 ymin: 167827 xmax: 535640.4 ymax: 182673.8
Projected CRS: OSGB36 / British National Grid
# read lsoa dataset
<- st_read("data/London-LSOA-2011.gpkg") lsoa11
Reading layer `London-LSOA-2011' from data source
`/Users/justinvandijk/Library/CloudStorage/Dropbox/UCL/Web/jtvandijk.github.io/GEOG0114/data/London-LSOA-2011.gpkg'
using driver `GPKG'
Simple feature collection with 4835 features and 10 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 503574.2 ymin: 155850.8 xmax: 561956.7 ymax: 200933.6
Projected CRS: OSGB36 / British National Grid
# read borough dataset
<- st_read("data/London-Boroughs.gpkg") borough
Reading layer `london_boroughs' from data source
`/Users/justinvandijk/Library/CloudStorage/Dropbox/UCL/Web/jtvandijk.github.io/GEOG0114/data/London-Boroughs.gpkg'
using driver `GPKG'
Simple feature collection with 33 features and 7 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 503568.2 ymin: 155850.8 xmax: 561957.5 ymax: 200933.9
Projected CRS: OSGB36 / British National Grid
Now, carefully examine each individual dataframe to understand how the data is structured and what information it contains.
R code
# inspect poi data
head(poi24)
Simple feature collection with 6 features and 11 fields
Geometry type: MULTIPOINT
Dimension: XY
Bounding box: xmin: 526913.4 ymin: 169695.2 xmax: 526945.5 ymax: 169970.8
Projected CRS: OSGB36 / British National Grid
id primary_name main_category
1 08f194ada9716b86030eab41bbd4207e "Gorgeous Grub" "burger_restaurant"
2 08f194ada9715a1903d73f4aef170602 "TLC Direct" "wholesale_store"
3 08f194ada944cba203fa613de4f5e6d5 "JD Sports" "sports_wear"
4 08f194ada9449a8a0345a466a0a6ece9 "Lidl GB" "supermarket"
alternate_category address
1 eat_and_drink|fast_food_restaurant "1 Prince Georges Road"
2 professional_services|lighting_store "280 Western Road"
3 sporting_goods|shoe_store "Unit 2 Tandem Centre Top Of Church Rd"
4 retail|fast_food_restaurant "Colliers Wood"
locality postcode region country source source_record_id
1 "London" "SW19 2" "ENG" "GB" "meta" "232538816864698"
2 "London" "SW19 2QA" "ENG" "GB" "meta" "1959707454355017"
3 "Colliers Wood" "SW19 2TY" <NA> "GB" "meta" "644899945690935"
4 "London" "SW19 2TY" <NA> "GB" "meta" "111430837210163"
geom
1 MULTIPOINT ((526913.4 16984...
2 MULTIPOINT ((526921.1 16969...
3 MULTIPOINT ((526915.7 16997...
4 MULTIPOINT ((526922.2 16988...
[ reached 'max' / getOption("max.print") -- omitted 2 rows ]
# inspect lsoa dataset
head(lsoa11)
Simple feature collection with 6 features and 10 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 531948.3 ymin: 180733.9 xmax: 545296.2 ymax: 184700.6
Projected CRS: OSGB36 / British National Grid
lsoa11cd lsoa11nm lsoa11nmw bng_e bng_n long
1 E01000001 City of London 001A City of London 001A 532129 181625 -0.09706
2 E01000002 City of London 001B City of London 001B 532480 181699 -0.09197
3 E01000003 City of London 001C City of London 001C 532245 182036 -0.09523
4 E01000005 City of London 001E City of London 001E 533581 181265 -0.07628
lat globalid lsoa11_name pop2011
1 51.51810 {283B0EAD-F8FC-40B6-9A79-1DDD7E5C0758} City of London 001A 1465
2 51.51868 {DDCE266B-7825-428C-9E0A-DF66B0179A55} City of London 001B 1436
3 51.52176 {C45E358E-A794-485A-BF76-D96E5D458EA4} City of London 001C 1346
4 51.51452 {4DDAF5E4-E47F-4312-89A0-923FFEC028A6} City of London 001E 985
geom
1 MULTIPOLYGON (((532105.1 18...
2 MULTIPOLYGON (((532634.5 18...
3 MULTIPOLYGON (((532135.1 18...
4 MULTIPOLYGON (((533808 1807...
[ reached 'max' / getOption("max.print") -- omitted 2 rows ]
# inspect borough dataset
head(borough)
Simple feature collection with 6 features and 7 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 507007.4 ymin: 155850.8 xmax: 561957.5 ymax: 194889.3
Projected CRS: OSGB36 / British National Grid
objectid name gss_code hectares nonld_area ons_inner
1 1 Kingston upon Thames E09000021 3726.117 0.000 F
2 2 Croydon E09000008 8649.441 0.000 F
3 3 Bromley E09000006 15013.487 0.000 F
4 4 Hounslow E09000018 5658.541 60.755 F
5 5 Ealing E09000009 5554.428 0.000 F
6 6 Havering E09000016 11445.735 210.763 F
sub_2011 geom
1 South POLYGON ((516401.6 160201.8...
2 South POLYGON ((535009.2 159504.7...
3 South POLYGON ((540373.6 157530.4...
4 West POLYGON ((509703.4 175356.6...
5 West POLYGON ((515647.2 178787.8...
6 East POLYGON ((553564 179127.1, ...
POI data
The inspection shows that the POI dataset contains a wide variety of location types, with each point tagged under a main and alternative category, as provided by the Overture Maps Foundation via Meta and Microsoft. However, these tags may not be consistent across the dataset, so we will need to identify specific keywords to filter the main_category
and alternate_category
columns.
We will start by filtering out all POIs where the word school
features in the main_category
column:
R code
# filter school poi data
<- poi24 |>
poi_schools filter(str_detect(main_category, "school"))
# inspect
head(unique(poi_schools$main_category), n = 50)
[1] "\"day_care_preschool\"" "\"driving_school\""
[3] "\"elementary_school\"" "\"school\""
[5] "\"language_school\"" "\"music_school\""
[7] "\"specialty_school\"" "\"preschool\""
[9] "\"dance_school\"" "\"high_school\""
[11] "\"drama_school\"" "\"cooking_school\""
[13] "\"middle_school\"" "\"vocational_and_technical_school\""
[15] "\"art_school\"" "\"private_school\""
[17] "\"religious_school\"" "\"nursing_school\""
[19] "\"montessori_school\"" "\"public_school\""
[21] "\"cosmetology_school\"" "\"medical_school\""
[23] "\"engineering_schools\"" "\"massage_school\""
[25] "\"business_schools\"" "\"law_schools\""
[27] "\"medical_sciences_schools\"" "\"sports_school\""
[29] "\"flight_school\""
You can further inspect the results using the View()
function.
This is still a very large list, and looking at the categories not all POIs containing the string school
should be included. However, this initial selection has given us a more manageable list from which we can choose the relevant tags. We can now further filter the dataset as well as clip the dataset to the administrative boundaries of Lambeth.
R code
# remove quotes for easier processing
<- poi_schools |>
poi_schools mutate(main_category = str_replace_all(main_category, "\"", ""))
# filter school poi data
<- poi_schools |>
poi_schools filter(main_category == "elementary_school" | main_category == "high_school" |
== "middle_school" | main_category == "private_school" | main_category ==
main_category "public_school" | main_category == "school")
# filter school poi data
<- borough |>
lambeth filter(name == "Lambeth")
<- poi_schools |>
poi_schools st_intersection(lambeth) |>
select(1:11)
# inspect
poi_schools
Simple feature collection with 141 features and 11 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 528635.7 ymin: 169846.4 xmax: 533065.9 ymax: 180398
Projected CRS: OSGB36 / British National Grid
First 10 features:
id
6 08f194ad1a394235035f3ab7c2e4721d
7 08f194ad1a8da734035945d69c357ddd
8 08f194ad1abb648603defd9d76b4c314
27 08f194ad130f0cd303c1c9f9b42438f8
primary_name main_category
6 "Woodmansterne Children's Centre" elementary_school
7 "Immanuel & St Andrew Church of England Primary School" school
8 "Monkey Puzzle Day Nursery & Preschool Streatham Common" school
27 "Campsbourne School" school
alternate_category address locality
6 school|education "Stockport Rd" <NA>
7 elementary_school|education "Northanger Rd" <NA>
8 education|public_service_and_government "496 Streatham High Road" "London"
27 education <NA> "London"
postcode region country source source_record_id
6 "SW16 5XE" <NA> "GB" "meta" "114577088601307"
7 "SW16 5SL" <NA> "GB" "meta" "128479257200832"
8 "SW16 3QB" "ENG" "GB" "meta" "1092187950854118"
27 <NA> <NA> "GB" "meta" "114411542481619"
geom
6 POINT (529701.5 169846.4)
7 POINT (530016.4 170574.1)
8 POINT (530208.6 170587.9)
27 POINT (528819.8 174228.7)
[ reached 'max' / getOption("max.print") -- omitted 6 rows ]
This is still a rather long list and likely inaccurate. According to Lambeth Council Education Statistics, there should be 80 primary and secondary schools across the borough. We can use the alternate_category
column to further narrow down our results.
You can inspect the different tags and their frequencies easily by creating a frequency table: table(poi_schools$alternate_category)
.
R code
# filter school poi data
<- poi_schools |>
poi_schools filter(str_detect(alternate_category, "elementary_school") | str_detect(alternate_category,
"high_school") | str_detect(alternate_category, "middle_school") | str_detect(alternate_category,
"private_school") | str_detect(alternate_category, "public_school"))
# inspect
poi_schools
Simple feature collection with 58 features and 11 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 528635.7 ymin: 170025.3 xmax: 532897.2 ymax: 179678.2
Projected CRS: OSGB36 / British National Grid
First 10 features:
id
1 08f194ad1a8da734035945d69c357ddd
2 08f194ad1a70460d037da737c256001b
3 08f194ad1c2dc81c032e9e0aa296a8d1
4 08f194ad1e4cec5903fafb7496a2d2f3
primary_name main_category
1 "Immanuel & St Andrew Church of England Primary School" school
2 "Granton Primary School" elementary_school
3 "Kingswood Primary School (Upper Site)" elementary_school
4 "Battersea Grammar School" school
alternate_category address locality postcode region
1 elementary_school|education "Northanger Rd" <NA> "SW16 5SL" <NA>
2 school|public_school "Granton Rd" <NA> "SW16 5AN" <NA>
3 school|high_school "193 Gipsy Road" "London" "SE27 9" "ENG"
4 high_school|education <NA> "London" <NA> <NA>
country source source_record_id geom
1 "GB" "meta" "128479257200832" POINT (530016.4 170574.1)
2 "GB" "meta" "235737420093504" POINT (529299.7 170025.3)
3 "GB" "meta" "110066125723254" POINT (532897.2 171498.4)
4 "GB" "meta" "103107239728950" POINT (529523.9 172310.9)
[ reached 'max' / getOption("max.print") -- omitted 6 rows ]
Since the POI dataset is compiled from various open sources, the data quality is not guaranteed. Some schools may be missing, while others could be duplicated, perhaps under slightly different names or because different buildings have been assigned separate point locations. However, it is unlikely that more than one school would share the same postcode. Therefore, we will use postcode information (where available) to finalise our school selection and remove any likely duplicates.
R code
# identify duplicate postcodes
<- poi_schools |>
poi_schools group_by(postcode) |>
mutate(rank = rank(primary_name)) |>
ungroup()
# filter school poi data
<- poi_schools |>
poi_schools filter(is.na(postcode) | rank == 1) |>
select(-rank)
# inspect
poi_schools
Simple feature collection with 54 features and 11 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 528635.7 ymin: 170025.3 xmax: 532897.2 ymax: 179678.2
Projected CRS: OSGB36 / British National Grid
# A tibble: 54 × 12
id primary_name main_category alternate_category address locality postcode
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 08f1… "\"Immanuel… school elementary_school… "\"Nor… <NA> "\"SW16…
2 08f1… "\"Granton … elementary_s… school|public_sch… "\"Gra… <NA> "\"SW16…
3 08f1… "\"Kingswoo… elementary_s… school|high_school "\"193… "\"Lond… "\"SE27…
4 08f1… "\"Batterse… school high_school|educa… <NA> "\"Lond… <NA>
5 08f1… "\"St Bede'… school elementary_school… "\"St … <NA> "\"SW12…
6 08f1… "\"St Leona… school elementary_school… "\"42 … "\"Lond… "\"SW16…
7 08f1… "\"Richard … elementary_s… college_universit… "\"New… <NA> "\"SW2 …
8 08f1… "\"Henry Ca… school high_school|eleme… "\"Hyd… <NA> "\"SW12…
9 08f1… "\"South Ba… school high_school|b2b_s… "\"56 … "\"Lond… "\"SW2 …
10 08f1… "\"Glenbroo… elementary_s… school|public_sch… "\"Cla… <NA> "\"SW4 …
# ℹ 44 more rows
# ℹ 5 more variables: region <chr>, country <chr>, source <chr>,
# source_record_id <chr>, geom <POINT [m]>
Although we now have fewer schools than we had expected, either due to overly restrictive filtering of tags or because some school locations are not recorded in the dataset, we will proceed with the current data.
Variable preparation can be a time-consuming process that often necessitates a more extensive exploratory analysis to ensure sufficient data quality. This may involve sourcing additional data to supplement your existing dataset.
We can use a similar approach to approximate the locations of fast food outlets in the Borough.
R code
# filter fast food poi data
<- poi24 |>
poi_fastfood filter(str_detect(main_category, "fast_food_restaurant") | str_detect(alternate_category,
"fast_food_restaurant") | str_detect(alternate_category, "chicken_restaurant") |
str_detect(alternate_category, "burger_restaurant"))
# inspect
poi_fastfood
Simple feature collection with 1444 features and 11 fields
Geometry type: MULTIPOINT
Dimension: XY
Bounding box: xmin: 526666.3 ymin: 168272.9 xmax: 535546.9 ymax: 182554
Projected CRS: OSGB36 / British National Grid
First 10 features:
id primary_name main_category
1 08f194ada9716b86030eab41bbd4207e "Gorgeous Grub" "burger_restaurant"
2 08f194ada9449a8a0345a466a0a6ece9 "Lidl GB" "supermarket"
3 08f194ada944daa80328c6604dab3503 "Moss Bros." "men's_clothing_store"
4 08f194ada932ad8603db11bbb7f953a7 "Livi's Cuisine" "african_restaurant"
alternate_category address locality
1 eat_and_drink|fast_food_restaurant "1 Prince Georges Road" "London"
2 retail|fast_food_restaurant "Colliers Wood" "London"
3 fast_food_restaurant "Unit 5, Tandem Shopping Centre" "London"
4 caterer|fast_food_restaurant "1 Locks Lane" "Mitcham"
postcode region country source source_record_id
1 "SW19 2" "ENG" "GB" "meta" "232538816864698"
2 "SW19 2TY" <NA> "GB" "meta" "111430837210163"
3 "SW19 2TY" <NA> "GB" "meta" "478090646011341"
4 "CR4 2" "ENG" "GB" "meta" "231745500530140"
geom
1 MULTIPOINT ((526913.4 16984...
2 MULTIPOINT ((526922.2 16988...
3 MULTIPOINT ((526945.5 16992...
4 MULTIPOINT ((527970.3 16955...
[ reached 'max' / getOption("max.print") -- omitted 6 rows ]
Let’s map both datasets to get an idea of how the data look like:
R code
# combine for mapping
<- poi_schools |>
poi_schools mutate(type = "School")
<- poi_fastfood |>
poi_fastfood mutate(type = "Fast food")
<- rbind(poi_schools, poi_fastfood)
poi_lambeth
# shape, polygon
tm_shape(lambeth) +
# specify column, classes
tm_polygons(
col = "#f0f0f0",
+
)
# shape, points
tm_shape(poi_lambeth) +
# specify column, colours
tm_dots(
col = "type",
size = 0.05,
palette = c("#beaed4", "#fdc086"),
title = ""
+
)
# set layout
tm_layout(
legend.outside = TRUE,
legend.position = c("right", "bottom"),
frame = FALSE
)
Network data
In addition to the locations of interest, we need network data to assess the accessibility of schools in relation to fast food outlets. We will use OpenStreetMap to extract road segment data. Similar to the POI dataset, OSM uses key
and value
tags to categorise the features within its dataset.
OpenStreetMap (OSM) is a free, editable map of the world, but its coverage is uneven globally. However, the accuracy and quality of the data can at times be questionable, with details such as road types and speed limits missing. The OpenStreetMap Wiki provides more details on the tagging system.
To download the Lambeth road network dataset, we first need to define our bounding box coordinates. We will then use these coordinates in our OSM query to extract specific types of road segments within the defined search area. Our focus will be on selecting all OSM features with the highway
tag that are likely to be used by pedestrians (e.g. excluding motorways
).
R code
# define our bbox coordinates, use WGS84
<- poi24 |>
bbox_lambeth st_transform(4326) |>
st_bbox()
# osm query
<- opq(bbox = bbox_lambeth) |>
osm_network add_osm_feature(key = "highway", value = c("primary", "secondary", "tertiary",
"residential", "path", "footway", "unclassified", "living_street", "pedestrian")) |>
osmdata_sf()
In some cases, the OSM query may return an error, particularly when multiple users from the same location are executing the exact same query. If so, you can download a prepared copy of the data here: [Download]. You can load this copy into R through load('data/London-OSM-Roads.RData')
The returned osm_network
object contains a variety of elements with the specified tags. Our next step is to extract the spatial data from this object to create our road network dataset. Specifically, we will extract the edges of the network, which represent the lines of the roads, as well as the nodes, which represent the points where the roads start, end, or intersect.
R code
# extract the nodes, with their osm_id
<- osm_network$osm_points[, "osm_id"]
osm_network_nodes
# extract the edges, with their osm_id and relevant columns
<- osm_network$osm_lines[, c("osm_id", "name", "highway", "maxspeed",
osm_network_edges "oneway")]
# inspect
head(osm_network_nodes)
Simple feature collection with 6 features and 1 field
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -0.1541499 ymin: 51.52434 xmax: -0.1457924 ymax: 51.52698
Geodetic CRS: WGS 84
osm_id geometry
78112 78112 POINT (-0.1457924 51.52698)
99878 99878 POINT (-0.1529787 51.52434)
99879 99879 POINT (-0.1532934 51.52482)
99880 99880 POINT (-0.1535802 51.52508)
99882 99882 POINT (-0.1541499 51.52567)
99883 99883 POINT (-0.1541362 51.52598)
# inspect
head(osm_network_edges)
Simple feature collection with 6 features and 5 fields
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: -0.1398347 ymin: 51.50608 xmax: -0.0821093 ymax: 51.5246
Geodetic CRS: WGS 84
osm_id name highway maxspeed oneway
31030 31030 Grafton Way primary 20 mph yes
31039 31039 Tottenham Court Road primary 20 mph <NA>
31959 31959 Cleveland Street residential 20 mph yes
554369 554369 King William Street tertiary 20 mph yes
554526 554526 Fenchurch Street tertiary 20 mph <NA>
1530592 1530592 Borough High Street primary 30 mph yes
geometry
31030 LINESTRING (-0.1349153 51.5...
31039 LINESTRING (-0.1303693 51.5...
31959 LINESTRING (-0.139512 51.52...
554369 LINESTRING (-0.08745 51.511...
554526 LINESTRING (-0.085135 51.51...
1530592 LINESTRING (-0.0882957 51.5...
We can quickly map the network edges to see how the road network looks like:
R code
# shape, polygon
tm_shape(osm_network_edges) +
# specify column, classes
tm_lines(
col = "#bdbdbd",
lwd = 0.2,
+
)
# shape, polygon
tm_shape(lambeth) +
# specify column, classes
tm_borders(
col = "#252525",
lwd = 2
+
)
# set legend
tm_add_legend(
type = "line",
labels = "Road segments",
col = "#bdbdbd"
+
)
tm_add_legend(
type = "line",
labels = "Outline Lambeth",
col = "#252525"
+
)
# set layout
tm_layout(
frame = FALSE,
legend.outside = TRUE,
legend.position = c("right", "bottom"),
)
Network preparation
Since our focus is on schoolchildren and walking distances, we will overwrite the oneway
variable to assume that none of the road segments are restricted to one-way traffic. This adjustment will ensure our analysis is not skewed by such restrictions and will help maintain a more accurate representation of the general connectivity of the network.
R code
# overwrite one-way default
$oneway <- "no" osm_network_edges
Now we have the network edges, we can turn this into a graph-representation that allows for the calculation of network-based accessibility statistics with our prepared point of interest data.
In any network analysis, the primary data structure is a graph composed of nodes and edges. The dodgr
library utilises weighting profiles to assign weights based on road types, tailored to the mode of transport that each profile is designed to model. In this instance, we will use the foot
weighting profile, as our focus is on modelling walking accessibility. To prevent errors related to the weighting profile, we will replace any NA
values in the highway tag with the value unclassified
.
R code
# replace missing highway tags with unclassified
<- osm_network_edges |>
osm_network_edges mutate(highway = if_else(is.na(highway), "unclassified", highway))
# create network graph
<- weight_streetnet(osm_network_edges, wt_profile = "foot") osm_network_graph
Once we have constructed our graph, we can use it to calculate network distances between our points of interest. One important consideration is that not all individual components in the extracted network may be connected. This can occur, for example, if the bounding box cuts off access to the road of a cul-de-sac. To ensure that our entire extracted network is connected, we will therefore extract the largest connected component of the graph.
The dodgr
package documentation explains that components are numbered in order of decreasing size, with $component = 1
always representing the largest component. It is essential to inspect the resulting subgraph to ensure that its coverage is adequate for analysis.
R code
# extract the largest connected graph component
<- osm_network_graph[osm_network_graph$component == 1, ]
netx_connected
# inspect number of remaining road segments
nrow(netx_connected)
[1] 407970
OpenStreetMap is a dynamic dataset, meaning that changes are made on a continuous basis. As a result, it is quite possible that the number of remaining road segments, as shown above, may differ slightly when you run this analysis.
Accessibility analysis
Now that we have our connected subgraph, we can use the dodgr_distances()
function to calculate the network distances between every possible origin (i.e. school) and destination (i.e. fast food outlet). For all combinations, the function will map the point of interest locations to the nearest point on the network and return the corresponding shortest-path distances.
The dodgr
package requires data to be projected in WGS84, so we need to reproject our point of interest data accordingly.
R code
# reproject
<- poi_schools |>
poi_schools st_transform(4326)
<- poi_fastfood |>
poi_fastfood st_transform(4326)
# distance matrix
<- dodgr_distances(netx_connected, from = st_coordinates(poi_schools),
distance_matrix to = st_coordinates(poi_fastfood), shortest = FALSE, pairwise = FALSE, quiet = FALSE)
The result of this computation is a distance matrix that contains the network distances between all origins (i.e. schools) and all destinations (i.e. fast-food outlets):
R code
# inspect
1:5, 1:5] distance_matrix[
6807494201 7110321980 7110321980 11371586827 33148215
8796433764 4623.917 4624.095 4624.095 3127.402 3085.486
8820889464 3611.298 3752.922 3752.922 1962.068 1920.151
11479633279 8460.668 8460.846 8460.846 6938.918 6897.002
292521291 4917.554 4917.732 4917.732 4233.726 4280.690
6625498885 6279.569 6279.747 6279.747 5595.741 5642.705
The above output displays the distance (in metres) between the first five schools and the first five fast-food outlets. The row and column IDs refer to the nearest nodes on the OSM network to which the schools and fast-food outlets were mapped.
Now that we have the distance matrix, we can aggregate the data and perform accessibility analysis. For example, we can count the number of fast-food outlets within 500 or 1,000 metres walking distance from each school:
R code
# fast-food outlets within 500m
$fastfood_500m <- rowSums(distance_matrix <= 500)
poi_schools
# fast-food outlets within 1000m
$fastfood_1000m <- rowSums(distance_matrix <= 1000) poi_schools
You can further inspect the results using the View()
function.
In the final step, we can investigate whether there is a relationship between the proximity of fast-food outlets and the relative levels of deprivation in the area. One approach is to calculate the average number of fast-food outlets within 1,000 metres of a school for each LSOA, and then compare these figures to their corresponding IMD deciles.
R code
# read imd dataset
<- read_csv("data/England-IMD-2019.csv") imd19
Rows: 32844 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): lsoa11cd
dbl (2): imd_rank, imd_dec
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# join imd
<- lsoa11 |>
lsoa11 left_join(imd19, by = c("lsoa11cd" = "lsoa11cd"))
# join schools to their parent lsoa
<- poi_schools |>
poi_schools st_transform(27700) |>
st_join(lsoa11)
We can use this approach to derive the average number of fast-food by IMD decile:
R code
# average counts by imd decile
<- poi_schools |>
fastfood_imd group_by(imd_dec) |>
mutate(avg_cnt = mean(fastfood_1000m)) |>
distinct(imd_dec, avg_cnt) |>
arrange(imd_dec)
# inspect
fastfood_imd
# A tibble: 7 × 2
# Groups: imd_dec [7]
imd_dec avg_cnt
<dbl> <dbl>
1 2 20.1
2 3 14.4
3 4 17.8
4 5 9.83
5 6 3
6 7 8.2
7 8 23.5
There appears to be a weak relationship, with schools in more deprived areas having, on average, a higher number of fast-food outlets within a 1,000-metre walking distance. However, this trend is not consistent, as schools in the least deprived areas of Lambeth show the highest accessibility on average.
Assignment
Accessibility analysis involves evaluating how easily people can reach essential services, destinations, or opportunities, such as schools, healthcare facilities, or workplaces, from a given location. It considers factors like distance, travel time, and transport networks, providing valuable insights for urban planning, transport policies, and social equity. The CDRC Access to Healthy Assets & Hazards (AHAH) dataset, for instance, uses accessibility analysis to quantify how easy it is to reach ‘unhealthy’ places, such as pubs and gambling outlets, for each neighbourhood in Great Britain.
Having run through all the steps during the tutorial, we can recreate this analysis ourselves. Using Lambeth as a case study, try to complete the following tasks:
- Extract all
pubs
from the Point of Interest dataset. - For each LSOA within Lambeth, calculate the average walking distance to the nearest pub.
- Create a map of the results.
Unlike before, LSOAs are now the unit of analysis. This means you will need to use the LSOA centroids as inputs for your distance matrix.
If you want to take a deep dive into accessibility analysis, there is a great resource that got published recently: Introduction to urban accessibility: a practical guide in R.
Before you leave
This brings us to the end of the tutorial. You should now have a basic understanding of the concepts behind accessibility analysis, how it can be executed in R, and some of the challenges you may encounter when conducting your own research. With this being said, you have now reached the end of this week’s content. Onwards and upwards!