Introduction
Analyzing public transit feeds is important to understand its
territorial coverage and dynamics, both on its spatial and temporal
dimensions. GTFShift provides several methods that
encapsulate pre-defined methodologies for them. This document explores
their applicability with simple examples.
This article uses a GTFS feed from the library GTFS database for Portugal as an example. Refer to the vignette(“download”) for more details.
# Get GTFS from library GTFS database for Portugal
data = read.csv(system.file("extdata", "gtfs_sources_pt.csv", package = "GTFShift"))
gtfs_id = "lisboa"
gtfs = GTFShift::load_feed(data$URL[data$ID == gtfs_id], create_transfers=FALSE)Analyse hourly frequency per stop
To analyse frequencies at stops, use
GTFShift::get_stop_frequency_hourly(), producing, for each,
an aggregated counting of bus servicing it per hour.
By default, the analysis is performed for next business Wednesday, in Portugal. Refer to
GTFShift::calendar_nextBusinessWednesday(), for more details. You can override this, usingdateparameter.
# Perform frequency analysis
frequencies_stop = GTFShift::get_stop_frequency_hourly(gtfs)
summary(frequencies_stop)
#>    stop_id               hour         frequency               geometry    
#>  Length:39208       Min.   : 6.00   Min.   : 1.000   POINT        :39208  
#>  Class :character   1st Qu.:10.00   1st Qu.: 3.000   epsg:4326    :    0  
#>  Mode  :character   Median :14.00   Median : 6.000   +proj=long...:    0  
#>                     Mean   :14.21   Mean   : 6.893                        
#>                     3rd Qu.:18.00   3rd Qu.: 9.000                        
#>                     Max.   :23.00   Max.   :40.000Its returns an sf data.frame that can be
displayed using mapview, or stored in GeoPackage format.
# Display map
mapview::mapview(
  frequencies_stop |>
    filter(hour == 8 &
           frequency > 2),
  zcol = "frequency",
  legend = TRUE,
  cex = 4,
  layer.name = "Frequency (hour)"
)
# Store in GeoPackage format
# st_write(frequencies_stop, "database/transit/bus_stop_frequency.gpkg", append=FALSE, quiet = TRUE)Analyse hourly frequency per route
The frequency analysis can also be performed route wise. For this
purpose, use GTFShift::get_route_frequency_hourly(),
returning aggregated results per hour and route.
The analysis can be performed for each route individually.
By default, the analysis is performed for next business Wednesday, in Portugal. Refer to
GTFShift::calendar_nextBusinessWednesday(), for more details. You can override this, usingdateparameter.
frequencies_route = GTFShift::get_route_frequency_hourly(gtfs)
summary(frequencies_route)
#>    route_id         route_short_name    direction_id        hour      
#>  Length:3363        Length:3363        Min.   :0.000   Min.   : 0.00  
#>  Class :character   Class :character   1st Qu.:0.000   1st Qu.: 9.00  
#>  Mode  :character   Mode  :character   Median :0.000   Median :13.00  
#>                                        Mean   :0.449   Mean   :13.15  
#>                                        3rd Qu.:1.000   3rd Qu.:18.00  
#>                                        Max.   :1.000   Max.   :23.00  
#>    frequency        shape_id                  geometry   
#>  Min.   : 1.000   Length:3363        LINESTRING   :3363  
#>  1st Qu.: 2.000   Class :character   epsg:4326    :   0  
#>  Median : 3.000   Mode  :character   +proj=long...:   0  
#>  Mean   : 3.125                                          
#>  3rd Qu.: 4.000                                          
#>  Max.   :11.000
quantile(frequencies_route$frequency)
#>   0%  25%  50%  75% 100% 
#>    1    2    3    4   11The overline parameter allows for an even more
aggregated screening of the operation, clustering routes that overlap
and converting them into a single route network. This allows for a
better visualization of the volumes of frequencies per each segment of
the network and can help prioritizing interventions in the network.
frequencies_route_overline = GTFShift::get_route_frequency_hourly(gtfs, overline = TRUE)
summary(frequencies_route_overline)
#>    frequency            hour                geometry     
#>  Min.   :  1.000   Min.   : 0.00   LINESTRING   :135774  
#>  1st Qu.:  3.000   1st Qu.: 9.00   epsg:4326    :     0  
#>  Median :  6.000   Median :13.00   +proj=long...:     0  
#>  Mean   :  9.134   Mean   :13.02                         
#>  3rd Qu.: 12.000   3rd Qu.:18.00                         
#>  Max.   :111.000   Max.   :23.00
quantile(frequencies_route_overline$frequency)
#>   0%  25%  50%  75% 100% 
#>    1    3    6   12  111Improve visualization
Using the overline attribute in
GTFShift::get_route_frequency_hourly() might not be the
best option if the GTFS shapes do not share exactly the same geometry.
In those cases, the overlapping lines might not be merged correctly,
causing inconsistent results, such as a street with different
frequencies along it, despite not having any bus stop in between those
differences.

This is a known issue of
stplanr::overline2(), the method used for the network
aggregation that has not been solved yet.
As an alternative, GTFShift provides some methods that
allow to overcome this issue, by correcting its geometry or aggregating
the network with open data.
Correcting geometry with OSM open data
GTFShift offers several methods that allow to get routes geometry from OpenStreetMaps. Refer to vignette(“osm”) for more details.
Aggregating the network with OSM open data
There are several methods to aggregate a transit network. One approach is through the determination of the centerlines of the roads where the vehicles operate. GTFShift provides a method that encapsulates Python neatnet package for this purpose. Refer to vignette(“osm”) for more details.
During the development of this project, no R packages were found suiting this purpose. Centerline package has this feature in its roadmap. Currently, there are available solutions for Python or ArcGis.
Aggregating frequencies over a target network
As an alternative to the
GTFShift::get_route_frequency_hourly() method using the
overline=TRUE parameter,
GTFShift::network_overline() provides a different frequency
aggregation functionality.
Given a target network, it identifies the segments corresponding to each route and uses them to aggregate the attribute defined in the parameters.
Below is provided an example, that uses the centerlines for the Carris network as a target network, generated using ArcGis.
network = sf::st_read(
  system.file("extdata", "centerline_carris.gpkg", package = "GTFShift"), 
  quiet = TRUE
)
frequencies_route_overline_improved = GTFShift::network_overline(
  network, 
  frequencies_route |> filter(hour == 8),
  attr = "frequency"
)
quantile(frequencies_route_overline_improved$frequency)
#>   0%  25%  50%  75% 100% 
#>    1    5   10   18  111