Skip to contents

Introduction

Performing analysis over transit requires data on the operation and infrastructure. GTFS is an open standard that offers this information in a simple and complete way.

GTFShift includes some built in methods that can assist gathering it, explored in this article.

Read a GTFS file

GTFShift makes use of several R packages to implement its features, but most of it is built upon tidytransit, so all the methods that accept a GTFS object should be a tidygtfs object.

To create it, you can either use GTFShift::load_feed() or the tidytransit method, tidytransit::read_gtfs(). Actually, GTFShift::load_feed() is an extension of the last, with the additional features of scanning the feed for any integrity errors and fixing them automatically, as well as the option to store it locally.

# Using tidytransit
gtfs = tidytransit::load_feed("https://operator.com/gtfs.zip")

# Using GTFShift
gtfs = GTFShift::load_feed("https://operator.com/gtfs.zip")

Find GTFS feeds

Using open catalogues

The gathering of the GTFS files can be simplified by using public archives like mobilitydatabase.org or transit.land.

GTFShift provides a method to query Mobility Database: GTFShift::query_mobilitydatabase(). It queries the /v1/gtfs_feeds API endpoint, returning a list of GTFS feeds with information about the providers, the area they cover and an URL to download them.

To use it, an access token must be provided. It can be obtained for free at Mobility Database website.

aml = sf::st_read("https://github.com/U-Shift/MQAT/raw/refs/heads/main/geo/MUNICIPIOSgeo.gpkg", quiet = TRUE) |> sf::st_bbox()

# usethis::edit_r_environ() # to set MOBILITY_DATABASE variable for this code chunk to work
feeds = GTFShift::query_mobilitydatabase(
  refresh_token = Sys.getenv("MOBILITY_DATABASE"),
  bounding_filter_method = "completely_enclosed",
  bbox = aml
)

feeds |>
  dplyr::filter(status == "active") |>
  dplyr::select(provider, status, producer_url) |>
  head()
#>                                      provider status
#> 1                                      Carris active
#> 2                 Cascais Próxima, E.M., S.A. active
#> 3                                    Fertagus active
#> 4                     Metro de Lisboa (Metro) active
#> 5 Metro Transportes do Sul, Metro Sul do Tejo active
#> 6           Transportes Coletivos do Barreiro active
#>                                                                       producer_url
#> 1                             https://gateway.carris.pt/gateway/gtfs/api/v2.8/GTFS
#> 2 https://drive.google.com/uc?export=download&id=13ucYiAJRtu-gXsLa02qKJrGOgDjbnUWX
#> 3                             https://www.fertagus.pt/GTFSTMLzip/Fertagus_GTFS.zip
#> 4                      https://www.metrolisboa.pt/google_transit/googleTransit.zip
#> 5                                              https://mts.pt/imt/MTS-20240129.zip
#> 6                https://www.tcbarreiro.pt/front/files/sample_gtfs/GTFS-TCB_24.zip

gtfs = GTFShift::load_feed(feeds$producer_url[2], create_transfers=FALSE)
summary(gtfs)
#> tidygtfs object
#> files        agency, routes, stop_times, trips, fare_attributes, fare_rules, shapes, vehicles, calendar, calendar_dates, feed_info, stops
#> agency       Cascais Próxima
#> service      from 2024-08-01 to 2025-08-31
#> uses         stop_times (no frequencies)
#> # routes       94
#> # trips      3784
#> # stop_ids   1067
#> # stop_names  588
#> # shapes      139

Using GTFShift incorporated database for Portugal

This library offers a small database with a compilation of GTFS files for Portuguese operators. It is a CSV file, available at extdata/gtfs_sources_pt.csv, and has the following attributes:

  • ID, a string unique identifier of the region/city the GTFS file applies to;
  • LastUpdate, the date at which this database entry was last updated;
  • ReferenceDate, a representative Wednesday that falls within the GTFS calendar;
  • URL, the URL at which the GTFS file is available;
  • GTFSDocs, the URL to the page that documents the operator GTFS.
data = read.csv(system.file("extdata", "gtfs_sources_pt.csv", package = "GTFShift"))
head(data)
#>         ID LastUpdate ReferenceDate
#> 1       cp 2025-02-01    2025-02-05
#> 2    autna 2025-02-01    2025-02-05
#> 3      AML 2025-02-01    2025-02-05
#> 4 barreiro 2025-02-01    2025-02-05
#> 5  cascais 2025-02-01    2025-02-05
#> 6   lisboa 2025-02-01    2025-02-05
#>                                                                                    URL
#> 1                                                  https://publico.cp.pt/gtfs/gtfs.zip
#> 2     https://drive.google.com/uc?export=download&id=1gah1x10RyFu7gJPweBcCXPd9vcFJFQ7c
#> 3                                              https://api.carrismetropolitana.pt/gtfs
#> 4                    https://www.tcbarreiro.pt/front/files/sample_gtfs/GTFS-TCB_24.zip
#> 5 https://drive.google.com/u/0/uc?id=13ucYiAJRtu-gXsLa02qKJrGOgDjbnUWX&export=download
#> 6                                 https://gateway.carris.pt/gateway/gtfs/api/v2.8/GTFS
#>            Type
#> 1 Long distance
#> 2 Long distance
#> 3   Inter-urban
#> 4         Urban
#> 5         Urban
#> 6         Urban
#>                                                                                                                  GTFSDocs
#> 1                                                                             https://www.transit.land/operators/o-eyc-cp
#> 2                                                                https://www.transit.land/operators/o-ez-autnatransportes
#> 3                                                                              https://github.com/carrismetropolitana/api
#> 4                                                                                              https://www.tcbarreiro.pt/
#> 5                                                          https://dadosabertos.cascais.pt/pt_PT/dataset/gtfs-mobicascais
#> 6 https://gateway.carris.pt/apiui/#!/apis/2c05b837-1c8e-4b34-85b8-371c8edb344b/pages/d7f1c190-908d-4615-b1c1-90908d4615f5

gtfs = GTFShift::load_feed(data$URL[1], create_transfers = FALSE) # example with CP - trains
summary(gtfs)
#> tidygtfs object
#> files        agency, routes, stop_times, trips, calendar, calendar_dates, stops
#> agency       CP - Comboios de Portugal
#> service      from 2024-12-15 to 2025-12-13
#> uses         stop_times (no frequencies)
#> # routes      187
#> # trips      1869
#> # stop_ids    457
#> # stop_names  457
#> # shapes      266