National Transit Equity Research Audit

Minimal public report + dashboard. Uses planned service supply (GTFS Schedule) and neighborhood income (2021 Census Profile).

Transit desert (overall)
Transit desert (Q1)
Transit desert (Q5)
Neighborhood rows

How to read this page

What does “wait gap” mean?

Wait gap = median(wait in lower-income group) − median(wait in higher-income group), in minutes. Positive means scheduled waits are longer in lower-income neighborhoods (worse). Negative means shorter (better).

What does “density gap” mean?

Density gap = median(trips/km² in lower-income group) − median(trips/km² in higher-income group). Negative means lower-income neighborhoods have lower scheduled service density (worse).

Why “planned service” and not real delays?

This site uses GTFS Schedule (timetables). It does not measure delays or cancellations. That requires GTFS‑Realtime.

Report (summary)

This section is designed to be readable. Numbers update automatically from /data/summary.json.

Transit deserts
overall
Q1:  •  Q5:
Coverage (weekday)
wait measurable
density measurable
Primary comparison
Bottom50 vs Top50
systems analyzed
Median weekday wait gap:
Median weekday density gap:
Robustness check
Q1 vs Q5
systems analyzed
Median weekday wait gap:
Median weekday density gap:
Key findings (plain English)
  1. National pattern (schedule-based): Lower-income neighborhoods do not consistently have worse planned service in this audit.
  2. Where equity risk concentrates: The clearest risk signal is transit deserts and coverage gaps, especially in the lowest-income quintile.
  3. How to use this: Prioritize “desert reduction” and coverage improvements. Treat “significant city” lists cautiously (multiple testing).

Findings and interpretation

Longer-form takeaways. Expand what you need; keep the page digestible.

Finding 1 — Not uniformly “low-income worse” for schedule-based supply

Many systems show equal or stronger scheduled service supply in lower-income neighborhoods (often reflecting dense corridors and core areas). Equity concerns still exist, but may concentrate in deserts, in specific outliers, or in reliability (not measured by static schedules).

Finding 2 — Desert risk is highest in the lowest-income quintile

Deserts are the most communicable equity signal: “places with very weak or absent service.” Use this as the primary KPI for public reporting.

Finding 3 — Rankings should be paired with uncertainty

When many systems are tested, some will appear significant by chance. For public-facing lists, apply multiple-testing control and report confidence intervals.

Methods

Short, reproducible, and readable.

Data sources
Metrics
  • Scheduled wait: derived from scheduled headways in a defined time window (weekday/weekend).
  • Scheduled density: scheduled supply normalized by neighborhood area (trips/km²).
  • Transit deserts: rule-based flag for very weak/absent service.
Equity comparisons + statistics
  • Primary: Bottom50 vs Top50 (median split within system).
  • Robustness: Q1 vs Q5 (quintiles within system).
  • Effect: median gap (low − high). Wait gap > 0 = worse waits in low-income; density gap < 0 = worse density in low-income.
  • Inference: Mann–Whitney U p-values (two-sided) and bootstrap confidence intervals.
What this does not measure

This is a planned-service audit. It does not capture delays/cancellations/crowding. For that, add GTFS‑Realtime where available.

Data quality and coverage

Not every neighborhood has measurable scheduled wait time in the analysis window. This section reports coverage so results are interpreted responsibly.

Neighborhood coverage (weekday)
have measurable scheduled wait
have measurable scheduled density
System coverage (primary)
Bottom50 vs Top50
Systems analyzed:
Usable wait gaps (wk):
Usable density gaps (wk):
System coverage (robustness)
Q1 vs Q5
Systems analyzed:
Usable wait gaps (wk):
Usable density gaps (wk):
How to interpret missing wait
In schedule-based evaluation, missing wait often means zero scheduled trips during the analysis window. Treat it as a coverage gap.
Quality rules used in the audit
  • Minimum sample sizes per system and per income group to compute robust statistics.
  • “Report tables” prefer complete finite statistics (gap + CI + p-value) to avoid NaN outputs.
  • For “significant city” lists, apply multiple-testing control (FDR) and require practical effect size thresholds.

Policy implications and recommendations

Prioritized for actionability and credibility.

Recommendation 1 — Make desert reduction the primary equity KPI

Track desert rate overall and by income group; prioritize where the lowest-income neighborhoods have the highest desert burden.

Recommendation 2 — Treat “no service” as an outcome

Where there are zero scheduled trips in the time window, wait is undefined—treat this as a coverage gap and report it explicitly.

Recommendation 3 — Add reliability when feasible

When GTFS‑Realtime is available, measure delay/cancellation patterns to capture reliability inequity.

Recommendation 4 — Use multiple-testing control for “significant city” lists

Apply FDR and require minimum practical effect sizes before listing systems publicly.

Limitations and ethics

Limitations
  • Schedule ≠ reliability: planned service only.
  • Wait assumptions: most interpretable for frequent service.
  • Geography: CT and DA differ in size and coverage.
  • System vs city: GTFS feeds may span multiple municipalities.
Ethics and responsible communication
  • Privacy: aggregated neighborhood statistics only; no personal data.
  • Avoid stigma: frame as service gaps and investment opportunities.
  • Accessibility: plain language + readable tables; avoid color-only meaning.
  • Licensing: respect open-data licenses and provide attribution.

Province summary

Medians of system gaps by province. Click headers to sort.

City/system results

Wait gap = median(wait_low) − median(wait_high) (minutes). Positive = worse waits in lower-income. Density gap = median(density_low) − median(density_high) (trips/km²). Negative = worse density in lower-income.

Sources, licensing, and attribution

This project uses open data. You can reuse these results, but you should keep attribution and respect each source’s license terms.

Primary sources
  • GTFS Schedule feeds (planned service): stops, trips, stop times, calendars. Reference: GTFS Schedule.
  • Statistics Canada — Census Profile (2021) for neighborhood income and geography-level characteristics. Terms: Statistics Canada Open Licence.
Attribution template (copy/paste)

“This analysis uses GTFS Schedule feeds published by Canadian transit agencies (see agency portals for license details) and Statistics Canada Census Profile data (2021) licensed under the Statistics Canada Open Licence. Analysis and visualizations: NTERA (ntera.ca).”

Important notes about transit feed licenses

GTFS feeds are usually distributed via municipal or provincial open-data portals. License terms can differ by agency. If you publish a downstream dataset, include a short “Data & Licensing” section and link to each portal’s license page when possible.

Data freshness

Income values come from the 2021 Census Profile (income year 2020). GTFS feeds represent the schedules available at the time they were collected. If you refresh GTFS inputs, regenerate the outputs and replace the files in /data.

Reproducibility and updates

This site is fully static. To update results, replace the files in /data and commit.

How to update the site
  1. Regenerate outputs and export updated CSVs.
  2. Replace the CSVs in /data (same filenames).
  3. Commit and push — GitHub Pages updates automatically.
View code / repository

https://github.com/jasonpark26/ntera

Downloads

Glossary

Wait gap

Median scheduled wait (low-income) − median scheduled wait (high-income), minutes. Positive = worse waits in low-income areas.

Density gap

Median trips/km² (low-income) − median trips/km² (high-income). Negative = worse density in low-income areas.

Transit desert

A rule-based flag for neighborhoods with very weak or absent planned service in the analysis window.

Planned service vs reliability

GTFS Schedule measures planned service. Reliability (delays/cancellations) requires GTFS‑Realtime or agency performance data.

FAQ

Why is wait missing for some neighborhoods?

Most commonly: no scheduled trips serve that neighborhood in the analysis window. Treat it as a coverage gap.

Does this measure delays/cancellations?

No. This is planned-service (GTFS Schedule). Reliability requires GTFS‑Realtime where available.

Why can higher-income areas look underserved?

Many higher-income neighborhoods are low-density suburbs with lower frequency and fewer routes.

Changelog

Recent updates
  • v5: Added Data Quality/Coverage, Last Updated timestamp, Reproducibility, Glossary, SEO tags, and a reliability disclaimer banner.
  • v4: Added Sources/Licensing and “How to read” definitions.