BusTr: Predicting Bus Travel Times from Real-Time Traffic

KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Virtual Event CA USA July, 2020, pp. 3243-3251, 2020.

Cited by: 0|Bibtex|Views17|Links
EI
Keywords:
transit systemmultilayer perceptronspatial input ablationbus arrival time predictionc congestionMore(9+)
Weibo:
We have described a new model, BusTr, for predicting how long it will take public transit buses to travel between points on their routes based on contextual features such as location and time as well as estimates of current tra c conditions

Abstract:

We present BusTr, a machine-learned model for translating road traffic forecasts into predictions of bus delays, used by Google Maps to serve the majority of the world's public transit systems where no official real-time bus tracking is provided. We demonstrate that our neural sequence model improves over DeepTTE, the state-of-the-art bas...More

Code:

Data:

0
Introduction
  • The authors present BusTr, a real-time delay forecasting system for public buses, which is used by Google Maps to expand the availability of real-time data for transit users around the world [9].

    Public transit systems are vital to human mobility in the rapidly urbanizing world.
  • A transit user wants to know (1) what the transit system is supposed to do: the system’s routes, stops, and schedules, and (2) what the transit system is doing right : the current locations and delays of the transit trips, which o en deviate signi cantly from published schedules [40]
  • Of these two modalities, the real-time state is disproportionately important for the routine trips that dominate most people’s transportation needs.
  • Transit variability is a source of rider anxiety and a barrier to increasing ridership [4, 5, 10, 34, 39, 42], and users place signi cant value on commute time reliability [21]
Highlights
  • We present BusTr, a real-time delay forecasting system for public buses, which is used by Google Maps to expand the availability of real-time data for transit users around the world [9]
  • Public transit systems are vital to human mobility in our rapidly urbanizing world
  • Transit variability is a source of rider anxiety and a barrier to increasing ridership [4, 5, 10, 34, 39, 42], and users place signi cant value on commute time reliability [21]
  • We have described a new model, BusTr, for predicting how long it will take public transit buses to travel between points on their routes based on contextual features such as location and time as well as estimates of current tra c conditions
  • Our model demonstrates excellent generalization to test data that di ers both spatially and temporally from the training examples we use, allowing our model to cope gracefully with the ever-changing world
Methods
  • The authors adopt per-shingle MAPE as the target metric. A review of the ETA prediction literature [27] notes that inconsistencies in reporting

    Steps 25K 50K 100K 200K 400K p-value p 10−10 p 10−10

    Except where otherwise stated, all the experiments train a model 20 times and evaluate its performance on 100,000 examples sampled from the test data set. e test data comes from a week of calendar data that is not used during training or validation.

    The authors show the mean and standard deviation of perrun test MAPEs.
  • Is approach, tested on 20 trials of 100k examples each from the test dataset produces mean MAPE 35.616.
  • Another natural baseline is a linear regression of the trajectory-scale features, without context, using just three per-trajectory numerical features: number of stops, distance traversed, and car tra c time estimates.
  • With 20 trials of linear regression tested on disjoint data slices from the training week, evaluated on 100k disjoint slices of the test week, the per-trial mean MAPE is 22.944
Conclusion
  • Acknowledgements e authors thank Cayden Meyer for directing them toward this problem space; Da-Cheng Juan for his ML modeling insights; and Neha Arora, Anthony Bertuca, Ma Deeds, Julian Gibbons, Reuben Kan, Ivan Kuznetsov, Oliver Lange, David La imore, ierry Le Boulenge, Ramesh Nagarajan, Marc Nunkesser, Anatoli Plotnikov, Ivan Volosyuk, and the greater Google Transit and Road Tra c teams for support, helpful discussions, and assistance with bringing this system to the world at large.
  • The authors are indebted to the partner agencies for providing the GTFS transit data feeds the system is trained on.
  • The authors have described a new model, BusTr, for predicting how long it will take public transit buses to travel between points on their routes based on contextual features such as location and time as well as estimates of current tra c conditions.
  • The authors' model demonstrates excellent generalization to test data that di ers both spatially and temporally from the training examples the authors use, allowing the model to cope gracefully with the ever-changing world
Summary
  • Introduction:

    The authors present BusTr, a real-time delay forecasting system for public buses, which is used by Google Maps to expand the availability of real-time data for transit users around the world [9].

    Public transit systems are vital to human mobility in the rapidly urbanizing world.
  • A transit user wants to know (1) what the transit system is supposed to do: the system’s routes, stops, and schedules, and (2) what the transit system is doing right : the current locations and delays of the transit trips, which o en deviate signi cantly from published schedules [40]
  • Of these two modalities, the real-time state is disproportionately important for the routine trips that dominate most people’s transportation needs.
  • Transit variability is a source of rider anxiety and a barrier to increasing ridership [4, 5, 10, 34, 39, 42], and users place signi cant value on commute time reliability [21]
  • Methods:

    The authors adopt per-shingle MAPE as the target metric. A review of the ETA prediction literature [27] notes that inconsistencies in reporting

    Steps 25K 50K 100K 200K 400K p-value p 10−10 p 10−10

    Except where otherwise stated, all the experiments train a model 20 times and evaluate its performance on 100,000 examples sampled from the test data set. e test data comes from a week of calendar data that is not used during training or validation.

    The authors show the mean and standard deviation of perrun test MAPEs.
  • Is approach, tested on 20 trials of 100k examples each from the test dataset produces mean MAPE 35.616.
  • Another natural baseline is a linear regression of the trajectory-scale features, without context, using just three per-trajectory numerical features: number of stops, distance traversed, and car tra c time estimates.
  • With 20 trials of linear regression tested on disjoint data slices from the training week, evaluated on 100k disjoint slices of the test week, the per-trial mean MAPE is 22.944
  • Conclusion:

    Acknowledgements e authors thank Cayden Meyer for directing them toward this problem space; Da-Cheng Juan for his ML modeling insights; and Neha Arora, Anthony Bertuca, Ma Deeds, Julian Gibbons, Reuben Kan, Ivan Kuznetsov, Oliver Lange, David La imore, ierry Le Boulenge, Ramesh Nagarajan, Marc Nunkesser, Anatoli Plotnikov, Ivan Volosyuk, and the greater Google Transit and Road Tra c teams for support, helpful discussions, and assistance with bringing this system to the world at large.
  • The authors are indebted to the partner agencies for providing the GTFS transit data feeds the system is trained on.
  • The authors have described a new model, BusTr, for predicting how long it will take public transit buses to travel between points on their routes based on contextual features such as location and time as well as estimates of current tra c conditions.
  • The authors' model demonstrates excellent generalization to test data that di ers both spatially and temporally from the training examples the authors use, allowing the model to cope gracefully with the ever-changing world
Tables
  • Table1: Model hyperparameters. Vizier was used to select the red ones over the black ones where given; others were set manually
  • Table2: Test MAPE by training step count and p-value for 100K being optimal, with one tailed t-test with 4× Bonferroni correction standards prevent the inter-comparison of approaches. MAPE, they report, is the most common metric used by the studies reviewed (13 of 40 studies), and is thus our choice here, too
  • Table3: BusTr vs DeepTTE, tuned for 10K-step training. We substantially outperform DeepTTE, even if we discard the runs where DeepTTE MAPEs don’t converge. One-tailed t-test pvalues are for losses compared to BusTr
  • Table4: Feature ablations, with one-tailed t-test p-values for losses compared to the full model, over n = 20 runs
  • Table5: E ect of generalization features on test data soon a er training, and on novel data over 9-week span. Test MAPE with standard deviation in parentheses. One-tailed t-test p-values given where the full model’s mean improves over the ablation (n=20 trials); ∗ - trials where the system under test outperformed the full model
Download tables as Excel
Reference
  • Anne Aguilera and Jean Grebert. Passenger transport mode share in cities: exploration of actual and future trends with a worldwide survey. International Journal of Automotive Technology and Management, 14(3-4):203–216, 2014.
    Google ScholarLocate open access versionFindings
  • Michael L Anderson. Subways, strikes, and slowdowns: e impacts of public transit on tra c congestion. American Economic Review, 104(9):2763–96, 2014.
    Google ScholarLocate open access versionFindings
  • Richard Barnes. Optimal orientations of discrete global grids and the poles of inaccessibility. International Journal of Digital Earth, 0(0):1–14, 2019.
    Google ScholarLocate open access versionFindings
  • Candace Brakewood, Sean Barbeau, and Kari Watkins. An experiment evaluating the impacts of real-time transit information on bus riders in Tampa, Florida. Transportation Research Part A: Policy and Practice, 69:409–422, 2014.
    Google ScholarLocate open access versionFindings
  • Sandip Chakrabarti and Genevieve Giuliano. Does service reliability determine transit patronage? insights from the Los Angeles Metro bus system. Transport Policy, 42:12 – 20, 201ISSN 0967-070X. URL http://www.sciencedirect.com/science/article/pii/S0967070X15300068.
    Locate open access versionFindings
  • Mei Chen, Jason Yaw, Steven I. Chien, and Xiaobo Liu. Using automatic passenger counter data in bus arrival time prediction. Journal of Advanced Transportation, 41(3):267–283, 2007. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/atr.5670410304.
    Locate open access versionFindings
  • Raj Che y and Nathaniel Hendren. e impacts of neighborhoods on intergenerational mobility I: Childhood exposure e ects. e arterly Journal of Economics, 133(3):1107–1162, 2018.
    Google ScholarLocate open access versionFindings
  • B. Dhivyabharathi, B. Anil Kumar, Avinash Achar, and Lelitha Vanajakshi. Bus travel time prediction: A lognormal autoregressive (AR) modeling approach. arXiv: 1904.03444, 2019.
    Findings
  • Alex Fabrikant. Google AI Blog, 2019.
    Google ScholarFindings
  • https://ai.googleblog.com/2019/06/
    Findings
  • [10] Brian Ferris, Kari Watkins, and Alan Borning. OneBusAway: results from providing real-time arrival information for public transit. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1807–1816. ACM, 2010.
    Google ScholarLocate open access versionFindings
  • [11] Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D. Sculley. Google Vizier: A Service for Black-Box Optimization. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’17, pages 1487–1495, Halifax, NS, Canada, 2017. ACM Press. ISBN 978-1-4503-4887-4.
    Google ScholarLocate open access versionFindings
  • gtfs-realtime/reference/, 2020.
    Google ScholarFindings
  • [13] GTFS. GTFS static overview. https://developers.google.com/transit/gtfs, 2020.
    Findings
  • [14] M. Amac Guvensan, Burak Dusun, Baris Can, and H. Irem Turkmen. A novel segment-based approach for improving classi cation performance of transport mode detection. Sensors, 18(1), 2018.
    Google ScholarLocate open access versionFindings
  • [15] Cristina Heghedus. PhD Forum: Forecasting Public Transit Using Neural Network Models. In 2017 IEEE International Conference on Smart Computing (SMARTCOMP), pages 1–2, Hong Kong, China, May 2017. IEEE. ISBN 978-1-5090-6517-2.
    Google ScholarLocate open access versionFindings
  • [16] Cristina Heghedus, Antorweep Chakravorty, and Chunming Rong. Neural Network Frameworks. Comparison on Public Transportation Prediction. In 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pages 842–849, Rio de Janeiro, Brazil, May 2019. IEEE. ISBN 978-172813-510-6.
    Google ScholarLocate open access versionFindings
  • [17] IPCC. Climate Change 2014: Mitigation of Climate Change. Cambridge University Press, 2014. ISBN 978-1-107-05821-7.
    Google ScholarFindings
  • [18] Ranhee Jeong and R Rile. Bus arrival time prediction using articial neural network model. In Proceedings. e 7th International IEEE Conference on Intelligent Transportation Systems (IEEE Cat. No. 04TH8749), pages 988–993. IEEE, 2004.
    Google ScholarLocate open access versionFindings
  • [19] Nikolas Julio, Ricardo Giesen, and Pedro Lizana. Real-time prediction of bus travel speeds using tra c shockwaves and machine learning algorithms. Research in Transportation Economics, 59:250 – 257, 2016. ISSN 0739-8859. Competition and Ownership in Land Passenger Transport (selected papers from the redbo 14 conference).
    Google ScholarLocate open access versionFindings
  • [20] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization, 2014.
    Google ScholarFindings
  • [21] Terence C. Lam and Kenneth A. Small. e value of time and reliability: measurement from a value pricing experiment. Transportation Research Part E: Logistics and Transportation Review, 37 (2):231 – 251, 2001. ISSN 1366-5545. Advances in the Valuation of Travel Time Savings.
    Google ScholarLocate open access versionFindings
  • [22] Ehsan Mazloumi, Geo Rose, Graham Currie, and Majid Sarvi. An integrated framework to predict bus travel time and its variability using tra c ow data. Journal of Intelligent Transportation Systems, 15(2):75–90, 2011.
    Google ScholarLocate open access versionFindings
  • [23] Claire McKnight, Herbert Levinson, Kaan Ozbay, Camille Kamga, and Robert Paaswell. Impact of tra c congestion on bus travel time in northern new jersey. Transportation Research Record, 1884: 27–35, 01 2004.
    Google ScholarLocate open access versionFindings
  • [24] Daniel L Mendoza, Martin P Buchert, and John C Lin. Modeling net e ects of transit operations on vehicle miles traveled, fuel consumption, carbon dioxide, and criteria air pollutant emissions in a mid-size US metro area: ndings from Salt Lake City, UT. Environmental Research Communications, 1(9):091002, Sep 2019.
    Google ScholarLocate open access versionFindings
  • [25] Georg Osang, James Cook, Alex Fabrikant, and Marco Gruteser. Livetravel: Real-time matching of transit vehicle trajectories to transit routes at scale. In Proceedings of 2019 IEEE ITSC, pages 2244–2251, 2019.
    Google ScholarLocate open access versionFindings
  • [26] Rahul Pathak, Christopher K. Wyczalkowski, and Xi Huang. Public transit access and the changing spatial distribution of poverty. Regional Science and Urban Economics, 66:198 – 212, 2017. ISSN 0166-0462.
    Google ScholarLocate open access versionFindings
  • [27] ilo Reich, Marcin Budka, Derek Robbins, and David Hulbert. Survey of ETA prediction methods in public transport networks. arXiv: 1904.05037, 2019.
    Findings
  • [28] Kevin Sahr, Denis White, and A. Jon Kimerling. Geodesic discrete global grid systems. Cartography and Geographic Information Science, 30(2):121–134, 2003.
    Google ScholarLocate open access versionFindings
  • [29] G. Salvo, G. Amato, and Pietro Zito. Bus speed estimation by neural networks to improve the automatic eet management. European Transport, 37:93–104, 2007.
    Google ScholarLocate open access versionFindings
  • [30] Benjamin Solnik, Daniel Golovin, Greg Kochanski, John Elliot Karro, Subhodeep Moitra, and D. Sculley. Bayesian optimization for a be er dessert. In Proceedings of the 2017 NIPS Workshop on Bayesian Optimization, December 9, 2017, Long Beach, USA, 2017. e workshop is BayesOpt 2017 NIPS Workshop on Bayesian Optimization December 9, 2017, Long Beach, USA.
    Google ScholarLocate open access versionFindings
  • [31] F. Sun, Y. Pan, J. White, and A. Dubey. Real-time and predictive analytics for smart public transportation decision support system. In 2016 IEEE International Conference on Smart Computing (SMARTCOMP), May 2016.
    Google ScholarLocate open access versionFindings
  • [37] Jiafu Wan, Jianqi Liu, Zehui Shao, Athanasios V. Vasilakos, Muhammad Imran, and Keliang Zhou. Mobile crowd sensing for tra c prediction in internet of vehicles. Sensors (Basel), 16(1), 2016.
    Google ScholarLocate open access versionFindings
  • [38] Dong Wang, Junbo Zhang, Wei Cao, Jian Li, and Yu Zheng. When will you arrive? Estimating travel time based on deep neural networks. In irty-Second AAAI Conference on Arti cial Intelligence, 2018.
    Google ScholarLocate open access versionFindings
  • [32] Yidan Sun, Guiyuan Jiang, Siew-Kei Lam, Shicheng Chen, and Peilan He. Bus Travel Speed Prediction using A ention Network of Heterogeneous Correlation Features. In Proceedings of ICDM. Society for Industrial and Applied Mathematics, May 2019. ISBN 978-1-61197-567-3. URL https://epubs.siam.org/doi/book/10.1137/1.9781611975673.
    Locate open access versionFindings
  • [39] Kari Edison Watkins, Brian Ferris, Alan Borning, G Sco Rutherford, and David Layton. Where is my bus? impact of mobile realtime information on the perceived and actual wait time of transit riders. Transportation Research Part A: Policy and Practice, 45(8): 839–848, 2011.
    Google ScholarLocate open access versionFindings
  • [33] Transit App. ”how we mapped the world’s weirdest streets”, 2015. URL ”https://medium.com/transit-app/hello-nairobi-cc27bb5a73b7”.
    Findings
  • [34] Transit Center. Who’s on board. Technical report, Transit Center, 2016. URL http://transitcenter.org/wp-content/uploads/2016/07/ Whos-On-Board-2016-7 12 2016.pdf.
    Findings
  • [40] Nate Wessel, Je Allen, and Steven Farber. Constructing a routable retrospective transit timetable from a real-time vehicle location feed and GTFS. Journal of Transport Geography, 62:92– 97, 2017.
    Google ScholarLocate open access versionFindings
  • [41] Haitao Xu and Jing Ying. Bus arrival time prediction with realtime and historic data. Cluster Computing, 20(4):3099–3106, December 2017. ISSN 1573-7543.
    Google ScholarLocate open access versionFindings
  • [35] W. Treethidtaphat, W. Pa ara-Atikom, and S. Khaimook. Bus arrival time prediction at any distance of bus route using deep neural network model. In 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), pages 988–992, Oct 2017.
    Google ScholarLocate open access versionFindings
  • Feng Zhang, Qing Shen, and Kelly J. Cli on. Examination of traveler responses to real-time information about bus arrivals using panel data. Transportation Research Record, 2082(1):107–115, 2008.
    Google ScholarLocate open access versionFindings
  • [36] William Vincent and Lisa Callaghan Jerram. e potential for
    Google ScholarFindings
  • [43] Chang-Jiang Zheng, Yi-Hua Zhang, and Xue-Jun Feng. Improved bus rapid transit to reduce transportation-related co2 emissions. iterative prediction for multiple stop arrival time using a support Journal of Public Transportation, 9(3):12, 2006.
    Google ScholarLocate open access versionFindings
  • vector machine. Transport, 27(2):158–164, 2012.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments