Geospatially Partitioning Public Transit Networks for Open Data Publishing
Keywords:Linked Data, Open Data, Mobility, Maintainability, Web API engineering
Public transit operators often publish their open data in a data dump, but developers with limited computational resources may not have the means to process all this data efficiently. In our prior work we have shown that geospatially partitioning an operator’s network can improve query times for client-side route planning applications by a factor of 2.4. However, it remains unclear whether this works for all network types, or other kinds of applications. To answer these questions, we must evaluate the same method on more networks and analyze the effect of geospatial partitioning on each network separately. In this paper we process three networks in Belgium: (i) the national railways, (ii) the regional operator in Flanders, and (iii) the network of the city of Brussels, using both real and artificially generated query sets. Our findings show that on the regional network, we can make query processing 4 times more efficient, but we could not improve the performance over the city network by more than 12%. Both the network’s topography, and to a lesser extent how users interact with the network, determine how suitable the network is for partitioning. Thus, we come to a negative answer to our question: our method does not work equally well for all networks. Moreover, since the network’s topography is the main determining factor, we expect this finding to apply to other graph-based geospatial data, as well as other Link Traversal-based applications.
T. Berners-Lee, “Linked data-design issues,” 2006. [Online]. Available: https://www.w3.org/DesignIssues/LinkedData.html.
P. Colpaert, A. Llaves, R. Verborgh, O. Corcho, E. Mannens and R. Van de Walle, “Intermodal public transit routing using Liked Connections,” in International Semantic Web Conference: Posters and Demos, 2015.
H. Delva, J. A. Rojas, P.-J. Vandenberghe, P. Colpaert and R. Verborgh, “Geospatial Partitioning of Open Transit Data.,” in International Conference on Web Engineering, Helsinki, 2020.
E. Miller, “An introduction to the resource description framework,” Bulletin of the American Society for Information Science and Technology, vol. 25, pp. 15–19, 1998.
R. Verborgh, O. Hartig, B. De Meester, G. Haesendonck, L. De Vocht, M. Vander Sande, R. Cyganiak, P. Colpaert, E. Mannens and R. Van de Walle, “Querying datasets on the web with high availability,” in International Semantic Web Conference, 2014.
M. Lanthaler and C. Gutl, “Hydra: A Vocabulary for Hypermedia-Driven Web APIs.,” LDOW, vol. 996, 2013.
O. Hartig, C. Bizer and J.-C. Freytag, “Executing SPARQL queries over the web of linked data,” in International Semantic Web Conference, 2009.
O. Hartig, “Zero-knowledge query planning for an iterator implementation of link traversal based query execution,” in Extended Semantic Web Conference, 2011.
O. Hartig and M. T. Ozsu, Walking without a map: optimizing response times of traversal-based linked data queries, 2016.
A. Antrim and S. J. Barbeau, “The many uses of GTFS data - opening the door to transit and multimodal applications,” Location-Aware Information Systems Laboratory at the University of South Florida, 2013.
A. Berger, D. Delling, A. Gebhardt and M. Müller-Hannemann, “Accelerating time-dependent multi-criteria timetable information is harder than expected,” in 9th Workshop on Algorithmic Approaches for Transportation Modeling, Optimization, and Systems (ATMOS’09), 2009.
R. Bauer, D. Delling and D. Wagner, “Experimental study of speed up techniques for timetable information systems,” Networks, 2011.
D. Delling, J. Dibbelt, T. Pajor and T. Zündorf, “Faster transit routing by hyper partitioning,” in 17th Workshop on Algorithmic Approaches for Transportation Modelling, Optimization, and Systems (ATMOS 2017), 2017.
H. Bast, M. Hertel and S. Storandt, “Scalable transfer patterns,” in 2016 Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments (ALENEX), 2016.
H. Bast, E. Carlsson, A. Eigenwillig, R. Geisberger, C. Harrelson, V. Raychev and F. Viger, “Fast routing in very large public transportation networks using transfer patterns,” in European Symposium on Algorithms, 2010.
G. Karypis and V. Kumar, “A fast and high quality multilevel scheme for partitioning irregular graphs,” SIAM Journal on scientific Computing, vol. 20, pp. 359–392, 1998.
D. Delling, A. V. Goldberg, I. Razenshteyn and R. F. Werneck, “Graph partitioning with natural cuts,” in 2011 IEEE International Parallel & Distributed Processing Symposium, 2011.
R. Battle and D. Kolas, “Geosparql: enabling a geospatial semantic web,” Semantic Web Journal, vol. 3, pp. 355–370, 2011.
J. Dibbelt, T. Pajor, B. Strasser and D. Wagner, “Connection Scan Algorithm,” 2017.
H. H. Wickham, K. Kafadar and Hadley, “Letter-value plots: Boxplots for large data,” 2011.