Benchmarking Web API Quality – Revisited


  • David Bermbach TU Berlin & Einstein Center Digital Future, Mobile Cloud Computing Research Group, Berlin, Germany
  • Erik Wittern IBM, Hybrid Cloud Integration, Hamburg, Germany



Web APIs, Benchmarking, Quality of Service


Modern applications increasingly interact with web APIs – reusable components, deployed and operated outside the application, and accessed over the network. Their existence, arguably, spurs application innovations, making it easy to integrate data or functionalities. While previous work has analyzed the ecosystem of web APIs and their design, little is known about web API quality at runtime. This gap is critical, as qualities including availability, latency, or provider security preferences can severely impact applications and user experience.

In this paper, we revisit a 3-month, geo-distributed benchmark of popular web APIs, originally performed in 2015. We repeat this benchmark in 2018 and compare results from these two benchmarks regarding availability and latency. We furthermore introduce new results from assessing provider security preferences, collected both in 2015 and 2018, and results from our attempts to reach out to API providers with the results from our 2015 experiments. Our extensive experiments show that web API qualities vary 1.) based on the geo-distribution of clients, 2.) during our individual experiments, and 3.) between the two experiments. Our findings provide evidence to foster the discussion around web API quality, and can act as a basis for the creation of tools and approaches to mitigate quality issues.


Download data is not yet available.

Author Biographies

David Bermbach, TU Berlin & Einstein Center Digital Future, Mobile Cloud Computing Research Group, Berlin, Germany

David Bermbach is an Assistant Professor at TU Berlin and is heading the Mobile Cloud Computing research group at the Einstein Center Digital Future in Berlin, Germany. In his research, he focuses on benchmarking as well as platforms and applications for cloud, edge, and fog computing. He holds a PhD in computer science and a diploma in business engineering, both from Karlsruhe Institute of Technology.

Erik Wittern, IBM, Hybrid Cloud Integration, Hamburg, Germany

Erik Wittern is IBM’s GraphQL Lead Architect, and works on bringing GraphQL support to IBM’s API Management products. Prior to his current role, Erik spent five years as a Research Staff Member at the IBM T.J. Watson Research Center in New York. His research in the field of Software Engineering focuses on web APIs, their discovery and use, and the evolution of new API models like GraphQL. Erik holds a PhD in computer science from Karlsruhe Institute Of Technology.


Daniel Abadi. Consistency tradeoffs in modern distributed database system design: Cap is only part of the story. IEEE Computer, 45(2):37–42, February 2012.

Eric Anderson, Xiaozhou Li, Mehul A. Shah, Joseph Tucek, and Jay J. Wylie. What consistency does your key-value store actually provide? In Proceedings of the 6th Workshop on Hot Topics in System Dependability (HOTDEP), HotDep’10, pages 1–16, Berkeley, CA, USA, 2010. USENIX Association.

J. Aué, M. Aniche, M. Lobbezoo, and A. van Deursen. An Exploratory Study on Faults inWeb API Integration in a Large-Scale Payment Company. In 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP), pages 13–22, May 2017.

SungGyeong Bae, Hyunghun Cho, Inho Lim, and Sukyoung Ryu. Safewapi: web api misuse detector for web applications. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 507–517, 2014.

D. Bermbach, J. Kuhlenkamp, A. Dey, A. Ramachandran, A. Fekete, and S. Tai. BenchFoundry: A Benchmarking Framework for Cloud Storage Services. In Proceedings of the 15th International Conference on Service Oriented Computing (ICSOC 2017). Springer, 2017.

D Bermbach, J Kuhlenkamp, A Dey, S Sakr, and R Nambiar. Towards an Extensible Middleware for Database Benchmarking. In TPCTC 2014, pages 82–96. Springer, 2014.

David Bermbach. Benchmarking Eventually Consistent Distributed Storage Systems. PhD thesis, Karlsruhe Institute of Technology, 2014.

David Bermbach and Erik Wittern. Benchmarking web api quality. In Proceedings of the 16th International Conference on Web Engineering (ICWE 2016). Springer, 2016.

David Bermbach, Erik Wittern, and Stefan Tai. Cloud Service Benchmarking: Measuring Quality of Cloud Services from a Client Perspective. Springer, 2017.

Carsten Binnig, Donald Kossmann, Tim Kraska, and Simon Loesing. How is the Weather Tomorrow?: Towards a Benchmark for the Cloud. In Proc. of DBTEST, pages 1–6. ACM, 2009.

Amir Hossein Borhani, Philipp Leitner, Bu-Sung Lee, Xiaorong Li, and Terence Hung. WPress: An Application-Driven Performance Benchmark for Cloud-Based Virtual Machines. In Proc. of EDOC, pages 101–109. IEEE, 2014.

Jake Brutlag. Speed Matters for Google Web Search. Technical report, Google, Inc., 2009.

Jürgen Cito, Devan Gotowka, Philipp Leitner, Ryan Pelette, Dritan Suljoti, and Schahram Dustdar. Identifying web performance degradations through synthetic and real-user monitoring. J. Web Eng., 14(5&6):414–442, 2015.

Cristian Coarfa, Peter Druschel, and Dan S Wallach. Performance Analysis of TLS Web Servers. ACM Transactions on Computer Systems (TOCS), 24(1):39–69, 2006.

Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana Yerneni. Pnuts: Yahoo!’s hosted data serving platform. Proceedings of the VLDB Endowment, 1(2):1277–1288, August 2008.

Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. Benchmarking Cloud Serving Systems with YCSB. In Proc. of SOCC, pages 143–154. ACM, 2010.

David Daly, William Brown, Henrik Ingo, Jim O’Leary, and David Bradford. The use of change point detection to identify software performance regressions in a continuous integration system. In Proceedings of the ACM/SPEC International Conference on Performance Engineering, pages 67–75, 2020.

Akon Dey, Alan Fekete, Raghunath Nambiar, and Uwe Röhm. Ycsb+ t: Benchmarking web-scale transactional databases. In 2014 IEEE 30th International Conference on Data Engineering Workshops, pages 223–230. IEEE, 2014.

Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, and Philippe Cudre-Mauroux. Oltp-bench: An extensible testbed for benchmarking relational databases. Proceedings of the VLDB Endowment, 7(4):277–288, 2013.

Tiago Espinha, Andy Zaidman, and Hans-Gerhard Gross. Web API Fragility: How Robust is Your Mobile Application? In Proc. of MOBILESoft, pages 12–21. IEEE, 2015.

Vincenzo Ferme, Ana Ivanchikj, and Cesare Pautasso. A framework for benchmarking bpmn 2.0 workflow management systems. In International conference on business process management, pages 251–259. Springer, 2016.

Vincenzo Ferme and Cesare Pautasso. A declarative approach for performance tests execution in continuous software development environments. In Proceedings of the 2018 ACM/SPEC International Conference on Performance Engineering, ICPE ’18, page 261–272, New York, NY, USA, 2018. Association for Computing Machinery.

Roy T Fielding and Richard N Taylor. Architectural Styles and the Design of Network-based Software Architectures, volume 7. University of California, Irvine Irvine, USA, 2000.

Enno Folkerts, Alexander Alexandrov, Kai Sachs, Alexandru Iosup, Volker Markl, and Cafer Tosun. Benchmarking in the cloud: What it should, can, and cannot be. In Proceedings of the 4th TPC Technology Conference on Performance Evaluation and Benchmarking (TPCTC 2012), pages 173–188. Springer, 2013.

Yu Gan, Yanqi Zhang, Dailun Cheng, Ankitha Shetty, Priyal Rathi, Nayan Katarki, Ariana Bruno, Justin Hu, Brian Ritchken, Brendon Jackson, et al. An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 3–18, 2019.

M. Grambow, F. Lehmann, and D. Bermbach. Continuous Benchmarking: Using System Benchmarking in Build Pipelines. In Proceedings of the 1st Workshop on Service Quality and Quantitative Evaluation in new Emerging Technologies, 2019.

M. Grambow, L. Meusel, E. Wittern, and D. Bermbach. Benchmarking Microservice Performance: A Pattern-based Approach. In Proceedings of the 35th ACM Symposium on Applied Computing, 2020.

Jim Gray. The Benchmark Handbook for Database and Transaction Systems, chapter Database and Transaction Processing Handbook. Morgan Kaufmann, 2nd edition, 1993.

Olaf Hartig and Jorge Pérez. Semantics and complexity of graphql. In Proceedings of the 2018 World Wide Web Conference, pages 1155–1164, 2018.

A. Van Hoorn, J. Waller, and W. Hasselbring. Kieker: A Framework for Application Performance Monitoring and Dynamic Software Analysis. In Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering, pages 247–248, 2012.

Keman Huang, Yushun Fan, and Wei Tan. An Empirical Study of Programmable Web: A Network Analysis on a Service-Mashup System. In 2012 IEEE 19th International Conference on Web Services, pages 552–559. IEEE, 2012.

Karl Huppler. The art of building a good benchmark. In Proceedings of the First TPC Technology Conference on Performance Evaluation and Benchmarking (TPCTC 2009), pages 18–30. Springer, 2009.

Ana Ivanchikj, Ilija Gjorgjiev, and Cesare Pautasso. Restalk miner: mining restful conversations, pattern discovery and matching. In International Conference on Service-Oriented Computing, pages 470–475. Springer, 2018.

Matjaz B Juric, Ivan Rozman, Bostjan Brumen, Matjaz Colnaric, and Marjan Hericko. Comparison of Performance of Web Services, WS-Security, RMI, and RMI–SSL. Journal of Systems and Software, 79(5):689–700, 2006.

M. Klems, D. Bermbach, and R. Weinert. A Runtime Quality Measurement Framework for Cloud Database Service Systems. In Proc. of QUATIC, pages 38–46, 2012.

Donald Kossmann, Tim Kraska, and Simon Loesing. An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. In Proc. of SIGMOD, pages 579–590. ACM, 2010.

J. Kuhlenkamp and S. Werner. Benchmarking FaaS Platforms: Call for Community Participation. In Proceedings of the 4th International Workshop on Serverless Computing. 2018.

J. Kuhlenkamp, S. Werner, M. C. Borges, K. El Tal, and S. Tai. An Evaluation of FaaS Platforms as a Foundation for Serverless Big Data Processing. In Proceedings of the 12th IEEE/ACM International Conference on Utility and Cloud Computing, 2019.

Jörn Kuhlenkamp, Markus Klems, and Oliver Röss. Benchmarking Scalability and Elasticity of Distributed Database Systems. pages 1219–1230, 2014.

Jörn Kuhlenkamp, Kevin Rudolph, and David Bermbach. AISLE: Assessment of Provisioned Service Levels in Public IaaS-based Database Systems. In Proc. of ICSOC, pages 154–168. Springer, 2015.

Avinash Lakshman and Prashant Malik. Cassandra: A decentralized structured storage system. SIGOPS Operating Systems Review, 44(2):35–40, April 2010.

Chune Li, Richong Zhang, Jinpeng Huai, and Hailong Sun. A Novel Approach for API Recommendation in Mashup Development. In 2014 IEEE International Conference on Web Services, pages 289–296. IEEE, 2014.

Jun Li, Yingfei Xiong, Xuanzhe Liu, and Lu Zhang. How Does Web Service API Evolution Affect Clients? In 2013 IEEE 20th International Conference on Web Services, pages 300–307. IEEE, 2013.

W. Lloyd, S. Ramesh, S. Chinthalapati, L. Ly, and S. Pallickara. Serverless computing: An Investigation of Factors Influencing Microservice Performance. In Proceedings of the IEEE International Conference on Cloud Engineering, pages 159–169, 2018.

Daniel Lübke, Olaf Zimmermann, Cesare Pautasso, Uwe Zdun, and Mirko Stocker. Interface evolution patterns: balancing compatibility and extensibility across service life cycles. In Tiago Boldt Sousa, editor, Proceedings of the 24th European Conference on Pattern Languages of Programs, EuroPLoP 2019, Irsee, Germany, July 3-7, 2019, pages 15:1–15:24. ACM, 2019.

Henry Martinez. How Much Does Downtime Really Cost? Accessed: 2019-12-02.

Jonathan McChesney, Nan Wang, Ashish Tanwer, Eyal de Lara, and Blesson Varghese. Defog: fog computing benchmarks. In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, pages 47–58, 2019.

S Müller, D Bermbach, S Tai, and F Pallas. Benchmarking the Performance Impact of Transport Layer Security in Cloud Database Systems. In Proc. of IC2E, pages 27–36. IEEE, 2014.

Andy Neumann, Nuno Laranjeiro, and Jorge Bernardino. An Analysis of Public REST Web Service APIs. IEEE Transactions on Services Computing, 2018.

Jakob Nielsen. Usability Engineering. Elsevier, 1st edition, 1994.

Mohamed A Oumaziz, Abdelkarim Belkhir, Tristan Vacher, Eric Beaudry, Xavier Blanc, Jean-Rémy Falleri, and Naouel Moha. Empirical Study on REST APIs Usage in Android Mobile Applications. In International Conference on Service-Oriented Computing, pages 614–622. Springer, 2017.

F. Pallas, D. Bermbach, S. Müller, and S. Tai. Evidence-based security configurations for cloud datastores. In Proceedings of the the 32nd ACM Symposium on Applied Computing. ACM, 2017.

Frank Pallas, Johannes Günther, and David Bermbach. Pick your choice in hbase: Security or performance. In Proceedings of the IEEE International Conference on Big Data (Big Data 2016). IEEE, 2017.

Swapnil Patil, Milo Polte, Kai Ren, Wittawat Tantisiriroj, Lin Xiao, Julio López, Garth Gibson, Adam Fuchs, and Billie Rinaldi. YCSB++: Benchmarking and Performance Debugging Advanced Features in Scalable Table Stores. In Proc. of SOCC, pages 1–14. ACM, 2011.

T. Rabl, M. Sadoghi, H.-A. Jacobsen, S. Gómez-Villamor, V. Muntés-Mulero, and S. Mankovskii. Solving Big Data Challenges for Enterprise Application Performance Management. Proceedings of the VLDB Endowment, 5(12), 2012.

Marianna Rapoport, Philippe Suter, Erik Wittern, Ondřej Lhótak, and Julian Dolby. Who you gonna call? Analyzing Web Requests in Android Applications. In Proceedings of the 14th International Conference on Mining Software Repositories, pages 80–90. IEEE Press, 2017.

Carlos Rodríguez, Marcos Baez, Florian Daniel, Fabio Casati, Juan Carlos Trabucco, Luigi Canali, and Gianraffaele Percannella. Rest apis: a large-scale analysis of compliance with principles and best practices. In International Conference on Web Engineering, pages 21–39. Springer, 2016.

J. Scheuner and P. Leitner. Performance Benchmarking of Infrastructure-as-a-Service (IaaS) Clouds with Cloud WorkBench. In Companion of the 2019 ACM/SPEC International Conference on Performance Engineering, pages 53–56, 2019.

J. Scheuner, P. Leitner, J. Cito, and H. Gall. Cloud WorkBench – Infrastructure-as-Code Based Cloud Benchmarking. In Proceedings of the IEEE 6th International Conference on Cloud Computing Technology and Science, pages 246–253, 2014.

Henning Schulz, Dušan Okanović, André van Hoorn, Vincenzo Ferme, and Cesare Pautasso. Behavior-driven load testing using contextual knowledge - approach and experiences. In Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, ICPE ’19, page 265–272, New York, NY, USA, 2019. Association for Computing Machinery.

Artem Shtatnov and Ravi Srinivas Ranganathan. Our learnings from adopting GraphQL. Accessed: 2020-04-01.

Mirko Stocker, Olaf Zimmermann, Uwe Zdun, Daniel Lübke, and Cesare Pautasso. Interface quality patterns: Communicating and improving the quality of microservices apis. In Proceedings of the 23rd European Conference on Pattern Languages of Programs, EuroPLoP ’18, New York, NY, USA, 2018. Association for Computing Machinery.

Mark Stuart. GraphQL: A success story for PayPal Checkout. Accessed: 2020-04-01.

Philippe Suter and Erik Wittern. Inferring Web API Descriptions from Usage Data. In Proc. of the 3rd IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb), pages 7–12, 2015.

G. Torikian, B. Black, B. Swinnerton, C. Sommerville, D. Celis, and K. Daigler. The GitHub GraphQL API. Accessed: 2020-04-01.

Jóakim v. Kistowski, Jeremy A. Arnold, Karl Huppler, Klaus-Dieter Lange, John L. Henning, and Paul Cao. How to build a benchmark. In Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering (ICPE 2015), pages 333–336. ACM, 2015.

B. Varghese, L. T. Subba, L. Thai, and A. Barker. Container-Based Cloud Virtual Machine Benchmarking. In Proceedings of the IEEE International Conference on Cloud Engineering, pages 192–201, 2016.

Hiroshi Wada, Alan Fekete, Liang Zhao, Kevin Lee, and Anna Liu. Data Consistency Properties and the Trade-offs in Commercial Cloud Storages: the Consumers’ Perspective. In Proc. of CIDR, pages 134–143, 2011.

Shaohua Wang, Iman Keivanloo, and Ying Zou. How Do Developers React to RESTful API Evolution? In Proc. of ICSOC, pages 245–259. Springer, 2014.

Michael Weiss and GR Gangadharan. Modeling the mashup ecosystem: Structure and growth. R&d Management, 40(1):40–49, 2010.

Erik Wittern, Alan Cha, James C. Davis, Guillaume Baudart, and Louis Mandel. An empirical study of graphql schemas. In Sami Yangui, Ismael Bouassida Rodriguez, Khalil Drira, and Zahir Tari, editors, Service-Oriented Computing, pages 3–19, Cham, 2019. Springer International Publishing.

Erik Wittern, Alan Cha, and Jim A. Laredo. Generating graphql-wrappers for rest(-like) apis. In Tommi Mikkonen, Ralf Klamma, and Juan Hernández, editors, Web Engineering, pages 65–83, Cham, 2018. Springer International Publishing.

Erik Wittern, Jim Laredo, Maja Vukovic, Vinod Muthusamy, and Aleksander Slominski. A Graph-based Data Model for API Ecosystem Insights. In Proc. of ICWS, pages 41–48. IEEE, 2014.

Erik Wittern, Annie Ying, Yunhui Zheng, Jim A. Laredo, Julian Dolby, Christopher C. Young, and Aleksander A. Slominski. Opportunities in Software Engineering Research for Web API Consumption. In Proceedings of the 1st International Workshop on API Usage and Evolution, WAPI ’17, pages 7–10, Piscataway, NJ, USA, 2017. IEEE Press.

Erik Wittern, Annie T. T. Ying, Yunhui Zheng, Julian Dolby, and Jim A. Laredo. Statically Checking Web API Requests in JavaScript. In Proceedings of the 39th International Conference on Software Engineering, ICSE ’17, pages 244–254, Piscataway, NJ, USA, 2017. IEEE Press.

Jinqiu Yang, Erik Wittern, Annie TT Ying, Julian Dolby, and Lin Tan. Towards extracting web api specifications from documentation. In 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR), pages 454–464. IEEE, 2018.

Shuli Yu and C Jason Woodard. Innovation in the Programmable Web: Characterizing the Mashup Ecosystem. In International Conference on Service-Oriented Computing, pages 136–147. Springer, 2008.

Emmanuele Zambon, Sandro Etalle, Roel J Wieringa, and Pieter Hartel. Model-based qualitative risk assessment for availability of it infrastructures. Software & Systems Modeling, 10(4):553–580, 2011.

Uwe Zdun, Mirko Stocker, Olaf Zimmermann, Cesare Pautasso, and Daniel Lübke. Guiding architectural decision making on quality aspects in microservice apis. In International Conference on Service-Oriented Computing, pages 73–89. Springer, 2018.

Kamal Zellag and Bettina Kemme. How Consistent is Your Cloud Application? In Proc. of SOCC. ACM, 2012.