Leveraging Conceptual Data Models to Ensure the Integrity of Cassandra Databases
The use of NoSQL databases for cloud environments has been increasing due to their performance advantages when working with big data. One of the most popular NoSQL databases used for cloud services is Cassandra, in which each table is created to satisfy one query. This means that as the same data could be retrieved by several queries, these data may be repeated in several different tables. The integrity of these data must be maintained in the application that works with the database, instead of in the database itself as in relational databases. In this paper, we propose a method to ensure the data integrity when there is a modification of data by using a conceptual model that is directly connected to the logical model that represents the Cassandra tables. This method identifies which tables are affected by the modification of the data and also proposes how the data integrity of the database may be ensured. We detail the process of this method along with two examples where we apply it in two insertions of tuples in a conceptual model. We also apply this method to a case study where we insert several tuples in the conceptual model, and then we discuss the results. We have observed how in most cases several insertions are needed to ensure the data integrity as well as needing to look for values in the database in order to do it.
Moniruzzaman, A. B. M, Hossain and Syed Akhter (2013). Nosql database: New era of databases for big data analytics-classification, characteristics and comparison. arXiv preprint arXiv:1307.0191.
Leavitt, Neal. (2010). Will NoSQL databases live up to their promise? Computer, Vol 43, No 2, pp 12–14.
Li, Yishan, and Manoharan, Sathiamoorthy. (2013). A performance comparison of SQL and NoSQL databases. In Communications, computers and signal processing, pp 15–19
Cattell, Rick. (2011). Scalable SQL and NoSQL data stores. Acm Sigmod Record, Vol 39, No 4, pp 12–27
Tauro, Clarence. JM, Aravindh, Shreeharsha and Shreeharsha, A. B. (2012). Comparative study of the new generation, agile, scalable, high performance NOSQL databases. International Journal of Computer Applications, Vol 48, No 20, pp. 1–4.
Bhogal, Jagdev and Choksi, Imran (2015). Handling big data using NoSQL. In IEEE 29th International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 393–398
Pokorny, Jaroslav (2013). NoSQL databases: a step to database scalability in web environment. International Journal of Web Information Systems, Vol 9 No 1, pp 69–82.
MongoDB Inc (2019). Who uses MongoDB https://www.mongodb.com/who-uses-mongodb Accesed: 2019-03-13
Datastax (2019). Case Studies, https://www.datastax.com/ resources/casestudies Accessed: 2019-03-13
Apache Software Foundation. (2016). Apache Cassandra, http://cassandra.apache.org/ Accessed: 2019-03-13
Han, Jing et al (2011). Survey on NoSQL database. In 6th international conference on Pervasive computing and applications (ICPCA), 2011 pp. 363–366
Datastax (2015). Basic Rules of Cassandra Data Modeling, https://www.datastax.com/dev/blog/basic-rules-of-cassandra-data-modeling Accessed 2019-03-13
Rajanarayanan Thottuvaikkatumana. (2015). Cassandra Design Patterns, second edition, ed. Packt Publishing Ltd
Suárez-Otero, Pablo, Suárez-Cabal, María José and Tuya, Javier (2018). Leveraging Conceptual Data Models for Keeping Cassandra Database Integrity. In WEBIST 2018, pp 398–403
Apache Software Foundation (2016). The Cassandra Query Language (CQL) http://cassandra.apache.org/doc/latest/cql/Accessed 2019-03-13
Ghazizadeh, Puya, Mukkamala, Ravi and Olariu, Stephan (2013). Data Integrity Evaluation in CloudDatabase-as-a-Service. In IEEE Ninth World Congress on Services pp 280–285
Aniello, Leonard et al (2017). Blockchain-based Database to Ensure Data Integrityin Cloud Computing Environments. In 13th European Dependable Computing Conference (EDCC), pp. 151–154
Olmsted, Aspen and Santhanakrishnan, Gayathri (2016). Cloud Data Denormalization of Anonymous Transactions. In Cloud Computing Seventh International Conference on Cloud Computing, GRIDs, and Virtualization, pp 42–46.
Datastax. (2017). How are consistent read and write operations handled? : https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDataConsistency.html Accessed 2019-03-13
Datastax (2015). New in Cassandra: Materialized Views: https://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views Accessed 2019-03-13
Christian Peter. (2015). Supporting the Join Operation in a NoSQL System. Master’s thesis. Norwegian university of Science and Technology, Norway
Chebotko, Artem; Kashlev, Andrey and Lu, Shiyong (2015). A Big Data Modeling Methodology for Apache Cassandra. In IEEE International Congress on Big Data (BigData’15), pp. 238–245
Sevilla Ruiz, Diego, Morales Feliciano, Severino and García Molina, Jesús (2015). Inferring versioned schemas from NoSQL databases and its applications. In International Conference on Conceptual Modeling (ER 2015), pp. 467–480
Datastax (2019). Creating a table: https://docs.datastax.com/en/dse/5.1/cql/cql/cqlusing/useCreateTable.html Accessed 2019-05-20