🎧 New: AI-Generated Podcasts Turn your study notes into engaging audio conversations. Learn more

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Transcript

1 Lakehead University NoSQL Databases by Abed Alkhateeb 2023 NoSQL 322/ • • • • A NoSQL originally referring to non SQL or non relational is a database that provides a mechanism for storage and retrieval of data. This data is modeled in means other than the tabular relations used in relati...

1 Lakehead University NoSQL Databases by Abed Alkhateeb 2023 NoSQL 322/ • • • • A NoSQL originally referring to non SQL or non relational is a database that provides a mechanism for storage and retrieval of data. This data is modeled in means other than the tabular relations used in relational databases. NoSQL databases are used in real-time web applications and big data and their use are increasing over time. NoSQL systems are also sometimes called Not only SQL to emphasize the fact that they may support SQL-like query languages. NoSQL Database 323/     A NoSQL database includes simplicity of design, simpler horizontal scaling to clusters of machines and finer control over availability. The data structures used by NoSQL databases are different from those used by default in relational databases which makes some operations faster in NoSQL. The suitability of a given NoSQL database depends on the problem it should solve. Data structures used by NoSQL databases are sometimes also viewed as more flexible than relational database tables. NoSQL 324/ 325/ What is/is not NoSQL database?  NoSQL does not mean the absence of SQL; instead it means Not Only SQL. In other words, we are not limited to a single option to store and retrieve the data. As its name implies, there is no fixed definition for what NoSQL is or does, but the following concepts can help us understand what NoSQL is (and isn't):       No relational model Data can be clustered Most solutions are open-source Built for the Web Lack of a schema Connection between the data is growing in which we require an architecture. Taxonomy of NoSQL 326/ Key-value • Graph • database Document- • oriented Why NoSQL? 327/    Since everyone is on the Web these days, it doesn't make a lot of sense to keep the database separate from the applications servers. Think about sites like Facebook - their data storage needs are completely different than those of a large organization's administrative systems. The relational model just doesn't work as well for that large amount of data. In part, this is because the relational model is fairly rigid, which means that changing an existing structure can be time-consuming. Why NoSQL? 328/     Structured data is arranged in neat rows and columns, and can be very tidy, if not huge and complex. Unstructured data, on the other hand, isn't organized; it doesn't follow the strict patterns found in relational databases. Relational databases can also have very large tables, leaving them with records that can be bloated with huge numbers of columns. NoSQL, as the visuals in this lesson will show, isn't tied to this rigid nature. It offers things like a clustered approach that can actually improve performance. SQL vs NoSQL 329/ SQL RELATIONAL DATABASE MANAGEMENT SYSTEM (RDBMS) These databases have fixed or static or predefined schema These databases are not suited for hierarchical data storage. These databases are best suited for complex queries Vertically Scalable NOSQL Non-relational or distributed database system. They have dynamic schema These databases are best suited for hierarchical data storage. These databases are not so good for complex queries Horizontally scalable Aggregate Data Models 3210/ Aggregate Data Models 3211/ Relational Data Models 3212/ Aggregate Data Models 3213/ Aggregate Data Models 3214/ Limitations of NoSQL 3215/ MongoDB 3216/     MongoDB is an open source NOSQL database It stores data in JSON-like documents Unlike relational database the data model is flexible using documents (in JSON format). It provides high performance, high availability, and automatic scaling since it is a distributed database. MongoDB 3217/    Since they are distributed, NoSQL databases like MongoDB often compromise consistency in favor of availability and partition tolerance. In the NoSQL world, the “eventual consistency” is often used to achieve speed and scalability. MongoDB is a cross-platform, document oriented database that provides, high performance, high availability, and easy scalability. MongoDB works on concept of collection and document. MongoDB Design 3218/    Database is a physical container for collections. Each database gets its own set of file on the file system. A single MongoDB server typically has multiple databases. MongoDB uses namevalue pairs or fields, which then make up documents, which make up collections. MongoDB Data Structure Organization 3219/ Create DataBase 3220/ Query language 3221/  SQL has Structured Query Language; MongoDB has the MongoDB Query Language. Sometimes referred to as MQL, it uses BSON (Binary JSON), which is based on JSON (JavaScript Object Notation). Query language 3222/ Arrays 3223/   Because data is stored flat in a MongoDB document, MongoDB can have fields with arrays – or a list of values – as values. For example, the field marinemammals below has an array of three string values (manatee, walrus, seal) as a value: Find() 3224/ To query on the last object of an array, use aggregate(). Let us create a collection with documents −  db.demo103.insertOne( { "Details" : [ { "StudentId" : 101, "Details" : "MongoDB" }, {"StudentId" : 102, "Details" : "MySQL" }, { "StudentId" : 103, "Details" : "Java" } ], "Details1" : [ { "StudentId" : 104, "Number" : 3 } ] } ); { "acknowledged" : true, "insertedId" : ObjectId("5e2ed2dd9fd5fd66da21446e") }  Display all documents from a collection with the help of find() method −  db.demo103.find();  This will produce the following output :−  { "_id" : ObjectId("5e2ed2dd9fd5fd66da21446e"), "Details" : [ { "StudentId" : 101, "Details" : "MongoDB" }, { "StudentId" : 102, "Details" : "MySQL" }, { "StudentId" : 103, "Details" : "Java" } ], "Details1" : [ { "StudentId" : 104, "Number" : 3 } ] }  Regular Expression 3225/ Regular Expressions are frequently used in all languages to search for a pattern or word in any string.  MongoDB also provides functionality of regular expression for string pattern matching using the $regex operator.  MongoDB uses PCRE (Perl Compatible Regular Expression) as regular expression language. Unlike text search, we do not need to do any configuration or command to use regular expressions.  3226/ Regular Expression Example  To retrieve the names has the following words using RegEx:   "acme" "Acme“ { name: { $regex: "(?i)a(?-i)cme" } } 3227/ Regular Expression Example  After the following Insertion: db.products.insertMany( [ { _id: 100, sku: "abc123", description: "Single line description." }, { _id: 101, sku: "abc789", description: "First line\nSecond line" }, { _id: 102, sku: "xyz456", description: "Many spaces before line" }, { _id: 103, sku: "xyz789", description: "Multiple\nline description" }, { _id: 104, sku: "Abc789", description: "SKU starts with A" } ]) What the following will find: db.products.find( { sku: { $regex: /789$/ } [ })? { _id: 101, sku: 'abc789', description: 'First line\nSecond line' },  { _id: 103, sku: 'xyz789', description: 'Multiple\nline description' }, { _id: 104, sku: 'Abc789', description: 'SKU starts with A' } ] 3228/ Regular Expression Example  After the following Insertion: db.products.insertMany( [ { _id: 100, sku: "abc123", description: "Single line description." }, { _id: 101, sku: "abc789", description: "First line\nSecond line" }, { _id: 102, sku: "xyz456", description: "Many spaces before line" }, { _id: 103, sku: "xyz789", description: "Multiple\nline description" }, { _id: 104, sku: "Abc789", description: "SKU starts with A" } ]) What the following will find: db.products.find( { sku: { $regex: “3$” } } [ )? { _id: 100, sku: "abc123", description: "Single line description." }  ] Schema Free 3229/   MongoDB does not need any pre-defined data schema Every document in a collection could have different data  Addresses NULL data fields Client-Server 3230/      MongoDB uses a client-server architecture. After installation you need start the MongoDB server – mongod from the command line This file and other executables are in C:\Program Files\MongoDB\Server\3.6\bin on Windows. You need to add the above folder to your environment variable PATH You can connect to the MongoDB server (mongod) using the mongo client using another command prompt (see next slide). The server (mongod) should be started before you can connect using the client (mongo) Starting server (mongod) and the client(mongo) 3231/ Creating and using a database – use command 3232/      To create a new database or switch to a previously created database you need to use the use command > use weatherdb To see the list of collections in this database use the show collections command > show collections This will not list any collections since this database is new and does not have any collections. 3233/       Inserting data into a collection Suppose our collection is called todays_weather. We use the insert() function on the collection to insert a document. > db.todays_weather.insert({city:"Dubai",max:32,min: 17}) WriteResult({ "nInserted" : 1 }) Note how the data has to be in JSON format. The collection todays_weather is created on first insert. db is used to refer to the current database. Now Run the show collections command again and we will see a new collection Doing multiple inserts 3234/    To insert multiple documents in a collection use the insertMany() function with a JSON array db.todays_weather.insertMany([{city:"Ajman",max:31, min:21}, {city:"Sharjah", max:32, min:10}]) The above command inserts two documents one for Ajman and one for Sharjah. Note how JSON for each document has to be inside a JSON array 3235/ Update value of A key using $SET  To update a value for a key use a criteria and key value pair to update as JSON to the update() function  For example we want to update the min for Sharjah city to 19  > db.todays_weather.update({city:"Sharjah"}, {$set: {min:19}})  The first JSON {city:“Sharjah“} is the criteria and second JSON is the update to the key min using $set Deleting from a collection 3236/     To delete a document from a collection use the remove() function or the deleteOne() function with a criteria NOTE: using the remove() function without a criteria JSON will remove all documents in the collection. To remove Fujairah from the todays_weather collection the criteria is {city:”Fujairah”} > db.todays_weather.remove({city: "Fujairah"}) Reading from a collection using find() function 3237/      Use the find() function to query or read from a collection When find function is used without any filter or criteria it retrieves all the documents in the collection > db.todays_weather.find() When the find function is used with a filter or criteria it retrieves only the documents which match the criteria. For example to retrieve the weather for Dubai from the collection todays_weather the query is as follows: > db.todays_weather.find({city:"Dubai"}) Relationship with Hadoop 3238/  Many organizations are harnessing the power of Hadoop and MongoDB together to create complete big data applications:    MongoDB powers the online, real time operational application, serving business processes and end-users, exposing analytics models created by Hadoop to operational processes Hadoop consumes data from MongoDB, blending it with data from other sources to generate sophisticated analytics and machine learning models. Results are loaded back to MongoDB to serve smarter and contextually-aware operational processes – i.e., delivering more relevant offers, faster identification of fraud, better prediction of failure rates from manufacturing processes. Relationship with Hadoop 3239/   Building on the Apache Hadoop project, a number of companies have built commercial Hadoop distributions. Leading providers include MongoDB partners Cloudera, Hortonworks and MapR. All have certified the MongoDB Connector for Hadoop with their respective distributions. Relationship with Hadoop 3240/ MongoDB limitations 3241/ Joins not Supported MongoDB doesn’t support joins like a relational database. Yet one can use joins functionality by adding by coding it manually. But it may slow execution and affect performance.  High Memory Usage MongoDB stores key names for each value pairs. Also, due to no functionality of joins, there is data redundancy. This results in increasing unnecessary usage of memory.  Limited Data Size You can have document size, not more than 16MB.  Limited Nesting You cannot perform nesting of documents for more than 100 levels.  References 1842/   MongoDB official page, https://www.mongodb.com/hadoop-andmongodb, Last accessed Oct 15, 2023. A MongoDB White Paper, Unlocking Operational Intelligence from the Data Lake,2018, Last accessed Oct 15, 2023. 43 QA

Use Quizgecko on...
Browser
Browser