Introduction to NoSQL PDF
Document Details
Uploaded by HeartfeltAtlanta
Jake Roach
Tags
Summary
This document provides an introduction to NoSQL database systems. The document describes different types of NoSQL databases, such as tabular, column-oriented, and document databases. It also shows how to write and execute queries for connecting to NoSQL databases. Focuses on using Python.
Full Transcript
Introduction to NoSQL INTRODUCTION TO NOSQL Jake Roach AI Engineer Traditional relational data stores (RDBMS) Organize data in tables, using columns and rows Leverage SQL to manage and query data Enforce integrity through constraints on databases and tables...
Introduction to NoSQL INTRODUCTION TO NOSQL Jake Roach AI Engineer Traditional relational data stores (RDBMS) Organize data in tables, using columns and rows Leverage SQL to manage and query data Enforce integrity through constraints on databases and tables INTRODUCTION TO NOSQL What is NoSQL? Definition: NoSQL stands for "not only SQL", Characteristics: and is a set of data storage tools and techniques that allows for structured, semi- Allows for wide variety of data to be stored and retrieved structured, and unstructured data to be stored and retrieved. Less rigid schema Better scaling and performance INTRODUCTION TO NOSQL NoSQL data stores Tabular Non-tabular { "title": "Python for Data Analysis", "price": 53.99,... } "weather": "sunny" "Rectangular" Semi-structured format Using columns and rows More flexible schema INTRODUCTION TO NOSQL NoSQL data stores Column-oriented databases Document databases A NoSQL data store that stores data by NoSQL data storage tool used to store semi- column, rather than row, and can be queried structured "documents" with SQL-like syntax. Allows for faster JSON format querying of data, especially when running analytical queries. Less rigid schema Use case: user-generated data (reviews) and Use case: big data, analytics workflows real-time analytics INTRODUCTION TO NOSQL More NoSQL data stores Key-value Graph A NoSQL data storage tool that stores data as A NoSQL data store that persists data in a a collection of key-value pairs: network of nodes and edges. Simple data that is written and read at a Nodes represent entities high frequency Edges represent relationships between Use cases: IoT (Internet of Things) data, entities mobile applications Use cases: social networks INTRODUCTION TO NOSQL Let's practice! INTRODUCTION TO NOSQL Tabular NoSQL data stores INTRODUCTION TO NOSQL Jake Roach Data Engineer Tabular data stores Column-oriented databases: Store data in columns, rather than rows Allow for selective column read and retrieval Easier schema changes Better data compression, query performance INTRODUCTION TO NOSQL Querying a column-oriented database SELECT title, price FROM books WHERE price < 50.00; SQL-like syntax Column-elimination and selective reads/retrieval Automatic data clustering INTRODUCTION TO NOSQL Query execution in column-oriented data stores SELECT title, price FROM books WHERE price < 50.00; This query executes by: Later, we'll look at: Accessing price column, identify records Optimizing data loads and deletes with price < 50.00 Creating performant JOIN s Retrieving corresponding values from title column Working with semi-structure data INTRODUCTION TO NOSQL Connecting to a Snowflake database import snowflake.connector conn = snowflake.connector.connect( user="", password="", account="", database="", schema="", warehouse="" ) The conn variable will be created for you, pre-exercise INTRODUCTION TO NOSQL Writing and executing Snowflake queries # Build a query in a string (or multi-line string) query = """ SELECT title, price FROM books WHERE price < 50.00; """ # Execute the query, print the results results = conn.cursor().execute(query).fetch_pandas_all() print(results) INTRODUCTION TO NOSQL Let's practice! INTRODUCTION TO NOSQL Non-tabular NoSQL data stores INTRODUCTION TO NOSQL Jake Roach Data Engineer Document databases Definition: A NoSQL data storage tool that { stores data in a flexible, semi-structured "title": "Python for Data Analysis", format, made up of key-value, key-array, and "price": 53.99, key-object pairs (similar to JSON). "topics": [ "Data Science", "Data Analytics",... ], "author": { "first": "William"... } } INTRODUCTION TO NOSQL Querying JSON data with Postgres JSON SELECT books -> 'title' AS title, books -> 'price' AS price FROM data_science_resources WHERE books -> 'author' ->> 'last' = 'Viafore'; Resulting in the following output: INTRODUCTION TO NOSQL Connecting to a Postgres database import sqlalchemy # Create a connection string, and an engine connection_string = "postgresql+psycopg2://:@:/" db_engine = sqlalchemy.create_engine(connection_string) To create a connection to a Postgres database: Form a connection string Create an engine using sqlalchemy.create_engine db_engine variable will be created, pre-exercise INTRODUCTION TO NOSQL Writing and executing Postgres JSON queries import pandas as pd To write and execute a query: Build a query in a string # Build the query query = """ Pass query and db_engine to the SELECT pd.read_sql() function books -> 'title' AS title, Print the resulting DataFrame books -> 'price' AS price FROM data_science_resources; """ # Execute the query result = pd.read_sql(query, db_engine) print(result) INTRODUCTION TO NOSQL Other non-tabular NoSQL data stores Key-value Graph { "name": "Jane Doe", "age": 25, "email": "[email protected]" } INTRODUCTION TO NOSQL Let's practice! INTRODUCTION TO NOSQL