Data Visualization Final Project PDF
Document Details
Uploaded by EducatedHeliotrope4922
Tags
Summary
This document appears to be a set of notes on SQL, database management and data visualization. The notes cover topics such as data types, database descriptions, relational databases, and SQL statements.
Full Transcript
1. Explain and discuss data and types of data Data: recorded description or measurement of something 3 types of data 1. Structured: tabular format 2. Semistructured: categories, tags (identifiers) 3. Unstructured (text heavy) 2. Explain why thinking before coding i...
1. Explain and discuss data and types of data Data: recorded description or measurement of something 3 types of data 1. Structured: tabular format 2. Semistructured: categories, tags (identifiers) 3. Unstructured (text heavy) 2. Explain why thinking before coding is important -accurate results -efficient -less time reworking 3. Describe a database and data modeling Database: container to store organized data Data modeling: organizes & structures info into multiple related tables -represents a business process or shows relationships between business processes 4. Describe relational database A relational database is used to organize data into related tables A relational database is a type of database that stores and provides access to data that are related to one another in the form of tables 5. Define entities, attributes, and relationships Entities: person, place, thing, event Attributes: characteristics / property / trait of entity Relationships: how entities interact with each other (entity relationship visual) 6. Explain how entity relationship (ER) diagrams are used to document and illustrate relationships Entity relationship diagrams are usedas a blueprint and how the relationship of entity sets shared in a database The 3 components of ERD are 1. Entities 2. Attributes 3. Relationships 4. 1. Define SQL Structured query language 2. Discuss how SQL differs from other computer languages SQL is a domain-specific language that allows users to communicate, edit, and extract data. SQL is a non procedural language unlike other computer languages ( It cannot write complete applications) 3. Explain how SQL is used in a database SQL is used to query, insert, update, and modift data 3 ways SQL is used in a database 1. Read/retrieve data 2. Write data 3. Update data SQL and Database Management systems (DBMS): Each DMBS has its own dialect SQL can translate 4. Learn how to write a basic SELECT statement SELECT statement is used to select data from a database. The data returned is stored in a result table, called the result-set 5. Tell a database which table your data will come FROM 6. Learn how to LIMIT the amount of data which is returned in a query 1. Understand the benefits of creating new tables Creating new tables allows for 1. Building models & predictions 2. Creating dashboards 3. Visualize the data 4. Extract data from other sources 2. Learn how to create tables within an existing database Creating blank tables: We use the CREATE TABLE statement to create a blank table 3. Learn how to write data to a new table The SQL statement: INSERT INTO helps us to add data to our table 4. Understand the basic data types of SQL What are the (4) basic data type of SQL? 1. Numeric Ex. Integerm, Real, Numeric, decimal 2. Character Ex varchar(n), char(n) 3. Booleans Ex yes/no, 1/0 4. Data time 5. Learn how to create temporary tables Use the SQL query: CREATE TEMPORARY TABLE 6. Know the limitations of temporary tables Temporary tables will be deleted when current session is terminated Faster than creating a real table Useful for complex queries using subsets and joins 1. Describe the basics of filtering your data Using the WHERE clause The WHERE clause is used to filter records 2. Use advance filtering techniques on your data a. IN operator i. Allows you to specify multiple values b. Or operator c. Or with And operator d. NOT operator i. filter records based on more than one condition 3. Explain the concepts of wildcards A wildcard character is used to substitute one or more characters in a string 4. Discuss the importance of sorting data for analysis purposes Sorting data logically helps keep information you want on top 1. Learn about subqueries Query within a query Subqueries merge data from multiple sources together 2. Discuss advantages and disadvantages of subqueries Advantages: Helps with adding other filtering criteria or requirements Disadvantage: Subquery selects can only retrieve a single column 3. Learn how to write subqueries within subqueries 4. Learn the best practices for subqueries Too many subqueries slow down performance 5. Learn about SQL JOINS A JOIN clause is used to combine rows from two or more tables, based on a related column between them Joins are temporary Cartesian / Cross Join CROSS JOINs: each row from the first table joins with all the rows of another table 6. Explain when and how to use inner JOINS The INNER JOIN keyword selects records that have matching values in both tables 7. Learn about self-joins within a SQL database Self joins are used to compare rows within the same table as though you were joining two different tables 8. Explain how left, right, and full outer JOINS work 1. Describe what a UNION is and how it works The UNION operator is used to combine the result set of two or more SELECT statements 2. Describe a UNION ALL operator The UNION ALL operator allows duplicate values 3. Describe an INTERSECT operator 1. Define pre-attentive attributes 2. Explain how pre-attentive attributes associated with color, form, spatial positioning, and movement are used in data visualizations 3. Explain how the Gestalt principles of similarity, proximity, enclosure, and connection can be used to create effective data visualizations 4. Understand data-ink ratios 1. Describe hue, saturation, and luminance and differentiate between them. 2. Describe the differences between color psychology and color symbolism 3. Explain how each color psychology and color symbolism can be used effectively 4. Define colors in data visualization software using the hue, saturation, luminance (HSL) system 5. Create data visualizations using colors that are easier for the audience to interpret 6. List common mistakes made when using color in data visualizations and how to avoid them 1. Explain the importance of knowing your audience’s needs and analytical comfort level to create an effective data visualization or presentation. High analytical level Low analytical level 2. Explain how to create empathy in the audience with the data to create the most effective message possible in your data visualization or presentation. 3. List the types of data visualizations that are most appropriate to communicate specific insights and for audiences with different needs and different levels of analytical comfort.