Data Visualization Final Project PDF

Document Details

EducatedHeliotrope4922

Uploaded by EducatedHeliotrope4922

Tags

SQL database management Data Visualization computer science

Summary

This document appears to be a set of notes on SQL, database management and data visualization. The notes cover topics such as data types, database descriptions, relational databases, and SQL statements.

Full Transcript

1.​ Explain and discuss data and types of data Data: recorded description or measurement of something 3 types of data 1.​ Structured: tabular format 2.​ Semistructured: categories, tags (identifiers) 3.​ Unstructured (text heavy) 2.​ Explain why thinking before coding i...

1.​ Explain and discuss data and types of data Data: recorded description or measurement of something 3 types of data 1.​ Structured: tabular format 2.​ Semistructured: categories, tags (identifiers) 3.​ Unstructured (text heavy) 2.​ Explain why thinking before coding is important -accurate results -efficient -less time reworking 3.​ Describe a database and data modeling Database: container to store organized data Data modeling: organizes & structures info into multiple related tables -represents a business process or shows relationships between business processes 4.​ Describe relational database A relational database is used to organize data into related tables A relational database is a type of database that stores and provides access to data that are related to one another in the form of tables 5.​ Define entities, attributes, and relationships Entities: person, place, thing, event Attributes: characteristics / property / trait of entity Relationships: how entities interact with each other (entity relationship visual) 6.​ Explain how entity relationship (ER) diagrams are used to document and illustrate relationships Entity relationship diagrams are usedas a blueprint and how the relationship of entity sets shared in a database The 3 components of ERD are 1.​ Entities 2.​ Attributes 3.​ Relationships 4.​ 1.​ Define SQL Structured query language 2.​ Discuss how SQL differs from other computer languages SQL is a domain-specific language that allows users to communicate, edit, and extract data. SQL is a non procedural language unlike other computer languages ( It cannot write complete applications) 3.​ Explain how SQL is used in a database SQL is used to query, insert, update, and modift data 3 ways SQL is used in a database 1.​ Read/retrieve data 2.​ Write data 3.​ Update data​ SQL and Database Management systems (DBMS): Each DMBS has its own dialect SQL can translate 4.​ Learn how to write a basic SELECT statement SELECT statement is used to select data from a database. The data returned is stored in a result table, called the result-set 5.​ Tell a database which table your data will come FROM 6.​ Learn how to LIMIT the amount of data which is returned in a query 1.​ Understand the benefits of creating new tables Creating new tables allows for 1.​ Building models & predictions 2.​ Creating dashboards 3.​ Visualize the data 4.​ Extract data from other sources 2.​ Learn how to create tables within an existing database Creating blank tables: We use the CREATE TABLE statement to create a blank table 3.​ Learn how to write data to a new table The SQL statement: INSERT INTO helps us to add data to our table 4.​ Understand the basic data types of SQL What are the (4) basic data type of SQL? 1.​ Numeric Ex. Integerm, Real, Numeric, decimal 2.​ Character Ex varchar(n), char(n) 3.​ Booleans Ex yes/no, 1/0 4.​ Data time 5.​ Learn how to create temporary tables Use the SQL query: CREATE TEMPORARY TABLE 6.​ Know the limitations of temporary tables Temporary tables will be deleted when current session is terminated Faster than creating a real table Useful for complex queries using subsets and joins 1.​ Describe the basics of filtering your data Using the WHERE clause The WHERE clause is used to filter records 2.​ Use advance filtering techniques on your data a.​ IN operator i.​ Allows you to specify multiple values b.​ Or operator c.​ Or with And operator d.​ NOT operator i.​ filter records based on more than one condition 3.​ Explain the concepts of wildcards A wildcard character is used to substitute one or more characters in a string 4.​ Discuss the importance of sorting data for analysis purposes Sorting data logically helps keep information you want on top 1.​ Learn about subqueries Query within a query Subqueries merge data from multiple sources together 2.​ Discuss advantages and disadvantages of subqueries Advantages: Helps with adding other filtering criteria or requirements Disadvantage: Subquery selects can only retrieve a single column 3.​ Learn how to write subqueries within subqueries 4.​ Learn the best practices for subqueries Too many subqueries slow down performance 5.​ Learn about SQL JOINS A JOIN clause is used to combine rows from two or more tables, based on a related column between them Joins are temporary Cartesian / Cross Join CROSS JOINs: each row from the first table joins with all the rows of another table 6.​ Explain when and how to use inner JOINS The INNER JOIN keyword selects records that have matching values in both tables 7.​ Learn about self-joins within a SQL database Self joins are used to compare rows within the same table as though you were joining two different tables 8.​ Explain how left, right, and full outer JOINS work 1.​ Describe what a UNION is and how it works The UNION operator is used to combine the result set of two or more SELECT statements 2.​ Describe a UNION ALL operator The UNION ALL operator allows duplicate values 3.​ Describe an INTERSECT operator 1.​ Define pre-attentive attributes 2.​ Explain how pre-attentive attributes associated with color, form, spatial positioning, and movement are used in data visualizations 3.​ Explain how the Gestalt principles of similarity, proximity, enclosure, and connection can be used to create effective data visualizations 4.​ Understand data-ink ratios 1.​ Describe hue, saturation, and luminance and differentiate between them. 2.​ Describe the differences between color psychology and color symbolism 3.​ Explain how each color psychology and color symbolism can be used effectively 4.​ Define colors in data visualization software using the hue, saturation, luminance (HSL) system 5.​ Create data visualizations using colors that are easier for the audience to interpret 6.​ List common mistakes made when using color in data visualizations and how to avoid them 1.​ Explain the importance of knowing your audience’s needs and analytical comfort level to create an effective data visualization or presentation. High analytical level Low analytical level 2.​ Explain how to create empathy in the audience with the data to create the most effective message possible in your data visualization or presentation. 3.​ List the types of data visualizations that are most appropriate to communicate specific insights and for audiences with different needs and different levels of analytical comfort.