Podcast
Questions and Answers
What does DBMS primarily focus on?
What does DBMS primarily focus on?
- Developing web applications.
- Designing computer hardware.
- Creating and managing databases. (correct)
- Managing computer networks.
Data warehousing involves storing data normalized in the most efficient manner to reduce redundancy.
Data warehousing involves storing data normalized in the most efficient manner to reduce redundancy.
False (B)
What is data mining used for in marketing?
What is data mining used for in marketing?
Data mining is used to improve market segmentation by extracting and analyzing customer data to tailor marketing campaigns.
A ______ is a field in a database table that uniquely identifies each record.
A ______ is a field in a database table that uniquely identifies each record.
Match the following data quality characteristics with their descriptions:
Match the following data quality characteristics with their descriptions:
What does data validation primarily ensure?
What does data validation primarily ensure?
Logging changes in a database directly prevents unauthorized access.
Logging changes in a database directly prevents unauthorized access.
How do parallel data sets protect against data loss and corruption?
How do parallel data sets protect against data loss and corruption?
[Blank] control limits the number of people who can change a database and what changes they can make.
[Blank] control limits the number of people who can change a database and what changes they can make.
Match each type of SQL key with its description:
Match each type of SQL key with its description:
What is the main goal of normalisation in databases?
What is the main goal of normalisation in databases?
In first normal form (1NF), a table can have multiple values in a single column.
In first normal form (1NF), a table can have multiple values in a single column.
What is transitive dependency in database design, and which normal form addresses it?
What is transitive dependency in database design, and which normal form addresses it?
A database with missing information and inaccuracies suffers from low data ______.
A database with missing information and inaccuracies suffers from low data ______.
Match the types of anomalies with their effects in databases:
Match the types of anomalies with their effects in databases:
What is a key aspect of data independence in databases?
What is a key aspect of data independence in databases?
Data redundancy always improves the efficiency and reduces the size of a database.
Data redundancy always improves the efficiency and reduces the size of a database.
Name one advantage that data warehousing offers over regular databases in terms of data visibility.
Name one advantage that data warehousing offers over regular databases in terms of data visibility.
An ______ trail records exactly who made changes, what the user changed, and when the changes were made in a database.
An ______ trail records exactly who made changes, what the user changed, and when the changes were made in a database.
Match the person involved with a database with their primary responsibility:
Match the person involved with a database with their primary responsibility:
Which of the following correctly describes 'Data Mining'?
Which of the following correctly describes 'Data Mining'?
RFID (Radio Frequency Identification) technology can only be used for tracking products in warehouses and retail stores.
RFID (Radio Frequency Identification) technology can only be used for tracking products in warehouses and retail stores.
List three GUI components commonly found in web forms that help users input data correctly and limit errors.
List three GUI components commonly found in web forms that help users input data correctly and limit errors.
Unlike most databases, data ______ takes the data from these databases and stores it in a non-normalised way.
Unlike most databases, data ______ takes the data from these databases and stores it in a non-normalised way.
Match each term related to digital data with its function:
Match each term related to digital data with its function:
Flashcards
What is Data?
What is Data?
Unprocessed numbers including facts or signals.
What are Databases?
What are Databases?
A collection of organised data, often used to store a wide range of information by programmers and web developers.
What is DBMS?
What is DBMS?
Software responsible for creating and managing databases, including managing data security.
What are Forms?
What are Forms?
Signup and view all the flashcards
What are Tags?
What are Tags?
Signup and view all the flashcards
What is RFID?
What is RFID?
Signup and view all the flashcards
What is a Digital Sensor?
What is a Digital Sensor?
Signup and view all the flashcards
What role do databases play on the internet?
What role do databases play on the internet?
Signup and view all the flashcards
What is a Cookie?
What is a Cookie?
Signup and view all the flashcards
Give an example of a type of information stored in databases?
Give an example of a type of information stored in databases?
Signup and view all the flashcards
What is Location Based Data?
What is Location Based Data?
Signup and view all the flashcards
What are Location Based Services?
What are Location Based Services?
Signup and view all the flashcards
What is Data Warehousing?
What is Data Warehousing?
Signup and view all the flashcards
What is the difference between data warehousing and databases?
What is the difference between data warehousing and databases?
Signup and view all the flashcards
What is Data Mining?
What is Data Mining?
Signup and view all the flashcards
What is the Data mining process?
What is the Data mining process?
Signup and view all the flashcards
What is SQL?
What is SQL?
Signup and view all the flashcards
What is the purpose of Data Mining?
What is the purpose of Data Mining?
Signup and view all the flashcards
What is Data Integrity?
What is Data Integrity?
Signup and view all the flashcards
What does data independence refer to?
What does data independence refer to?
Signup and view all the flashcards
Data Redundancy
Data Redundancy
Signup and view all the flashcards
What is Quality Data?
What is Quality Data?
Signup and view all the flashcards
What does data mining allow you to do with patterns?
What does data mining allow you to do with patterns?
Signup and view all the flashcards
How to Protect Data?
How to Protect Data?
Signup and view all the flashcards
What is Logging Changes?
What is Logging Changes?
Signup and view all the flashcards
Study Notes
Chapter Overview
- This chapter covers Data collection, Data warehousing, Data mining and Caring for and managing data
Learning Outcomes
- Provide an overview of data collection
- Provide examples of data collection
- Describe data warehousing
- Compare data warehousing with databases
- Describe data mining and provide examples
- Learn how data should be cared for and managed
Databases in a Nutshell
- Computers store data for instructions, application data in RAM, and user application files
- Files and databases are the common structures used to store data
- While a user operates an application, the data is saved in the computer's memory
- Data intended for later use persists in a database or file on more permanent storage
Files and Databases
- Data is unprocessed information
- To be usable, data requires processing and organization into meaningful information
Databases
- Databases consist of organized data
- Databases serve as the most important tool for programmers and web developers for storing data
- May store application settings, website text, graphics, status updates, messages, and social network comments
Database Management Software (DBMS)
- DBMS is the software responsible for managing databases, including creation, table construction, and security
- Popular database management software examples: Microsoft SQL Server, Microsoft Access, MySQL, and SQLite
Data Collection
- Manually adding data to a database is inefficient and only suitable for small databases
- Most databases use automatic techniques to capture data
Forms
- Web forms are interactive online pages for user input
- Web forms contain GUI components like checkboxes, combo boxes, spinners, drop-down lists, and text boxes
- Web forms streamline business by limiting paperwork and documentation, favoring online documentation
Tags
- Electronic tags transmit radio frequency data to a tag reader and vice versa
- Tags track or identify items and are common in merchandising warehouses, vehicle tracking, and pet tracking
RFID (Radio Frequency Identification)
- RFID involves tiny chips storing kilobytes of information, scannable for display and database addition
- Thousands of businesses use RFID to tag products in warehouses
RFID Examples
- Products in a warehouse are automatically scanned and removed from the database when removed from the warehouse
- Tools are tracked to see who is using them and when
- Tickets at events open gates automatically and add data to the database
- Public transport cards record trips on a database to deduct costs
- Products sold in shops are scanned and their details are added to the bill, updating inventory
E-Tolls and RFID
- In December 2014, SANRAL launched an e-toll system in Gauteng to fund a R20 billion highway project
- Motorists purchased e-tags read by toll gantries
- Cameras with RFID readers recorded vehicle data to generate monthly invoices
Digital Sensors
- Digital sensors are electronic or electrochemical devices where data conversion and transmission are done digitally
- Examples of data sensed include temperature, distance, humidity, and light
- Wireless sensor tags connect events in the physical world, such as motion, door/window status, temperature, or smartphones
Invisible Online Data Collection
- Databases are critical for storing website information, especially on user-generated content platforms (YouTube, Facebook, Wikipedia)
- These sites automatically store user-entered data in databases, including status updates, likes, tweets, and uploaded media
- Personal information like email addresses, usernames, and passwords are also stored
Cookies
- Cookie is a message from a web server to a web browser, stored in a text file.
- The browser returns the message to the server each time the browser requests a page
- Cookies identify and customize web pages for users, often through a form for personal information
- Online advertising companies use big databases to track users and activity across web pages
Database Usage
- Databases are used for credit card payments, automatic toll gates, cookies, and cell phone calls
- Software is made to read the information and record it in a database automatically
- Automatic reports can be generated, such as credit card statements
Transaction Tracking
- Transaction data, like transaction type, store location, employee, customer information and payment details are sent to the corporate database
- Data is stored on credit cards, store cards and store loyalty cards
- Transaction tracking offers benefits like consumer safety, improved user experience, fraud detection, tracking browsing history and demographic profiles
- A downside of data tracking is the possible misuse of personal information
Location Based Data
- Location-based data provides mappable data, including static data like roads and buildings, and dynamic data like vehicles or traffic
- Data comes from GPS and geographic positioning systems
Location Based Services (LBS)
- Location-based services use software applications and location-based databases to provide services such as finding the best route, stolen vehicles, or nearby services
- Smartphones and tablets are better at location-based computing: weather applications, food ordering applications, and car sharing services
- Companies mine databases to improve their decision making
Data Warehousing
- Data warehousing stores data from multiple databases in a non-normalized way, using more storage space
Data Warehousing Details
- Data warehousing helps in reporting, analytics, and data mining
- Data warehouse does not contain copies of the original databases, it is a new database
- A data warehouse makes data available and ready for analysis by users in different departments, who can create graphs and reports
Data Warehousing vs Database
- A data warehouse stores large amounts of historical data, but a database stores current transactions
- Normalization refines a database's structure to minimize redundancy and improve integrity
Data Mining
- Data mining identifies trends and patterns between different sets of data in large databases
- Right data selection shows trends and patterns between data that can dramatically improve decision making
Data Mining Examples
- Data mining helps improve market segmentation using customer data to direct personalized loyalty campaigns
- Data mining in marketing predicts users likely to unsubscribe from services, what they will search, or what will achieve a successful reponse rate
Data Mining Process
- To mine a database, you will extract relevant data, look for patterns in the data, and discover knowledge from the patterns
Extracting the Relevant Data
- Select only the data that is useful from a large database to use
- It can be extracted from the datasets using SQL by specifying the fields to extract, which data table to use and the conditions
Look For Patterns in the Data
- Working with large amounts of data requires looking for patterns to understand the dataset
- These patterns can result in knowledge, used to make better decisions and develop strategies
Discover Knowledge
- Identify patterns, you have turned an overwhelming amount of disorganised data into a few useful facts
- Confirmed the situation, informed decisions can be made, or strategies developed
Example Data Mining
- Data mining is used for Government Social Grants Social grants are administered by the South African Social Security Agency (SASSA)
- SASSA is mandated to provide comprehensive social security services against vulnerability and poverty within the constitutional legislative framework
- Most social grants are means tested to assess the value of assets and income
- The Government conducts an annual General Household Survey (GHS) to measure the living circumstances of South African households to collect big data
- Data mining techniques the relevant data that will be useful is extracted to obtain information/ knowledge
Data Mining in Facebook
- Facebook accumulates all personal data over time - data collection is happening in more dimensions than are ever understood by most users
- Using data integration, it's then mixed with other data sources that, as end-users, will never be aware
- Apps, that use data analytics, are used to analyze friends of friends comments, textual analysis, online behaviour, and so on, to compile data about users
- Information/knowledge is then used to determine current emotional state, correlate how sad or depressed someone might be, suggest possible friends etc.
Value of Data
- Online shopping websites can charge owners a fee for placing an advertisement of products on the website, if a database already contains many other products
- To gather the data needed to sell products, the website's creator can ask sellers to enter the important data for their products on the website, from where it is added to the database
Database Usefulness
- For a database to be useful, it needs to record and store valuable and useful data
- It is valuable to record and store in your database:
- Will I ever use the data in this field?
- Will anyone else use the data in this field?
- What fields do I need specifically for my application?
- What fields would I need for my application in the future?
Characteristics of Quality Data
- Accurate: The data needs to be both correct and precise
- Consistent: The data in one part of your database should not contradict or differ from the data in another part of your database
- Current: The information to be of high-quality, it is important that it is up-to-date
- Complete: In a database, incomplete data is almost as bad as inaccurate data
- Relevant Good quality data is relevant to the people who are using it
Data Protection
- Databases need to be protected from several different threats, including incorrect data entry, data corruption, data loss, accidental data deletion, purposeful data deletion and unauthorised access
- Multiple tools and techniques protect large databases
Data Validation
- Data validation is the process in which you check whether the data is accurate, in the correct format or of the correct type before allowing your database to record it
- It ensures that the data in your database is consistent and accurate
Data Verification
- Data verification is a manual technique that can be used to make sure that the data on a database is correct and accurate
Data Verification Types
- Full verification requires that each piece of data that is entered into a database is read and checked by someone - can be very time consuming
- Sample verification, in which a randomly selected sample of data is checked to ensure there are not systematic errors - possible to miss small mistakes
Data Integrity
- Data integrity we are referring to the reliability, accuracy and how trustworthy data is over its entire lifecycle
- Uncorrupted data (integrity) is considered to be ‘clean data' that stays unchanged throughout its lifecycle
- Many DBMSs have built-in integrity controls that help to maintain the data integrity
Logging changes
- Logging is the process of recording any changes made by users to a database
- Called creating an audit trail, the audit trail records exactly:
- who made the changes
- what the user changed
- When they made the changes
Data Warehousing
- Data warehousing is a technique used for storing data from more than one database, it is usually stored in a way that is secure, reliable and easy to retrieve
Data Warehouses Importance
- Help improve data integrit, make incorrect data entries or data corruption more visible by allowing data analysis
- Help improve data integrit, make data loss more visible allowing the problem to be fixed
- Can be used to recover critical data if it is deleted or corrupted
Access Control
- Access control refers to managing and controlling the parts of a database that users have access to
- Limiting the number of people who can change a database, and by limiting what changes each user can make, you can reduce the damage that any single user can do to a database
- Three important ways to control access to your data: Passwords: ensure that only the owner of a username can log in with that username User rights: determine which tables and fields every username can access, and what changes (if any) the user can make to these tables Good database security: ensures that the data is secure and that outside people cannot find other ways to access the database
Parallel Data Sets
- Backups are the most important tool to protect databases from data loss and data corruption
- To ensure that data has not been corrupted or deleted, the database is checked at intervals against a perfect copy of it, called a parallel data set
- For parallel data sets, If there are differences, it means that data was either corrupted or deleted
- Database backups should be protected as securely as the database itself
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.