DAB 202 IT Service Management Week 6

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

Which of the following is considered a 'related job role' in the field of data engineering?

Data Scientist (correct)
Network Security Engineer
Database Administrator specializing in backups
Frontend Web Developer

Organizations are projected to increase their investment in data and analytics services by what percentage through 2026?

45 percent (correct)
25 percent
15 percent
35 percent

Which of the following best describes the use of data analytics?

Applying mathematical models to handle unstructured data in real-time.
Systematic analysis of large datasets (big data) to find patterns and trends to produce actionable insights. (correct)
Analyzing large and complex datasets to uncover intricate relationships between variables.
Making predictions from data at a scale impossible for humans.

What is the primary function of AI/ML in contrast to data analytics?

To make predictions from data at a scale that is difficult or impossible for humans. (C) Signup and view all the answers

A retail store analyzes historical total revenue per customer and categorizes customers based on their spending habits. Which type of analytics is the store primarily using?

Descriptive analytics to segment customers. (B) Signup and view all the answers

A healthcare provider uses machine learning to forecast which patients are most likely to be readmitted within 30 days. Which category of analytics does this fall under?

Predictive (A) Signup and view all the answers

A manufacturing plant analyzes sensor data to determine the root causes of recent equipment failures. Which category of analytics are they employing?

Diagnostic (A) Signup and view all the answers

A city uses traffic pattern analysis to recommend adjustments to signal timings aiming to reduce congestion. Which type of data analytics are they applying?

Prescriptive analytics (C) Signup and view all the answers

What is the crucial initial step in designing a data pipeline for data-driven decisions?

Identifying the business problem to be solved. (C) Signup and view all the answers

Which of the following best describes data wrangling in the context of a data pipeline?

The tasks of discovering, cleaning, normalizing, transforming, and augmenting data as it passes through the pipeline. (B) Signup and view all the answers

In a data-driven organization, what is the primary focus of data engineering?

Building and maintaining the data infrastructure. (A) Signup and view all the answers

Which of the following practices aligns with the 'Unify' aspect of modern data strategies?

Breaking down data silos to create a single source of truth. (B) Signup and view all the answers

What is the primary goal of 'Modernizing' data infrastructure in data-driven organizations?

To increase agility and reduce undifferentiated lifting. (D) Signup and view all the answers

Which action exemplifies the 'Innovate' pillar of a modern data strategy?

Applying AI/ML to uncover new insights in unstructured data. (C) Signup and view all the answers

Which of the following is NOT one of the 'Five Vs' of data?

Volatility (D) Signup and view all the answers

Which 'V' of data is most closely associated with the trustworthiness and accuracy of data?

Veracity (C) Signup and view all the answers

How do volume and velocity most directly impact data pipeline design?

They drive decisions about the data's required processing power and storage capacity. (B) Signup and view all the answers

Which scenario exemplifies streaming ingestion?

Clickstream data analyzed continuously from a retailer's website (A) Signup and view all the answers

Which data type is characterized by elements and attributes, has a self-describing structure, and is exemplified by formats like JSON and XML?

Semistructured (A) Signup and view all the answers

Why is it important to consider data variety when designing a data pipeline?

Different data types and sources may require different processing and transformation techniques. (D) Signup and view all the answers

Which of the following is generally true about unstructured data compared to structured data?

It is harder to query but more flexible. (D) Signup and view all the answers

In data analytics, what is the significance of 'data lineage'?

It traces the transformations applied to the data and its origins. (A) Signup and view all the answers

A data engineer discovers inconsistencies in how customer addresses are formatted across two merged datasets. Which action best addresses this issue?

Normalizing the address format to a consistent standard. (B) Signup and view all the answers

What is the advantage of storing timestamped details instead of aggregated values in a data analytics system?

Allows for detailed analysis to find and debug errors. (B) Signup and view all the answers

Which of the following strategies would most effectively enhance the 'Veracity' of data in a data pipeline?

Securing all layers of the pipeline and preventing unwanted data changes. (C) Signup and view all the answers

A sensor network is established at a wind farm which sends constant streams of data about wind speed, direction, and temperature. Which category of data source best describes this situation?

Events, IoT devices, and sensors (B) Signup and view all the answers

A data scientist is tasked with building a predictive model for customer churn. During the data exploration phase, they notice a significant number of missing values in a key demographic field. Which action is the MOST appropriate first step to address this issue?

Investigate the reasons for the missing data and assess potential biases. (D) Signup and view all the answers

A company has traditionally relied on nightly batch processing of sales data to generate reports. However, business stakeholders are now demanding real-time insights into sales performance. Which data architecture change would best enable this capability?

Implement a streaming data ingestion and processing pipeline alongside the batch pipeline. (D) Signup and view all the answers

A financial institution wants to detect fraudulent transactions as quickly as possible. Which type of data analysis approach is MOST suitable for this purpose?

Streaming analytics (C) Signup and view all the answers

Which of the following is a critical consideration when choosing a data storage solution for a high-volume, high-velocity data stream, such as sensor data from industrial equipment?

Scalability to handle increasing data volumes and ingestion rates. (B) Signup and view all the answers

A large e-commerce company captures user browsing behavior, product views, and purchases to personalize recommendations. Which data type would best categorize the user's comments?

Unstructured (C) Signup and view all the answers

A health tech company is developing a personalized medicine app that combines patient medical history, genomic data, and real-time data from wearable sensors. What challenge is MOST likely to arise due to the variety of data sources?

Integrating different data types and formats effectively. (A) Signup and view all the answers

A retail company implements a new data governance program. Which activity would directly support maintaining data integrity and consistency?

Securing all layers of the data pipeline. (B) Signup and view all the answers

After a systems upgrade, a data analyst notices that customer birthdates are being incorrectly recorded in the database, resulting in many customers appearing to be born on January 1, 1900. Which step would be the MOST effective in addressing this data quality issue?

Identify the root cause of the data entry issue and implement measures to prevent recurrence. (C) Signup and view all the answers

A data team is designing a new data warehouse. Which approach is typically recommended to best supports traceability and debugging in data analytics?

Save the all raw unmodified data. (D) Signup and view all the answers

A company captures data about user interactions with their web application. Over time, they transition from capturing simple page view events to more complex events including mouse movements and form inputs. Which 'V' of data does this change primarily reflect?

Variety (B) Signup and view all the answers

Flashcards

Data analytics

A systematic analysis of large datasets to find patterns and trends, producing actionable insights.

AI/ML

Mathematical models used to make predictions from data at a scale that is difficult for humans.