Podcast
Questions and Answers
Which of the following is NOT a typical data quality dimension considered in literature?
Which of the following is NOT a typical data quality dimension considered in literature?
- Accuracy
- Relevance (correct)
- Completeness
- Timeliness
In forecasting, fitting errors are generally larger than forecast errors.
In forecasting, fitting errors are generally larger than forecast errors.
False (B)
What preliminary step is crucial before selecting a forecasting model, involving the construction and visual inspection of time series plots?
What preliminary step is crucial before selecting a forecasting model, involving the construction and visual inspection of time series plots?
Data analysis
The forecasting process step that involves assessing how well the forecasting model is likely to perform in its intended application is known as model ______.
The forecasting process step that involves assessing how well the forecasting model is likely to perform in its intended application is known as model ______.
Match the following imputation techniques with their descriptions:
Match the following imputation techniques with their descriptions:
What is the primary purpose of Data Cleaning
in the context of data warehousing for forecasting?
What is the primary purpose of Data Cleaning
in the context of data warehousing for forecasting?
Qualitative forecasting techniques rely primarily on historical data patterns.
Qualitative forecasting techniques rely primarily on historical data patterns.
In the context of forecasting, what is the Forecast Horizon
?
In the context of forecasting, what is the Forecast Horizon
?
The component of a time series that repeats on a regular basis, such as yearly, is known as ______.
The component of a time series that repeats on a regular basis, such as yearly, is known as ______.
Which of the following represents the correct chronological order of steps in the forecasting process?
Which of the following represents the correct chronological order of steps in the forecasting process?
Flashcards
What is a process?
What is a process?
A series of connected activities transforming inputs into outputs.
Problem Definition in Forecasting
Problem Definition in Forecasting
Understanding forecast usage and customer expectations.
Data Collection
Data Collection
Obtaining relevant historical data for variables to be forecast, including predictor variables.
Data Analysis
Data Analysis
Signup and view all the flashcards
Model Selection and Fitting
Model Selection and Fitting
Signup and view all the flashcards
Model Validation
Model Validation
Signup and view all the flashcards
Forecasting Model Deployment
Forecasting Model Deployment
Signup and view all the flashcards
Monitoring Forecasting Model Performance
Monitoring Forecasting Model Performance
Signup and view all the flashcards
Data Cleaning
Data Cleaning
Signup and view all the flashcards
Data Imputation
Data Imputation
Signup and view all the flashcards
Study Notes
- A process is a series of connected activities transforming inputs into outputs.
Forecasting Process
- Problem Definition
- Data Collection
- Data Analysis
- Model Selection and Fitting
- Model Validation
- Forecasting Model Deployment
- Monitoring Forecasting Model Performance
Problem Definition
- Includes understanding how the forecast will be used and the user's expectations.
- Key questions to address: desired forecast form, forecast horizon, forecast interval, and forecast accuracy level.
Data Collection
- Involves obtaining the relevant historical data for the variables to be forecast, including potential predictor variables.
- Relevant information collection, storage methods, and systems change and not all historical data is useful for current problems.
- During this phase, planning for future data collection and storage to preserve reliability and integrity is important.
Data Analysis
- An important preliminary step in selecting a forecasting model.
- Time series plots should be constructed and visually inspected for patterns like trends, seasonality, or cyclicity.
- Trend is the evolutionary movement (upward or downward) in the variable's value and can be long-term, dynamic, or of short duration.
- Seasonality is a time series component that repeats regularly (e.g., yearly), while cyclicity does not.
- Numerical summaries, such as standard deviation, percentiles, and autocorrelation, should be computed and evaluated.
- Unusual data points or potential outliers should be identified and flagged for further study.
- The purpose of data analysis is to gain a "feel" for the data and understand underlying patterns like trends and seasonality and suggests initial quantitative forecasting methods and models to explore.
Model Selection and Fitting
- Involves choosing one or more forecasting models and fitting them to the data.
- Fitting refers to estimating the unknown parameters, usually by least squares methods.
Model Validation
- Consists of evaluating the forecasting model to determine its likely performance in the intended application.
- Examination of the "fit" to historical data and the magnitude of forecast errors when the model is used on fresh data is required.
- Fitting Errors will always be smaller than forecast Errors
- Data Splitting: A method for validating a forecasting model before use, split into two segments fitting segment where the model is fit, and forecasting segment where forecasts are simulated.
Forecasting Model Deployment
- Involves using the fitted model and the resulting forecasts as use by the customer.
- Ensuring the customer understands how to use the model and generating timely forecasts routinely is vital.
Monitoring Forecasting Model Performance
- Should be ongoing after model deployment to ensure it performs satisfactorily.
Data for Forecasting
- Data are the raw materials for modeling and forecasting.
- Data vs Information: Data is preferred since it corresponds more to the description needed for the forecasting process, information is extracted or synthesized from this data.
- Output of forecasting can be information using data as an input.
- Data warehouse is a repository for modern organizational data (sales, transactions, etc.)
- Data warehouses store data in the cloud and organize, manipulate, and integrate data from multiple sources/systems.
Basic Functionality of Data Warehouse
- Data Extraction: Obtaining data from external sources.
- Data Transformation: Applying rules to prevent duplication and address issues like missing information (sometimes called data cleaning).
- Data Loading: Loading transformed data into the warehouse for modeling and analysis.
Data Quality Dimensions
- Accuracy
- Timeliness
- Completeness
- Representativeness
- Consistency
Data Cleaning
- Examines data to detect errors, missing values, outliers, or inconsistencies, and then correcting them.
- Errors can result from recording or transmission problems.
- Can be corrected by working with the original data source and it improves the forecasting process.
Checks Before Using Data to develop Models
- Check for missing data.
- Confirm if the data falls within an expected range.
- Check for potential outliers or unusual values.
Imputation
- Data Imputation: Correcting missing data or replacing outliers using an estimation process, replacing missing/erroneous values with "likely" values based on available information,enabling analysis with complete datasets.
- Mean value imputation: Replacing a missing value with the average of nonmissing observations.
- Stochastic mean value imputation: Adding a random variable to the mean value to capture noise or variability.
- Regression imputation: Computing the imputed value from a model predicting the missing value, the prediction model does not have to be linear.
- Hot deck imputation: Imputing values using values from similar complete observations.
- Cold deck imputation: Using information from a deck of cards not currently in current use.
Resouces for Forecasting
- Professional forecasting journals: Journal of Forecasting, International Journal of Forecasting, and Journal of Business Forecasting Methods and Systems
- Journals publishing new methodology, evaluation studies, case studies, and applications.
- Mainstream statistics and operations research journals that publish papers on forecasting: Journal of Business and Economic Statistics, Management Science, Naval Research Logistics, Operations Research, International Journal of Production Research, and Journal of Applied Statistics
Two Broad Types of Forecasting Techniques
- Qualitative Forecasting Techniques (Delphi Method): Subjective, require expert judgment, and used when there is limited historical data
- Example: Introducing new product via marketing tests, surveys, and sales performance analysis.
- Quantitative Forecasting Techniques: Use historical data and forecasting models to summarize data patterns and express statistical relationships.
- Regression Model: Use relationships between variables of interest and related predictor variables.
- Smoothing Models: Employ a simple function of previous observations.
- General Time Series Models: Employ statistical properties of historical data to specify a formal model and estimate parameters.
Form of the Forecast
- Point Estimates: Forecast almost always wrong, good to experience forecast errors
- Prediction Interval: A range useful in decision-making.
- Point Estimate: A single numerical value used to approximate an unknown population parameter, commonly used with statistics such as mean, proportion, or variance.
- Point Forecast: A single predicted value for a future observation that uses time series analysis.
Forecasting Problem Features
- Forecast Horizon: Number of future periods a forecast must be produced.
- Forecast Interval: Frequency with which new forecasts are prepared.
- Rolling and Moving horizon forecasting technique
Time series plots reveal patterns
- Trends
- Level Shifts
- Periods or cycle
- Unusual Observation
- Combination of Pattern
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.