Podcast
Questions and Answers
What is the primary goal of the discord detection process?
What is the primary goal of the discord detection process?
How is the time series subsequence defined in discord detection?
How is the time series subsequence defined in discord detection?
What does the subset Tk represent in the context of discord detection?
What does the subset Tk represent in the context of discord detection?
What does the notation !Tk? signify in the context given?
What does the notation !Tk? signify in the context given?
Signup and view all the answers
In the formula provided, what does the expression arg max min d(xTi, xTj) indicate?
In the formula provided, what does the expression arg max min d(xTi, xTj) indicate?
Signup and view all the answers
What defines the limitation of discord detection?
What defines the limitation of discord detection?
Signup and view all the answers
Which algorithm employs heuristics and pruning for discord detection?
Which algorithm employs heuristics and pruning for discord detection?
Signup and view all the answers
In distance-based approaches, what does the notation $d(xTi , x̂) > τ$ represent?
In distance-based approaches, what does the notation $d(xTi , x̂) > τ$ represent?
Signup and view all the answers
How can a reference for subsequence comparison be obtained in discord detection?
How can a reference for subsequence comparison be obtained in discord detection?
Signup and view all the answers
Which of the following best describes the brute-force search method in discord detection?
Which of the following best describes the brute-force search method in discord detection?
Signup and view all the answers
Which method involves the use of patterns from external time series for analysis?
Which method involves the use of patterns from external time series for analysis?
Signup and view all the answers
What approach is generally used for collective outlier detection in time series?
What approach is generally used for collective outlier detection in time series?
Signup and view all the answers
In distance-based approaches, what is primarily compared to identify patterns?
In distance-based approaches, what is primarily compared to identify patterns?
Signup and view all the answers
When analyzing time series, what does clustering from the same time series refer to?
When analyzing time series, what does clustering from the same time series refer to?
Signup and view all the answers
Which of the following correctly describes a potential reference for seasonal averages?
Which of the following correctly describes a potential reference for seasonal averages?
Signup and view all the answers
Which aspect of time series analysis does feature-based classification primarily focus on?
Which aspect of time series analysis does feature-based classification primarily focus on?
Signup and view all the answers
What is an important characteristic of collective outliers in time series?
What is an important characteristic of collective outliers in time series?
Signup and view all the answers
What type of outlier detection is highlighted in distance-based approaches?
What type of outlier detection is highlighted in distance-based approaches?
Signup and view all the answers
Which of the following is NOT an approach to detect collective outliers?
Which of the following is NOT an approach to detect collective outliers?
Signup and view all the answers
What does a reference value signify in distance-based approaches?
What does a reference value signify in distance-based approaches?
Signup and view all the answers
What defines the periodicity of subsequence outliers?
What defines the periodicity of subsequence outliers?
Signup and view all the answers
Which representation method is considered for collective outliers?
Which representation method is considered for collective outliers?
Signup and view all the answers
What is a key property of length in subsequences of collective outliers?
What is a key property of length in subsequences of collective outliers?
Signup and view all the answers
Which type of sequences can exhibit collective outliers?
Which type of sequences can exhibit collective outliers?
Signup and view all the answers
Which approach focuses on measuring the similarity between data points to find outliers?
Which approach focuses on measuring the similarity between data points to find outliers?
Signup and view all the answers
What is the significance of understanding subsequence representation in collective outlier detection?
What is the significance of understanding subsequence representation in collective outlier detection?
Signup and view all the answers
What does the expression 'arg max min d(xTi , xTj )' primarily indicate?
What does the expression 'arg max min d(xTi , xTj )' primarily indicate?
Signup and view all the answers
In the context of discord detection, what is the role of 'd(.,.)'?
In the context of discord detection, what is the role of 'd(.,.)'?
Signup and view all the answers
What does the parameter 'Ti' represent in the context of the algorithm?
What does the parameter 'Ti' represent in the context of the algorithm?
Signup and view all the answers
Which of the following describes '1-nearest neighbor distance' in this context?
Which of the following describes '1-nearest neighbor distance' in this context?
Signup and view all the answers
What type of subsets does the algorithm focus on when measuring distance?
What type of subsets does the algorithm focus on when measuring distance?
Signup and view all the answers
Which graphical representation reflects the process of measuring distance over time?
Which graphical representation reflects the process of measuring distance over time?
Signup and view all the answers
What can be inferred about subsets 'Ti' and 'Tj' based on their relationship?
What can be inferred about subsets 'Ti' and 'Tj' based on their relationship?
Signup and view all the answers
Why is 'min d(xTi , xTj )' considered significant in this algorithm?
Why is 'min d(xTi , xTj )' considered significant in this algorithm?
Signup and view all the answers
What is the primary purpose of training a model in the context of model-based approaches?
What is the primary purpose of training a model in the context of model-based approaches?
Signup and view all the answers
In model-based approaches, what does the symbol $ au$ typically represent?
In model-based approaches, what does the symbol $ au$ typically represent?
Signup and view all the answers
What does the equation $|x_i - ilde{x}_i| > au$ signify in model-based approaches?
What does the equation $|x_i - ilde{x}_i| > au$ signify in model-based approaches?
Signup and view all the answers
What type of data does a model-based approach primarily rely on for its predictions?
What type of data does a model-based approach primarily rely on for its predictions?
Signup and view all the answers
The collective outlier detection method in model-based approaches is exemplified by which of the following?
The collective outlier detection method in model-based approaches is exemplified by which of the following?
Signup and view all the answers
Which step is NOT part of the forecasting process in model-based approaches?
Which step is NOT part of the forecasting process in model-based approaches?
Signup and view all the answers
What does training a predictive model involve in the context described?
What does training a predictive model involve in the context described?
Signup and view all the answers
What key aspect does the model-based approach focus on measuring?
What key aspect does the model-based approach focus on measuring?
Signup and view all the answers
Study Notes
Norwegian University of Life Sciences
- The institution's name is displayed in the logo
- The institution offers courses in life sciences
DAT320: Outlier detection
- Course title: Outlier detection
- Course topic: Collective outliers in time series data
- Instructor: Kristian Hovde Liland
Collective outliers in time series
- Key properties of collective outliers (subsequences): length of the subsequence, fixed-length versus variable-length, representation of the subsequence, models, transformations, periodicity of subsequence outliers (non-periodic sequences, periodic sequences)
- Based on Blázquez-García et al., 2021, and Gupta et al., 2014
Approaches to detecting collective outliers (subsequences)
- Discord detection
- Distance-based approaches
- Model-based approaches
Discord detection
- Concept: determine "most unusual subsequence" (discord)
- Define time series subsequences by sliding window Tk = [tk - 2, tk + 2) for some tk and fixed δ
- Discord subset Tk (the subset which is most different to its most similar subset)
- T = arg max min d(xT₁, XT₃) -Tk
- min d(xT₁, XT₃) acts as a 1-nearest neighbor distance (single linkage between Ti and all possible non-overlapping subsets Tj)
Distance-based approaches
- Discord detection: pairwise comparison between subsequences
- Concept of distance-based outlier detection: comparison of subsequence to reference ("normal") subsequence x
- d(xT₁, x) > т
- How to obtain the reference x? Reference from same time series → clustering; Reference from external time series → feature-based classification/clustering, or dictionary of patterns
Model-based approaches
- Related to distance-based approach with reference from same time series (history)
- Concept: train predictive model and measure distance between prediction and reference in interval
- Train model on {1,...,t}
- Predict & steps into the future (Xt+1,..., Xt+8)
R code
- The code provides approaches for detecting collective outliers, specifically for the ecg0606 dataset
- Various options are listed for brute-force search and HOT-SAX algorithm
- Other options for distance-based and model-based approaches are provided
Literature
- Blázquez-García et al. (2021), review of outliers/anomalies in time series data
- Gupta et al. (2014), outlier detection for temporal data: a survey
DAT320: Point outliers in time series data
- Course topic: Point outliers in time series data
- Instructor: Kristian Hovde Liland
Approaches for point outliers in time series
- Temporal windowing
- Model-based approaches
- Distribution-based approaches
- Multivariate time series
Outliers in time series
- Approaches for point outliers in time series: ignore temporal component, temporal windowing, customized outlier detectors, univariate vs multivariate time series, global vs local (contextual) outlier
- Based on [Blázquez-García et al., 2021]
Time series as random samples
- Baseline concept: ignore temporal dependencies
- Works well for global outliers
- Easy to implement
- Is not capable of detecting contextual outliers
- Seasonality & trending may distort results
- Same methodology as for random samples (uni-/multivariate data)
Temporal windowing
- Concept: divide time axis into windows
- t − δ/2, t + δ/2] C[t, T], δ > 0,
- Apply non-temporal method to each window
- Works well for seasonal patterns
- Same methodology as for random samples
- Hard to determine the window size δ
Density-based approaches
- Concept: outliers have only few neighbors
- Applied in temporal window
- Alternatives:
- KNN distance
- DBSCAN
- LOF
Model-based approaches
- Concept: time series models are able to describe outliers well
- Estimation-based
- Model is trained based on all values
- Outliers produce large residuals
- Prediction-based
- Model is trained only on history
- Outliers produce inaccurate predictions
R code
- Provides R code for various methods of outlier detection and analysis.
- Example of using different parameters with the code.
Other topics
- Distance-based approaches
- Model-based approaches
- R code
- Further reading
- The slides contain information about different methods for finding outliers
Other
- The slides cover different types of outliers, including point outliers, collective outliers, and contextual outliers.
- R-code examples illustrate use of functions in R for different statistical tests.
- Further details about all topics are available through the given resources.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
This quiz covers the key concepts and methodologies involved in discord detection within time series analysis. It explores definitions, algorithms, limitations, and approaches utilized in detecting anomalies in time series data. Test your understanding of the fundamental principles of this advanced topic.