Podcast
Questions and Answers
What is the primary purpose of time series analysis?
What is the primary purpose of time series analysis?
- To calculate descriptive statistics such as mean and standard deviation.
- To create data visualizations of static data.
- To perform data cleaning and preprocessing.
- To identify trends, patterns, and seasonality in data. (correct)
Which component of a time series represents long-term movements or patterns occurring at regular intervals?
Which component of a time series represents long-term movements or patterns occurring at regular intervals?
- Trend (correct)
- Irregular (Noise)
- Cyclic
- Seasonality
Which of the following time series components represents periodic fluctuations occurring at regular intervals, typically within a year?
Which of the following time series components represents periodic fluctuations occurring at regular intervals, typically within a year?
- Irregular (Noise)
- Seasonality (correct)
- Trend
- Cyclic
Which time series component is characterized by long-term fluctuations that do not have a fixed period?
Which time series component is characterized by long-term fluctuations that do not have a fixed period?
Which component in a time series accounts for unpredictable or random fluctuations that cannot be attributed to trend, seasonality, or cyclic effects?
Which component in a time series accounts for unpredictable or random fluctuations that cannot be attributed to trend, seasonality, or cyclic effects?
What distinguishes a multiplicative time series model from an additive time series model?
What distinguishes a multiplicative time series model from an additive time series model?
What is the primary purpose of line plots in time series visualization?
What is the primary purpose of line plots in time series visualization?
What information do seasonal plots primarily convey in time series analysis?
What information do seasonal plots primarily convey in time series analysis?
What is the main purpose of histograms in the context of time series visualization?
What is the main purpose of histograms in the context of time series visualization?
Which plots are used to show the correlation between a time series and its lagged values?
Which plots are used to show the correlation between a time series and its lagged values?
When preprocessing time series data, which is the most common strategy for handling missing values?
When preprocessing time series data, which is the most common strategy for handling missing values?
In time series preprocessing, what is the typical approach for dealing with outliers?
In time series preprocessing, what is the typical approach for dealing with outliers?
What does it mean for time series data to be stationary?
What does it mean for time series data to be stationary?
What characteristic defines a stochastic time series?
What characteristic defines a stochastic time series?
What is a key characteristic of a deterministic time series?
What is a key characteristic of a deterministic time series?
What is differencing used for in time series analysis?
What is differencing used for in time series analysis?
What is the purpose of the Augmented Dickey-Fuller (ADF) test?
What is the purpose of the Augmented Dickey-Fuller (ADF) test?
What do Autocorrelation Function (ACF) plots primarily help to identify in time series analysis?
What do Autocorrelation Function (ACF) plots primarily help to identify in time series analysis?
In the context of ACF plots, what does a slow decrease in the ACF as the lags increase indicate?
In the context of ACF plots, what does a slow decrease in the ACF as the lags increase indicate?
What does a 'scalloped' shape in the ACF plot usually indicate?
What does a 'scalloped' shape in the ACF plot usually indicate?
What type of model is suggested when the Partial Autocorrelation Function (PACF) plot drops sharply after lag p?
What type of model is suggested when the Partial Autocorrelation Function (PACF) plot drops sharply after lag p?
What characteristic of the Autocorrelation Function (ACF) suggests an AR model?
What characteristic of the Autocorrelation Function (ACF) suggests an AR model?
In the context of time series models, what does the term 'white noise' refer to?
In the context of time series models, what does the term 'white noise' refer to?
What is the purpose of the Ljung-Box test?
What is the purpose of the Ljung-Box test?
Flashcards
Time Series Analysis
Time Series Analysis
A statistical technique used to analyze data points collected or recorded at successive, evenly-spaced time intervals.
Trend (Tt)
Trend (Tt)
Long-term movement or patterns in data that occur at regular intervals. Can be linear or non-linear.
Seasonality (St)
Seasonality (St)
Periodic fluctuations in data that occur at regular intervals, such as daily, monthly, or yearly.
Cyclic (Ct)
Cyclic (Ct)
Signup and view all the flashcards
Random/Irregular (It)
Random/Irregular (It)
Signup and view all the flashcards
Additive Time Series
Additive Time Series
Signup and view all the flashcards
Multiplicative Time Series
Multiplicative Time Series
Signup and view all the flashcards
Line Plots (Time Series)
Line Plots (Time Series)
Signup and view all the flashcards
Seasonal Plots
Seasonal Plots
Signup and view all the flashcards
Histograms (Time Series)
Histograms (Time Series)
Signup and view all the flashcards
ACF/PACF (Time Series)
ACF/PACF (Time Series)
Signup and view all the flashcards
Handling Missing Values
Handling Missing Values
Signup and view all the flashcards
Dealing with Outliers
Dealing with Outliers
Signup and view all the flashcards
Stationarity & Transformations
Stationarity & Transformations
Signup and view all the flashcards
Stochastic Time Series
Stochastic Time Series
Signup and view all the flashcards
Deterministic Time Series
Deterministic Time Series
Signup and view all the flashcards
Deterministic Trend
Deterministic Trend
Signup and view all the flashcards
Stochastic Trend
Stochastic Trend
Signup and view all the flashcards
Stationarity
Stationarity
Signup and view all the flashcards
Differencing
Differencing
Signup and view all the flashcards
Making Stationary
Making Stationary
Signup and view all the flashcards
Autocorrelation Function (ACF)
Autocorrelation Function (ACF)
Signup and view all the flashcards
AR model
AR model
Signup and view all the flashcards
MA model
MA model
Signup and view all the flashcards
ARIMA model
ARIMA model
Signup and view all the flashcards
Study Notes
- Time series analysis is a statistical technique used to analyze data points collected or recorded at successive, evenly-spaced time intervals
- It helps to identify trends, patterns, and seasonality in data, crucial for forecasting future values and making decisions
Components of Time Series
- Trend: Long-term movement/patterns occurring at regular intervals, can be linear or non-linear
- Seasonality: Periodic fluctuations occurring at regular intervals (daily/monthly/yearly) within a year
- Cyclic: Long-term fluctuations that don't have a fixed period like seasonality, examples include economic or business cycles
- Random or Irregular (Noise): Unpredictable/random fluctuations in the data that can't be attributed to trend, seasonality, or cyclic behavior, examples include natural disasters
- Time series can be represented as an additive model e.g. Yt = Tt + St + Ct + It, or a multiplicative model e.g. Yt = Tt * St * Ct * It
Time Series Visualization
- Line plots: Show data points over time, allowing for easy observation of trends and fluctuations
- Seasonal plots: Show and break down data into seasonal components
- Histograms: Show the distribution of data values over time
- ACF/PACF: Show the correlation between time series data points and their lagged values
Preprocessing
- Handling Missing Values: Remove or replace them with the mean
- Dealing with outliers: Remove them if it's okay, or replace them with the mean/median in most cases
- Stationarity & Transformations: Data should be stationary
Stochastic Time Series
- Contains a random/probabilistic component, preventing its behavior from being explicitly described
Deterministic Time Series
- Has no random/probabilistic components
- It's always possible to predict its future behavior and describe how it behaved in the past
Deterministic Trend
- Trend is stationary
Stochastic Trend
- Trend is difference stationary
- Proved by random walk process
Random Walk
- A particular time series process where the current values are a combination of the previous values
- Described by the equation X(t) = X(t-1) + W(t), where W(t) is a random component
Stationarity
- If a time series is stationary, its statistical properties like mean, variance, and autocorrelation do not change over time
- To check for stationarity:
- Plot the data for visualization
- Perform the Augmented Dickey-Fuller (ADF) test to check for stationarity
- Analyze rolling mean & variance to make sure they remain constant
Making a Time Series Stationary
- Differencing: Subtracting the previous value from the current value
- Transformation: Using mathematical functions like Logarithm, Square Root, or Power Transformation
- De-trending: Removing the trend component
ACF & PACF Plots
- Autocorrelation Function (ACF): A graphical representation of the correlation of a time series with itself at different lags
- Correlation Coefficient: A measure of how closely two variables are related, ranging from -1 to 1:
- 1: Positive (perfect)
- 0: No relationship
- -1: Negative (perfect)
- ACF plot: Used to identify the order of an MA model
- Order of AR model: The number of lags included in the model
- ACF plot will show spikes at the lags included in the model
- Partial Autocorrelation Function (PACF): A graphical representation of the correlation of a time series with itself at different lags, after removing the effects of the previous lags
- PACF Plot: Used to identify the order of an AR model
- Trend & Seasonality in ACF Plots: The slow decrease in the ACF as the lags increase is due to the trend, while the 'scalloped' shape is due to the seasonality
Calculate ACF
- Used to find the value of the correlation coefficient (rk) for lag k
- rk = (Σ (Yt-Y)(Yt-k-Y)) / (Σ (Yt-Y)^2), sum from t=k+1 to N
- Yt = value of time series at time t
- Y = mean of the time series
- N = total number of observations
- rk = correlation coefficient
Calculate PACF
- It measures the direct correlation between a time series and its lagged values, removing the influence of intermediate lags
- Yule-Walker equation: φk,k = (rk - Σ φk-1,j rk-j) / (1 - Σ φk-1,j rj ) for j=1 to k-1
- φk,k = PACF value at lag k
- rk = autocorrelation at lag k
- φk-1,j = previous PACF values
- Lag 1, PACF is r1
- For Lag 2 => φ2,2 = (r2 - φ1,1r1) / (1-φ1,1r1)
White Noise
- A time series that is completely random and lacks any structure
- It doesn't show trends, cycles, or autocorrelation, and is unpredictable and chaotic
- If all ACF bars are close to zero (within the bounds), the series is white noise (no exploitable pattern)
- If some ACF bars are significantly different from zero, there is an exploitable structure
- White noise is stationary, mean and variance must remain constant over time (check with ADF test)
- Ljung-Box test: ho -> the series is white noise, h1-> the series is not white noise
- p > 0.05 -> accept h1
- p < 0.05 -> accept ho
- Check for normal distribution: if values fluctuate around zero, the series is white noise
AR Model (Auto-Regressive Model)
- It expresses a time series as a linear combination of its past values
- The current value depends on previous values plus some random noise
- Y(t) = φ1Y(t-1) + φ2Y(t-2) + ... + φp*Y(t-p) + ε(t)
- Y(t) = current value of time series
- φ1,2 = auto regressive coefficients
- Y(t-1) = previous value
- p = number of past values used
- ε(t) = white noise
- To identify AR models:
- Check PACF: If it drops sharply after lag p, it suggests an AR(p) model
- Check ACF: A slow decay suggests an AR model
MA Model (Moving Average)
- Expresses a time series as a function of past error terms (white noise)
- MA models don't depend directly on past values, instead, models the dependencies through past forecast errors
- Y(t) = ε(t) + θ1ε(t-1) + θ2ε(t-2) + ... + θq*ε(t-q)
- Y(t) = current value of time series
- θ1, θ2, θq = coefficient of MA
- q = order of MA
- ε(t) = random error (white noise)
- To Identify MA models:
- Check ACF: If it drops at lag q, it suggests an MA model
- Check PACF: A slow decay suggests MA
ARIMA Model (Autoregressive Integrated Moving Average)
- A statistical method used for time series forecasting that captures both AR and MA components (p, q)
- It also addresses non-stationarity through differencing (d)
- p -> AR (number of time lags)
- q -> MA (order)
- d -> degree of differencing (number of times the data has been differenced)
- AR -> build trend from past values in the series
- I -> differencing to remove trend and seasonality
- MA -> relationship between an observation and the residual error
- To find p, d, q values: Analyze ACF and PACF functions
- If the parameters are identified, the model can be fitted to forecast future value
- Can be optimized using Akaike Information Criterion (AIC) / Bayesian Information Criterion (BIC)
- (1 - Φ1B - ... - ΦpBp) (1 - B)^d yt = c + (1 + Θ1B + ... + ΘqB^q) Et
SARIMA Model (Seasonal Autoregressive Integrated Moving Average)
- An extension of the ARIMA model, designed to handle seasonal data
- Combines the concepts of autoregressive (AR), integrated (I), and moving average (MA) models with seasonal components
- Seasonal differencing: Process of subtracting the time series data by a lag that equals the seasonality
- Helps to remove the seasonality and make data stationary, Seasonal differencing -> D
- Notation
- SARIMA(p, d, q) (P, D, Q, S)
- AR(p) -> order of AR
- MA(q) -> order of MA
- I(d) -> Integrated component of d
- Seasonal AR(P) -> order of Seasonal AR component
- MA(Q) -> seasonal MA order.
- Seasonal I(D) -> Seasonal Integrated component of D
- S -> seasonal period
- Mathematical formula:
- (1 - Φ1B)(1 - ΦiBS) (1 - B)^d (1 - B)^D yt = (1 + θ1B)(1 + ΘiBS) εt, where
- Φ -> non seasonal autoregressive coefficient
- Φi -> seasonal auto regressive coefficient.
- θ -> non seasonal moving average coefficient
- Θι -> seasonal moving average.coefficient
ETS Model (Error, Trend and Seasonality model) or Exponential Smoothing
- Used to decompose a time series into its components: Error, Trend, Seasonality
- Helps to understand the patterns & behaviors
- Models are based on incorporative additive and multiplicative error models, additive and multiplicative trend and seasonal
- Additive: Yt = Tt + St + et
- Trend, seasonality and error are added together
- Seasonal variance and error remains constant
- Multiplicative: Yt = Tt * St * et
- Trend, seasonality, error multiplied together
- Variance (seasonal) and error are proportional to the trend
- When forecasting, the model uses a weighted average of past values/data and give more weight to the recent value and less weight to older data/value
- Mathematical formula:
- Yt = lt-1 + bt-1 + St-m + et
- l(t-1) -> level at time (t-1)
- b(t-1) -> trend at t
- S(t-m) -> Seasonal component at time (t-m)
ETS Models
- ETS(A,N,N) :- simple exponential smoothing. (No trend, No seasonality)
- ETS(A,A,N) :- Holt's linear Trend model. (Trend but No seasonality)
- ETS(A,A,A) :- Holt - Winters Additive model. (seasonal variations remain constant)
- ETS(M,A,M) :- Holt - winters Multiplicative model. (seasonal variations grow over time)
Process for ETS
- visualize the data
- check if trend / seasonality exist
- try models in ETS and compare
- use AIC to find the best model
Multivariate Time Series Analysis
- Statistical technique used to analyze multiple time series datasets to identify patterns/relationships between them
Common Models
-
VAR (Vector Autoregression)
-
VARIMA (Multivariate autoregressive Integrated -moving average)
-
State-space models
-
Exponential smoothing
-
Multivariate time series has more than one time series variable
-
Each variable depends not only on its past values, but also has dependency on other variables
VAR method
- In order to compute Y1(t), we use Y1, Y2 (past values)
- For Y2(t) we use Y1, Y2 (past values)
- Mathematical relations:
- Y1(t) = a1 + w11 * Y1(t-1) + w12 * Y2(t-1) + e1(t+1)
- Y2(t) = a2 + w21 * Y1(t-1) + w22 * Y2(t-1) + e2(t+1)
- where a1, a2 are constants, and w11, w12, w21, w22 are coefficients and e1 and e2 are error terms.
- Similar to AR(1) process, but only depends on it's own past value: [Y(t) = a + wY(t - 1) + e]
- In order to perform VAR model we use multiple variables and vectors in each equation to implement equation
VAR process
- Yt = at + W1 * Y(t-1) + ... + Wp * Y(t-p) + et; yt = [y1, yt2, ... ytp]
Johansen's test
- In order to check stationarity in multivariate time series, we use the Johansen's test
- In univariate time series, Augmented Dickey-Fuller test (ADF test) is used
Granger's Causality test
- Used to identify the relationship between variables to build the model
- If there is no relationship between the variable, those will be excluded and modeled separately
- If they share a relationship between them, those will be considered to modeling
- P > 0.05 -> accept Null
- P < 0.05 -> reject Null
Random Walk
- Random steps/events happen, means one after another without following any pattern/algorithm.
- According to T.S; It refers to a stochastic (random) process where the future value is determined by the current value + white noise term
- Yt = Y(t-1) + Et (future value = current value + error term (white noise))
Random walk with Drift
- D -> drift constant, It helps in including upward/downward trend in your predictions if trend exists
- Yt = Y(t-1) + Et + D (Drift)
Correlated Random Walk
- Is similar to base line equation here we add correlation coefficient to the equation
- Yt = p(Y(t-1) + Et
- p correlation Coefficient term
Characteristics of a Time Serie
- Non-Stationary time series have µ and σ^2 are not constant over time
- Unpredictable you can't Be predict/forecast future values
- Infinite Variance over time
If time series has/follows a random walk, it has unit out, which means a non-stationary. -Its test is to use ADF test
Finding Random walk:
- Plot the data which appears to have trend
- perform to check Stationary test, By which
- ADF Test/kpss Test,
- Diffrencing makes data Stationary
Autocorrelation test (ACF)
- High autocorrelation at lags
Simple Smoothing Methods
- Moving Average
- Simple moving average (SMA)
- In order to forecast, we assume that a future value will equal to the average of past values
- Useful in modeling random series: it smooths the most recent actual values
- Useful in smoothing randomness
Example of 2 periods period
Jan 1 *120 Feb 2 *124
Mar 3 *122. *122 error equal 0 Apr 4 *123. *123.5 error equal 0
May 5. *125. *125.5 error equal 0.5 June 6. *128 *124 error equal 4 SMA2 = 120+24/2= 122
- Forecast for may is 122 April= 2+122/2/123
weighted Moving Average
- It Adds More Weight to the Most valueable of absolution this are used Unlike that SMA, it makes weights to
- each of absolution data has values which have 1 this are higher most recently, divided results the of weight
Weight :0.4 0.3 0.2 Forcsted WMA4 may equal: Weight times j, and so it is the best
Exponential Moving Averages (EMA)
- Best
- In all data points used the data to the points of quick. EMA T = Pt-EMAt-1/K+1+ EMA+1 equal value, a+2)/k
- Pe= current price for EMA+1 value K is equals.length of EMA
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.