Forecast Error: Difference between observed and predicted values.
Residuals: Forecast errors for the training data.
Goals: Minimize forecast error, avoid bias, and capture patterns.
Concise notes for forecasting exams, covering key concepts, methods, and models based on "Forecasting: Principles and Practice (3rd ed)".
Forecast Error: Difference between observed and predicted values. Residuals: Forecast errors for the training data. Goals: Minimize forecast error, avoid bias, and capture patterns. |
Bias: Systematic difference between forecasts and actual values. Indicated by non-zero mean error. Forecast Distribution: Use simulations of the future to develop forecast distributions |
Overfitting: Fitting model too closely to the training data, results in poor performance on new data. Underfitting: Model is too simple to capture patterns in the data |
Additive Decomposition: |
Multiplicative Decomposition: |
Trend: Long-term direction of the series. |
Seasonal: Regular, predictable variations that recur over a fixed period. |
Cyclic: Fluctuations around the trend, usually over a longer period than seasonality. |
Random: Irregular, unpredictable variations. |
Classical Decomposition: Method for decomposing a time series into its components (trend, seasonal, and irregular). |
STL Decomposition: Versatile and robust method for decomposing time series data, handling both additive and multiplicative seasonality, as well as complex seasonal patterns. |
Seasonal Plot: Data are grouped by season (e.g., months or quarters) and plotted to highlight seasonal patterns, showing how the series varies within each season. |
Time Plot: Time series data are plotted against time, revealing trends, seasonality, and cyclical patterns over time. |
Scatter Plot: Data points are plotted as individual points to visualize the relationship between two variables, such as the series and its lagged values, helping to identify autocorrelation and patterns. |
Autocorrelation Function (ACF): Measures the correlation between a time series and its lagged values, revealing the strength and significance of autocorrelation at different lags. |
Partial Autocorrelation Function (PACF): Measures the correlation between a time series and its lagged values after removing the effects of intermediate lags, isolating the direct relationship between the series and each lag. |
Average Method: Forecast all future values using the average of historical data. Formula: \hat{y}_{t+h|t} = \bar{y} = (y_1 + y_2 + ... + y_T) / T |
Naive Method: Forecast equals the last observed value. Formula: \hat{y}_{t+h|t} = y_t |
Seasonal Naive Method: Forecast equals the last observed value from the same season. Formula: \hat{y}_{t+h|t} = y_{t+h-m(k+1)}, where m is the seasonal period and k is the integer part of (h-1)/m |
Drift Method: Forecast is the last value plus an average change over time. Formula: \hat{y}_{t+h|t} = y_t + h \frac{y_T - y_1}{T-1} |
Assumptions about residuals: Uncorrelated, mean zero, constant variance, normally distributed. |
Plots: Time series plot, histogram, ACF plot. Tests: Ljung-Box test. |
Ljung-Box Test: Tests whether a group of autocorrelations of a time series are different from zero. Q^* = T(T+2) \sum_{k=1}^h r_k^2(e) |
Suitable for: Data with no trend or seasonality. Formula: \hat{y}_{t+1|t} = \alpha y_t + (1 - \alpha) \hat{y}_{t|t-1} |
\alpha: Smoothing constant (0 < \alpha < 1). Higher values give more weight to recent observations. |
Initialization: \hat{y}_{1|0} can be set to y_1 or the average of the first few observations. |
Suitable for: Data with a trend but no seasonality. Equations: |
\alpha: Smoothing constant for the level. |
Initialization: \ell_0 and b_0 can be estimated using linear regression on the historical data. |
Damped Trend Methods: Similar to Holt’s method, but the trend is damped over time. Formula: \hat{y}_{t+h|t} = \ell_t + (\phi + \phi^2 + ... + \phi^h)b_t |
\phi: Damping parameter (0 < \phi < 1). As h increases, the forecast approaches \ell_T + \frac{\phi}{1-\phi} b_T. |
Suitable for: Data with both trend and seasonality. Can be additive or multiplicative. |
Additive: \hat{y}_{t+h|t} = \ell_t + hb_t + s_{t+h-m(k+1)} |
Multiplicative: \hat{y}_{t+h|t} = (\ell_t + hb_t)s_{t+h-m(k+1)} |
Parameters: \alpha, \beta^*, \gamma are smoothing constants (0 < \alpha, \beta^*, \gamma < 1). |
Initialization: Requires initial estimates for level, trend, and seasonal components. |
AR(p): Autoregressive model of order p. Uses past values to predict future values. MA(q): Moving average model of order q. Uses past forecast errors to predict future values. I(d): Integrated component of order d. Represents the number of differences required to make the time series stationary. |
Stationarity: A time series is stationary if its statistical properties (mean, variance, autocorrelation) do not change over time. |
Differencing: Used to make a time series stationary. First difference: y'_t = y_t - y_{t-1}. Seasonal difference: y'_t = y_t - y_{t-m} |
ACF and PACF Plots: Used to identify the order of AR and MA components. Information Criteria: AIC, AICc, BIC. Lower values indicate better model fit. |
AIC (Akaike Information Criterion): AIC = -2 \log(L) + 2k, where L is the likelihood and k is the number of parameters. |
AICc (Corrected AIC): AICc = AIC + \frac{2k(k+1)}{T-k-1}, where T is the number of observations. |
BIC (Bayesian Information Criterion): BIC = -2 \log(L) + k \log(T) |
AR(p) Model: y_t = c + \phi_1 y_{t-1} + \phi_2 y_{t-2} + ... + \phi_p y_{t-p} + e_t |
MA(q) Model: y_t = c + e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2} + ... + \theta_q e_{t-q} |
ARIMA(p,d,q) Model: Combines AR(p), I(d), and MA(q) components. Requires differencing the series d times to achieve stationarity. |