Business Forecasting Cookbook
Types of Data:
- Time Series data:
quarterly sales, daily stock price, etc.
- Cross Sectional data:
store sales for 10 branches across U.S.,
etc.
- Pooled data: U.S.
egg production in 50 states in 1990 and 1991
Pattern of Data (only for Time Series data):
- Trend: long term
component that represents growth or decline over period of time
- Seasonal: pattern of
change that repeats itself year after year
- Cyclical: wavelike
fluctuation around the trend
- Stationary: whose
basic statistical properties such as mean and variance, remains constant
over time
- Random (cannot be
forecasted)
Since different forecasting model fits different data pattern, we need to identify
data pattern before formulating the model:
- Eye-ball observation: not
appropriate if pattern is mixed and length of seasonality is hard to tell
- Autocorrelation Analysis:
correlate between a variable lagged one or more periods
Case Study: Consolidated Edison company's
quarterly sales from 1985 till 1999. (It
requires the use of statistical software, such as SPSS, Minitab, SAS)
Once we identify the data pattern, we can fit the data with an appropriate forecasting
model:
- Naive I: forecast is
based solely on the most recent information available. Best fitted for
stationary data.
- Moving Average:
forecast is a mean computed for the most recent observation. Best fitted
for stationary data.
- Simple Exponential
Smoothing: forecast is based on averaging (smoothing) past values of a
series in a decreasing (exponential) manner. Best fitted for stationary
data.
- Holt's Linear Exponential
Smoothing: It smoothes the level (data) and the slope (trend)
by using different smoothing constant for each. Best fitted for trend data
- Winter's Seasonal Linear
Exponential Smoothing: It is basically an exponentially smoothing
adjusted for trend and seasonal variations. Best fitted for
seasonal or trend-seasonal data.
We ignore regression model at the moment for later discussion.
Case Study: We compare error terms of
different forecasting models above to determine the best fitted model, then use
that model to forecast Consolidated
Edison company's sales for first quarter of 2000. (It requires the use of statistical software, such as SPSS, Minitab, SAS)
For cross sectional data, we can use regression analysis for
forecasting. Below are procedures to find the best fitting regression equation:
- Define independent and
dependent variables
- Determine a priori (before
the fact) what do you think the relationship (sign of correlation) between
independent and dependent variables
- Gather the data
- Plot each independent
variable against dependent variable to determine linearity (or non
linearity) of the relationship
- Observe the Correlation
Matrix to determine any multicollinearity
(independent variables are highly dependent among themselves) and any dominant
independent variable
- Estimate the best fitting
model by using stepwise regression process (it eliminates any
independent variables that are insignificant)
- Test the model for accuracy
- Check correlation sign(s)
against our priori
- Test the significant of the
correlations (t and p value) and the regression equation (std error est, adj R2, F
test)
- Check randomness of error
terms:
Plot residual against each
independent variable: to determine any heteroscedasticity
(variance of error terms increases or non-constant as the independent variable
increases)
- Forecast using the model
Case Study: We try to determine the best
fitting regression equation to explain sales at 40 Home Depot stores,
and the best store location to build between two possible locations. (It requires the use of statistical software, such as
SPSS, Minitab, SAS)
For time series data, we can also use regression analysis. In
addition to procedures above, we perform additional randomness check of error
terms:
Durbin-Watson test: to determine any serial correlation (error terms
are correlated with themselves lagged by one period).
Case Study: Now we use Time Series Regression
analysis, using trend and seasonal variables, to forecast Edison’s quarterly revenues for 2000,
and compare this result to previously used forecasting models. (It requires the use of statistical software, such as
SPSS, Minitab, SAS)