Business Forecasting Cookbook

 

I. Data Types

Types of Data:

  • Time Series data: quarterly sales, daily stock price, etc.
  • Cross Sectional data: store sales for 10 branches across U.S., etc.
  • Pooled data: U.S. egg production in 50 states in 1990 and 1991

 

II. Data Pattern

Pattern of Data (only for Time Series data):

  • Trend: long term component that represents growth or decline over period of time
  • Seasonal: pattern of change that repeats itself year after year
  • Cyclical: wavelike fluctuation around the trend
  • Stationary: whose basic statistical properties such as mean and variance, remains constant over time
  • Random (cannot be forecasted)

Since different forecasting model fits different data pattern, we need to identify data pattern before formulating the model:

  • Eye-ball observation: not appropriate if pattern is mixed and length of seasonality is hard to tell
  • Autocorrelation Analysis: correlate between a variable lagged one or more periods

Case Study: Consolidated Edison company's quarterly sales from 1985 till 1999. (It requires the use of statistical software, such as SPSS, Minitab, SAS)

 

III. Forecasting Models

Once we identify the data pattern, we can fit the data with an appropriate forecasting model:

  • Naive I: forecast is based solely on the most recent information available. Best fitted for stationary data.
  • Moving Average: forecast is a mean computed for the most recent observation. Best fitted for stationary data.
  • Simple Exponential Smoothing: forecast is based on averaging (smoothing) past values of a series in a decreasing (exponential) manner. Best fitted for stationary data.
  • Holt's Linear Exponential Smoothing: It smoothes the level (data) and the slope (trend) by using different smoothing constant for each. Best fitted for trend data
  • Winter's Seasonal Linear Exponential Smoothing: It is basically an exponentially smoothing adjusted for trend and seasonal variations. Best fitted for seasonal or trend-seasonal data.

We ignore regression model at the moment for later discussion.

Case Study: We compare error terms of different forecasting models above to determine the best fitted model, then use that model to forecast  Consolidated Edison company's sales for first quarter of 2000. (It requires the use of statistical software, such as SPSS, Minitab, SAS)

 

IV. Regression Analysis

For cross sectional data, we can use regression analysis for forecasting. Below are procedures to find the best fitting regression equation:

  1. Define independent and dependent variables
  2. Determine a priori (before the fact) what do you think the relationship (sign of correlation) between independent and dependent variables
  3. Gather the data
  4. Plot each independent variable against dependent variable to determine linearity (or non linearity) of the relationship
  5. Observe the Correlation Matrix to determine any multicollinearity (independent variables are highly dependent among themselves) and any dominant independent variable
  6. Estimate the best fitting model by using stepwise regression process (it eliminates any independent variables that are insignificant)
  7. Test the model for accuracy
  • Check correlation sign(s) against our priori
  • Test the significant of the correlations (t and p value) and the regression equation (std error est, adj R2, F test)
  • Check randomness of error terms:

Plot residual against each independent variable: to determine any heteroscedasticity (variance of error terms increases or non-constant as the independent variable increases)

  1. Forecast using the model

Case Study: We try to determine the best fitting regression equation to explain sales at 40 Home Depot stores, and the best store location to build between two possible locations. (It requires the use of statistical software, such as SPSS, Minitab, SAS)

For time series data, we can also use regression analysis. In addition to procedures above, we perform additional randomness check of error terms:

Durbin-Watson test: to determine any serial correlation (error terms are correlated with themselves lagged by one period).

Case Study: Now we use Time Series Regression analysis, using trend and seasonal variables, to forecast Edison’s quarterly revenues for 2000, and compare this result to previously used forecasting models. (It requires the use of statistical software, such as SPSS, Minitab, SAS)

 

Hosted by www.Geocities.ws

1