Forecasting Notes

 

Dummy Variable:  a dummy variable, in general, is a categorical variable, which has the value zero or one depending on whether the characteristic applies.  It is a way of introducing “states of nature” which are important in  understanding the movement of the dependent variable, but are not quantifiable as continuous variables.

 

Seasonal Effects:  use of dummy variables (within a simple linear trend case)

           1.     If you wanted to estimate an equation to represent the sale of grass seed over some period, where you believe that the basic relationship is

                                                                                               

Sales = f(time)

          

           say you have quarterly data: per this table

 

                                         Sales

               time period 1:  1996-Q1    100

                    period 2:  1996-Q2    120

                    period 3:  1996-Q3    110

                    period 4:  1996-Q4    105

                 

                    period 5:  1997-Q1    104

                    period 6:  1997-Q2    124  

                    period 7:  1997-Q3    114

                    period 8:  1997-Q4    109

 

2. here is graph of the 8 observations:

 

      

 

 

 

           3.     without seasonal dummies, a simple linear trend regression will estimate a straight line like so:

 

                        Sales = 106.7 +  .881 t    R2 = .07

 

 

 

           4.     but by including dummies, you can pick up the regular ups and downs which appear to repeat on a seasonal basis

          

                 

           5.     Since there are four seasons which have to be introduced you will need more than 1 dummy.  One dummy variable is good for a 2 part breakdown.  The variable will take on the value of one if it applies, zero it it does not apply. So, if we want to break the year into two parts: summer (dummy =1), and winter (dummy =0).

              a.  a subtle feature is that 3 dummies are correct to handle a four part breakdown.  Thus, dummy 1 can be used to represent spring, dummy 2 can represent summer, and dummy 3 can be fall, and the left-out category will be Winter.  In other words, if the observation is for winter all three dummies have the value of zero and the impact of winter season is captured in the constant term.

 

                                               

           6.     so the grass seed data would be set up like this:

 

                                           (Spring)  (Summer)  (Fall)

                                      Sales   D1        D2       D3   

            time  period 1:  1996-Q1    100    0         0        0

                  period 2:  1996-Q2    120    1         0        0

                  period 3:  1996-Q3    110    0         1        0

                  period 4:  1996-Q4    105    0         0        1

                  period 5:  1997-Q1    104    0         0        0

                  period 6:  1997-Q2    124    1         0        0

                  period 7   1997-Q3    114    0         1        0

                  period 8   1997-Q4    109    0         0        1

                 .

                 .

                etc.

            The observations. would be entered like this:

 for time period 1:

                        S=100   and    T=1   D1=0   D2=0   D3=0

                         for time period 2:

                        S=120   and    T=2   D1=1   D2=0   D3=0

                         etc.

                  The resulting estimated equation would be

 

                        Sales =  99 + 1.0(t) + 19(D1)  + 8(D2)   +  2(D3)

                            R square = 1.0

 

           

     

Note the impressive improvement in R square  -- it would appear that much more accurate forecasts would be possible with this improved model (of course the data above were hand-picked so that a very precise underlying relationship was present).

 

 

 

 

 

 

Hosted by www.Geocities.ws

1