Regression Analysis is the study of relationships between variables. Because of its generality and applicability, Regression Analysis is one of the most pervasive of all statistical methods in the business world.

 

In the given problem the data presents three variables for 32 recently auctioned comparable items.

 

Objective: Determine the validity of the assumption of an antique collector regarding the variables influencing the price of goods sold on the auction by building a Regression Model and providing adequate analysis.

 

 To determine the relationships between the dependent variable and independent variables (in the given case item’s price, its age and number of people bidding) we use Scatterplots. Higher the correlation between explanatory variable and response variable more linear is the relationship, thus unit change in independent variable has considerable impact in outcome of dependant variable.  In this particular case correlation between items’ price and its age and number of bidders was 0.73 and 0.43 respectively and are presented graphically below.  

 

 

 

        

 

It can be concluded that the age of an item plays more decisive role in its price then the number of people bidding for it. Furthermore, to consider the importance of more than one explanatory variable (in this particular case two, age and number of bidders) independent variables should have low correlation among them. Using Scatterplot the correlation between items’ age and number of bidders was determined - negative 0.24,

 

 

     low enough to consider both variables while building the regression model.

 

 StatPro’s multiple regression procedure is used to estimate the equation for costs of items as a function of items’ age and the number of people bidding for them.

 It uses The Least Squared Method to estimate the regression.

  

Results of multiple regression for Auction_Price

 

 

 

 

 

 

 

 

 

 

 

 

Summary measures

 

 

 

 

 

 

 

Multiple R

0.9448

 

 

 

 

 

 

R-Square

0.8927

 

 

 

 

 

 

Adj R-Square

0.8853

 

 

 

 

 

 

StErr of Est

133.1365

 

 

 

 

 

 

 

 

 

 

 

 

 

ANOVA Table

 

 

 

 

 

 

 

Source

df

SS

MS

F

p-value

 

 

Explained

2

4277159.7188

2138579.8594

120.6511

0.0000

 

 

Unexplained

29

514034.5000

17725.3276

 

 

 

 

 

 

 

 

 

 

 

Regression coefficients

 

 

 

 

 

 

 

Coefficient

Std Err

t-value

p-value

Lower limit

Upper limit

 

Constant

1336.7220

173.3561

7.7108

0.0000

-1691.2753

-982.1688

 

Age_of_Item

12.7362

0.9024

14.1140

0.0000

10.8906

14.5818

 

Number_Bidders

85.8151

8.7058

9.8573

0.0000

68.0098

103.6204

 

 Above is presented Multiple Regression Output for recently auctioned comparable items.

 

 Estimated Price of an Item = 1336 + 12.73Age of Item + 85.8Numver of Bidders

 

  The interpretation of equation above is that if the number of bidders is held constant then the price of the item is expected to increase by 12.73 for each additional year increase in the age of that item, and if the age is being held constant the price of an item will rise by 85.8 per one increase in number of bidders. 1336 is a fixed component of an items price.

 

 

 Summary Measures:

 

  R² (R-Square) measures the goodness of linear fit. It is the percentage of variation of the response variable explained by the combined set of explanatory variables.  As it can be seen from the table above R² is almost 0.9, that indicates high, linear correlation between dependent variable – Price, and independent variables – Age of the Item and Number of Bidders. The square root of R² is correlation between the fitted values and the observed values of the response variable; for the given model it is 0.94. It means that an items age and number of bidders explain 89% of estimated price. Scatterplot of fitted values versus observed values below presents this high correlation graphically.    

 

 

 

 Since adding additional variable to the equation increases the value of we cant really know is the additional variable helping to determine the accuracy  of the prediction or not. Adjusted R² is listed in regression outputs. It helps to determine the relevance of the additional explanatory variables to the response variable. If Adjusted R² decreases with the addition of the extra variable it means that the variable (or variables) should be omitted.

 

 Standard Error Se is a measure of the prediction of an error we are likely to make when using multiple regression equation to predict the response variable. The smaller the standard error for a particular regression equation, the more accurate predictions tend to be. Table above indicates the standard error 133.1 meaning that approximately 2/3 of the predictions of the price should be within 1 standard error, or $133 of the actual Item price. 

 

Regression Coefficients:

 

Estimated Price of an Item = 1336 + 12.73Age of Item + 85.8Numver of Bidders

 

  The interpretation of equation above is that if the number of bidders is held constant then the price of the item is expected to increase by 12.73 for each additional year increase in the age of that item, and if the age is being held constant the price of an item will rise by 85.8 per one increase in number of bidders. 1336 is a fixed component of an items price.

 

P- value – Indicates the probability of making type I error (there is no relationship between dependent and independent variables). If it is as high as 0.05 we should not use the variable as the predictor. As it can be seen form the regression output above P value for both explanatory variables is very low, thus both independent variables are used in the equation. 

 

t- value – The ratio of the estimate of a regression coefficient to its standard error, used to test whether the coefficient is 0.

Hosted by www.Geocities.ws

1