This assignment required the use of the USDA’s Economic Research Service Food Availability System as well as the FRED Economic Database. The former was used to find peanut consumption values from 1980 – 2012, and the latter was used to find real disposable personal income data: per capita from the same year range. The reason that this year range was selected was to have enough of a sample size for the regression analysis, and the date where the research “cut off” was on 2012. The dependent variable, peanut butter consumption, and the explanatory variable, real disposable personal income, will form the Engel Curve, as seen later in this post.
The regression analysis was done to see what income’s effect was on peanut butter consumption, to determine what kind of good peanut butter is. Peanut butter, as evidenced throughout by the regression analysis, was revealed to be a normal good, because as income rises, demand (or consumption in this case) rose as well. Here, we can see this result in the regression analysis. The equation is:
PBC_{i} = β_{0} + β_{1}setRDPIPC + errorterm_{i}
Real Disposable Income (in hundreds) per Capita on Peanut Butter Consumption per Capita
The REG Procedure
Model: MODEL1
Dependent Variable: PBC PBC
Number of Observations Read | 50 |
Number of Observations Used | 33 |
Number of Observations with Missing Values | 17 |
Analysis of Variance | |||||
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 1 | 1.25309 | 1.25309 | 10.55 | 0.0028 |
Error | 31 | 3.68056 | 0.11873 | ||
Corrected Total | 32 | 4.93365 |
Root MSE | 0.34457 | R-Square | 0.2540 |
Dependent Mean | 3.15007 | Adj R-Sq | 0.2299 |
Coeff Var | 10.93846 |
Parameter Estimates | ||||||
Variable | Label | DF | Parameter Estimate |
Standard Error |
t Value | Pr > |t| |
Intercept | Intercept | 1 | 2.10576 | 0.32700 | 6.44 | <.0001 |
setRDPIPC | 1 | 0.00363 | 0.00112 | 3.25 | 0.0028 |
With this information, we can fill in the variables of the simple linear regression equation.
PBC_{i} = 2.10576 + .00363setRDPIPC + errorterm_{i}
The real disposable personal income data is significant at 5%. However, there are still problems that may occur with this data set. As seen above, the R^{2} value is rather small, to suggest that the model does not fit well with the data.
Autocorrelation/ Durbin-Watson Test:
Real Disposable Income (in hundreds) per Capita on Peanut Butter Consumption per Capita
The AUTOREG Procedure
Ordinary Least Squares Estimates | |||
SSE | 3.68056122 | DFE | 31 |
MSE | 0.11873 | Root MSE | 0.34457 |
SBC | 28.2593619 | AIC | 25.2663468 |
MAE | 0.27955637 | AICC | 25.6663468 |
MAPE | 8.88405386 | HQC | 26.2734053 |
Durbin-Watson | 0.4287 | Total R-Square | 0.2540 |
Parameter Estimates | |||||
Variable | DF | Estimate | Standard Error |
t Value | Approx Pr > |t| |
Intercept | 1 | 2.1058 | 0.3270 | 6.44 | <.0001 |
setRDPIPC | 1 | 0.0000363 | 0.0000112 | 3.25 | 0.0028 |
Here, we can see that the Durbin-Watson test is far below our expectations (2), suggesting heavy autocorrelation. Autocorrelation signals the possible correlation between time periods, and in this case, of previous time periods on future ones. We correct this autocorrelation by lagging the model by 2 periods. Our resulting Durbin-Watson value is significantly 2. We can also see the parameter estimates for the lagged values below. Additionally, the R^{2} value rises to a respectable .7143, or 71.43% of variation is explained by the model.
Real Disposable Income (in hundreds) per Capita on Peanut Butter Consumption per Capita
The AUTOREG Procedure
Maximum Likelihood Estimates | |||
SSE | 1.40937588 | DFE | 29 |
MSE | 0.04860 | Root MSE | 0.22045 |
SBC | 4.54312068 | AIC | -1.4429096 |
MAE | 0.15374566 | AICC | -0.0143381 |
MAPE | 4.89358882 | HQC | 0.57120748 |
Log Likelihood | 4.72145479 | Transformed Regression R-Square | 0.1039 |
Durbin-Watson | 1.9626 | Total R-Square | 0.7143 |
Observations | 33 |
Parameter Estimates | |||||
Variable | DF | Estimate | Standard Error |
t Value | Approx Pr > |t| |
Intercept | 1 | 1.8506 | 0.7859 | 2.35 | 0.0255 |
setRDPIPC | 1 | 0.0000467 | 0.0000271 | 1.72 | 0.0952 |
AR1 | 1 | -0.6606 | 0.1842 | -3.59 | 0.0012 |
AR2 | 1 | -0.1491 | 0.1933 | -0.77 | 0.4468 |
Possible Omitted Variable:
Through observing the low R^{2} value and the small effect of the income on peanut butter consumption, it is safe to say that there are numerous possible omitted variables that contribute to peanut butter consumption. Foremost among these variables is possible jam/jelly/preserve consumption, as jam is as close to a perfect substitute to peanut butter as possible. Unfortunately, the dataset for Jelly, Jam, Preserves over the years is not available in a complete and accessible form, so it can only be assumed that this would allow the model to be a better fit for the dependent variable. However, it must be noted that any possible variable that can be added to this model will contribute to increasing R, while not necessarily making the model a best fit for
Conclusion and Engel Graph:
The conclusion that we can draw from this data is that a $1 rise in real disposable income per capita will cause a .0004 pound rise in peanut butter consumption per capita. As we have noted before, with the relatively small R^{2} value and the small effect of real disposable income per capita on peanut butter consumption, we can assume that there are some significant variables omitted from this regression that would greatly help its credibility, despite correcting for autocorrelation and multicollinearity.