This assignment required the use of the USDA’s Economic Research Service Food Availability System as well as the FRED Economic Database. The former was used to find peanut consumption values from 1980 – 2012, and the latter was used to find real disposable personal income data: per capita from the same year range. The reason that this year range was selected was to have enough of a sample size for the regression analysis, and the date where the research “cut off” was on 2012. The dependent variable, peanut butter consumption, and the explanatory variable, real disposable personal income, will form the Engel Curve, as seen later in this post.

The regression analysis was done to see what income’s effect was on peanut butter consumption, to determine what kind of good peanut butter is.  Peanut butter, as evidenced throughout by the regression analysis, was revealed to be a normal good, because as income rises, demand (or consumption in this case) rose as well. Here, we can see this result in the regression analysis. The equation is:

PBCi =  β0  + β1setRDPIPC + errortermi

Real Disposable Income (in hundreds) per Capita on Peanut Butter Consumption per Capita

The REG Procedure

Model: MODEL1

Dependent Variable: PBC PBC

Number of Observations Read 50
Number of Observations Used 33
Number of Observations with Missing Values 17


Analysis of Variance
Source DF Sum of
F Value Pr > F
Model 1 1.25309 1.25309 10.55 0.0028
Error 31 3.68056 0.11873    
Corrected Total 32 4.93365      


Root MSE 0.34457 R-Square 0.2540
Dependent Mean 3.15007 Adj R-Sq 0.2299
Coeff Var 10.93846    


Parameter Estimates
Variable Label DF Parameter
t Value Pr > |t|
Intercept Intercept 1 2.10576 0.32700 6.44 <.0001
setRDPIPC   1 0.00363 0.00112 3.25 0.0028


With this information, we can fill in the variables of the simple linear regression equation.

PBCi = 2.10576 + .00363setRDPIPC + errortermi


The real disposable personal income data is significant at 5%. However, there are still problems that may occur with this data set. As seen above, the R2 value is rather small, to suggest that the model does not fit well with the data.

Autocorrelation/ Durbin-Watson Test:

Real Disposable Income (in hundreds) per Capita on Peanut Butter Consumption per Capita

The AUTOREG Procedure

Ordinary Least Squares Estimates
SSE 3.68056122 DFE 31
MSE 0.11873 Root MSE 0.34457
SBC 28.2593619 AIC 25.2663468
MAE 0.27955637 AICC 25.6663468
MAPE 8.88405386 HQC 26.2734053
Durbin-Watson 0.4287 Total R-Square 0.2540


Parameter Estimates
Variable DF Estimate Standard
t Value Approx
Pr > |t|
Intercept 1 2.1058 0.3270 6.44 <.0001
setRDPIPC 1 0.0000363 0.0000112 3.25 0.0028

Here, we can see that the Durbin-Watson test is far below our expectations (2), suggesting heavy autocorrelation. Autocorrelation signals the possible correlation between time periods, and in this case, of previous time periods on future ones. We correct this autocorrelation by lagging the model by 2 periods. Our resulting Durbin-Watson value is significantly 2. We can also see the parameter estimates for the lagged values below. Additionally, the R2 value rises to a respectable .7143, or 71.43% of variation is explained by the model.

Real Disposable Income (in hundreds) per Capita on Peanut Butter Consumption per Capita

The AUTOREG Procedure

Maximum Likelihood Estimates
SSE 1.40937588 DFE 29
MSE 0.04860 Root MSE 0.22045
SBC 4.54312068 AIC -1.4429096
MAE 0.15374566 AICC -0.0143381
MAPE 4.89358882 HQC 0.57120748
Log Likelihood 4.72145479 Transformed Regression R-Square 0.1039
Durbin-Watson 1.9626 Total R-Square 0.7143
    Observations 33


Parameter Estimates
Variable DF Estimate Standard
t Value Approx
Pr > |t|
Intercept 1 1.8506 0.7859 2.35 0.0255
setRDPIPC 1 0.0000467 0.0000271 1.72 0.0952
AR1 1 -0.6606 0.1842 -3.59 0.0012
AR2 1 -0.1491 0.1933 -0.77 0.4468



Possible Omitted Variable:

Through observing the low R2 value and the small effect of the income on peanut butter consumption, it is safe to say that there are numerous possible omitted variables that contribute to peanut butter consumption. Foremost among these variables is possible jam/jelly/preserve consumption, as jam is as close to a perfect substitute to peanut butter as possible.  Unfortunately, the dataset for Jelly, Jam, Preserves over the years is not available in a complete and accessible form, so it can only be assumed that this would allow the model to be a better fit for the dependent variable. However, it must be noted that any possible variable that can be added to this model will contribute to increasing R, while not necessarily making the model a best fit for

Conclusion and Engel Graph:

The conclusion that we can draw from this data is that a $1 rise in real disposable income per capita will cause a .0004 pound rise in peanut butter consumption per capita. As we have noted before, with the relatively small R2 value and the small effect of real disposable income per capita on peanut butter consumption, we can assume that there are some significant variables omitted from this regression that would greatly help its credibility, despite correcting for autocorrelation and multicollinearity.

Real Disposable Income Per Capita on Peanut Butter Consumption