To estimate an aggregate Engel curve I chose to model the effect of median household income on the consumption of food outside of the home. Food consumed outside the home included: Eating and drinking places including tips, hotels and motels including tips, retail stores and direct selling including vending machines, recreational places such as movie theaters, bowling alleys, pool halls, sports arenas, camps, amusement parks, and gold and country clubs including concessions. Food consumption data was sourced from the USDA Economic Research Service, and all other income datasets was sourced from the St. Louis Federal Reserve Economic Database. All variables are in nominal terms and time-series with a range from 1984 to 2014.
The original model used to estimate the curve was simply using median household income (HHINC) as the lone explanatory variable and food consumption (FOODOUT) as the dependent variable. The model suggests that Median Household Income does have a positive effect on food consumption outside the home.
For this sample, a $1 increase in median household income led to a $17.080 million increase in the consumption of food outside of the home. The coefficient on Median Household Income was highly statistically significant with a p-value < .0001. The regression as a whole fit very well (R^{2}=.955).
Ordinary Least Squares Estimates | |||
SSE | 3.92426E10 | DFE | 29 |
MSE | 1353193700 | Root MSE | 36786 |
SBC | 744.572464 | AIC | 741.70449 |
MAE | 28496.0543 | AICC | 742.133061 |
MAPE | 7.82222584 | HQC | 742.639378 |
Durbin-Watson | 0.1352 | Total R-Square | 0.9554 |
Parameter Estimates | ||||||
Variable | DF | Estimate | Standard Error |
t Value | Approx Pr > |t| |
Variable Label |
Intercept | 1 | -266908 | 27543 | -9.69 | <.0001 | |
HHINC | 1 | 17.0805 | 0.6849 | 24.94 | <.0001 | HHINC |
While the model seems to be a good estimation, running Godfrey’s test to detect autocorrelation showed that it was very likely that autocorrelation was present.
Godfrey’s Serial Correlation Test | ||
Alternative | LM | Pr > LM |
AR(1) | 25.7915 | <.0001 |
In order to try and correct the autocorrelation that was present I lagged the model one period and ran the equation again.
This time the coefficient of HHINC fell to a $16.751 million increase in the consumption of food outside of the home and was still highly statistically significant with a p-value < .0001. The overall fit of the model increased as well (R^{2}=.992). However again, Godfrey’s test suggested autocorrelation, even if it was slightly improved.
Yule-Walker Estimates | |||
SSE | 6592247328 | DFE | 28 |
MSE | 235437405 | Root MSE | 15344 |
SBC | 693.830079 | AIC | 689.528117 |
MAE | 11186.182 | AICC | 690.417006 |
MAPE | 3.03299095 | HQC | 690.93045 |
Durbin-Watson | 0.9637 | Transformed Regression R-Square | 0.8970 |
Total R-Square | 0.9925 |
Parameter Estimates | ||||||
Variable | DF | Estimate | Standard Error |
t Value | Approx Pr > |t| |
Variable Label |
Intercept | 1 | -239605 | 43793 | -5.47 | <.0001 | |
HHINC | 1 | 16.7510 | 1.0730 | 15.61 | <.0001 | HHINC |
Godfrey’s Serial Correlation Test | ||
Alternative | LM | Pr > LM |
AR(1) | 8.5665 | 0.0034 |
I decided that the addition of another explanatory variable into the model could possibly alleviate or eliminate the problems it currently had. I decided to include Disposable Income (DSPI) in the model and ran it again.
The new model suggests that Median Household Income has a negative effect on food consumption and Disposable Income has a positive effect on food consumption. For this sample, a $1 dollar in Median Household Income led to a $2.923 million decrease in the consumption of food outside of the home holding all else constant. The coefficient on Median Household Income was statistically significant with a p-value = .0193. For this sample, a $1 Billion increase in disposable income led to a $62.430 million increase in food consumption and was highly statistically significant with a p-value<.0001 . The regression as a whole fit very well (R^{2}=.996).
Ordinary Least Squares Estimates | |||
SSE | 3377634499 | DFE | 28 |
MSE | 120629804 | Root MSE | 10983 |
SBC | 671.976232 | AIC | 667.674271 |
MAE | 8141.92129 | AICC | 668.56316 |
MAPE | 2.18470345 | HQC | 669.076603 |
Durbin-Watson | 0.8857 | Total R-Square | 0.9962 |
Parameter Estimates | ||||||
Variable | DF | Estimate | Standard Error |
t Value | Approx Pr > |t| |
Variable Label |
Intercept | 1 | 53135 | 20301 | 2.62 | 0.0141 | |
HHINC | 1 | -2.9237 | 1.1780 | -2.48 | 0.0193 | HHINC |
DSPI | 1 | 62.4304 | 3.6207 | 17.24 | <.0001 | DSPI |
A good-fitting model, however autocorrelation was again shown to be present.
Godfrey’s Serial Correlation Test | ||
Alternative | LM | Pr > LM |
AR(1) | 8.4047 | 0.0037 |
Lagging the variables this time brought the level of autocorrelation to just above an acceptable level however household Income became insignificant (p-value=.57150) and the coefficient of DSPI decreased from $62.430 million to $56.541 million with the same p-value<.0001. Also, the overall fit of the model increased (R^{2}=.997).
Yule-Walker Estimates | |||
SSE | 2306449814 | DFE | 27 |
MSE | 85424067 | Root MSE | 9243 |
SBC | 663.842237 | AIC | 658.106288 |
MAE | 6325.5341 | AICC | 659.64475 |
MAPE | 1.63994926 | HQC | 659.976065 |
Durbin-Watson | 1.4618 | Transformed Regression R-Square | 0.9920 |
Total R-Square | 0.9974 |
Parameter Estimates | ||||||
Variable | DF | Estimate | Standard Error |
t Value | Approx Pr > |t| |
Variable Label |
Intercept | 1 | 16395 | 25810 | 0.64 | 0.5306 | |
HHINC | 1 | -0.8509 | 1.4854 | -0.57 | 0.5715 | HHINC |
DSPI | 1 | 56.5405 | 4.5768 | 12.35 | <.0001 | DSPI |
Godfrey’s Serial Correlation Test | ||
Alternative | LM | Pr > LM |
AR(1) | 3.5674 | 0.0589 |
For my final model I ran Disposable Income as the only explanatory variable as the negative coefficient on Median Household Income doesn’t make sense, and it became insignificant.
This model suggests that there is a positive relationship between Disposable Income and Food Consumed Outside of the Home. For this sample, a $1 billion increase in Disposable Income led to a $53.581 million increase in the consumption of food outside of the home. The coefficient on Median Household Income was highly statistically significant with a p-value < .0001. The regression as a whole fit very well (R^{2}=.995). Predictably autocorrelation was present and the variable were lagged
The coefficient of DSPI increased to $54.2278 million and retained the same p-value<.0001. The lagged variable model also brought the probability of autocorrelation to an acceptably safe level (.1337).
Yule-Walker Estimates | |||
SSE | 2060477141 | DFE | 28 |
MSE | 73588469 | Root MSE | 8578 |
SBC | 657.178358 | AIC | 652.876397 |
MAE | 5956.08384 | AICC | 653.765285 |
MAPE | 1.57073324 | HQC | 654.278729 |
Durbin-Watson | 1.7321 | Transformed Regression R-Square | 0.9873 |
Total R-Square | 0.9977 |
Parameter Estimates | ||||||
Variable | DF | Estimate | Standard Error |
t Value | Approx Pr > |t| |
Variable Label |
Intercept | 1 | 1063 | 9537 | 0.11 | 0.9121 | |
DSPI | 1 | 54.2278 | 1.1611 | 46.71 | <.0001 | DSPI |
Godfrey’s Serial Correlation Test | ||
Alternative | LM | Pr > LM |
AR(1) | 2.2496 | 0.1337 |
In conclusion I believe of the Lagged Disposable Income model is the best performer of the three in estimating an aggregate Engel Curve. Similarly the fact that the model suggests that consumption increases as income increases shows that it is a normal good. The output graph might show a very slight convex downward bend which would suggest it might be a luxury good.