To estimate an aggregate Engel curve I chose to model the effect of median household income on the consumption of food outside of the home. Food consumed outside the home included: Eating and drinking places including tips, hotels and motels including tips, retail stores and direct selling including vending machines, recreational places such as movie theaters, bowling alleys, pool halls, sports arenas, camps, amusement parks, and gold and country clubs including concessions. Food consumption data was sourced from the USDA Economic Research Service, and all other income datasets was sourced from the St. Louis Federal Reserve Economic Database. All variables are in nominal terms and time-series with a range from 1984 to 2014.

The original model used to estimate the curve was simply using median household income (HHINC) as the lone explanatory variable and food consumption (FOODOUT) as the dependent variable. The model suggests that Median Household Income does have a positive effect on food consumption outside the home.

For this sample, a $1 increase in median household income led to a $17.080 million increase in the consumption of food outside of the home. The coefficient on Median Household Income was highly statistically significant with a p-value < .0001. The regression as a whole fit very well (R2=.955).

Ordinary Least Squares Estimates
SSE 3.92426E10 DFE 29
MSE 1353193700 Root MSE 36786
SBC 744.572464 AIC 741.70449
MAE 28496.0543 AICC 742.133061
MAPE 7.82222584 HQC 742.639378
Durbin-Watson 0.1352 Total R-Square 0.9554

 

Parameter Estimates
Variable DF Estimate Standard
Error
t Value Approx
Pr > |t|
Variable Label
Intercept 1 -266908 27543 -9.69 <.0001
HHINC 1 17.0805 0.6849 24.94 <.0001 HHINC

While the model seems to be a good estimation, running Godfrey’s test to detect autocorrelation showed that it was very likely that autocorrelation was present.

Godfrey’s Serial Correlation Test
Alternative LM Pr > LM
AR(1) 25.7915 <.0001

 

In order to try and correct the autocorrelation that was present I lagged the model one period and ran the equation again.

This time the coefficient of HHINC fell to a $16.751 million increase in the consumption of food outside of the home and was still highly statistically significant with a p-value < .0001. The overall fit of the model increased as well (R2=.992). However again, Godfrey’s test suggested autocorrelation, even if it was slightly improved.

Yule-Walker Estimates
SSE 6592247328 DFE 28
MSE 235437405 Root MSE 15344
SBC 693.830079 AIC 689.528117
MAE 11186.182 AICC 690.417006
MAPE 3.03299095 HQC 690.93045
Durbin-Watson 0.9637 Transformed Regression R-Square 0.8970
  Total R-Square 0.9925

 

Parameter Estimates
Variable DF Estimate Standard
Error
t Value Approx
Pr > |t|
Variable Label
Intercept 1 -239605 43793 -5.47 <.0001
HHINC 1 16.7510 1.0730 15.61 <.0001 HHINC

 

Godfrey’s Serial Correlation Test
Alternative LM Pr > LM
AR(1) 8.5665 0.0034

 

I decided that the addition of another explanatory variable into the model could possibly alleviate or eliminate the problems it currently had. I decided to include Disposable Income (DSPI) in the model and ran it again.

The new model suggests that Median Household Income has a negative effect on food consumption and Disposable Income has a positive effect on food consumption. For this sample, a $1 dollar in Median Household Income led to a $2.923 million decrease in the consumption of food outside of the home holding all else constant. The coefficient on Median Household Income was statistically significant with a p-value = .0193. For this sample, a $1 Billion increase in disposable income led to a $62.430 million increase in food consumption and was highly statistically significant with a p-value<.0001 .  The regression as a whole fit very well (R2=.996).

Ordinary Least Squares Estimates
SSE 3377634499 DFE 28
MSE 120629804 Root MSE 10983
SBC 671.976232 AIC 667.674271
MAE 8141.92129 AICC 668.56316
MAPE 2.18470345 HQC 669.076603
Durbin-Watson 0.8857 Total R-Square 0.9962

 

Parameter Estimates
Variable DF Estimate Standard
Error
t Value Approx
Pr > |t|
Variable Label
Intercept 1 53135 20301 2.62 0.0141
HHINC 1 -2.9237 1.1780 -2.48 0.0193 HHINC
DSPI 1 62.4304 3.6207 17.24 <.0001 DSPI

 

A good-fitting model, however autocorrelation was again shown to be present.

Godfrey’s Serial Correlation Test
Alternative LM Pr > LM
AR(1) 8.4047 0.0037

 

Lagging the variables this time brought the level of autocorrelation to just above an acceptable level however household Income became insignificant (p-value=.57150) and the coefficient of DSPI decreased from $62.430 million to $56.541 million with the same p-value<.0001. Also, the overall fit of the model increased (R2=.997).

Yule-Walker Estimates
SSE 2306449814 DFE 27
MSE 85424067 Root MSE 9243
SBC 663.842237 AIC 658.106288
MAE 6325.5341 AICC 659.64475
MAPE 1.63994926 HQC 659.976065
Durbin-Watson 1.4618 Transformed Regression R-Square 0.9920
  Total R-Square 0.9974

 

Parameter Estimates
Variable DF Estimate Standard
Error
t Value Approx
Pr > |t|
Variable Label
Intercept 1 16395 25810 0.64 0.5306
HHINC 1 -0.8509 1.4854 -0.57 0.5715 HHINC
DSPI 1 56.5405 4.5768 12.35 <.0001 DSPI

 

Godfrey’s Serial Correlation Test
Alternative LM Pr > LM
AR(1) 3.5674 0.0589

 

For my final model I ran Disposable Income as the only explanatory variable as the negative coefficient on Median Household Income doesn’t make sense, and it became insignificant.

This model suggests that there is a positive relationship between Disposable Income and Food Consumed Outside of the Home. For this sample, a $1 billion increase in Disposable Income led to a $53.581 million increase in the consumption of food outside of the home. The coefficient on Median Household Income was highly statistically significant with a p-value < .0001. The regression as a whole fit very well (R2=.995). Predictably autocorrelation was present and the variable were lagged

The coefficient of DSPI increased to $54.2278 million and retained the same p-value<.0001. The lagged variable model also brought the probability of autocorrelation to an acceptably safe level (.1337).

 

Yule-Walker Estimates
SSE 2060477141 DFE 28
MSE 73588469 Root MSE 8578
SBC 657.178358 AIC 652.876397
MAE 5956.08384 AICC 653.765285
MAPE 1.57073324 HQC 654.278729
Durbin-Watson 1.7321 Transformed Regression R-Square 0.9873
  Total R-Square 0.9977

 

Parameter Estimates
Variable DF Estimate Standard
Error
t Value Approx
Pr > |t|
Variable Label
Intercept 1 1063 9537 0.11 0.9121
DSPI 1 54.2278 1.1611 46.71 <.0001 DSPI

 

Godfrey’s Serial Correlation Test
Alternative LM Pr > LM
AR(1) 2.2496 0.1337

 

In conclusion I believe of the Lagged Disposable Income model is the best performer of the three in estimating an aggregate Engel Curve. Similarly the fact that the model suggests that consumption increases as income increases shows that it is a normal good. The output graph might show a very slight convex downward bend which would suggest it might be a luxury good.

Aggregate Engel Curve Estimation Using Income and Food Consumed Outside of the Home