For this assignment I used per capita real disposable personal income from FRED as my measure of overall income for the United States. The good that I decided to examine is total alcohol consumption per capita, measured in gallons of ethanol. I obtained this information from Statista. I decided to use a time series approach for my analysis of the Engel Curve. The timespan of my data goes from 1990 through 2014. I expected the curve to be upward sloping because I think that alcohol is a normal good. People who are rich do not find alcohol less appealing to drink in general than those with less money. However, certain brands or types of alcoholic beverages may be inferior. If people without much money suddenly became substantially more wealthy, they might decide to stop drinking cheap brands of beer like Busch for example, and switch to drinking expensive wine instead. As I am not differentiating between brands or types of alcoholic beverages in my regression analysis, I am unable to test for this. I decided to put income on the Y-axis and ethanol on the X-axis even though this made income the dependent variable in the regression, because Engel Curves always have income on the Y-axis. Although ethanol should be the dependent variable in this situation, the results obtained when switching the two variables are still important and useful.

The year 1990 has a much higher value for ethanol consumption than any of the other years, and looks like it is an outlier. Outliers are a problem in statistical models because they pull a regression line in their direction substantially, even though they are only one observation among many. In order to refine my estimate of the Engel curve, I took that data point out, and ran the regression again, using only the years 1991 through 2014. I expected that this would improve the fit of the model substantially.

The REG Procedure

Model: MODEL1

Dependent Variable: Income

Number of Observations Read 41
Number of Observations Used 25
Number of Observations with Missing Values 16

 

Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 1 36283434 36283434 2.16 0.1549
Error 23 385782222 16773140
Corrected Total 24 422065656

 

Root MSE 4095.50242 R-Square 0.0860
Dependent Mean 31905 Adj R-Sq 0.0462
Coeff Var 12.83639  

 

Parameter Estimates
Variable DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept 1 -3508.74938 24092 -0.15 0.8855
Ethanol 1 15754 10711 1.47 0.1549

Pearson Correlation Coefficients, N = 25
Prob > |r| under H0: Rho=0
  Ethanol Income
Ethanol
1.00000
0.29320
0.1549
Income
0.29320
0.1549
1.00000

As it can be seen in the above graph, the Engel Curve here is upward sloping which indicates that Alcohol is a normal good as was expected, although there are a number of data points below the regression line aiming in a downward direction. This regression line does not appear to fit the data very well because many of the data points are scattered fairly far away from it. Also, I had a low R-Square value of 0.0860. According to the way I arranged the variables, this means that variation in ethanol consumption only explains 8.6 percent of the variation in income. However, I did check the R-Square results with ethanol as the dependent variable, and it was exactly the same. The t-statistic obtained for ethanol was not significant here, with a value of only 1.47 and a corresponding p-value of 0.1549. The F-value also indicates that the entire model is not significant, showing the same p-value as for ethanol, which is to be expected since ethanol was the only explanatory variable included, and it is not significant. The correlation between ethanol and income is 0.293 here, and this value stays the same when switching the variables. The equation of this regression line is INCOME = -3508.7 + 15754ETHANOL. This indicates that if a person consumes one more gallon of ethanol in a year than someone else, their income is $15,754 greater than that other persons.

The REG Procedure

Model: MODEL1

Dependent Variable: Income

Number of Observations Read 41
Number of Observations Used 24
Number of Observations with Missing Values 17

 

Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 1 127732279 127732279 11.14 0.0030
Error 22 252325481 11469340    
Corrected Total 23 380057760      

 

Root MSE 3386.64141 R-Square 0.3361
Dependent Mean 32170 Adj R-Sq 0.3059
Coeff Var 10.52733    

 

Parameter Estimates
Variable DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept 1 -46433 23564 -1.97 0.0615
Ethanol 1 35097 10517 3.34 0.0030

Pearson Correlation Coefficients, N = 24
Prob > |r| under H0: Rho=0
  Ethanol Income
Ethanol
1.00000
 
0.57973
0.0030
Income
0.57973
0.0030
1.00000

 

The graph of my refined Engel Curve also shows that ethanol is a normal good, as it is upward sloping. As can be seen, taking the 1990 value out of my analysis improved the model. The R-Square value was increased substantially, from 0.086 to 0.3361. This is still not a very high R-Square value, but it is clearly much better than the one before. The t-statistic increased far enough to become significant, with a t-value of 3.34 and a p-value of 0.003, making it significant at the 0.1% level. The F-statistic also shows that the model is significant, with a value of 11.14 and the same p-value as the t-statistic. The correlation between ethanol and income in this increased substantially, from 0.293 to 0.580. The new regression equation is INCOME = -46433 + 35097ETHANOL, meaning that if the model is correct, a person will consume 1 more gallon of ethanol per year if their income is $35,097 higher.

Because this data is based on ethanol consumption, it may not be very reliable for an Engel curve. As I mentioned earlier, the existence of many different types of alcoholic beverages with a wide variety in prices questions the analysis presented. It is possible that poor people drink more than rich people, but buy much cheaper brands and in the end spend far less money on alcohol. People also consume alcohol for different reasons, such as depression. A person who becomes clinically depressed may vastly increase their alcohol consumption without an increase in income, or even with a decrease in income if they are depressed because of losing a job. In addition, my data on ethanol is for all people 14 years and older, which is problematic. It may be that in some years, more underage people, who do not have many monetary obligations decide to drink more, and influence others who are of age to buy the alcohol for them.

 

Engel Curve for Ethanol Consumption and Income