For this assignment I used per capita real disposable personal income from FRED as my measure of overall income for the United States. The good that I decided to examine is total alcohol consumption per capita, measured in gallons of ethanol. I obtained this information from Statista. I decided to use a time series approach for my analysis of the Engel Curve. The timespan of my data goes from 1990 through 2014. I expected the curve to be upward sloping because I think that alcohol is a normal good. People who are rich do not find alcohol less appealing to drink in general than those with less money. However, certain brands or types of alcoholic beverages may be inferior. If people without much money suddenly became substantially more wealthy, they might decide to stop drinking cheap brands of beer like Busch for example, and switch to drinking expensive wine instead. As I am not differentiating between brands or types of alcoholic beverages in my regression analysis, I am unable to test for this. I decided to put income on the Yaxis and ethanol on the Xaxis even though this made income the dependent variable in the regression, because Engel Curves always have income on the Yaxis. Although ethanol should be the dependent variable in this situation, the results obtained when switching the two variables are still important and useful.
The year 1990 has a much higher value for ethanol consumption than any of the other years, and looks like it is an outlier. Outliers are a problem in statistical models because they pull a regression line in their direction substantially, even though they are only one observation among many. In order to refine my estimate of the Engel curve, I took that data point out, and ran the regression again, using only the years 1991 through 2014. I expected that this would improve the fit of the model substantially.
The REG Procedure
Model: MODEL1
Dependent Variable: Income
Number of Observations Read  41 
Number of Observations Used  25 
Number of Observations with Missing Values  16 
Analysis of Variance  
Source  DF  Sum of Squares 
Mean Square 
F Value  Pr > F 
Model  1  36283434  36283434  2.16  0.1549 
Error  23  385782222  16773140  
Corrected Total  24  422065656 
Root MSE  4095.50242  RSquare  0.0860 
Dependent Mean  31905  Adj RSq  0.0462 
Coeff Var  12.83639 
Parameter Estimates  
Variable  DF  Parameter Estimate 
Standard Error 
t Value  Pr > t 
Intercept  1  3508.74938  24092  0.15  0.8855 
Ethanol  1  15754  10711  1.47  0.1549 
Pearson Correlation Coefficients, N = 25 Prob > r under H0: Rho=0 

Ethanol  Income  
Ethanol 



Income 


As it can be seen in the above graph, the Engel Curve here is upward sloping which indicates that Alcohol is a normal good as was expected, although there are a number of data points below the regression line aiming in a downward direction. This regression line does not appear to fit the data very well because many of the data points are scattered fairly far away from it. Also, I had a low RSquare value of 0.0860. According to the way I arranged the variables, this means that variation in ethanol consumption only explains 8.6 percent of the variation in income. However, I did check the RSquare results with ethanol as the dependent variable, and it was exactly the same. The tstatistic obtained for ethanol was not significant here, with a value of only 1.47 and a corresponding pvalue of 0.1549. The Fvalue also indicates that the entire model is not significant, showing the same pvalue as for ethanol, which is to be expected since ethanol was the only explanatory variable included, and it is not significant. The correlation between ethanol and income is 0.293 here, and this value stays the same when switching the variables. The equation of this regression line is INCOME = 3508.7 + 15754ETHANOL. This indicates that if a person consumes one more gallon of ethanol in a year than someone else, their income is $15,754 greater than that other persons.
The REG Procedure
Model: MODEL1
Dependent Variable: Income
Number of Observations Read  41 
Number of Observations Used  24 
Number of Observations with Missing Values  17 
Analysis of Variance  
Source  DF  Sum of Squares 
Mean Square 
F Value  Pr > F 
Model  1  127732279  127732279  11.14  0.0030 
Error  22  252325481  11469340  
Corrected Total  23  380057760 
Root MSE  3386.64141  RSquare  0.3361 
Dependent Mean  32170  Adj RSq  0.3059 
Coeff Var  10.52733 
Parameter Estimates  
Variable  DF  Parameter Estimate 
Standard Error 
t Value  Pr > t 
Intercept  1  46433  23564  1.97  0.0615 
Ethanol  1  35097  10517  3.34  0.0030 
Pearson Correlation Coefficients, N = 24 Prob > r under H0: Rho=0 

Ethanol  Income  
Ethanol 



Income 


The graph of my refined Engel Curve also shows that ethanol is a normal good, as it is upward sloping. As can be seen, taking the 1990 value out of my analysis improved the model. The RSquare value was increased substantially, from 0.086 to 0.3361. This is still not a very high RSquare value, but it is clearly much better than the one before. The tstatistic increased far enough to become significant, with a tvalue of 3.34 and a pvalue of 0.003, making it significant at the 0.1% level. The Fstatistic also shows that the model is significant, with a value of 11.14 and the same pvalue as the tstatistic. The correlation between ethanol and income in this increased substantially, from 0.293 to 0.580. The new regression equation is INCOME = 46433 + 35097ETHANOL, meaning that if the model is correct, a person will consume 1 more gallon of ethanol per year if their income is $35,097 higher.
Because this data is based on ethanol consumption, it may not be very reliable for an Engel curve. As I mentioned earlier, the existence of many different types of alcoholic beverages with a wide variety in prices questions the analysis presented. It is possible that poor people drink more than rich people, but buy much cheaper brands and in the end spend far less money on alcohol. People also consume alcohol for different reasons, such as depression. A person who becomes clinically depressed may vastly increase their alcohol consumption without an increase in income, or even with a decrease in income if they are depressed because of losing a job. In addition, my data on ethanol is for all people 14 years and older, which is problematic. It may be that in some years, more underage people, who do not have many monetary obligations decide to drink more, and influence others who are of age to buy the alcohol for them.