As one of the necessities for many people, it is always interesting to see how the consumption of gasoline would response to the changes in income. For this assignment, I attempted to analyze the effect of income to gasoline demand, and plot an aggregate Engel Curve to graphically illustrate the variations in consumption as income increases.
Model Specification and Data Sources
The model of choice took a time series approach, the time span is from Jan. 1975 to Nov. 2016, in total of 503 observations. The dependent variable is per capita gasoline consumption (GASPC) from the U.S. Product Supplies of Finished Motor Gasoline (measures in thousand barrels) reported by U.S. Energy Information Administration. In order to get the per captia consumption, I divided the total consumption by monthly measured population. The population data is obtained from U.S. Bureau of Economic Analysis. Real disposable personal income per capita (RDPI00) is one of the explanatory variables, real price of gasoline (RPGAS) is also included as an independent because gasoline consumption is also greatly dependent on its price. The real price of gasoline is obtained by deflating the nominal price for regular gasoline by PCE, same with the real disposable income. U.S. Bureau of Economic Analysis collects data on disposable income, and gasoline price data is obtained by U.S. Bureau of Labor Statistics. I choose to log all the variable to estimate the percentage change of the variables. This double log model allows me to observe both price and income elasticity very conveniently.
SAS Outputs
The table below presented the regression output for the following model.
Model 1: Estimates Using Macroeconomic Data (1975 – 2016)  
3.98231***
(0.12853) 

ln(RPGAS)  0.02300**
(0.01055) 
ln(RDPI00)  0.03380***
(0.01293) 
R^{2}  0.0310 
Adj. R^{2}  0.0271 
S.E. of residuals  3.63061 
Sum squared residuals  2.05720 
Note: Figures in parentheses are standard errors, ** (p<0.05), *** (p<0.01)
As shown above, while all of the coefficient estimates are statistically significant at 5%, the R^{2} is very low, suggesting that the model has omitted variable bias. To fix this, I created 11 dummy variables (December is the reference month) for every month in order to account for the fact that people travel at different distances in different months (for example, people are more likely to do road trip during the summer than winter). After adding the dummy variables, the model’s goodness of fit improved significantly. Output is presented below.
Model 2: Estimates After Adding Dummy Variables (19752016)  
3.98360***
(0.09155) 

ln(RPGAS)  0.03602***
(0.00753) 
ln(RDPI00)  0.03139***
(0.00918) 
R^{2}  0.5233 
Adj. R^{2}  0.5107 
S.E. of residuals  0.04549 
Sum squared residuals  1.01197 
Note: Figures in parentheses are standard errors, *** (p<0.01)
The coefficient estimates for the dummy variables are not included in here for the sake of space, but notice that the R^{2} increased significantly, from 0.03 to 0.52. Another thing that stood out to me is the negative coefficient estimate for real disposable income. This implies that gasoline is an inferior good, which does not fit with the conventional interpretation of gasoline. Therefore I decided to run separate regressions for different time periods, to see whether the negative effect is persistence throughout the history or the effect has reversed in the recent decade. The table below is a side by side comparison of regression results for two different time spans, Nov. 1990Nov. 2000 and Nov. 2006Nov. 2016.
Model 3: Comparison Between Two Time Spans (Dummy Variables Included)  
Nov. 1990 – Nov. 2000  Nov. 2006 – Nov. 2016  
0.61261***
(0.21991) 
10.65427***
(1.03656) 

ln(RPGAS)  0.03564***
(0.01585) 
0.09411***
(0.01510) 
ln(RDPI00)  0.30235***
(0.02186) 
0.67451***
(0.09986) 
R^{2}  0.9202  0.7027 
Adj. R^{2}  0.9105  0.6666 
S.E. of residuals  0.01631  0.03069 
Sum squared residuals  0.02846  0.10081 
Note: Figures in parentheses are standard errors, *** (p<0.01)
Again, the output for dummy variables is not presented here. Take a look at the coefficient estimate for real disposable, it was positive back in the 20^{th} century, and changed to negative as the data got more recent. One possible explanation for this could be the development of transportation in the recent years. As people’s disposable income increases, they would be more likely to choose other forms of transportation such as air plane instead of driving.
Plot Engel Curve Based on Regression Results
Based on the results from Model 2, a one percent change in real disposable income per capita would lead to a 0.03 percent decrease in gasoline consumption. With price being fixed, the following Engel Curve shows the overall variation in gasoline consumption oppose to changes in income from 19752016. The negative slope suggested gasoline is an inferior good.[1]
Conclusion
Overall, it is still hard to explain why gasoline became an inferior good in recent years. It could be errors exist in the model, or due to the omitted variable bias. Another very interesting thing to note is that the R^{2} increased by a large amount when dealing with shorter time spans, even the model of estimation is exactly the same. This fact makes me wonder the presence of autocorrelation in the model, that some variables are included in the error term overtime that are correlated with the independent variables, but not being carried in the shorter time of period. In short, these concerns remain unsolved with the current model, a further study is required for more in depth interpretations.
[[1] This graph is slightly out of scale, because according to the regression output, the effect of one percent increase in disposable income is pretty small. So the graph should be much flatter than the current one. Yet due to the limitation in SAS, this is what I have so far.]