Sam Felling

The consumption of water and the effects income has on these consumption levels across the 50 states intrigued me and as such I decided to focus my SAS assignment on estimating this Engel curve. The consumption of water is a good that does not have a ton of data available on it, and as such I had to adjust accordingly. In 2010 the U.S department of the interior and U.S Geological survey came together to form a report on the consumption of water both domestically and industrially across the 50 states. This data released was the data I used to estimate my dependent variable in this OLS regression. Because the data was average consumption of water daily in millions of gallons in 2010, my independent variables also had to be statewide data across in 2010. The variables I focuses on were “income” which was median household income in 2010 across the states, “unemployment” which was the average rate of unemployment across the states in 2010, “population” which was the gross population across the states in 2010, and finally “bachelors” which was the percent of population across the states that were 25 and older who held bachelor’s degrees or higher in 2010. After running the initial regression, I received the following output.

 

The SAS System

The REG Procedure

Model: MODEL1

Dependent Variable: consumtion consumtion

Number of Observations Read 51
Number of Observations Used 51
Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 4 1621962144 405490536 34.77 <.0001
Error 46 536483889 11662693    
Corrected Total 50 2158446032      

 

Root MSE 3415.06855 R-Square 0.7514
Dependent Mean 6888.80588 Adj R-Sq 0.7298
Coeff Var 49.57417    

 

Parameter Estimates
Variable Label DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept Intercept 1 7707.91450 4178.30478 1.84 0.0715
income income 1 -0.11330 0.09324 -1.22 0.2305
unemployment unemployment 1 103.92834 278.62980 0.37 0.7109
population population 1 0.00081677 0.00007821 10.44 <.0001
Bachelors Bachelors 1 -36.34565 132.01076 -0.28 0.7843
             

Population was the only significant parameter estimated in this model and that seemed to be a given. All the other variables could not reject the null of not being significant. The only P value of significance was population with a p value that was so small that it could be accepted at 1 percent level of significance. This model though did have significance and approximately 75 percent of variation in water consumption was explained by income, unemployment, education, and population.  Below is graph of the Engel Curve of Water consumption when it is the only variable.

 

As shown in the graph above when income is the only variable the model predicts a non-significant downward slope, implying a inverse relationship between income and consumption of water. This may be due to as income goes up your values change, but also that you can afford appliances and various items than are more energy efficient. However, the p value shown below once again indicated a non-significant relationship. Show below is the output from the bivariate regression. The P value was .45 and very weak. In the regression with all of the variables such as education, income ,and unemployment the p value was still high at .23 and still insignificant.

 

The SAS System

The REG Procedure

Model: MODEL1

Dependent Variable: consumtion consumtion

Number of Observations Read 51
Number of Observations Used 51

 

Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 1 25225530 25225530 0.58 0.4502
Error 49 2133220502 43535112    
Corrected Total 50 2158446032      

 

Root MSE 6598.11430 R-Square 0.0117
Dependent Mean 6888.80588 Adj R-Sq -0.0085
Coeff Var 95.78023    
Parameter Estimates
Variable Label DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept Intercept 1 11256 5810.63794 1.94 0.0585
income income 1 -0.08737 0.11478 -0.76 0.4502

To correct for problems with my model I had to first determine some potential problems. I believed high standard errors may have been in fact due to correlation between the variables. After retrieving these results my very first initial concern was the unusually high standard error for unemployment and Bachelors. As such I decided to create a double log model that should help correct for some of the autocorrelation. In this model “lncon” is the natural log of2010 water consumption across states, “lnunem” is the natural log of 2010 unemployment rate across states, “lnpop” is the natural log of 2010 population across states, and “lnbach” is the natural log of 2010 percent of population 25 and over across the states who held bachelor’s degrees or higher. The model results are shown below.

Water consumption Engel Curve

The REG Procedure

Model: MODEL1

Dependent Variable: lncon lncon

Number of Observations Read 51
Number of Observations Used 51
Analysis of Variance
Source DF Sum of
Squares
Mean
Square
F Value Pr > F
Model 4 100.09811 25.02453 17.37 <.0001
Error 46 66.25850 1.44040    
Corrected Total 50 166.35660      

 

Root MSE 1.20017 R-Square 0.6017
Dependent Mean 8.22613 Adj R-Sq 0.5671
Coeff Var 14.58969    

 

Parameter Estimates
Variable Label DF Parameter
Estimate
Standard
Error
t Value Pr > |t|
Intercept Intercept 1 -15.41774 16.03432 -0.96 0.3413
lnincome lnincome 1 2.49133 1.78913 1.39 0.1705
lnunem lnunem 1 -1.56430 0.85442 -1.83 0.0736
lnpop lnpop 1 1.25736 0.19299 6.52 <.0001
lnbach lnbach 1 -5.71094 1.46567 -3.90 0.0003

 

When using the double log model we find increasing significant results and much lower standard errors. In this model we can see that income is still downward sloping and insignificant, but the p value was much lower. While the R squared value (.6020 was lower than the first model, this seems to be a more realistic explanation of water consumption by the variables. In this model lnpop and lnbach are significant at 1 percent level of significance, and lnunem is significant at 10 percent level of significance. The following graphs are fit plots and residual regressors for the double log model.

 

 

Ultimately though the double log model explains less, it makes more sense. Population is of course going to impact levels of consumption. If population in increased it is excepted water consumption will increase along with that population. If population is increased by 10 percent than water consumption than the model predicts holding all else equal that water consumption will increase by around 12 percent. Education has as downward slope as well. If you are more educated you may be more inclined to conserve more water due to cultural values.  If a state has higher unemployment than the model predicts a decline in the usage of water. This make sense because if unemployment is higher than less people will be able to afford more water. An interesting note is that in this model income is not significant but it is estimated as upward sloping.

Ultimately when it comes to determine the Engel Curve of water consumption, the results from my study were inconclusive. Other variables were found to be significant, but income was not. This however is telling. Water consumption may be inelastic when it comes to income, but once again one model found a upward sloping relationship and the other a downward sloping relationship (both not significant o values). One potential flaw with this study could be the fact that there is no clear indicator or variable for price across the states. If a variable for price of water across the states in 2010 was added, the results could become more significant. However, there is no current proxy for price that seemed to be a variable that is usable. Only a few select cities have data available on average price of water, and it seemed illogical to use a few select cities as usable data for all the states.

 

Data obtained from:

https://pubs.usgs.gov/circ/1405/pdf/circ1405.pdf

https://www.bls.gov/lau/lastrk10.htm

https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_10_1YR_S1901&prodType=table

https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=DEC_10_SF1_SF1DP1&prodType=table

Engel Curve for the Consumption of Water in 2010