Time-Series Study of Alcohol Sales
With this project I attempted to see if real median income in the United States had any sort of effect on alcohol sales. My thinking is that as income rises, people will party more, or at the very least by more expensive spirits, and thus increase alcohol sales. To ensure I had a good enough sample I also added the section of the population that identified as white. The nature of the assignment had it be that I only needed a single good and a single measure of income, so there is the explanation for that decision. Before I get too deep into the results, here is how the graphs turned out.
Then for funzies, I swapped independent and dependent variables just to see what it looked like in graph form:
Mean alcohol sales over the period of 1992-2015 were $87.4 billion. Mean real median income was $54,721.98, with mean income of whites being $57,496.60, and all three numbers have been reproduced below.
The data for income was provided by the US Census Bureau in their “Income and Poverty in the United States: 2015” and it broke down both mean and median income for the whole country into several different races. In regards to the data for the alcohol sales, that is also courtesy of the US Census Bureau in their “Monthly & Annual Wholesale Trade report” dataset. This awesome dataset selected nineteen goods across both durable and nondurable, and tracked their sales, inventories, and inventories/sales ratios. As the names implies, it also breaks the data into monthly segments, but I personally found it unnecessary for the work that I would be doing, considering my income data was only annual. The technical term for alcohol in terms of the dataset is Beer, Wine, & Distilled Alcoholic Beverages, but shortened for repeated use.
I used SAS to run OLS regressions for my data, and the above charts were done in MS Excel due to their ease of being put into a MS Word document. In order to make analyzing the results easier and more widely applicable, I took the natural log of all three variables. Without further ado, let us analyze some regressions!
Well that could have gone better… Both real median income and real median income of white people were insignificant at 5%, with both needing to reach a t value of roughly 2.074 (1.62 for median income, and 1.69 for white median income). Had they been significant, increasing income by 10% is predicted to increase spending on alcohol by 27%. In terms of whether the models are quality, they are not fully explanatory. The R-Squared of the whole population is .10, and the one for whites is .11. Not only are the models not super explanatory, they are not all that significant thanks to F values of 2.61 and 2.84 respectively. In order to have a shot at being a statistically significant model at 5%, it would need to have a F value of more than 3. I wish I could write more about the model, but at present, there is not a whole lot that can be said.
Just because no values came up significant, does not mean good work was not done. Sometimes just merely proving more work is needed is a success. To some, knowledge that income does not affect alcohol sales is a good thing. What if for instance you ran a distillery that specialized in domestic spirits. You would be pleased to know negative income shocks are not going to hurt sales, so when times are a little rough you can focus on attracting new customers or getting current ones to buy a bigger size. These results align more with a scientific mindset where they seek to disprove notions, concepts, and ideas not prove. No one can be wholly mad since we know for sure that the world works the way it does.
The SAS System 9 10:32 Monday, February 13, 2017
The REG Procedure
Model: MODEL1
Dependent Variable: lnalc
Number of Observations Read 25
Number of Observations Used 24
Number of Observations with Missing Values 1
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 0.27516 0.27516 2.61 0.1203
Error 22 2.31732 0.10533
Corrected Total 23 2.59248
Root MSE 0.32455 R-Square 0.1061
Dependent Mean 11.32570 Adj R-Sq 0.0655
Coeff Var 2.86560
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 -18.19236 18.26322 -1.00 0.3300
lninc 1 2.70578 1.67409 1.62 0.1203
==============================================================================
The SAS System 10 10:32 Monday, February 13, 2017
The REG Procedure
Model: MODEL1
Dependent Variable: lnalc
Number of Observations Read 25
Number of Observations Used 24
Number of Observations with Missing Values 1
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 0.29652 0.29652 2.84 0.1060
Error 22 2.29596 0.10436
Corrected Total 23 2.59248
Root MSE 0.32305 R-Square 0.1144
Dependent Mean 11.32570 Adj R-Sq 0.0741
Coeff Var 2.85237
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 -20.10195 18.64489 -1.08 0.2926
lnwinc 1 2.86782 1.70136 1.69 0.1060
The SAS System 11 10:32 Monday, February 13, 2017
The REG Procedure
Model: MODEL1
Dependent Variable: lnalc
Number of Observations Read 25
Number of Observations Used 24
Number of Observations with Missing Values 1
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 2 0.30230 0.15115 1.39 0.2720
Error 21 2.29017 0.10906
Corrected Total 23 2.59248
Root MSE 0.33024 R-Square 0.1166
Dependent Mean 11.32570 Adj R-Sq 0.0325
Coeff Var 2.91581
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 -20.40367 19.10457 -1.07 0.2976
lninc 1 -2.38088 10.33779 -0.23 0.8201
lnwinc 1 5.26548 10.55492 0.50 0.6231
The SAS System 12 10:32 Monday, February 13, 2017
The MEANS Procedure
Variable N Mean Std Dev Minimum Maximum
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
ALCSALES 24 87401.83 28249.52 50222.00 135021.00
RMEDINC 24 54721.98 2189.85 50478.50 57909.06
WINC 24 57496.60 2252.56 53256.02 60440.91
Ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
SAS Code:
options ls =78 formdlim= ‘=’ nolabel;
ods listing;
proc import out= work.mydat
datafile= “H:E488_SASb.xlsx” replace;
run;
data two;
set work.mydat;
lnalc = log(ALCSALES);
lninc = log(RMEDINC);
lnwinc = log(WINC);
run;
proc reg data = two;
model lnalc = lninc;
run;
proc reg data = two;
model lnalc = lnwinc;
run;
proc sgplot data = two;
scatter x = RMEDINC y = ALCSALES;
run;
proc sgplot data = two;
scatter x = WINC y = ALCSALES;
run;
proc reg data = two;
model lnalc = lninc lnwinc;
run;
proc means data = two;
var ALCSALES RMEDINC WINC;
run;
-The Jiffy Riddler