Timeseries study of the proportion of food in Personal consumption expenditure
Econ 385
I think that to model time series (19672016) differences in the proportion of Food Expenditure in personal consumption expenditure based on Personal Income, Inflation consumer prices, and the proportion of Health Care Expenditure in Personal Consumption Expenditure. So I think that the dependent variable is the Proportion of Food Expenditure in personal consumption expenditure (PFIA), and explanatory variable is Personal Income (PI), and the proportion of Health Care Expenditure in Personal Consumption Expenditure (PHCIA). I found the data of the personal consumption expenditure: food, and the personal consumption expenditure: health care, calculate their proportion in the personal consumption expenditure: All term. Meanwhile I assume that the proportion of Food Expenditure in personal consumption expenditure will decrease as personal income increase, because people will pay more money to other parts such as Health Care rather than pay to Food, when their Income increase, and we can found it form the scatter plot(1), the proportion health care expenditure in personal consumption expenditure has a same relationship and tendency, like scatter plot(2)
(1)
(2)
The MEANS Procedure
Variable  Label  N  Mean  Std Dev  Minimum  Maximum 
PFIA
PI PHCIA 
PFIA
PI PHCIA 
49
49 49 
0.1073261
6151.41 0.1237509 
0.0307650
4544.49 0.0320319 
0.0733248
665.7000000 0.0628695 
0.1627907
15458.50 0.1684346 
proc means;
var PFIA PI PHCIA;
run;
From this MEANS Procedure, the mean of PFIA is 0.173, the mean of PI is 6154.41, and the mean of PHCIA is 0.124, meanwhile we can found that as the PFIA decrease, the personal income and PHCIA will be increase.
The Relationship PFIA with PI and PHCIA
The REG Procedure
Model: MODEL1
Dependent Variable: PFIA PFIA
Number of Observations Read  50 
Number of Observations Used  49 
Number of Observations with Missing Values  1 
Analysis of Variance  
Source  DF  Sum of Squares 
Mean Square 
F Value  Pr > F 
Model  2  0.04317  0.02159  439.86  <.0001 
Error  46  0.00226  0.00004908  
Corrected Total  48  0.04543 
Root MSE  0.00701  RSquare  0.9503 
Dependent Mean  0.10733  Adj RSq  0.9481 
Coeff Var  6.52731 
Parameter Estimates  
Variable  Label  DF  Parameter Estimate 
Standard Error 
t Value  Pr > t 
Intercept  Intercept  1  0.22309  0.00700  31.86  <.0001 
PI  PI  1  1.06236E8  5.713825E7  0.02  0.9852 
PHCIA  PHCIA  1  0.93490  0.08106  11.53  <.0001 
proc reg;
title’The Relationship PFIA with PI and PHCIA’;
model PFIA=PI PHCIA;
run;
I used SAS to perform an Ordinary Least Squares analysis of my data set. Below is the interpretation of my regression results. For this sample an increase one percentage in the personal income was associated with an additional 1.06236E8 dollars in the proportion food expenditure of personal consumption expenditure, holding the proportion of health care expenditure in personal consumption expenditure constant. The coefficient on the personal income is not statistically significant (p=0.9852). And the proportion of health care expenditure in personal consumption expenditure is highly statistically significant (p=<0.0001), The regression as a whole fit reasonably well (R^{2}=0.9503, adjusted R^{2}=0.9481) and was highly statistically significant (F=439.86, P<0.0001).
Firstly I suspect that my regression has problem of multicollinearity, I want to run PROC CORR to detect the multicollinearity, and I can found that the correlation coefficients is bigger than 0.8(0.92106) in explanatory variablepersonal income and explanatory variablethe proportion of health care expenditure in the personal consumption expenditure. So I think that the regression has problem of multicollinearity.
The Relationship PFIA with PI and PHCIA
The CORR Procedure
3 Variables:  PFIA PI PHCIA 
Simple Statistics  
Variable  N  Mean  Std Dev  Sum  Minimum  Maximum  Label 
PFIA  49  0.10733  0.03077  5.25898  0.07332  0.16279  PFIA 
PI  49  6151  4544  301419  665.70000  15459  PI 
PHCIA  49  0.12375  0.03203  6.06379  0.06287  0.16843  PHCIA 
Pearson Correlation Coefficients, N = 49 Prob > r under H0: Rho=0 

PFIA  PI  PHCIA  
PFIA
PFIA 
1.00000

0.89813
<.0001 
0.97484
<.0001 
PI
PI 
0.89813
<.0001 
1.00000

0.92106
<.0001 
PHCIA
PHCIA 
0.97484
<.0001 
0.92106
<.0001 
1.00000

proc corr;
var PFIA PI PHCIA;
run;
I will make sure again for multicollinearity, so I want to make variance inflation factor test. And I still found that these two explanatory VIF is so higher, bigger than 5, the PI is 6.59445 and the PHCIA is 6.59445. I want to cure it.
The Relationship PFIA with PI and PHCIA
The REG Procedure
Model: MODEL1
Dependent Variable: PFIA PFIA
Number of Observations Read  50 
Number of Observations Used  49 
Number of Observations with Missing Values  1 
Analysis of Variance  
Source  DF  Sum of Squares 
Mean Square 
F Value  Pr > F 
Model  2  0.04317  0.02159  439.86  <.0001 
Error  46  0.00226  0.00004908  
Corrected Total  48  0.04543 
Root MSE  0.00701  RSquare  0.9503 
Dependent Mean  0.10733  Adj RSq  0.9481 
Coeff Var  6.52731 
Parameter Estimates  
Variable  Label  DF  Parameter Estimate 
Standard Error 
t Value  Pr > t  Variance Inflation 
Intercept  Intercept  1  0.22309  0.00700  31.86  <.0001  0 
PI  PI  1  1.06236E8  5.713825E7  0.02  0.9852  6.59455 
PHCIA  PHCIA  1  0.93490  0.08106  11.53  <.0001  6.59455 
proc reg;
model PFIA = PI ICP PHCIA/ VIF;
RUN;
I want to cure it, drop one of the multicollinearity variable the proportion of health care. And we can found that an increase of one percentage point in the personal income was associated with an additional 0.00000608dollars in the proportion of food expenditure in the personal consumption expenditure, and the coefficient on the personal income was highly statistically significant (p<0.0001). The result is my estimation, because from the Engel Curve, we can know that as the personal income increase, the proportion of food expenditure in the personal consumption expenditure will decrease, because people will pay more attention to others’ part.
The Relationship PFIA with PI and PHCIA
The REG Procedure
Model: MODEL1
Dependent Variable: PFIA PFIA
Number of Observations Read  50 
Number of Observations Used  49 
Number of Observations with Missing Values  1 
Analysis of Variance  
Source  DF  Sum of Squares 
Mean Square 
F Value  Pr > F 
Model  1  0.03665  0.03665  196.06  <.0001 
Error  47  0.00879  0.00018692  
Corrected Total  48  0.04543 
Root MSE  0.01367  RSquare  0.8066 
Dependent Mean  0.10733  Adj RSq  0.8025 
Coeff Var  12.73848 
Parameter Estimates  
Variable  Label  DF  Parameter Estimate 
Standard Error 
t Value  Pr > t 
Intercept  Intercept  1  0.14473  0.00331  43.74  <.0001 
PI  PI  1  0.00000608  4.34228E7  14.00  <.0001 
proc reg;
title’The Relationship PFIA with PI;
model PFIA=PI;
run;
Because it is a time series project, so I estimate it is an autocorrelation, and take it to DW test. With three explanatory variables and 49 observations, and I get Durbin Watson d statistic d, 5%significance level, d_{L}is 1.5, and d_{U} is 1.59, so form the DW d less than d_{L} 1.50, I will reject H_{0}, no positive autocorrelation, so it is an autocorrelation.
The Relationship PFIA with PI and PHCIA
The REG Procedure
Model: MODEL1
Dependent Variable: PFIA PFIA
DurbinWatson D  0.276 
Number of Observations  49 
1st Order Autocorrelation  0.848 
proc reg;
model PFIA=PI PHCIA/DW;
run;
Fix the autocorrelation
Relationship PFTA with PI AND PHCIA
The AUTOREG Procedure
Dependent Variable  PFIA 
PFIA 
Relationship PFTA with DSPIC CPIF and PHCIA
The AUTOREG Procedure
Ordinary Least Squares Estimates  
SSE  0.00225755  DFE  46 
MSE  0.0000491  Root MSE  0.00701 
SBC  338.54805  AIC  344.22351 
MAE  0.00572767  AICC  343.69018 
MAPE  6.22501397  HQC  342.07025 
DurbinWatson  0.2765  Total RSquare  0.9503 
DurbinWatson Statistics  
Order  DW  Pr < DW  Pr > DW 
1  0.2765  <.0001  1.0000 
NOTE: Pr<DW is the pvalue for testing positive autocorrelation, and Pr>DW is the pvalue for testing negative autocorrelation.
Parameter Estimates  
Variable  DF  Estimate  Standard Error 
t Value  Approx Pr > t 
Variable Label 
Intercept  1  0.2231  0.007003  31.86  <.0001  
PI  1  1.062E8  5.7138E7  0.02  0.9852  PI 
PHCIA  1  0.9349  0.0811  11.53  <.0001  PHCIA 
Estimates of Autocorrelations  
Lag  Covariance  Correlation  1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 
0  0.000046  1.000000   ******************** 
1  0.000039  0.847528   *****************  
Preliminary MSE  0.000013 
Estimates of Autoregressive Parameters  
Lag  Coefficient  Standard Error 
t Value 
1  0.847528  0.079120  10.71 
Relationship PFTA with DSPIC CPIF and PHCIA
The AUTOREG Procedure
YuleWalker Estimates  
SSE  0.00043337  DFE  45 
MSE  9.63056E6  Root MSE  0.00310 
SBC  414.26047  AIC  421.82775 
MAE  0.00248173  AICC  420.91866 
MAPE  2.40552271  HQC  418.95673 
DurbinWatson  0.9490  Transformed Regression RSquare  0.7794 
Total RSquare  0.9905 
DurbinWatson Statistics  
Order  DW  Pr < DW  Pr > DW 
1  0.9490  <.0001  1.0000 
NOTE: Pr<DW is the pvalue for testing positive autocorrelation, and Pr>DW is the pvalue for testing negative autocorrelation.
Parameter Estimates  
Variable  DF  Estimate  Standard Error 
t Value  Approx Pr > t 
Variable Label 
Intercept  1  0.1867  0.009900  18.86  <.0001  
PI  1  2.519E6  7.998E7  3.15  0.0029  PI 
PHCIA  1  0.4990  0.1085  4.60  <.0001  PHCIA 
proc autoreg;
title’Relationship PFTA with DSPIC CPIF and PHCIA’;
MODEL PFIA=PI ICP PHCIA/NLAG=1 DWPROB;
RUN;
Conclusion:
I estimate that as the personal income increase, the proportion of food expenditure in the personal consumption expenditure. Because in the Engle curve when personal income increase, people will pay more money to others part rather than food, maybe people will pay more attention to health care and sport activities. However, my project still has some limitation, for example, I should add more explanatory, because I think that a lot of factor will affect the change of the proportion of food expenditure in the personal consumption expenditure. On the other hand, I should add different state data to compare, it can help me analysis different state and different family will have different results in the project.