Time-series study of the proportion of food in Personal consumption expenditure
Econ 385
I think that to model time series (1967-2016) differences in the proportion of Food Expenditure in personal consumption expenditure based on Personal Income, Inflation consumer prices, and the proportion of Health Care Expenditure in Personal Consumption Expenditure. So I think that the dependent variable is the Proportion of Food Expenditure in personal consumption expenditure (PFIA), and explanatory variable is Personal Income (PI), and the proportion of Health Care Expenditure in Personal Consumption Expenditure (PHCIA). I found the data of the personal consumption expenditure: food, and the personal consumption expenditure: health care, calculate their proportion in the personal consumption expenditure: All term. Meanwhile I assume that the proportion of Food Expenditure in personal consumption expenditure will decrease as personal income increase, because people will pay more money to other parts such as Health Care rather than pay to Food, when their Income increase, and we can found it form the scatter plot(1), the proportion health care expenditure in personal consumption expenditure has a same relationship and tendency, like scatter plot(2)
(1)
(2)
The MEANS Procedure
Variable | Label | N | Mean | Std Dev | Minimum | Maximum |
PFIA
PI PHCIA |
PFIA
PI PHCIA |
49
49 49 |
0.1073261
6151.41 0.1237509 |
0.0307650
4544.49 0.0320319 |
0.0733248
665.7000000 0.0628695 |
0.1627907
15458.50 0.1684346 |
proc means;
var PFIA PI PHCIA;
run;
From this MEANS Procedure, the mean of PFIA is 0.173, the mean of PI is 6154.41, and the mean of PHCIA is 0.124, meanwhile we can found that as the PFIA decrease, the personal income and PHCIA will be increase.
The Relationship PFIA with PI and PHCIA
The REG Procedure
Model: MODEL1
Dependent Variable: PFIA PFIA
Number of Observations Read | 50 |
Number of Observations Used | 49 |
Number of Observations with Missing Values | 1 |
Analysis of Variance | |||||
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 2 | 0.04317 | 0.02159 | 439.86 | <.0001 |
Error | 46 | 0.00226 | 0.00004908 | ||
Corrected Total | 48 | 0.04543 |
Root MSE | 0.00701 | R-Square | 0.9503 |
Dependent Mean | 0.10733 | Adj R-Sq | 0.9481 |
Coeff Var | 6.52731 |
Parameter Estimates | ||||||
Variable | Label | DF | Parameter Estimate |
Standard Error |
t Value | Pr > |t| |
Intercept | Intercept | 1 | 0.22309 | 0.00700 | 31.86 | <.0001 |
PI | PI | 1 | -1.06236E-8 | 5.713825E-7 | -0.02 | 0.9852 |
PHCIA | PHCIA | 1 | -0.93490 | 0.08106 | -11.53 | <.0001 |
proc reg;
title’The Relationship PFIA with PI and PHCIA’;
model PFIA=PI PHCIA;
run;
I used SAS to perform an Ordinary Least Squares analysis of my data set. Below is the interpretation of my regression results. For this sample an increase one percentage in the personal income was associated with an additional -1.06236E-8 dollars in the proportion food expenditure of personal consumption expenditure, holding the proportion of health care expenditure in personal consumption expenditure constant. The coefficient on the personal income is not statistically significant (p=0.9852). And the proportion of health care expenditure in personal consumption expenditure is highly statistically significant (p=<0.0001), The regression as a whole fit reasonably well (R2=0.9503, adjusted R2=0.9481) and was highly statistically significant (F=439.86, P<0.0001).
Firstly I suspect that my regression has problem of multicollinearity, I want to run PROC CORR to detect the multicollinearity, and I can found that the correlation coefficients is bigger than 0.8(0.92106) in explanatory variable-personal income and explanatory variable-the proportion of health care expenditure in the personal consumption expenditure. So I think that the regression has problem of multicollinearity.
The Relationship PFIA with PI and PHCIA
The CORR Procedure
3 Variables: | PFIA PI PHCIA |
Simple Statistics | |||||||
Variable | N | Mean | Std Dev | Sum | Minimum | Maximum | Label |
PFIA | 49 | 0.10733 | 0.03077 | 5.25898 | 0.07332 | 0.16279 | PFIA |
PI | 49 | 6151 | 4544 | 301419 | 665.70000 | 15459 | PI |
PHCIA | 49 | 0.12375 | 0.03203 | 6.06379 | 0.06287 | 0.16843 | PHCIA |
Pearson Correlation Coefficients, N = 49 Prob > |r| under H0: Rho=0 |
|||
PFIA | PI | PHCIA | |
PFIA
PFIA |
1.00000
|
-0.89813
<.0001 |
-0.97484
<.0001 |
PI
PI |
-0.89813
<.0001 |
1.00000
|
0.92106
<.0001 |
PHCIA
PHCIA |
-0.97484
<.0001 |
0.92106
<.0001 |
1.00000
|
proc corr;
var PFIA PI PHCIA;
run;
I will make sure again for multicollinearity, so I want to make variance inflation factor test. And I still found that these two explanatory VIF is so higher, bigger than 5, the PI is 6.59445 and the PHCIA is 6.59445. I want to cure it.
The Relationship PFIA with PI and PHCIA
The REG Procedure
Model: MODEL1
Dependent Variable: PFIA PFIA
Number of Observations Read | 50 |
Number of Observations Used | 49 |
Number of Observations with Missing Values | 1 |
Analysis of Variance | |||||
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 2 | 0.04317 | 0.02159 | 439.86 | <.0001 |
Error | 46 | 0.00226 | 0.00004908 | ||
Corrected Total | 48 | 0.04543 |
Root MSE | 0.00701 | R-Square | 0.9503 |
Dependent Mean | 0.10733 | Adj R-Sq | 0.9481 |
Coeff Var | 6.52731 |
Parameter Estimates | |||||||
Variable | Label | DF | Parameter Estimate |
Standard Error |
t Value | Pr > |t| | Variance Inflation |
Intercept | Intercept | 1 | 0.22309 | 0.00700 | 31.86 | <.0001 | 0 |
PI | PI | 1 | -1.06236E-8 | 5.713825E-7 | -0.02 | 0.9852 | 6.59455 |
PHCIA | PHCIA | 1 | -0.93490 | 0.08106 | -11.53 | <.0001 | 6.59455 |
proc reg;
model PFIA = PI ICP PHCIA/ VIF;
RUN;
I want to cure it, drop one of the multicollinearity variable- the proportion of health care. And we can found that an increase of one percentage point in the personal income was associated with an additional -0.00000608dollars in the proportion of food expenditure in the personal consumption expenditure, and the coefficient on the personal income was highly statistically significant (p<0.0001). The result is my estimation, because from the Engel Curve, we can know that as the personal income increase, the proportion of food expenditure in the personal consumption expenditure will decrease, because people will pay more attention to others’ part.
The Relationship PFIA with PI and PHCIA
The REG Procedure
Model: MODEL1
Dependent Variable: PFIA PFIA
Number of Observations Read | 50 |
Number of Observations Used | 49 |
Number of Observations with Missing Values | 1 |
Analysis of Variance | |||||
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 1 | 0.03665 | 0.03665 | 196.06 | <.0001 |
Error | 47 | 0.00879 | 0.00018692 | ||
Corrected Total | 48 | 0.04543 |
Root MSE | 0.01367 | R-Square | 0.8066 |
Dependent Mean | 0.10733 | Adj R-Sq | 0.8025 |
Coeff Var | 12.73848 |
Parameter Estimates | ||||||
Variable | Label | DF | Parameter Estimate |
Standard Error |
t Value | Pr > |t| |
Intercept | Intercept | 1 | 0.14473 | 0.00331 | 43.74 | <.0001 |
PI | PI | 1 | -0.00000608 | 4.34228E-7 | -14.00 | <.0001 |
proc reg;
title’The Relationship PFIA with PI;
model PFIA=PI;
run;
Because it is a time series project, so I estimate it is an autocorrelation, and take it to DW test. With three explanatory variables and 49 observations, and I get Durbin Watson d statistic d, 5%significance level, dLis 1.5, and dU is 1.59, so form the DW d less than dL 1.50, I will reject H0, no positive autocorrelation, so it is an autocorrelation.
The Relationship PFIA with PI and PHCIA
The REG Procedure
Model: MODEL1
Dependent Variable: PFIA PFIA
Durbin-Watson D | 0.276 |
Number of Observations | 49 |
1st Order Autocorrelation | 0.848 |
proc reg;
model PFIA=PI PHCIA/DW;
run;
Fix the autocorrelation
Relationship PFTA with PI AND PHCIA
The AUTOREG Procedure
Dependent Variable | PFIA |
PFIA |
Relationship PFTA with DSPIC CPIF and PHCIA
The AUTOREG Procedure
Ordinary Least Squares Estimates | |||
SSE | 0.00225755 | DFE | 46 |
MSE | 0.0000491 | Root MSE | 0.00701 |
SBC | -338.54805 | AIC | -344.22351 |
MAE | 0.00572767 | AICC | -343.69018 |
MAPE | 6.22501397 | HQC | -342.07025 |
Durbin-Watson | 0.2765 | Total R-Square | 0.9503 |
Durbin-Watson Statistics | |||
Order | DW | Pr < DW | Pr > DW |
1 | 0.2765 | <.0001 | 1.0000 |
NOTE: Pr<DW is the p-value for testing positive autocorrelation, and Pr>DW is the p-value for testing negative autocorrelation.
Parameter Estimates | ||||||
Variable | DF | Estimate | Standard Error |
t Value | Approx Pr > |t| |
Variable Label |
Intercept | 1 | 0.2231 | 0.007003 | 31.86 | <.0001 | |
PI | 1 | -1.062E-8 | 5.7138E-7 | -0.02 | 0.9852 | PI |
PHCIA | 1 | -0.9349 | 0.0811 | -11.53 | <.0001 | PHCIA |
Estimates of Autocorrelations | |||
Lag | Covariance | Correlation | -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 |
0 | 0.000046 | 1.000000 | | |********************| |
1 | 0.000039 | 0.847528 | | |***************** | |
Preliminary MSE | 0.000013 |
Estimates of Autoregressive Parameters | |||
Lag | Coefficient | Standard Error |
t Value |
1 | -0.847528 | 0.079120 | -10.71 |
Relationship PFTA with DSPIC CPIF and PHCIA
The AUTOREG Procedure
Yule-Walker Estimates | |||
SSE | 0.00043337 | DFE | 45 |
MSE | 9.63056E-6 | Root MSE | 0.00310 |
SBC | -414.26047 | AIC | -421.82775 |
MAE | 0.00248173 | AICC | -420.91866 |
MAPE | 2.40552271 | HQC | -418.95673 |
Durbin-Watson | 0.9490 | Transformed Regression R-Square | 0.7794 |
Total R-Square | 0.9905 |
Durbin-Watson Statistics | |||
Order | DW | Pr < DW | Pr > DW |
1 | 0.9490 | <.0001 | 1.0000 |
NOTE: Pr<DW is the p-value for testing positive autocorrelation, and Pr>DW is the p-value for testing negative autocorrelation.
Parameter Estimates | ||||||
Variable | DF | Estimate | Standard Error |
t Value | Approx Pr > |t| |
Variable Label |
Intercept | 1 | 0.1867 | 0.009900 | 18.86 | <.0001 | |
PI | 1 | -2.519E-6 | 7.998E-7 | -3.15 | 0.0029 | PI |
PHCIA | 1 | -0.4990 | 0.1085 | -4.60 | <.0001 | PHCIA |
proc autoreg;
title’Relationship PFTA with DSPIC CPIF and PHCIA’;
MODEL PFIA=PI ICP PHCIA/NLAG=1 DWPROB;
RUN;
Conclusion:
I estimate that as the personal income increase, the proportion of food expenditure in the personal consumption expenditure. Because in the Engle curve when personal income increase, people will pay more money to others part rather than food, maybe people will pay more attention to health care and sport activities. However, my project still has some limitation, for example, I should add more explanatory, because I think that a lot of factor will affect the change of the proportion of food expenditure in the personal consumption expenditure. On the other hand, I should add different state data to compare, it can help me analysis different state and different family will have different results in the project.