Time-series study of the proportion of food in Personal consumption expenditure

Econ 385

I think that to model time series (1967-2016) differences in the proportion of Food Expenditure in personal consumption expenditure based on Personal Income, Inflation consumer prices, and the proportion of Health Care Expenditure in Personal Consumption Expenditure. So I think that the dependent variable is the Proportion of Food Expenditure in personal consumption expenditure (PFIA), and explanatory variable is Personal Income (PI), and the proportion of Health Care Expenditure in Personal Consumption Expenditure (PHCIA). I found the data of the personal consumption expenditure: food, and the personal consumption expenditure: health care, calculate their proportion in the personal consumption expenditure: All term. Meanwhile I assume that the proportion of Food Expenditure in personal consumption expenditure will decrease as personal income increase, because people will pay more money to other parts such as Health Care rather than pay to Food, when their Income increase, and we can found it form the scatter plot(1), the proportion health care expenditure in personal consumption expenditure has a same relationship and tendency, like scatter plot(2)

(1)

(2)

The MEANS Procedure

 Variable Label N Mean Std Dev Minimum Maximum PFIA PI PHCIA PFIA PI PHCIA 49 49 49 0.1073261 6151.41 0.1237509 0.0307650 4544.49 0.0320319 0.0733248 665.7000000 0.0628695 0.1627907 15458.50 0.1684346

proc means;

var PFIA PI PHCIA;

run;

From this MEANS Procedure, the mean of PFIA is 0.173, the mean of PI is 6154.41, and the mean of PHCIA is 0.124, meanwhile we can found that as the PFIA decrease, the personal income and PHCIA will be increase.

The Relationship PFIA with PI and PHCIA

The REG Procedure

Model: MODEL1

Dependent Variable: PFIA PFIA

 Number of Observations Read 50 Number of Observations Used 49 Number of Observations with Missing Values 1

 Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model 2 0.04317 0.02159 439.86 <.0001 Error 46 0.00226 0.00004908 Corrected Total 48 0.04543

 Root MSE 0.00701 R-Square 0.9503 Dependent Mean 0.10733 Adj R-Sq 0.9481 Coeff Var 6.52731

 Parameter Estimates Variable Label DF Parameter Estimate Standard Error t Value Pr > |t| Intercept Intercept 1 0.22309 0.00700 31.86 <.0001 PI PI 1 -1.06236E-8 5.713825E-7 -0.02 0.9852 PHCIA PHCIA 1 -0.93490 0.08106 -11.53 <.0001

proc reg;

title’The Relationship PFIA with PI and PHCIA’;

model PFIA=PI PHCIA;

run;

I used SAS to perform an Ordinary Least Squares analysis of my data set. Below is the interpretation of my regression results. For this sample an increase one percentage in the personal income was associated with an additional -1.06236E-8 dollars in the proportion food expenditure of personal consumption expenditure, holding the proportion of health care expenditure in personal consumption expenditure constant. The coefficient on the personal income is not statistically significant (p=0.9852). And the proportion of health care expenditure in personal consumption expenditure is highly statistically significant (p=<0.0001), The regression as a whole fit reasonably well (R2=0.9503, adjusted R2=0.9481) and was highly statistically significant (F=439.86, P<0.0001).

Firstly I suspect that my regression has problem of multicollinearity, I want to run PROC CORR to detect the multicollinearity, and I can found that the correlation coefficients is bigger than 0.8(0.92106) in explanatory variable-personal income and explanatory variable-the proportion of health care expenditure in the personal consumption expenditure. So I think that the regression has problem of multicollinearity.

The Relationship PFIA with PI and PHCIA

The CORR Procedure

 3 Variables: PFIA PI PHCIA

 Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum Label PFIA 49 0.10733 0.03077 5.25898 0.07332 0.16279 PFIA PI 49 6151 4544 301419 665.70000 15459 PI PHCIA 49 0.12375 0.03203 6.06379 0.06287 0.16843 PHCIA

 Pearson Correlation Coefficients, N = 49 Prob > |r| under H0: Rho=0 PFIA PI PHCIA PFIA PFIA 1.00000 -0.89813 <.0001 -0.97484 <.0001 PI PI -0.89813 <.0001 1.00000 0.92106 <.0001 PHCIA PHCIA -0.97484 <.0001 0.92106 <.0001 1.00000

proc corr;

var PFIA PI PHCIA;

run;

I will make sure again for multicollinearity, so I want to make variance inflation factor test. And I still found that these two explanatory VIF is so higher, bigger than 5, the PI is 6.59445 and the PHCIA is 6.59445. I want to cure it.

The Relationship PFIA with PI and PHCIA

The REG Procedure

Model: MODEL1

Dependent Variable: PFIA PFIA

 Number of Observations Read 50 Number of Observations Used 49 Number of Observations with Missing Values 1

 Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model 2 0.04317 0.02159 439.86 <.0001 Error 46 0.00226 0.00004908 Corrected Total 48 0.04543

 Root MSE 0.00701 R-Square 0.9503 Dependent Mean 0.10733 Adj R-Sq 0.9481 Coeff Var 6.52731

 Parameter Estimates Variable Label DF Parameter Estimate Standard Error t Value Pr > |t| Variance Inflation Intercept Intercept 1 0.22309 0.00700 31.86 <.0001 0 PI PI 1 -1.06236E-8 5.713825E-7 -0.02 0.9852 6.59455 PHCIA PHCIA 1 -0.93490 0.08106 -11.53 <.0001 6.59455

proc reg;

model PFIA = PI ICP PHCIA/ VIF;

RUN;

I want to cure it, drop one of the multicollinearity variable- the proportion of health care. And we can found that an increase of one percentage point in the personal income was associated with an additional -0.00000608dollars in the proportion of food expenditure in the personal consumption expenditure, and the coefficient on the personal income was highly statistically significant (p<0.0001). The result is my estimation, because from the Engel Curve, we can know that as the personal income increase, the proportion of food expenditure in the personal consumption expenditure will decrease, because people will pay more attention to others’ part.

The Relationship PFIA with PI and PHCIA

The REG Procedure

Model: MODEL1

Dependent Variable: PFIA PFIA

 Number of Observations Read 50 Number of Observations Used 49 Number of Observations with Missing Values 1

 Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model 1 0.03665 0.03665 196.06 <.0001 Error 47 0.00879 0.00018692 Corrected Total 48 0.04543

 Root MSE 0.01367 R-Square 0.8066 Dependent Mean 0.10733 Adj R-Sq 0.8025 Coeff Var 12.7385

 Parameter Estimates Variable Label DF Parameter Estimate Standard Error t Value Pr > |t| Intercept Intercept 1 0.14473 0.00331 43.74 <.0001 PI PI 1 -0.00000608 4.34228E-7 -14.00 <.0001

proc reg;

title’The Relationship PFIA with PI;

model PFIA=PI;

run;

Because it is a time series project, so I estimate it is an autocorrelation, and take it to DW test. With three explanatory variables and 49 observations, and I get Durbin Watson d statistic d, 5%significance level, dLis 1.5, and dU is 1.59, so form the DW d less than dL 1.50, I will reject H0, no positive autocorrelation, so it is an autocorrelation.

The Relationship PFIA with PI and PHCIA

The REG Procedure

Model: MODEL1

Dependent Variable: PFIA PFIA

 Durbin-Watson D 0.276 Number of Observations 49 1st Order Autocorrelation 0.848

proc reg;

model PFIA=PI PHCIA/DW;

run;

Fix the autocorrelation

Relationship PFTA with PI AND PHCIA

The AUTOREG Procedure

 Dependent Variable PFIA PFIA

Relationship PFTA with DSPIC CPIF and PHCIA

The AUTOREG Procedure

 Ordinary Least Squares Estimates SSE 0.00225755 DFE 46 MSE 0.0000491 Root MSE 0.00701 SBC -338.54805 AIC -344.22351 MAE 0.00572767 AICC -343.69018 MAPE 6.22501397 HQC -342.07025 Durbin-Watson 0.2765 Total R-Square 0.9503

 Durbin-Watson Statistics Order DW Pr < DW Pr > DW 1 0.2765 <.0001 1.0000

NOTE: Pr<DW is the p-value for testing positive autocorrelation, and Pr>DW is the p-value for testing negative autocorrelation.

 Parameter Estimates Variable DF Estimate Standard Error t Value Approx Pr > |t| Variable Label Intercept 1 0.2231 0.007003 31.86 <.0001 PI 1 -1.062E-8 5.7138E-7 -0.02 0.9852 PI PHCIA 1 -0.9349 0.0811 -11.53 <.0001 PHCIA

 Estimates of Autocorrelations Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0 0.000046 1.000000 |                    |********************| 1 0.000039 0.847528 |                    |*****************   |

 Preliminary MSE 1.3e-05

 Estimates of Autoregressive Parameters Lag Coefficient Standard Error t Value 1 -0.847528 0.079120 -10.71

Relationship PFTA with DSPIC CPIF and PHCIA

The AUTOREG Procedure

 Yule-Walker Estimates SSE 0.00043337 DFE 45 MSE 9.63056E-6 Root MSE 0.00310 SBC -414.26047 AIC -421.82775 MAE 0.00248173 AICC -420.91866 MAPE 2.40552271 HQC -418.95673 Durbin-Watson 0.9490 Transformed Regression R-Square 0.7794 Total R-Square 0.9905

 Durbin-Watson Statistics Order DW Pr < DW Pr > DW 1 0.9490 <.0001 1.0000

NOTE: Pr<DW is the p-value for testing positive autocorrelation, and Pr>DW is the p-value for testing negative autocorrelation.

 Parameter Estimates Variable DF Estimate Standard Error t Value Approx Pr > |t| Variable Label Intercept 1 0.1867 0.009900 18.86 <.0001 PI 1 -2.519E-6 7.998E-7 -3.15 0.0029 PI PHCIA 1 -0.4990 0.1085 -4.60 <.0001 PHCIA

proc autoreg;

title’Relationship PFTA with DSPIC CPIF and PHCIA’;

MODEL PFIA=PI ICP PHCIA/NLAG=1 DWPROB;

RUN;

Conclusion:

I estimate that as the personal income increase, the proportion of food expenditure in the personal consumption expenditure. Because in the Engle curve when personal income increase, people will pay more money to others part rather than food, maybe people will pay more attention to health care and sport activities. However, my project still has some limitation, for example, I should add more explanatory, because I think that a lot of factor will affect the change of the proportion of food expenditure in the personal consumption expenditure. On the other hand, I should add different state data to compare, it can help me analysis different state and different family will have different results in the project.

The relationship PFIA with PI and PHCIA