Updates and Research from Steve Bronder

The purpose of this site is to give current information on Steve Bronder's (me) research and personal life

Panel Analysis of Nonstationarity in Idiosyncratic and Common Components

The purpose of this package is to perform the Panel Analysis of Nonstationarity in Idiosyncratic and Common Components from Bai and Ng (2004,2010). When working with large dimensional panels, standard pooled and aggregated nonstationarity tests tend to over-reject null hypothesis due to:

1. Curse of dimensionality
2. Cross-Correlation in panel structure
3. Weak strength to Large N or large T

Instead of testing the data directly, PANIC performs a factor model to derive the common and idiosyncratic components of the panel. By using the BIC3 from Bai and Ng (2004) it is possible to determine the number of common components in panels that reduce cross correlation in the error term. In this vignette we will perform PANIC on three aggregate levels of National Income and Product Accounts in order to test these aggregates for aggregation bias

## Vignette Info

This vignette will use the functions panic10() and panic04() availabe through library(PANICr). These functions perform a factor model on each level of aggregation, derive the common and idiosyncratic components, and then perform several pooled test statistics. One of several benefits of PANIC is that by reducing cross-correlation we allow valid pooling of individual statistics and so panel tests can be run that have reasonable strength. Performing the factor analysis using BIC3, the criteria for determining the number of factors in our approximate factor model, allows us to determine whether the nonstationarity is pervasive, variable specific, or both.

## Data

The data we use is gathered from the Price Indexes for Personal Consumption Expenditures by Type of Product available from the BEA. The data is monthly from 1959 to 20131. At this point we run the data through X-132. After extracting each sector we divide them up into three seperate levels of aggregation from highest level of aggregation to lowest. To turn this dataset into year on year inflation we perform $$log(p_{t}/p_{t-12})$$. The data is available already cleaned and manipulated as NIPAagg1, NIPAagg2, and NIPAagg3, respectivly.

## Model

Consider a factor analytic model:

$$X_{it} = D_{it} + \lambda_{i}' F_{t} + e_{it}$$

Where $$D_{it}$$ is a polynomial trend function, $$F_{t}$$ is an $$r\times{1}$$ vector of common factors, and $$\lambda_{i}$$ is a vector of factor loadings. The panel $$X_{it}$$ is the sum of a deterministic component $$D_{it}$$ , a common component $$\lambda_{i}' F_{t}$$, and an error $$e_{it}$$ that is largely idiosyncratic. A factor model with $$N$$ variables has $$N$$ idiosyncratic components, but a smaller number of common factors. $$D_{it}$$ can be modeled by $$P$$. In PANIC 2004, When the number of common factors is greater than one, $$P=1$$ and the deterministic trend has an individual specific fixed effect and time trend. When the number of common factors is equal to one, $$P=0$$ is an individual specific fixed effect. When the number of common factors is zero, $$P=0$$ is neither.

PANIC 2010 examines the data with ADF models A, B, and C. A assumes no deterministic componet, B assumes a constant to allow for a fixed effect, and C allows a constant and a trend. Note that this is different than P as P is a data generating process while Models A, B, and C impose these constraints inside of the ADF test.

The benefit of this factor model is that, if the number of factors has been correctly determined, the error term will be largely idosyncratic and the common components will explain the largest variance of the data. To determine the approximate number of factors we use the BIC3 from Bai and Ng (2002) such that:

$$BIC3 = V(k,\hat{F}^k)+k\hat{\sigma}^2 + \frac{(N+T-k)ln(NT)}{NT}$$

$$(k,\hat{F}^k)$$ is the average residual variance when k factors are assumed for each cross-section unit. $$\hat{\sigma}^2$$ is the mean of the error term squared over N and T.

Once we have this model we perform ADF style pooled tests on the idiosyncratic and common components. panic04 and panic10 ask for nfac, the number of estimated factors, k1, the maximum lag allowed in the ADF test, and jj, the criteria to determine the number of factors in our approximate factor model. nfac is weak to underestimation so it is suggested to overestimate the number of factors. To determine the lag of the ADF test Bai and Ng (2002) suggest $$4(\sqrt{\frac{T}{100}})$$. jj is an Integer 1 through 8. Choices 1 through 7 are respectively, IC(1), IC(2), IC(3), AIC(1), BIC(1), AIC(3), and BIC(3), respectively. Choosing 8 makes the number of factors equal to the number of columns whose sum of eigenvalues is less than or equal to .5. panic10() also has the option to run models on demeaned or non-demeaned data (TRUE or FALSE) which will return models A and B in the first case and C in the second.

With this information it is now appropriate to start running our tests.

  library(PANICr)
library(MCMCpack)
## Loading required package: coda
## ##
## ## Markov Chain Monte Carlo Package (MCMCpack)
## ## Copyright (C) 2003-2014 Andrew D. Martin, Kevin M. Quinn, and Jong Hee Park
## ##
## ## Support provided by the U.S. National Science Foundation
## ## (Grants SES-0350646 and SES-0350613)
## ##
  data(NIPAagg1)
data(NIPAagg2)
data(NIPAagg3)

agg1.04 <- panic04(NIPAagg1,12,7,8)
Test Values
Pooled Demeaned 151.287625586434 18.3723862242089
Pooled Idiosyncratic 131.157425663919 15.4668421381828
Pooled Cointegration test 197.117010015667 24.9872880834621
Common Test
-5.719
-4.345
agg2.04 <- panic04(NIPAagg2,12,7,8)
Common Test
-6.282
-5.915
-4.684
Test Values
Pooled Demeaned 611.71868977349 38.8866151821883
Pooled Idiosyncratic 522.753968510507 32.2555763707432
Pooled Cointegration test 490.814016498057 29.8749129074792
agg3.04 <- panic04(NIPAagg3,12,7,8)
Test Values
Pooled Demeaned 2115.1699605466 73.452641227304
Pooled Idiosyncratic 1717.95911237424 57.3436141575353
Pooled Cointegration test 2799.94347308076 101.223874316303
Common Test
-5.936
-6.874
-7.459
-7.442
-6.794
-4.671
-5.780
-8.341
-7.702
agg1.10.d <- panic10(NIPAagg1,12,7,8,TRUE)
Pool Test P MP Test Model C
Pa -7.65294028274482 ta -0.44431132912738
Pb -4.06244302208989 tb -0.300793409686763
PMSB rho1 04 Pool LM
-2.146 0.9995 11.5
agg2.10.d <- panic10(NIPAagg2,12,7,8,TRUE)
Pool Test P MP Test Model C
Pa -23.428399842655 ta -1.9833426704611
Pb -10.0827765202076 tb -1.01402612949304
PMSB rho1 04 Pool LM
-4.29 0.9987 27.72
agg3.10.d <- panic10(NIPAagg3,12,7,8,TRUE)
Pool Test P MP Test Model C
Pa -36.8676210042462 ta -3.30706919021236
Pb -15.0224271151914 tb -1.52334099307322
PMSB rho1 04 Pool LM
-6.043 0.9985 52.06
agg1.10.nd <- panic10(NIPAagg1,12,7,8,FALSE)
MP Model A Model B
ta -18.7367009415592 0.518659001147791
tb -5.83521970527988 0.252226142119584
-2.089 1 14.97
agg2.10.nd <- panic10(NIPAagg2,12,7,8,FALSE)
MP Model A Model B
ta -41.7213607853102 -0.602540832084791
tb -11.6746801440191 -0.212887822882118
-3.754 0.9997 30.84
agg3.10.nd <- panic10(NIPAagg3,12,7,8,FALSE)
MP Model A Model B
ta -55.5196791341561 -2.16018368071093
tb -15.7322363951904 -0.670568755228224
-5.148 0.9991 57.72

## Interpreting Results

For these tests a 5% critical value of -2.86 for the Common unit root test $$ADF_{\hat{F}}^{c}$$ implies rejection. The idiosyncratic unit root test, $P_{}^{c}$ has a 5% critical value of 1.64. I utilize Bai and Ng’s (2002) third information criterion to determine the number of common factors. For rejection rates of the cointergration test see Bai and Ng (2004) as they vary per number of common factors. Lag lengths are determined by the formula $$4(T/100)^{\frac{1}{4}}$$ as done in Bai and Ng (2004).The dimensions of the aggregates are (N=12,T=639), (N=46,T=639), (N=160,T=639), respectively.

For PANIC (2010) use table’s one and two found in Bai and Ng (2010) to find the critical values. These tests are only imposed on the idiosyncratic component. Our PMSB test rejects for every level of aggregation (CV = 1.00) and so we conclude, accounting for a structural shift, there is no unit root in the idiosyncratic component. $$P_a$$ and $$P_b$$ are asymptotically standard normal and so we reject our null hypothesis for these tests as well. For tests of models A, B, and C we must reject both test statistics or conlclude nonstatinarity. For model C we reject our null hypothesis at aggregates 2 and 3, but conclude nonstatinarity in aggregate 1. For model B, we do not reject the null hypothesis for any level of aggregation. For model A, we reject all null hypothesis of every level of aggregation.

mcmcagg1.04 <- MCMCpanic04(NIPAagg1, 9, 7, 8, burn = 50000, mcmc = 100000, thin = 10,
verbose = 0,seed = NA, lambda.start = NA, psi.start = NA,
l0 = 0, L0 = 0, a0 = 0.001, b0 = 0.001, std.var = TRUE)
mcmcagg2.04 <- MCMCpanic04(NIPAagg2, 9, 7, 8, burn = 50000, mcmc = 100000, thin = 10,
verbose = 0,seed = NA, lambda.start = NA, psi.start = NA,
l0 = 0, L0 = 0, a0 = 0.001, b0 = 0.001, std.var = TRUE)
mcmcagg3.04 <- MCMCpanic04(NIPAagg3, 9, 7, 8, burn = 50000, mcmc = 100000, thin = 10,
verbose = 0,seed = NA, lambda.start = NA, psi.start = NA,
l0 = 0, L0 = 0, a0 = 0.001, b0 = 0.001, std.var = TRUE)
adf.mcmc1 <- as.mcmc(mcmcagg1.04$adf.mcmc) adf.mcmc2 <- as.mcmc(mcmcagg2.04$adf.mcmc)

summary(adf.mcmc1)
##
## Iterations = 1:1e+05
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 1e+05
##
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
##
##          Mean       SD Naive SE Time-series SE
## adf50a 125.23  0.05799 1.83e-04       1.83e-04
## adf50b  14.61  0.00837 2.65e-05       2.65e-05
## adf30a  93.54 28.22350 8.93e-02       1.15e-01
## adf30b  10.04  4.07371 1.29e-02       1.66e-02
##         -2.88  0.73703 2.33e-03       9.06e-03
##         -2.87  0.72304 2.29e-03       6.69e-03
##
## 2. Quantiles for each variable:
##
##          2.5%    25%   50%    75% 97.5%
## adf50a 125.23 125.23 125.2 125.23 125.2
## adf50b  14.61  14.61  14.6  14.61  14.6
## adf30a  40.49  73.01  93.9 114.14 146.5
## adf30b   2.38   7.07  10.1  13.01  17.7
##         -4.56  -3.30  -2.8  -2.39  -1.6
##         -4.50  -3.30  -2.8  -2.39  -1.6
summary(adf.mcmc2)
##
## Iterations = 1:1e+05
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 1e+05
##
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
##
##          Mean      SD Naive SE Time-series SE
## adf50a 601.74  0.4854 0.001535       0.001535
## adf50b  38.14  0.0362 0.000114       0.000114
## adf30a 513.01 40.1592 0.126995       0.970909
## adf30b  31.53  2.9933 0.009466       0.072367
##         -3.94  0.9449 0.002988       0.034613
##         -3.57  0.8115 0.002566       0.026811
##         -3.92  0.9655 0.003053       0.053733
##
## 2. Quantiles for each variable:
##
##          2.5%    25%    50%    75%  97.5%
## adf50a 601.74 601.74 601.74 601.74 601.74
## adf50b  38.14  38.14  38.14  38.14  38.14
## adf30a 424.36 488.00 515.54 541.60 581.58
## adf30b  24.92  29.67  31.72  33.66  36.64
##         -6.08  -4.53  -3.79  -3.25  -2.48
##         -5.67  -3.92  -3.37  -3.03  -2.44
##         -6.16  -4.49  -3.68  -3.23  -2.57

1. T = 660

2. X-13 is a software program available from the U.S. Census Bureau that seasonally adjusts multiple time series using X-13ARIMA-SEATS process