Эконометрика — Совбак ВШЭ и РЭШ, 2021 final

Совбак ВШЭ и РЭШЭконометрика2021final
Скачать задачи PDF

Question 1

Part I: The trend toward later retirement — 20 points

The labor-force participation rate, LFPR, is the fraction of people who are employed or looking for work. From the mid-1990s through approximately 2009, the LFPR of Americans aged 55 and over, LFPR55LFPR55, rose from roughly 30% to more than 40%. Since 2010 it has plateaued.

The four-quarter change is

Δ4LFPR55t=LFPR55tLFPR55t4.\Delta_4LFPR55_t=LFPR55_t-LFPR55_{t-4}.

The part examines whether the slowdown was caused by the 2008-2009 recession using unemployment, real disposable personal-income growth, and housing starts.

timeLFPR age 55+000
Labor-force participation rate, ages 55+, quarterly
timefour-quarter change000
Four-quarter change in LFPR55
timeunemployment rate000
Unemployment rate
timequarterly growth000
Real disposable personal-income growth
timehousing starts000
Housing starts

Table 1. Variable definitions and summary statistics

Unit: U.S. quarterly data, 1995Q1-2015Q3, T=83T=83.

VariableDefinitionMeanStd. dev.
LFPR55LFPR55LFPR for U.S. residents aged 55 and over36.23.8
ΔLFPR55\Delta LFPR55LFPR55tLFPR55t1LFPR55_t-LFPR55_{t-1}0.120.22
Δ4LFPR55\Delta_4LFPR55LFPR55tLFPR55t4LFPR55_t-LFPR55_{t-4}0.470.48
Unemployment RateUnemployed as percentage of labor force6.01.7
Real IncomeAfter-tax personal income, billions of 2010 dollars9,9891,616
Δln(RealIncome)\Delta\ln(RealIncome)Quarterly log change0.00690.0099
Housing StartsNew home construction started in the quarter, millions1.3340.471
ΔHousingStarts\Delta HousingStartsFirst difference-0.00380.084

Table 2. Forecasting regressions for Δ4LFPR55\Delta_4LFPR55

All regressions use quarterly data from 1995Q1 through 2007Q4. Standard errors are in parentheses. HR denotes heteroskedasticity-robust; HAC denotes heteroskedasticity- and autocorrelation-consistent.

Regressor/statistic(1)(2)(3)(4)(5)
Standard errors/testsHRHACHACHACHAC
ΔLFPR55t4\Delta LFPR55_{t-4}0.138<br>(0.245)0.138<br>(0.211)-0.059<br>(0.149)0.169<br>(0.244)0.024<br>(0.313)
ΔLFPR55t5\Delta LFPR55_{t-5}0.380<br>(0.289)0.380<br>(0.234)0.216<br>(0.148)0.409<br>(0.245)0.212<br>(0.367)
ΔLFPR55t6\Delta LFPR55_{t-6}0.430+<br>(0.255)0.430*<br>(0.166)0.269*<br>(0.135)0.458**<br>(0.154)0.270<br>(0.224)
Unemployment Ratet4_{t-4}0.756**<br>(0.217)
Unemployment Ratet5_{t-5}-0.236<br>(0.332)
Unemployment Ratet6_{t-6}-0.675**<br>(0.215)
ΔlnRealIncomet4\Delta\ln RealIncome_{t-4}-3.08<br>(3.16)
ΔlnRealIncomet5\Delta\ln RealIncome_{t-5}-0.205<br>(6.79)
ΔlnRealIncomet6\Delta\ln RealIncome_{t-6}7.65<br>(9.64)
ΔHousingStartst4\Delta HousingStarts_{t-4}1.01**<br>(0.30)
ΔHousingStartst5\Delta HousingStarts_{t-5}-0.89<br>(0.61)
ΔHousingStartst6\Delta HousingStarts_{t-6}0.51<br>(0.38)
Constant0.497**<br>(0.088)0.497**<br>(0.112)1.447**<br>(0.412)0.444<br>(0.291)-0.460<br>(0.712)
BIC-1.641-1.641-2.039-1.433-1.537
Adjusted R2R^20.0210.0210.4410.0020.077
RMSFE, 2008Q1-2015Q30.5580.5580.6850.5540.381
Observations5252525252
FF: three lagged LFPR changes1.51, p=.224p=.2244.55, p=.007p=.0073.33, p=.028p=.0286.40, p=.001p=.0012.33, p=.087p=.087
FF: three unemployment terms11.76, p=.000p=.000
FF: three income-growth terms1.20, p=.320p=.320
FF: three housing-start terms4.21, p=.010p=.010

++, *, and ** denote significance at 10%, 5%, and 1%.

timefour-quarter LFPR55 change000
Actual and forecast four-quarter change in LFPR55

1 — 4 points

Regressions (1) and (2) differ only in how standard errors are computed. Which method is preferred here, or does it not matter in theory? Explain.

2

You must forecast the change in LFPR55 from 2015Q3 to 2016Q3 using one of regressions (2)-(5).

  1. (4 points) Which specification would you choose, and why?
  2. (4 points) Give the standard error of the forecast. You need not compute the forecast itself.

3 — 8 points

Assess the hypothesis that the post-2010 plateau in LFPR55 was caused by the recession and slow recovery. Absent the recession, older workers might have continued postponing retirement; because of job losses and difficulty finding work, many may instead have retired earlier.

Using Figures 1-6 and Table 2, write one or two paragraphs evaluating whether the evidence supports, contradicts, or is insufficient to assess this hypothesis.

Question 2

Part II: Bank lending and small-business profitability — 21 points

The Bank of Russia wants to estimate the effect of bank lending on small-business profitability. It has data on one hundred businesses over five years, described in the source as 50,000 observations:

  • PitP_{it}: profitability;
  • BitB_{it}: total bank loans received;
  • GitG_{it}: total government loans received;
  • RjiR_{ji}: region dummies, j=1,,85j=1,\ldots,85;
  • LkiL_{ki}: line-of-business dummies, k=1,,20k=1,\ldots,20;
  • UtU_t: unemployment;
  • DiD_i: distance to the nearest bank branch;
  • CtC_t: number of business closures in the city;
  • OtO_t: number of business openings in the city.

Business and year dummies and a time trend can also be created. The baseline regression is

Pit=β0+β1Bit+β2Git+j=285β3,jRji+k=220β4,kLki+β5t+εit.P_{it}=\beta_0+\beta_1B_{it}+\beta_2G_{it} +\sum_{j=2}^{85}\beta_{3,j}R_{ji} +\sum_{k=2}^{20}\beta_{4,k}L_{ki} +\beta_5t+\varepsilon_{it}.

For each concern, explain briefly how you would address it.

  1. (3 points) Only businesses surviving all five years are included.
  2. (3 points) Larger bank loans are usually granted to businesses more likely to be profitable.
  3. (3 points) A loan may have a positive long-run effect but initially be a burden because repayments begin before benefits are realized.
  4. (3 points) Government-loan recipients may use bank loans more effectively because the government-loan program includes business education.
  5. (3 points) Only businesses that received loans are in the data set.
  6. (3 points) Doubling loan size more than doubles the expected benefit.
  7. (3 points) Unobserved, time-invariant business characteristics, such as management quality, may affect both loan access and profitability.

Question 3

Part III: True, false, or uncertain — 24 points

For each statement, answer true, false, or uncertain, with an explanation.

(a) (4 points) In a probit or logit model, marginal effects on the probability that the dependent variable equals one are functions of XX. Therefore, it is difficult to give β^\widehat\beta alone a useful interpretation.

(b) (4 points) You want to estimate the supply equation for snow boots

Qi=γ+δPi+ψCi+ui,Q_i=\gamma+\delta P_i+\psi C_i+u_i,

where QiQ_i is quantity sold, PiP_i is price, and CiC_i is transportation cost from a central production facility. Annual snowfall is proposed as an instrument for PiP_i. The fact that snowfall is correlated with QiQ_i proves that the instrument is invalid.

(c) (4 points) In a panel model where regressors are correlated with individual fixed effects, the fixed effects should be omitted to avoid multicollinearity.

(d) (4 points) The same instrument can be used for two different endogenous regressors in one regression.

(e) (4 points) OLS with cross-sectional data is bad because heteroskedasticity biases both slope estimates and standard errors.

(f) (4 points) It is possible to construct standard errors robust to unknown heteroskedasticity, but not standard errors robust to both heteroskedasticity and autocorrelation.

Question 4

Part IV: Miscellaneous problems — 35 points

1. Omitted variables and instruments — 15 points

Suppose

Y=β0+β1X1+β2X2+ε,Y=\beta_0+\beta_1X_1+\beta_2X_2+\varepsilon,

and the main parameter of interest is β1\beta_1. Assume X1X_1 and X2X_2 are uncorrelated and, in fact,

plim1nX1X2=0.\operatorname{plim}\frac1nX_1'X_2=0.
  1. (5 points) Without data on X2X_2, can you consistently estimate β1\beta_1? Explain.
  2. (5 points) Suppose X1X_1 is endogenous and X2X_2 remains unobserved. A proposed instrument W1W_1 is a noisy linear combination of X1X_1 and X2X_2, for example W1=X1+0.1X2+η,ηN(0,1).W_1=X_1+0.1X_2+\eta, \qquad \eta\sim N(0,1). Discuss whether W1W_1 is a suitable instrument and the properties of the IV estimator.
  3. (5 points) If X2X_2 is observed, is there a way to obtain a consistent estimate of β1\beta_1? Give a specific procedure or explain why not.

2. Cigarette-demand IV estimates — 20 points

The observations are the 48 continental U.S. states. Let qiq_i denote cigarette sales, pip_i a state price index, and IiI_i average income.

Using log state sales tax as an instrument for log price gives

lnqi^=9.431.14lnpi+0.21lnIi,\widehat{\ln q_i} =9.43-1.14\ln p_i+0.21\ln I_i,

with standard errors

(0.37)(0.31)(0.37)\qquad(0.31)

under the coefficients on lnpi\ln p_i and lnIi\ln I_i.

  1. (2 points) Construct a 95% confidence interval for the price elasticity.
  2. (2 points) Test the significance of the income coefficient.

Adding log state cigarette tax as a second instrument gives

lnqi^=9.891.28lnpi+0.28lnIi,\widehat{\ln q_i} =9.89-1.28\ln p_i+0.28\ln I_i,

with standard errors

(0.25)(0.25).(0.25)\qquad(0.25).
  1. (1 point) Construct a 95% confidence interval for the price elasticity.
  2. (5 points) Compare the lnpi\ln p_i coefficients across the two specifications to test instrument validity. You may use the fact that the variance of their difference can be estimated as the difference of their estimated variances, with the second specification having the lower variance.

Using a second year of data, first differences, and the differences of the same two instruments gives

ln(qi/qi1)^=0.371.20ln(pi/pi1)+0.46ln(Ii/Ii1),\widehat{\ln(q_i/q_{i1})} =0.37-1.20\ln(p_i/p_{i1})+0.46\ln(I_i/I_{i1}),

with standard errors

(0.25)(0.25).(0.25)\qquad(0.25).
  1. (5 points) Write a model in which this differenced IV estimator is consistent but the previous level estimators are not.
  2. (5 points) Regressing ln(pi/pi1)\ln(p_i/p_{i1}) on a constant, ln(Ii/Ii1)\ln(I_i/I_{i1}), and the two instruments gives an FF statistic of 88.6, with 48 observations. Is it significant? Which of the two key IV conditions does this test confirm?