Эконометрика — Совбак ВШЭ и РЭШ, 2022 final

Совбак ВШЭ и РЭШЭконометрика2022final
Скачать задачи PDF

Question 1

Part I: Health insurance and medical-care use — 32 points

The fraction of U.S. adults with medical insurance rose substantially after implementation of the Affordable Care Act. This part studies whether insurance causes greater use of medical services by exploiting Medicare eligibility at age 65.

The data are from the 2006 U.S. Health and Retirement Survey and are restricted to working, nonretired people aged 48-75.

agehealth-insurance coverage000
Health-insurance coverage by age
agedoctor visits000
Average number of doctor visits by age

Table 1. Variable definitions

VariableDefinitionMeanStd. dev.
HI1 if the individual has any health insurance, 0 otherwise0.870.34
Dr visitsNumber of doctor visits during the year7.3112.81
hospital staysNumber of hospital stays during the year0.220.67
prescription drugsNumber of prescription drugs currently taken0.680.47
AgeAge in years59.626.71
Age65+1 if age is at least 65, 0 otherwise0.240.67
assetsTotal assets, dollars319,4351,603,809
incomeTotal income, dollars90,251123,143
married1 if married and spouse is alive0.770.42
female1 if female0.520.50
Hispanic1 if Hispanic0.090.29
black1 if Black0.140.35

N=7,036N=7{,}036.

Table 2. Medical utilization and health insurance

Standard errors are in parentheses. All standard errors and FF tests are heteroskedasticity-robust. Regressions (4)-(6) use TSLS with Age65+ as the instrument for HI.

Regressor/statistic(1) OLS: Dr visits(2) OLS: HI(3) OLS: Dr visits(4) TSLS: Dr visits(5) TSLS: Dr visits(6) TSLS: Dr visits
HI2.01**<br>(0.69)9.62**<br>(2.26)0.03<br>(3.70)0.59<br>(3.57)
Age65+0.153**<br>(0.006)1.47**<br>(0.34)
Age1.06<br>(5.18)-0.03<br>(5.47)0.72<br>(5.43)
Age2Age^2-0.02<br>(0.08)-0.003<br>(0.090)-0.015<br>(0.088)
Age3Age^30.00013<br>(0.00046)0.00004<br>(0.00048)0.00011<br>(0.00048)
Other controls: income, assets, female, Black, Hispanic, marriedYesNoNoNoNoYes
Constant-15.7<br>(104.3)0.834**<br>(0.005)6.96**<br>(0.18)-1.06<br>(1.98)9.97<br>(111.2)-7.54<br>(110.7)
R2R^20.0100.0380.002
FF: Age, Age2Age^2, Age3Age^35.99, p=.001p=.0012.66, p=.047p=.0472.71, p=.044p=.044
FF: other controls6.44, p=.000p=.0006.30, p=.000p=.000

1

Using regression (1):

  1. (2 points) Interpret the coefficient on HI.
  2. (2 points) Construct a 95% confidence interval.

2

Using regression (3):

  1. (2 points) Interpret the intercept.
  2. (2 points) Interpret the coefficient on Age65+.

3 — 3 points

In regression (1), the individual tt statistics for Age, Age2Age^2, and Age3Age^3 are below one, yet their joint FF statistic is 5.99 and rejects at 1%. How is this possible?

4 — 3 points

People expecting high medical costs may be more likely to buy insurance. Explain why this produces simultaneous-causality bias in the coefficient on HI in regression (1).

5 — 3 points

The coefficient on HI in regression (4) is the ratio of the Age65+ coefficients in regressions (3) and (2):

1.470.153=9.62.\frac{1.47}{0.153}=9.62.

Explain why, intuitively or mathematically.

6. IV regression (4)

  1. (2 points) What is a weak instrument?
  2. (3 points) Is Age65+ weak or strong here? Explain.
  3. (2 points) What does instrument exogeneity mean?
  4. (3 points) Is Age65+ plausibly exogenous in regression (4)? Use judgment.
  5. (5 points) Regressions (1), (4), (5), and (6) estimate the effect of insurance on doctor visits. Which has the greatest internal validity? Explain.

Question 2

Part II: Performance-linked pay — 18 points

Table IV from Lemieux, MacLeod, and Parent (Quarterly Journal of Economics, 2009) reports regressions of log wages on a performance-pay dummy and other variables using worker panel data. The regressions also contain industry, occupation, and year dummies; county unemployment; marital status; race dummies; and union status. Standard errors are in parentheses.

Experience and tenure enter as quadratics. “Potential experience, effect at 20 years” and “Tenure, effect at ten years” report marginal effects evaluated at those values. Interaction rows report corresponding differences for performance-pay jobs.

Table IV. Skills-related wage differentials and performance-pay jobs

VariablePP jobs OLS (1)Non-PP jobs OLS (2)Sample OLS (3)Sample FE (4)All jobs OLS (5)All jobs FE (6)
Performance-pay job dummy-0.4526<br>(0.1019)-0.2061<br>(0.0723)-0.2406<br>(0.1251)0.1414<br>(0.0998)
Years of education0.0929<br>(0.0071)0.0665<br>(0.0039)0.0637<br>(0.0040)0.0167<br>(0.0091)0.0584<br>(0.0047)0.0040<br>(0.0096)
Education ×\times performance-pay job0.0365<br>(0.0071)0.0169<br>(0.0048)0.0217<br>(0.0092)-0.0079<br>(0.0071)
Education ×\times 1990-19930.0161<br>(0.0085)0.0222<br>(0.0066)
Education ×\times performance pay ×\times 1990-19930.0190<br>(0.0137)0.0280<br>(0.0089)
Potential experience, effect at 20 years0.4259<br>(0.0635)0.2882<br>(0.0288)0.3010<br>(0.0294)0.4545<br>(0.1258)0.3002<br>(0.0294)0.4231<br>(0.1256)
Experience ×\times performance-pay job0.1162<br>(0.0584)0.0149<br>(0.0501)0.1018<br>(0.0581)-0.0278<br>(0.0509)
Tenure, effect at ten years0.1670<br>(0.0268)0.2197<br>(0.0154)0.2262<br>(0.0154)0.1158<br>(0.0128)0.2271<br>(0.0154)0.1191<br>(0.0129)
Tenure ×\times performance-pay job-0.0666<br>(0.0301)0.0278<br>(0.0237)-0.0677<br>(0.0303)0.0196<br>(0.0239)
Observations9,68016,46626,14626,14626,14626,146

Models (5)-(6) include interactions between performance pay and education, potential experience, and tenure. Models (5)-(6) include the full set of interactions for 1980-1984, 1985-1989, and 1990-1993, but only 1990-1993 estimates are reported. FE is the fixed-effect method at the individual-job level.

1 — 3 points

Using column (3), is the return to education higher in performance-pay or non-performance-pay jobs? What is the difference, and is it statistically significant?

2 — 3 points

Using column (3), what is the return to having a performance-pay job for a worker with 16 years of education, 20 years of experience, and 10 years of tenure?

3 — 3 points

Column (4) includes worker fixed effects. The education coefficient falls from 0.0637 in column (3) to 0.0167 in column (4). Is this a large economic change? Explain.

4 — 5 points

Give a concrete explanation for the difference between 0.0637 and 0.0167.

5 — 4 points

Choose among homoskedasticity-only, heteroskedasticity-robust, and individual-job clustered standard errors. Which is most appropriate, and why?

Question 3

Part III: Miscellaneous questions — 50 points

1. Measurement error — 8 points

The true model is

Y=β0+β1X+ε,Y^*=\beta_0+\beta_1X^*+\varepsilon,

but only

Y=Y+η,X=X+ξY=Y^*+\eta, \qquad X=X^*+\xi

are observed. The variables ε\varepsilon, η\eta, and ξ\xi have zero means and are mutually independent and independent of XX^*.

  1. (4 points) Show that the OLS slope from regressing YY on XX is inconsistent.
  2. (4 points) Briefly describe one way to obtain a consistent estimator of β1\beta_1.

2. Linear probability model — 13 points

Consider

E(yixi)=β0+β1xi,E(y_i\mid x_i)=\beta_0+\beta_1x_i,

where yi=1y_i=1 if student ii receives an A in econometrics and 0 otherwise, and xix_i is hours spent studying.

  1. (3 points) In yi=β0+β1xi+εi,y_i=\beta_0+\beta_1x_i+\varepsilon_i, find E(εixi)E(\varepsilon_i\mid x_i) and Var(εixi)\operatorname{Var}(\varepsilon_i\mid x_i).
  2. (2 points) Is OLS consistent for β1\beta_1?
  3. (4 points) What is the correct way to estimate standard errors of β^1\widehat\beta_1?
  4. (4 points) Suggest a method that may produce narrower confidence intervals than OLS.

3. Expectations hypothesis for Treasury bills — 14 points

Let hy6thy6_t be the three-month holding yield, in percent, from buying a six-month Treasury bill at t1t-1 and selling it at tt as a three-month bill. Let hy3t1hy3_{t-1} be the known three-month holding yield from buying a three-month bill at t1t-1.

The expectations hypothesis says

E(hy6tIt1)=hy3t1.E(hy6_t\mid I_{t-1})=hy3_{t-1}.

This suggests

hy6t=β0+β1hy3t1+uthy6_t=\beta_0+\beta_1hy3_{t-1}+u_t

and the null H0:β1=1H_0:\beta_1=1.

Using 123 observations,

hy6t^=0.058+1.104hy3t1,\widehat{hy6_t}=-0.058+1.104hy3_{t-1},

with heteroskedasticity-robust standard error 0.039 for the slope.

  1. (3 points) Test H0:β1=1H_0:\beta_1=1 against a two-sided alternative at 1%. Is the estimate significantly different from 1?

Another implication is that no variable dated t1t-1 or earlier should help once hy3t1hy3_{t-1} is controlled for. Let dt1d_{t-1} be the price difference between six- and three-month bills. The expanded regression is

hy6t^=0.123+1.053hy3t1+0.480dt1,\widehat{hy6_t}=-0.123+1.053hy3_{t-1}+0.480d_{t-1},

with robust standard errors 0.039 and 0.109 for the two slopes.

  1. (3 points) Is the coefficient on hy3t1hy3_{t-1} significantly different from 1? Is the coefficient on dt1d_{t-1} significant?
  2. (3 points) Explain how to test jointly that the coefficient on hy3t1hy3_{t-1} is 1 and the coefficient on dt1d_{t-1} is zero.
  3. (5 points) A regression of hy3thy3_t on hy3t1hy3_{t-1} gives a coefficient of 0.914. Why may this be concerning? How would you modify the analysis?

4. Dynamic causal effects — 15 points

  1. (1 point) What specification is appropriate for estimating the dynamic causal effect of XX on YY?
  2. (2 points) What does it mean for XX to be exogenous? Strictly exogenous?
  3. For each example, assess whether exogeneity and strict exogeneity are plausible:
    • (4 points) YY: U.S. inflation; XX: percentage change in world oil prices set by OPEC.
    • (4 points) YY: GDP growth; XX: Federal Funds rate.
    • (4 points) YY: change in inflation; XX: unemployment rate, as in a Phillips curve.