Эконометрика — МИЭФ, 2025 midterm 1

МИЭФЭконометрика2025midterm 1
Скачать задачи PDF

Question 1

Multiple-choice test

Which of the following is a difference between least absolute deviations, LAD, and ordinary least squares, OLS?

  1. OLS and LAD are equally sensitive to outlying observations.

  2. OLS is more computationally intensive than LAD.

  3. OLS is more sensitive to outlying observations than LAD.

  4. OLS is justified for very large samples, while LAD is justified for smaller samples.

  5. OLS estimates the conditional median of the dependent variable, while LAD estimates the conditional mean.

Question 2

Multiple-choice test

Which of the following regression models is nonlinear in parameters?

  1. y=1β0+β1x+uy=\dfrac{1}{\beta_0+\beta_1x}+u.

  2. y=β0+β1x1/2+uy=\beta_0+\beta_1x^{1/2}+u.

  3. logy=β0+β1logx+u\log y=\beta_0+\beta_1\log x+u.

  4. logy=β0+β1x+u\log y=\beta_0+\beta_1x+u.

  5. All regressions 1)-4) are nonlinear in parameters.

Question 3

Multiple-choice test

A linear transformation of the explanatory variable XX in the model

Yi=β1+β2Xi+uiY_i=\beta_1+\beta_2X_i+u_i

does not generally change:

  1. The estimate of the intercept.

  2. The estimate of the slope coefficient.

  3. The standard error of the slope coefficient estimate.

  4. The determination coefficient of the regression.

  5. The standard error of the intercept estimate.

Question 4

Multiple-choice test

A student investigates factors affecting GDP growth for 200 countries in 2023. GDP is first measured in constant 2021 US dollars at purchasing-power parity. He then measures GDP growth using national currencies at constant prices. All regressors remain the same, and the GDP growth rate remains the dependent variable. What can be said about the two regressions?

  1. All coefficients are the same.

  2. Only the intercept changes; all slope coefficients remain the same.

  3. All coefficients are generally different.

  4. Coefficients may differ, but R2R^2 is the same.

  5. Coefficients may differ, but SSRSSR is the same.

Question 5

Multiple-choice test

A student estimates the production function

y=γ1+αk+βl+u,(1)y=\gamma_1+\alpha k+\beta l+u, \tag{1}

where yy is the output growth rate, kk is the capital growth rate, and ll is the labour growth rate. She then estimates

yk=γ2+μ(lk)+u.(2)y-k=\gamma_2+\mu(l-k)+u. \tag{2}

Which statement is correct?

  1. Model (1) is a restricted version of (2).

  2. Model (2) is a restricted version of (1).

  3. Both statements 1) and 2) are incorrect.

  4. Models (1) and (2) are equivalent.

  5. There is perfect multicollinearity in model (2).

Question 6

Multiple-choice test

Suppose the following model is estimated by OLS and the Gauss-Markov conditions are satisfied:

y=β0+β1x1+β2x2+(1+β3β2)x3+u.y=\beta_0+\beta_1x_1+\beta_2x_2+(1+\beta_3-\beta_2)x_3+u.

Then:

  1. You can obtain an unbiased estimate of β3\beta_3.

  2. You cannot obtain an unbiased estimate of β3\beta_3, but can obtain a consistent estimate.

  3. You cannot obtain either an unbiased or a biased-but-consistent estimate of β3\beta_3.

  4. You cannot obtain any estimate of β3\beta_3 because of perfect multicollinearity.

  5. All the above statements are incorrect.

Question 7

Multiple-choice test

In the regression model

y=α+βx+u,y=\alpha+\beta x+u,

where uu satisfies the Gauss-Markov conditions and is normally distributed, xx contains random measurement errors that are independent, normally distributed, homoscedastic, not autocorrelated, and have zero expected values. Suppose β>0\beta>0 and the mean of xx is negative. For large samples:

  1. The estimate of α\alpha is biased upwards.

  2. The estimate of α\alpha is biased downwards.

  3. The estimate of α\alpha is unbiased.

  4. The estimate of α\alpha may be biased upwards or downwards.

  5. The estimate of β\beta is biased upwards.

Question 8

Multiple-choice test

If OLS is used in a simple regression model with heteroscedasticity, the population variance of the slope estimator is

var(b2)=i=1nxi2σi2(i=1nxi2)2.(1)\operatorname{var}(b_2)= \frac{\sum_{i=1}^{n}x_i^2\sigma_i^2} {\left(\sum_{i=1}^{n}x_i^2\right)^2}. \tag{1}

Under homoscedasticity,

var(b2)=σ2i=1nxi2.(2)\operatorname{var}(b_2)= \frac{\sigma^2}{\sum_{i=1}^{n}x_i^2}. \tag{2}

Let σi2=σ2ki\sigma_i^2=\sigma^2k_i, where kik_i are unknown nonnegative weights and ki=n\sum k_i=n. Then:

  1. Expression (1) is always greater than (2).

  2. Expression (1) is always less than (2).

  3. Expression (1) is greater than or equal to (2).

  4. Expression (1) is less than or equal to (2).

  5. Expression (1) can be greater than, less than, or equal to (2), depending on the relationship between σi\sigma_i and xix_i.

Question 9

Multiple-choice test

Which of the following is a difference between the White test and the Breusch-Pagan test?

  1. The White test detects heteroscedasticity of unknown form, while the Breusch-Pagan test detects heteroscedasticity of a specified form.

  2. The Breusch-Pagan test detects heteroscedasticity of unknown form, while the White test detects heteroscedasticity of a specified form.

  3. The number of regressors used in the White test is larger than the number used in the Breusch-Pagan test.

  4. The number of regressors used in the Breusch-Pagan test is larger than the number used in the White test.

  5. None of the above.

Question 10

Multiple-choice test

The Durbin-Wu-Hausman test can be used to detect:

I. Measurement-error bias.

II. Simultaneous-equations bias.

III. Endogeneity of explanatory variables.

  1. I, II, and III.

  2. II and III only.

  3. I and III only.

  4. I and II only.

  5. II only.

Question 11

Multiple-choice test

The economic model is

\begin{aligned} y_1&=\alpha+\tau y_2+\pi x_1+\phi x_2+\varepsilon x_3+u_1, \tag{1}\\ y_2&=\beta+\mu y_1+\gamma x_1+u_2. \tag{2} \end{aligned}

Here y1y_1 and y2y_2 are endogenous variables; x1x_1, x2x_2, and x3x_3 are exogenous variables; and u1u_1, u2u_2 satisfy the Gauss-Markov conditions. Indicate the correct statement:

  1. You may apply TSLS to (1), but not to (2).

  2. You may apply TSLS to (2), but not to (1).

  3. You may apply TSLS to both (1) and (2).

  4. You may not apply TSLS to either (1) or (2).

  5. TSLS is not needed because OLS provides consistent estimates in both equations.

Question 12

Multiple-choice test

For a simultaneous equations model with 9 equations, 9 endogenous variables, and 8 exogenous variables, which statement is true for any equation?

  1. An equation is likely to be underidentified if 10 variables are missing from it.

  2. An equation is likely to be exactly identified if 9 variables are missing from it.

  3. An equation is likely to be overidentified if 9 variables are missing from it.

  4. An equation is likely to be exactly identified if 10 variables are missing from it.

  5. An equation is likely to be underidentified if 11 variables are missing from it.

Part 2. Free Response Questions — 2 hours.

Section A. Answer all questions from this section (original Questions 1-2).

Question 13

Written Question 1 — 25 marks

An ICEF student preparing her diploma examines the dependence of painting prices PiP_i, in thousands of dollars, on the painting's age AGEiAGE_i, measured in decades from the current year to the year in which the work was created, and canvas size SiS_i, measured in square feet. Data on 19 paintings sold at one auction are used to estimate:

P^i=2.35+0.028AGEi+0.037Si,R2=0.325,SSR=0.417,\widehat P_i=2.35+0.028AGE_i+0.037S_i,\qquad R^2=0.325,\qquad SSR=0.417,

with standard errors

(0.57)(0.011)(0.020).(1)(0.57)\qquad(0.011)\qquad(0.020). \tag{1}

(a) (13 marks)

  • Interpret the coefficients of equation (1).
  • Evaluate the significance of the individual coefficients and of equation (1) as a whole.
  • SiS_i is insignificant, and the student considers excluding it. What are the likely consequences, given that the correlation between AGEAGE and SS is 0.93-0.93?

The supervisor recommends adding provenance information. Let PVi=1PV_i=1 if the painting has documented provenance and PVi=0PV_i=0 otherwise. The student estimates

P^i=0.87+0.062AGEi+0.082Si+0.36PVi,R2=0.554,SSR=0.276,\widehat P_i=0.87+0.062AGE_i+0.082S_i+0.36PV_i,\qquad R^2=0.554,\qquad SSR=0.276,

with standard errors

(0.71)(0.016)(0.024)(0.13).(2)(0.71)\qquad(0.016)\qquad(0.024)\qquad(0.13). \tag{2}
  • Why are the intercepts different in equations (1) and (2)?
  • What does the coefficient on PViPV_i mean? Is it significant?
  • What assumptions about the coefficients on AGEiAGE_i and SiS_i are implicit in equation (2)?

(b) (12 marks) The supervisor also recommends adding PViAGEiPV_iAGE_i and PViSiPV_iS_i:

P^i=0.097+0.085AGEi+0.11Si+2.69PVi0.07PViAGEi0.08PViSi,\widehat P_i= -0.097+0.085AGE_i+0.11S_i+2.69PV_i-0.07PV_iAGE_i-0.08PV_iS_i, R2=0.691,SSR=0.191,R^2=0.691,\qquad SSR=0.191,

with standard errors

(0.75)(0.017)(0.025)(3.55)(0.05)(0.14).(3)(0.75)\qquad(0.017)\qquad(0.025)\qquad(3.55)\qquad(0.05)\qquad(0.14). \tag{3}
  • What is the meaning of this recommendation? How do the marginal effects differ for paintings with and without provenance?
  • Is provenance statistically significant in equation (3)?
  • The alternative approach is a Chow test. Suppose equation (1), estimated separately for paintings without provenance, gives SSR=0.1908SSR=0.1908, while for paintings with provenance it gives SSR=0.000122SSR=0.000122. Explain the Chow test and compare its result with the previous test.

Question 14

Written Question 2 — 25 marks

A student studies selling prices of 400 houses in the Moscow region, measured in thousands of roubles. The data include property size in square metres, number of bedrooms, and a dummy for air conditioning.

The estimated equation is

log(pricei)=9.894+0.300log(sizei)+0.078bedroomsi+0.212aircoi,n=400.(1)\log(price_i)=9.894+0.300\log(size_i)+0.078bedrooms_i+0.212airco_i,\qquad n=400. \tag{1}

Conventional standard errors are

(0.232)(0.028)(0.015)(0.024),(0.232)\qquad(0.028)\qquad(0.015)\qquad(0.024),

and heteroscedasticity-robust standard errors are

[0.233][0.028][0.018][0.023].[0.233]\qquad[0.028]\qquad[0.018]\qquad[0.023].

(a) (13 marks)

  • What is the economic meaning of the coefficient on log(size)\log(size)?
  • Interpret the coefficient on bedroomsbedrooms.
  • Interpret the coefficient on aircoairco.
  • What is heteroscedasticity? Why is it likely in this sample? What are its consequences for econometric estimation?
  • What are heteroscedasticity-consistent standard errors and what are they used for? Comment on the differences between the two sets of standard errors. Are the coefficient estimates significant?

(b) (12 marks) The student observes that the variance of log(price)\log(price) increases with log(size)\log(size) and assumes

σui=klog(sizei),\sigma_{u_i}=k\log(size_i),

in addition to the Gauss-Markov assumptions and normality.

  • To test for heteroscedasticity, she orders the 400 observations by size. Estimating equation (1) on the 150 observations with the smallest log(size)\log(size) gives SSR1=6.214SSR_1=6.214. Estimating it on the 100 observations with the largest log(size)\log(size) gives SSR2=7.781SSR_2=7.781. The unequal sample sizes are used because there are fewer high-priced properties. Carry out the appropriate test, state the hypotheses, calculate the statistic, and draw a conclusion. What are the consequences of violating the equal-subsample-size rule?
  • A friend recommends White's test. For the auxiliary equation with cross terms, the student obtains R2=0.0433R^2=0.0433. Complete the test, including its distribution, degrees of freedom, critical value, and conclusion.

Section B. Answer one question from this section (original Question 3 or Question 4).

Question 15

Written Question 3 — 25 marks

Consider two regression models without an intercept:

Yi=β1Zi+β2Xi+ui,(1)Y_i=\beta_1Z_i+\beta_2X_i+u_i, \tag{1} Yi=β2Xi+ui,(2)Y_i=\beta_2X_i+u_i, \tag{2}

where

E(ui)=0,E(ui2)=σ2,E(uiuj)=0 for ij,E(u_i)=0,\qquad E(u_i^2)=\sigma^2,\qquad E(u_iu_j)=0\ \text{for }i\ne j,

and ZZ and XX are non-stochastic.

(a) (10 marks) Let (1) be the true model, but estimate (2) by OLS:

Y^=β^2X.\widehat Y=\widehat\beta_2X.
  • Show that
β^2=iXiYiiXi2\widehat\beta_2^*=\frac{\sum_iX_iY_i}{\sum_iX_i^2}

is biased.

  • Which factors determine the direction of the bias?
  • Under what conditions is there no bias?

(b) (8 marks) Consider

Yi=β1+β2Xi+ui(3)Y_i=\beta_1+\beta_2X_i+u_i \tag{3}

and

Yi=β2Xi+ui,(4)Y_i=\beta_2X_i+u_i, \tag{4}

with the same disturbance assumptions, non-stochastic XX, and i=1,,ni=1,\ldots,n.

  • Let (3) be true and estimate (4) by OLS. Rewrite (3) as Yi=β1Zi+β2Xi+uiY_i=\beta_1Z_i+\beta_2X_i+u_i, where Zi=1Z_i=1. Using part (a), find the omitted-variable bias. How does X\overline X affect its direction, and together with which other factors?
  • Now let (4) be true and estimate (3) by OLS. Briefly describe the consequences for the properties of β^2\widehat\beta_2.

(c) (7 marks) Obtain the expression for

E(β^2β^2)E\left(\widehat\beta_2^*-\widehat\beta_2\right)

by directly comparing

β^2=iXiYiiXi2\widehat\beta_2^*=\frac{\sum_iX_iY_i}{\sum_iX_i^2}

from regression (4) with

β^2=i(XiX)(YiY)i(XiX)2\widehat\beta_2= \frac{\sum_i(X_i-\overline X)(Y_i-\overline Y)} {\sum_i(X_i-\overline X)^2}

from regression (3).

Question 16

Written Question 4 — 25 marks

Consider the following simultaneous-equations model of equilibrium wheat consumption:

wheatD=α1price+α2GDP+uD,(1)wheat^D=\alpha_1price+\alpha_2GDP+u_D, \tag{1} wheatS=β1price+β2sunshine+β3flood+uS.(2)wheat^S=\beta_1price+\beta_2sunshine+\beta_3flood+u_S. \tag{2}

The errors uDu_D and uSu_S are i.i.d. with zero means and constant variances. Here:

  • wheatwheat is per-capita wheat consumption, in kilograms;
  • priceprice is the domestic price per kilogram;
  • GDPGDP is per-capita income;
  • sunshinesunshine is sunshine hours or solar radiation;
  • floodflood is the number of severe flooding events during the last wheat-growing season.

The variables GDPGDP, sunshinesunshine, and floodflood are exogenous; wheatwheat and priceprice are endogenous. The model is estimated using cross-sectional data.

(a) (10 marks)

  • Explain endogenous and exogenous variables.
  • Using the market-clearing condition wheatS=wheatDwheat^S=wheat^D, derive reduced-form equations for priceprice and wheatwheat.
  • Briefly show why estimating equations (1) and (2) by OLS is problematic. A derivation of the large-sample bias is not required.

(b) (8 marks)

  • What does it mean for an econometric equation to be identified or not identified? The order condition is not required for this definition.
  • Determine whether equations (1) and (2) are exactly identified, underidentified, or overidentified using the order condition.
  • A seminar participant proposes adding
sunshine2=(sunshine)2,sunshine^2=(sunshine)^2,

so the supply equation becomes

wheatS=β1price+β2sunshine+β3sunshine2+β4flood+uS.(2*)wheat^S=\beta_1price+\beta_2sunshine+\beta_3sunshine^2+\beta_4flood+u_S. \tag{2*}

What is the rationale for this proposal? What signs should be expected for the coefficients on sunshinesunshine and sunshine2sunshine^2?

  • How does adding sunshine2sunshine^2 change identification of equations (1) and (2*)?

(c) (7 marks) Which method can be used to obtain consistent estimates of equation (1)? Briefly describe how to apply it.