Эконометрика — МИЭФ, 2023 final

МИЭФЭконометрика2023final
Скачать задачи PDF

Question 1

Multiple-choice test

For the model

Yi=β1+β2Xi+ui,Y_i=\beta_1+\beta_2X_i+u_i,

where XiX_i are non-stochastic and the Model A assumptions are satisfied, the following three estimators of β2\beta_2 are proposed:

b1=YˉXˉ,b2=i(XiXˉ)(YiYˉ)i(XiXˉ)2,b3=iXiYiiXi2.b_1=\frac{\bar Y}{\bar X},\qquad b_2=\frac{\sum_i(X_i-\bar X)(Y_i-\bar Y)}{\sum_i(X_i-\bar X)^2},\qquad b_3=\frac{\sum_iX_iY_i}{\sum_iX_i^2}.

The following is correct for these estimators:

  1. All the estimators b1b_1, b2b_2, and b3b_3 are unbiased.

  2. All the estimators b1b_1, b2b_2, and b3b_3 are biased.

  3. The estimator b2b_2 is unbiased, while b1b_1 and b3b_3 are biased.

  4. The estimators b1b_1 and b2b_2 are unbiased, while b3b_3 is biased.

  5. The estimators b2b_2 and b3b_3 are unbiased, while b1b_1 is biased.

Question 2

Multiple-choice test

Which of the following correctly identifies an advantage of using adjusted R2R^2 over R2R^2?

  1. Adjusted R2R^2 corrects the bias in R2R^2.

  2. Adjusted R2R^2 is easier to calculate than R2R^2.

  3. The penalty of adding new independent variables is better understood through adjusted R2R^2 than R2R^2.

  4. Adjusted R2R^2 can be calculated for models having logarithmic functions, while R2R^2 cannot be calculated for such models.

  5. None of the above is correct.

Question 3

Multiple-choice test

A student estimated by OLS the production function

y=γ1+αk+βl+u,(1)y=\gamma_1+\alpha k+\beta l+u \tag{1},

where yy is the output growth rate, kk is the capital growth rate, and ll is the labour growth rate. Then he decided to estimate by OLS the function

ykl=γ2+μk+ρl+u.(2)y-k-l=\gamma_2+\mu k+\rho l+u. \tag{2}

Which statement of the following ones is correct?

  1. μ^=α^\hat\mu=\hat\alpha.

  2. ρ^=β^\hat\rho=\hat\beta.

  3. R12=R22R_1^2=R_2^2.

  4. SSR1=SSR2SSR_1=SSR_2.

  5. SST1=SST2SST_1=SST_2.

Question 4

Multiple-choice test

If you have estimated the parameters of the following model using OLS directly, with the Gauss-Markov conditions satisfied,

y=α+β1x1+β2x2+(β2β3)x3+u,y=\alpha+\beta_1x_1+\beta_2x_2+(\beta_2-\beta_3)x_3+u,

then:

  1. You can get an unbiased estimate of β3\beta_3.

  2. You cannot get an unbiased estimate of β3\beta_3, but can get a consistent estimate of it.

  3. You cannot get an unbiased, or biased but consistent, estimate of β3\beta_3.

  4. You cannot get any estimate of β3\beta_3.

  5. All the above statements are incorrect.

Question 5

Multiple-choice test

Which of the following correctly defines the FF statistic for testing linear restrictions if Rr2R_r^2 represents the coefficient of determination from the restricted model, Rur2R_{ur}^2 represents the coefficient of determination from the unrestricted model, and qq is the number of restrictions imposed?

F=(Rur2Rr2)/q(1Rur2)/(nk).F=\frac{(R_{ur}^2-R_r^2)/q}{(1-R_{ur}^2)/(n-k)}.
F=(Rr2Rur2)/q(1Rur2)/(nk).F=\frac{(R_r^2-R_{ur}^2)/q}{(1-R_{ur}^2)/(n-k)}.
F=(Rur2Rr2)/q(1Rr2)/(nk).F=\frac{(R_{ur}^2-R_r^2)/q}{(1-R_r^2)/(n-k)}.
F=(Rr2Rur2)/q(1Rr2)/(nk).F=\frac{(R_r^2-R_{ur}^2)/q}{(1-R_r^2)/(n-k)}.
  1. None of the above.

Question 6

Multiple-choice test

The following double-logarithmic model is estimated:

logY=β1+β2logX2+u.\log Y=\beta_1+\beta_2\log X_2+u.

The interpretation of the coefficient β2\beta_2 is the following:

  1. If X2X_2 increases by one unit, then YY increases approximately by 100β2100\beta_2 percent.

  2. If X2X_2 increases by one unit, then YY increases approximately by β2/100\beta_2/100 percent.

  3. If X2X_2 increases by one percent, then YY increases approximately by 100β2100\beta_2 percent.

  4. If X2X_2 increases by one percent, then YY increases approximately by β2\beta_2 percent.

  5. If X2X_2 increases by one percent, then YY increases approximately by β2\beta_2 units.

Question 7

Multiple-choice test

An econometric model is described by the following three equations:

\begin{aligned} y_1&=\alpha+\beta y_3+\gamma x_1+\sigma x_3+\pi x_4+u_1, \tag{1}\\ y_2&=\delta+\varepsilon y_1+\lambda x_2+u_2, \tag{2}\\ y_3&=\mu+\theta y_1+\omega y_2+\rho x_3+\chi x_4+u_3. \tag{3} \end{aligned}

Here y1y_1, y2y_2, and y3y_3 are endogenous variables; x1x_1, x2x_2, x3x_3, and x4x_4 are exogenous variables; and u1u_1, u2u_2, and u3u_3 are disturbance terms, independent and satisfying the Gauss-Markov conditions. Choose the correct statement:

  1. Equation (2) is exactly identified.

  2. Equation (1) is overidentified.

  3. Equation (3) is underidentified.

  4. Equation (1) is exactly identified.

  5. Equation (2) is underidentified.

Question 8

Multiple-choice test

The model with the dependent variable PiP_i, monthly pension, as a function of work experience WEiWE_i and average earnings EARNiEARN_i is being considered:

Pi=β1+β2WEi+β3EARNi+ui.P_i=\beta_1+\beta_2WE_i+\beta_3EARN_i+u_i.

The value of pension is restricted by the values PUP_U and PLP_L from the top and from the bottom, but there are no actual observations in the sample with Pi=PUP_i=P_U or Pi=PLP_i=P_L. The student decided to estimate a Tobit model with the truncated sample, with all observations on the upper or lower bounds excluded. Please indicate the correct statement among the following ones:

  1. The estimated coefficients are biased but consistent.

  2. The estimated coefficients are biased and inconsistent.

  3. The estimated coefficients are unbiased.

  4. For the truncated sample, OLS estimation would provide unbiased estimates.

  5. None of the above.

Question 9

Multiple-choice test

The following model of determination of the size of dividends is considered:

Dt=γPt+ut,(1)D_t^*=\gamma P_t+u_t, \tag{1} ΔDt=λ(DtDt1)+ρ(PtPt1),(2)\Delta D_t=\lambda(D_t^*-D_{t-1})+\rho(P_t-P_{t-1}), \tag{2}

where DtD_t^* is the desirable size of the dividends, PtP_t is the current profits, DtD_t is the actual size of the dividends, and ΔDt=DtDt1\Delta D_t=D_t-D_{t-1}. The following statement is correct. The model is:

  1. The adaptive expectations model and can be consistently estimated in the form of the Koyck distribution model.

  2. The partial adjustment model and can be consistently estimated in the form of the ADL(1,0) model.

  3. The partial adjustment model and can be consistently estimated in the form of the ADL(0,1) model.

  4. The error correction model and can be consistently estimated in the form of the ADL(1,1) model.

  5. The error correction model and can be consistently estimated in the form of the ADL(1,0) model.

Question 10

Multiple-choice test

Refer to the following model:

Yt=α0+β0St+β1St1+β2St2+β3St3+ut.Y_t=\alpha_0+\beta_0S_t+\beta_1S_{t-1}+\beta_2S_{t-2}+\beta_3S_{t-3}+u_t.

Here (β0+β1)(\beta_0+\beta_1) represents:

  1. The short-run change in YY given a temporary increase in SS.

  2. The short-run change in YY given a permanent increase in SS.

  3. The long-run change in YY given a permanent increase in SS.

  4. The long-run change in YY given a temporary increase in SS.

  5. None of the above.

Question 11

Multiple-choice test

Indicate the incorrect statement among the following ones:

  1. If XtX_t is a random walk with drift, the series of first differences ΔXt=(XtXt1)=β1+εt\Delta X_t=(X_t-X_{t-1})=\beta_1+\varepsilon_t, where εt\varepsilon_t is white noise, is stationary.

  2. The time trend Xt=β1+β2t+εtX_t=\beta_1+\beta_2t+\varepsilon_t is a non-stationary series.

  3. The MA(1) process Xt=εt+α2εt1X_t=\varepsilon_t+\alpha_2\varepsilon_{t-1} is stationary.

  4. The AR(1) process Xt=β2Xt1+εtX_t=\beta_2X_{t-1}+\varepsilon_t, with 1<β2<1-1<\beta_2<1, is asymptotically stationary.

  5. The stationarity of an ARMA process is determined by its MA part.

Question 12

Multiple-choice test

In the model based on panel data

Yit=β1+j=2kβjXjit+p=1sγpZpi+δt+εit,Y_{it}=\beta_1+\sum_{j=2}^{k}\beta_jX_{jit}+\sum_{p=1}^{s}\gamma_pZ_{pi}+\delta t+\varepsilon_{it},

random effect estimation is based on the following assumptions:

I. There are no XX variables that are fixed for each individual.

II. There is some unobserved heterogeneity in the model.

III. Each of the unobserved ZpZ_p variables is treated as being drawn randomly from a given distribution.

IV. The ZpZ_p variables are correlated with some of the XjX_j variables.

V. The ZpZ_p variables are distributed independently of all of the XjX_j variables.

  1. I, III and IV only.

  2. II, III and V only.

  3. II and III only.

  4. I, III and V only.

  5. III and IV only.

Question 13

Written part, Section A — original Question 1 — 25 marks

Part 2. Written examination. One session, 2 hours without break.

SECTION A. Answer all questions 1-2 from this section.

Working on her coursework, a student of ICEF interviewed ICEF graduates of different graduation years working in Russia. She is interested in studying their current earning, earniearn_i, in thousands of rubles per month. Explanatory variables are ageiage_i, age of respondent in years, age squared agei2age_i^2, and also some dummy variables: mscaimsca_i, equal to 1 for those graduates who have received a master's degree abroad and 0 otherwise; nfeinfe_i, no further education, equal to 1 for those graduates who received a master's degree neither abroad nor in the country; and maleimale_i, equal to 1 for male and 0 for female. She posted a questionnaire on the Internet and received answers from 41 graduates of different years of graduation from ICEF. Here are the results of estimation of two regressions using different sets of variables. Standard errors are in brackets.

earn^i=75.18+2.37agei0.02agei2,R2=0.48.(1)\widehat{earn}_i=75.18+2.37age_i-0.02age_i^2, \qquad R^2=0.48. \tag{1}

Standard errors:

(4.18)(0.21)(0.01)(4.18)\qquad(0.21)\qquad(0.01) earn^i=123.68+2.54agei0.03agei2+40.21mscai51.23nfei+0.25malei,R2=0.63.(2)\widehat{earn}_i=123.68+2.54age_i-0.03age_i^2+40.21msca_i-51.23nfe_i+0.25male_i, \qquad R^2=0.63. \tag{2}

Standard errors:

(6.49)(0.34)(0.007)(4.56)(22.37)(0.16)(6.49)\qquad(0.34)\qquad(0.007)\qquad(4.56)\qquad(22.37)\qquad(0.16)

(a) (12 marks)

  • How many groups of dummy variables does equation (2) contain? How many categories of education level of ICEF graduates do dummy variables mscaimsca_i and nfeinfe_i describe? What is the reference category in each of the groups of dummy variables?

  • Help the student estimate the expected earnings of ICEF graduates of different categories presented in equations (1) and (2) for a person 25 years old. Why are the coefficients of equations (1) and (2) different, and what is the difference in the meaning of the estimates obtained from equations (1) and (2)?

  • Are the coefficients of the variables mscaimsca_i, nfeinfe_i, and maleimale_i significant? Are they jointly significant?

(b) (13 marks) The student found that the variables mscaimsca_i, nfeinfe_i, and maleimale_i do not correlate with ageiage_i and with agei2age_i^2.

  • Can the effects of age be considered independently of the values of other variables?

  • What is the meaning of the coefficients of ageiage_i and agei2age_i^2 in equation (2)? Evaluate the marginal effect of age for age=25age=25, age=42age=42, and age=60age=60 and discuss the results. Is the influence of age on earnings significant?

  • What would be the consequences for evaluating equation (2) if the variable agei2age_i^2 was excluded from it? Explain based on your knowledge of econometric theory.

  • Fearing the presence of heteroscedasticity, the student runs the Breusch-Pagan test for equation (2), obtaining the value of the statistic χ2=17.5\chi^2=17.5. For equation (1), she performs a White test with cross-terms included, obtaining χ2=10.2\chi^2=10.2. Help the student complete the tests for heteroscedasticity and draw conclusions. Explain your answer.

Question 14

Written part, Section A — original Question 2 — 25 marks

The student decided to investigate the factors that affect expenditures on air travel in the United States. To do this, she uses data from the 25 years, 1994-2018, prior to the outbreak of the Covid-19 pandemic, on total expenditure latla_t, total income ldtld_t, and the air travel relative price index lptlp_t, all taken in logarithms. She first builds the following regressions using OLS and Cochrane-Orcutt (C.O.) methods. Standard errors are in parentheses.

la^t=12.7+2.1ldt,R2=0.47,DW=0.31OLS.(1)\widehat{la}_t=-12.7+2.1ld_t, \qquad R^2=0.47, \qquad DW=0.31 \qquad\text{OLS}. \tag{1}

Standard errors:

(0.68)(0.10)(0.68)\qquad(0.10) la^t=7.5+1.3ldt,R2=0.98,DW=1.40C.O..(2)\widehat{la}_t=-7.5+1.3ld_t, \qquad R^2=0.98, \qquad DW=1.40 \qquad\text{C.O.}. \tag{2}

Standard errors:

(5.9)(0.84)(5.9)\qquad(0.84) la^t=9.6+2.3ldt0.99lpt,R2=0.99,DW=1.46OLS.(3)\widehat{la}_t=-9.6+2.3ld_t-0.99lp_t, \qquad R^2=0.99, \qquad DW=1.46 \qquad\text{OLS}. \tag{3}

Standard errors:

(0.40)(0.05)(0.09)(0.40)\qquad(0.05)\qquad(0.09) la^t=9.4+2.2ldt0.97lpt,R2=0.99,DW=1.88C.O..(4)\widehat{la}_t=-9.4+2.2ld_t-0.97lp_t, \qquad R^2=0.99, \qquad DW=1.88 \qquad\text{C.O.}. \tag{4}

Standard errors:

(0.54)(0.07)(0.11)(0.54)\qquad(0.07)\qquad(0.11)

(a) (13 marks)

  • Why can one suggest the presence of autocorrelation in some of the equations listed above? Why is this question important when evaluating regression equations? Help the student explore this question using the Durbin-Watson test.

  • For what purpose does the student, along with equation (1), also calculate equations (2), (3), and (4)? Explain your opinion.

  • Has she been able to achieve her goals? Is there any reason to believe that there is no autocorrelation in equation (4), or should the student be advised to take an additional test? Which one?

A student's friend advised her to use a lagged variable as the best and simple tool to make the DWDW statistic acceptable. The corresponding equation is

la^t=4.68+1.3ldt0.73lpt+0.41lat1,R2=0.99,DW=2.32OLS.(5)\widehat{la}_t=-4.68+1.3ld_t-0.73lp_t+0.41la_{t-1}, \qquad R^2=0.99, \qquad DW=2.32 \qquad\text{OLS}. \tag{5}

Standard errors:

(1.31)(0.26)(0.10)(0.11)(1.31)\qquad(0.26)\qquad(0.10)\qquad(0.11)
  • Do you agree with the advice of the student's friend? Help her to test equation (5) for autocorrelation.

(b) (12 marks) The supervisor advised the student to consider the more general model ADL(1,1),

lat=β1+β2ldt+β3ldt1+β4lpt+β5lpt1+β6lat1+ut,la_t=\beta_1+\beta_2ld_t+\beta_3ld_{t-1}+\beta_4lp_t+\beta_5lp_{t-1}+\beta_6la_{t-1}+u_t,

and conduct a Common Factor test for this model. The corresponding estimated models are as follows.

Unrestricted model

la^t=5.5+1.4ldt+0.11ldt10.65lpt0.18lpt1+0.32lat1,R2=0.99,RSS=0.0354.(6)\widehat{la}_t=-5.5+1.4ld_t+0.11ld_{t-1}-0.65lp_t-0.18lp_{t-1}+0.32la_{t-1}, \qquad R^2=0.99, \qquad RSS=0.0354. \tag{6}

Standard errors:

(1.9)(0.61)(0.67)(0.18)(0.27)(0.10)(1.9)\qquad(0.61)\qquad(0.67)\qquad(0.18)\qquad(0.27)\qquad(0.10)

Restricted model

la^t=7.0+2.24ldt+0.11ldt10.96lpt0.18lpt1+0.97lat1,R2=0.99,RSS=0.0583.(7)\widehat{la}_t=-7.0+2.24ld_t+0.11ld_{t-1}-0.96lp_t-0.18lp_{t-1}+0.97la_{t-1}, \qquad R^2=0.99, \qquad RSS=0.0583. \tag{7}

Standard errors shown in the source:

(2.3)(0.07)(0.11)(0.22)(2.3)\qquad(0.07)\qquad(0.11)\qquad(0.22)
  • Demonstrate how to obtain the restricted specification of the ADL(1,1) model from the multiple regression model
lat=α1+α2ldt+α3lpt+utla_t=\alpha_1+\alpha_2ld_t+\alpha_3lp_t+u_t

with the autocorrelated disturbance term

ut=ρut1+εt.u_t=\rho u_{t-1}+\varepsilon_t.
  • Help the student to run the Common Factor test, stating the restrictions and making a conclusion.

  • Under what conditions is the Common Factor test valid? Explain how the student can test that these conditions are met.

Question 15

Written part, Section B — original Question 3 — 25 marks

SECTION B. Answer only ONE question from this section: Question 3 OR Question 4.

(a) (10 marks)

  • Explain what is meant by a stationary time series and a non-stationary time series. How to understand if a time series is stationary?

  • What is detrending of a time series? What is differencing of a time series? Explain what you understand by difference-stationary and trend-stationary time series. What is the difference in the impact of random shocks on difference-stationary and trend-stationary time series?

  • Demonstrate that the time trend

Xt=α0+α1t+utX_t=\alpha_0+\alpha_1t+u_t

is a trend-stationary time series. There is no need to prove that this series is non-stationary. We assume

E[ut]=0,Var(ut)=σ2,E[utus]=0st.E[u_t]=0, \qquad Var(u_t)=\sigma^2, \qquad E[u_tu_s]=0\quad\forall s\ne t.
  • Demonstrate that the random walk
Xt=Xt1+utX_t=X_{t-1}+u_t

is a difference-stationary time series. There is no need to prove that this series is non-stationary. Use the same assumptions about utu_t.

(b) (7 marks) Consider the following non-stationary process:

yt=γ0+γ1t+ut,ut=ρut1+εt,(1)y_t=\gamma_0+\gamma_1t+u_t, \qquad u_t=\rho u_{t-1}+\varepsilon_t, \tag{1}

where εt\varepsilon_t is i.i.d. (0,σ2)(0,\sigma^2).

  • Explain the source or sources of non-stationarity of yty_t. Indicate at what values of the parameters process (1) turns out to be difference stationary and at what values it is trend stationary.

  • Investigate the implications of detrending process (1) under the assumption ρ<1|\rho|<1.

  • Investigate the consequences of applying the differencing transformation to process (1) under the assumption ρ=1\rho=1.

(c) (8 marks)

  • Show that you can rewrite model (1) as
Δyt=β0+β1t+β2yt1+εt.(2)\Delta y_t=\beta_0+\beta_1t+\beta_2y_{t-1}+\varepsilon_t. \tag{2}

Clearly indicate the one-to-one relation between (γ0,γ1,ρ)(\gamma_0,\gamma_1,\rho) and (β0,β1,β2)(\beta_0,\beta_1,\beta_2).

  • How can equation (2) be used to test time series (1) for stationarity?

Question 16

Written part, Section B — original Question 4 — 25 marks

The researcher investigates the effect of having vocational training available in high school on the probability of currently living in poverty for the population of men who grew up with a disadvantaged background. Let povpov be a dummy variable equal to one if a man is currently living below the poverty line and zero otherwise. The variable ageage is age, and eduedu is total years of schooling. Let vocvoc be an indicator equal to unity if a man's high school offered vocational training. Using a random sample of 850 men, the researcher obtains

Pr^(pov=1age,edu,voc)=F(0.4530.016age0.087edu0.149voc),(1)\widehat{\Pr}(pov=1\mid age,edu,voc) =F(0.453-0.016age-0.087edu-0.149voc), \tag{1}

where

F(z)=exp(z)1+exp(z)F(z)=\frac{\exp(z)}{1+\exp(z)}

is the logit function.

(a) (10 marks)

  • Why is model (1) estimated by maximum likelihood and not OLS? Explain the meaning of the maximum likelihood method. What properties do estimates obtained by the maximum likelihood method have?

  • Discuss the benefits and drawbacks of using the logit regression model when trying to explain a binary variable povpov.

  • Equation (1) contains information only about the estimated coefficients of the model. What additional information is needed to be able to judge the statistical quality of econometric model (1)? What tests can be carried out for this purpose?

(b) (7 marks)

  • Use the direct comparison of two probabilities of living in poverty calculated by the logit function to evaluate the effect of having vocational training available in high school for a 40-year-old man with 12 years of education. Give details and interpret the results.

(c) (8 marks)

  • Now do the same estimation of the marginal effect of vocational education as in (b) using derivatives.

  • What percentage is the calculated marginal effect of the maximum possible?