Эконометрика — МИЭФ, 2024 final

МИЭФЭконометрика2024final
Скачать задачи PDF

Question 1

Multiple-choice test

In the model

Yi=β1+ui,Y_i=\beta_1+u_i,

the OLS estimator of β1\beta_1 equals:

  1. b1=Yˉb_1=\bar Y.

  2. b1=0b_1=0.

  3. b1=uˉb_1=\bar u.

  4. b1=iYi2iYib_1=\dfrac{\sum_iY_i^2}{\sum_iY_i}.

  5. b1=iYin1b_1=\dfrac{\sum_iY_i}{n-1}.

Question 2

Multiple-choice test

The regression model

Y=β1+β2X2+β3X3+uY=\beta_1+\beta_2X_2+\beta_3X_3+u

had been transformed into one with deviations of the explanatory variables from their means:

X2=X2Xˉ2,X3=X3Xˉ3.X_2^*=X_2-\bar X_2, \qquad X_3^*=X_3-\bar X_3.

In the transformed model, fitted for the same sample, generally speaking:

  1. The estimates of the slope coefficients will change, while the estimate of the intercept will stay the same.

  2. The estimates of both the slope coefficients and intercept will stay the same.

  3. The estimates of the slope coefficients will stay the same, while the estimate of the intercept will change.

  4. The estimates of both the slope coefficients and intercept will change.

  5. The transformed model cannot be estimated due to multicollinearity.

Question 3

Multiple-choice test

Which method can be used for detecting heteroskedasticity of the type

σi2=γXi?\sigma_i^2=\gamma X_i?
  1. Goldfeld-Quandt test.

  2. Durbin-Watson test.

  3. Dickey-Fuller test.

  4. Student's test.

  5. None of the above.

Question 4

Multiple-choice test

The researcher wants to examine the life expectancy level in the world and estimates the following model:

LE^i=β^1+β^2EURi,SSR=316.\widehat{LE}_i=\hat\beta_1+\hat\beta_2EUR_i, \qquad SSR=316.

Here LEiLE_i is life expectancy in the country in years, and EURiEUR_i is a dummy variable equal to 1 if the country is European. It is known that there are 160 countries in the sample and 40 of them are European. What is the estimated covariance between β^1\hat\beta_1 and β^2\hat\beta_2?

  1. 1.03%-1.03\%.

  2. 1.67%-1.67\%.

  3. 2.39%-2.39\%.

  4. 2.89%-2.89\%.

  5. 3.33%-3.33\%.

Question 5

Multiple-choice test

The researcher estimated the model of the wage rate in thousand rubles gained by a specialist in the banking sector:

lnwi=11.3+0.75EDUCi+1.2MALEi+0.22lnEXPERi+ui.\ln w_i=11.3+0.75EDUC_i+1.2MALE_i+0.22\ln EXPER_i+u_i.

Here ww is the wage rate received by an individual, EDUCEDUC is the number of years of graduate study, MALEMALE is a binary variable equal to 1 if the person is male, and EXPEREXPER is the number of years of working experience. Indicate the correct statement:

  1. One more year of working experience increases the wage rate by 0.22 thousand rubles, on average, others equal.

  2. One more year of graduate study increases the wage rate by 75%, on average, others equal.

  3. One more year of working experience increases the wage rate by 22%, on average, others equal.

  4. Males get 20% higher wages than females, on average, others equal.

  5. Males get higher wages than females by 1.2 thousand rubles, on average, others equal.

Question 6

Multiple-choice test

The Durbin-Wu-Hausman test can be used for detection of the following:

I. Heteroscedasticity.

II. Measurement errors.

III. Simultaneous equations bias.

IV. Endogeneity of explanatory variables.

  1. I, II and III only.

  2. III and IV only.

  3. II and IV only.

  4. II, III and IV only.

  5. All I-IV.

Question 7

Multiple-choice test

In econometrics, simultaneity bias arises when:

  1. Strictly exogenous explanatory variables determine the dependent variable through a step-by-step process.

  2. One or more of the explanatory variables is jointly determined with the dependent variable.

  3. The disturbance term is correlated with the dependent variable.

  4. Heteroscedasticity is present in the model.

  5. There is correlation between some explanatory variables.

Question 8

Multiple-choice test

To obtain consistent estimates of the parameters of the standard partial adjustment model

Yt=β1+β2Xt+ut,YtYt1=λ(YtYt1),Y_t^*=\beta_1+\beta_2X_t+u_t, \qquad Y_t-Y_{t-1}=\lambda(Y_t^*-Y_{t-1}),

one should use:

  1. Koyck distribution.

  2. Koyck transformation.

  3. Error correction model.

  4. ADL(1,0) model.

  5. ADL(1,1) model.

Question 9

Multiple-choice test

Which of the following statements is true?

  1. The expectation of a random walk with drift process is a linear function of time.

  2. The expectation of a random walk with drift process does not exist.

  3. The expectation of a random walk with drift process is zero.

  4. The expectation of a random walk with drift process is constant in time.

  5. The expectation of a random walk with drift process is a quadratic function of time.

Question 10

Multiple-choice test

Given the estimated regression

z=1.55+0.5X0.3X2,z=1.55+0.5X-0.3X^2,

find the marginal partial effect at X=2X=2, applying the Probit model and its marginal probability function

f(z)=12πez2/2.f(z)=\frac{1}{\sqrt{2\pi}}e^{-z^2/2}.

You may use f(z=1.35)=0.1604f(z=1.35)=0.1604.

  1. 0.15020.1502.

  2. 0.16040.1604.

  3. 0.1123-0.1123.

  4. 0.2016-0.2016.

  5. None of the above.

Question 11

Multiple-choice test

In a Random Effects (RE) model, the error term uitu_{it} can be decomposed into:

  1. A time-specific component and an individual-specific component.

  2. An observation-specific component and a time-specific component.

  3. A time-specific component and a group-specific component.

  4. An individual-specific component and an observation-specific component.

  5. An individual-specific component and a group-specific component.

Question 12

Multiple-choice test

In the context of panel data, the term "unbalanced panel" refers to:

  1. A panel with more cross-sectional units than time periods.

  2. A panel with heteroskedastic errors.

  3. A panel where all cross-sectional units are observed for the same time periods.

  4. A panel with correlated random effects.

  5. A panel where the number of time periods differs across cross-sectional units.

Question 13

Written part, Section A — original Question 1 — 25 marks

Part 2. Written examination. One session, 2 hours without break.

SECTION A. Answer all questions 1-2 from this section.

A student has aggregate annual data on consumer expenditure on cloth CLOTCLOT, disposable personal income DPIDPI, both measured at current prices in U.S. dollars billions, a price index for clothes PCLOTPCLOT, and a price index for total personal expenditure PTPEPTPE, for the United States for the period 1998-2022, 25 observations. The student runs the following regressions. Standard errors are in parentheses; SSRSSR is the sum of squared residuals. When considering equations, issues related to nonstationary time series should be ignored.

logCLOT^=1.32+0.80logDPI0.61logPCLOT+0.60logPTPE,R2=0.998.(1)\widehat{\log CLOT}=-1.32+0.80\log DPI-0.61\log PCLOT+0.60\log PTPE, \qquad R^2=0.998. \tag{1}

Standard errors and additional statistics:

(0.12)(0.06)(0.11)(0.05),SSR=0.0045,DW=1.66.(0.12)\qquad(0.06)\qquad(0.11)\qquad(0.05), \qquad SSR=0.0045, \qquad DW=1.66. log(CLOTPTPE)^=1.80+0.58log(DPIPTPE)+0.13log(PCLOTPTPE),R2=0.937.(2)\widehat{\log\left(\frac{CLOT}{PTPE}\right)} =-1.80+0.58\log\left(\frac{DPI}{PTPE}\right) +0.13\log\left(\frac{PCLOT}{PTPE}\right), \qquad R^2=0.937. \tag{2}

Standard errors and additional statistics:

(0.19)(0.09)(0.10),SSR=0.0172,DW=0.66.(0.19)\qquad(0.09)\qquad(0.10), \qquad SSR=0.0172, \qquad DW=0.66.

(a) (12 marks)

  • Give an interpretation to regression coefficients of equations (1) and (2), commenting on the meaning of the variables used in these models and values of coefficients.

  • Are coefficients of equations (1) and (2) significant? What conclusions can be drawn from this?

  • Are coefficients of equations (1) and (2) significant? What conclusions can be drawn from this? Are equations (1) and (2) significant in general?

(b) (13 marks)

  • Explain why the second regression may be considered as a restricted version of the first. What is the restriction? Test the restriction. What information — on R2R^2 or SSRSSR or both — can be used for this test?

  • What are the benefits and risks of imposing a restriction in a regression model? Consider the restricted and unrestricted equation in turn as correct specification. Which equation should be chosen for further study of the problem under consideration based on the analysis performed, and why?

  • The teacher advised the student to also pay attention to the Durbin-Watson statistics in equations (1) and (2). Perform the required test and draw conclusions.

Question 14

Written part, Section A — original Question 2 — 25 marks

Working on her coursework, a student of ICEF collected data on a sample of 36 ICEF graduates from different years of graduation working in Russia. She is interested in studying their current salary, SALSAL, in thousands of roubles per month. Explanatory variables are AGEAGE, age of respondent in years, age squared AGE2AGE^2, and also some dummy variables: MSCAMSCA, Master of Science Degree Abroad, equal to 1 for those graduates who have received a master's degree abroad and 0 otherwise; NFENFE, no further education, equal to 1 for those graduates who received a master's degree neither abroad nor in the country; and MALEMALE, equal to 1 for male and 0 for female. Here are the results of estimation of two regressions using different sets of variables. Standard errors are in brackets.

SAL^i=15.18+0.254AGEi0.003AGEi2,R2=0.48.(1)\widehat{SAL}_i=15.18+0.254AGE_i-0.003AGE_i^2, \qquad R^2=0.48. \tag{1}

Standard errors:

(0.18)(0.021)(0.0015)(0.18)\qquad(0.021)\qquad(0.0015) SAL^i=10.068+0.18AGEi0.002AGEi2+4.021MSCAi5.123NFEi+0.025MALEi,R2=0.63.(2)\widehat{SAL}_i=10.068+0.18AGE_i-0.002AGE_i^2+4.021MSCA_i-5.123NFE_i+0.025MALE_i, \qquad R^2=0.63. \tag{2}

Standard errors:

(0.049)(0.03)(0.001)(0.456)(2.237)(0.016)(0.049)\qquad(0.03)\qquad(0.001)\qquad(0.456)\qquad(2.237)\qquad(0.016)

(a) (12 marks)

  • Give interpretation to the coefficients of equation (1). Evaluate the marginal effect of AGEAGE for AGE=25AGE=25, AGE=42AGE=42, and AGE=60AGE=60.

  • Is this marginal effect of AGEAGE significant?

  • What is the role of the AGE2AGE^2 variable in equations (1)-(2)? Noticing that this variable was insignificant in both equations, the student decided to exclude it from the equations. Try to predict how this might affect the coefficients of model (1). Explain your reasoning.

(b) (13 marks)

  • How many categories of education level of ICEF graduates do dummy variables MSCAMSCA and NFENFE describe? What is the reference category for equation (2)? What do the coefficients of dummy variables tell about salaries of ICEF graduates?

  • Are the coefficients of the variables MSCAMSCA, NFENFE, and MALEMALE significant? Are they jointly significant?

  • The student decided to replace the variable NFENFE, no further education, with MSCRMSCR, getting a master's degree in Russia, leaving all other variables in the equation. Indicate what changes will occur in equation (2), and how exactly they will change, and which coefficients will remain the same. Answer the same questions about the standard errors of the coefficients and the value of R-squared.

To answer these questions, simply write a new equation, writing out only those quantities whose values you can accurately predict and leaving blanks for those quantities whose value cannot be predicted. No explanation is required.

SAL^i=____+____AGEi+____AGEi2+____MSCAi+____MSCRi+____MALEi,R2=____.(3)\widehat{SAL}_i= \_\_\_\_+\_\_\_\_AGE_i+\_\_\_\_AGE_i^2+\_\_\_\_MSCA_i+\_\_\_\_MSCR_i+\_\_\_\_MALE_i, \qquad R^2=\_\_\_\_. \tag{3}

Standard-error template:

(___)(____)(____)(_____)(_____)(_____)(\_\_\_)\qquad(\_\_\_\_)\qquad(\_\_\_\_)\qquad(\_\_\_\_\_)\qquad(\_\_\_\_\_)\qquad(\_\_\_\_\_)

Question 15

Written part, Section B — original Question 3 — 25 marks

SECTION B. Answer only ONE question from this section: Question 3 OR Question 4.

A variable YY is determined by a variable XX, the relationship being

Yi=β1+β2Xi+ui,Y_i=\beta_1+\beta_2X_i+u_i,

where uiu_i is a disturbance term that satisfies regression Model B assumptions. The values of XX are drawn randomly from a population with variance σX2\sigma_X^2.

(a) (12 marks)

  • Assuming XiX_i and uiu_i are correlated, briefly show that the OLS estimator of β2\beta_2 is inconsistent.

  • Suppose that there exists a third variable ZZ that is correlated with XX but independent of uu. Demonstrate that if the researcher had regressed YY on XX using ZZ as an instrument for XX, the slope coefficient β^2IV\hat\beta_2^{IV} would have been a consistent estimator of β2\beta_2.

  • What is the difference between an instrumental variable, instrument, and a proxy variable?

  • What are the rules for choosing valid instruments?

(b) (13 marks)

  • During discussion of these results, some of the participants suggested using the TSLS approach instead, saying that it has many advantages compared to the instrumental variable approach. What is the TSLS estimator here?

  • Show that in this case the TSLS estimator is the same as the instrumental variable estimator.

Question 16

Written part, Section B — original Question 4 — 25 marks

A student is interested in the influence of the price level and income on tourists' expenditures on various types of entertainment in a large seaside resort, EitE_{it}, in thousands of dollars. Suppose that the time series of expenditures on DtD_t, diving; StS_t, surfing; PtP_t, paragliding; YtY_t, yachting; FtF_t, fishing; and TtT_t, spa treatment, for ten years, 2008-2017, form the panel as observations for particular units of some general type of entertainment, EitE_{it}. DPItDPI_t is disposable personal income, also in thousands of dollars, and PREitPRE_{it} is the relative price index for the corresponding EitE_{it}. Let the model under investigation be

LGEit=β1+β2LGDPIt+β3LGPREit+uit,LGE_{it}=\beta_1+\beta_2LGDPI_t+\beta_3LGPRE_{it}+u_{it},

where

LGE=log(E),LGDPI=log(DPI),LGPRE=log(PRE).LGE=\log(E), \qquad LGDPI=\log(DPI), \qquad LGPRE=\log(PRE).

She estimated three equations:

LGE^ij=4.66+0.93LGDPIj0.11LPREij,R2=0.048,RSS=15.181 - Pooled.(1)\widehat{LGE}_{ij}=-4.66+0.93LGDPI_j-0.11LPRE_{ij}, \qquad R^2=0.048, \qquad RSS=15.18 \qquad\text{1 - Pooled}. \tag{1}

Standard errors:

(5.84)(0.66)(0.06)(5.84)\qquad(0.66)\qquad(0.06) LGE^ij=2.13+0.98LGDPIj0.68LPREij,R2=0.994,RSS=0.09572 - Fixed.(2)\widehat{LGE}_{ij}=-2.13+0.98LGDPI_j-0.68LPRE_{ij}, \qquad R^2=0.994, \qquad RSS=0.0957 \qquad\text{2 - Fixed}. \tag{2}

Standard errors:

(0.51)(0.06)(0.05)(0.51)\qquad(0.06)\qquad(0.05) LGE^ij=2.15+0.98LGDPIj0.67LPREij,R2=0.899,RSS=0.1063 - Random.(3)\widehat{LGE}_{ij}=-2.15+0.98LGDPI_j-0.67LPRE_{ij}, \qquad R^2=0.899, \qquad RSS=0.106 \qquad\text{3 - Random}. \tag{3}

Standard errors:

(0.57)(0.06)(0.05)(0.57)\qquad(0.06)\qquad(0.05)

(a) (12 marks)

  • What are the advantages of panel data analysis comparing to cross-section regression and time series analysis? What relative advantages of panel data models can be indicated based on the comparison of the estimated equations (1)-(3)?

  • The student decided to use the approach based on the fixed effects panel data model. Help the student to understand what the LSDV method is and how to use it for estimation of the panel data model under consideration.

  • The fixed effect for yachting is 0.530.53, and for fishing 0.85-0.85; explain what this means for analysing equation (2).

(b) (13 marks)

  • How to test the presence of unobserved heterogeneity in the fixed effect model, or, to put it another way, that there are considerable differences between the pooled model and the model with fixed effects? Give some details: indicate the corresponding test and the data needed for it, null hypothesis, distribution of the test statistic, the number of degrees of freedom, and the decision rule.

  • What test can you use to choose between random effects and fixed effects panel model? The statistic of this test turned out to be 1.76; what conclusion can be drawn from this? What risks are associated with choosing models with fixed and random effects? What meaningful considerations can be used here when choosing the type of model? What is your final choice?