Эконометрика — МИЭФ, 2023 midterm 1
Question 1
Multiple-choice test
Which of the following statements is true?
-
If the calculated value of the statistic is higher than the critical value, we reject the alternative hypothesis in favor of the null hypothesis.
-
The statistic is always nonnegative as is never smaller than .
-
Degrees of freedom of a restricted model is always less than the degrees of freedom of an unrestricted model.
-
The statistic is more flexible than the statistic to test a hypothesis with a single restriction.
-
None of the above.
Question 2
Multiple-choice test
In a regression model, if variance of the dependent variable , conditional on an explanatory variable , is not constant, then:
-
The statistics are invalid and confidence intervals are valid for small sample sizes.
-
The statistics are valid and confidence intervals are invalid for small sample sizes.
-
The statistics and confidence intervals are valid no matter how large the sample size is.
-
The statistics and confidence intervals are both invalid no matter how large the sample size is.
-
The OLS estimators are biased, and hence no need to discuss statistics and confidence intervals.
Question 3
Multiple-choice test
In econometrics, simultaneity bias arises when:
-
Strictly exogenous explanatory variables determine the dependent variable through a step-by-step process.
-
The disturbance term is correlated with the dependent variable.
-
One or more of the explanatory variables is jointly determined with the dependent variable.
-
Heteroscedasticity is present in the model.
-
There is correlation between some explanatory variables.
Question 4
Multiple-choice test
For the model
where are non-stochastic and the Model A assumptions are satisfied, the estimator
is, generally speaking:
-
An unbiased and efficient estimator of .
-
An unbiased but inefficient estimator of .
-
A biased estimator of .
-
A non-linear estimator of .
-
Non-stochastic.
Question 5
Multiple-choice test
For the sample of 55 observations, functions (1) and (2) were estimated:
The determination coefficients for these models are and , respectively. The statistic for testing the hypothesis in (1) equals:
-
.
-
.
-
.
-
.
-
You cannot test this hypothesis using (1) and (2).
Question 6
Multiple-choice test
The function of expenditures for cosmetics depending on disposable personal income has been estimated using OLS for a representative sample of people:
where is expenditure for cosmetics, is disposable personal income, for females and for males, and for males and for females.
For this regression the following is correct:
-
The estimates of intercept are the same for male and female subsamples, while the estimates of slope coefficient, generally speaking, differ for them.
-
The estimates of slope coefficient are the same for male and female subsamples, while the estimates of intercept, generally speaking, differ for them.
-
Both intercepts and slope coefficients estimated, generally speaking, differ for male and female subsamples.
-
Both intercepts and slope coefficients estimated are the same for male and female subsamples.
-
The combination of intercept and slope dummies is incorrect, and the model cannot be estimated.
Question 7
Multiple-choice test
If you have estimated the parameters of the following model using OLS directly, with the Gauss-Markov conditions satisfied,
then:
-
You can get an unbiased estimate of .
-
You cannot get an unbiased estimate of , but can easily get a consistent estimate of it.
-
You cannot get an unbiased, or biased but consistent, estimate of .
-
You cannot get any estimate of .
-
All the above statements are incorrect.
Question 8
Multiple-choice test
If OLS is used in a simple regression model in the case of heteroscedasticity, the population variance of the slope coefficient is
The formula for the homoscedasticity case is
Let , where are unknown weights and . Then:
-
Expression (1) is always greater than (2).
-
Expression (1) is always less than (2).
-
Expression (1) is greater than or equal to (2).
-
Expression (1) is less than or equal to (2).
-
Expression (1) can be greater than, less than, or equal to (2), depending on the nature of the relationship between and .
Question 9
Multiple-choice test
In the regression model
where satisfies the Gauss-Markov conditions and is normally distributed, the explanatory variable includes random measurement errors that are independent, normally distributed, homoscedastic, not autocorrelated, and have zero expected values. Suppose and the mean value of is negative. When estimating the model using OLS, for large samples:
-
The estimator of will be biased upwards.
-
The estimator of will be biased downwards.
-
The estimator of will be unbiased.
-
The estimator of may be biased upwards or downwards.
-
The OLS estimator of does not exist.
Question 10
Multiple-choice test
For a simultaneous equations model with 7 equations, 7 endogenous variables and 7 exogenous variables, the following statement is true:
-
With that number of potential instruments, any equation is identified in the model.
-
An equation in the model is identified if and only if only exogenous variables are available on its right-hand side.
-
The number of potential instruments is insufficient to make all the equations identified.
-
No equation can be overidentified in the model.
-
None of the above.
Question 11
Multiple-choice test
The economic model is described by the following simultaneous equations:
\begin{aligned} y_1&=\delta+\tau y_2+\pi x_2+u_2, \tag{1}\\ y_2&=\alpha+\pi y_1+\gamma x_1+\phi x_2+u_1. \tag{2} \end{aligned}Here and are endogenous variables; and are stochastic exogenous variables; and and are disturbance terms satisfying the Gauss-Markov conditions. Indicate the correct statement:
-
You may apply TSLS in (1), but not in (2).
-
You may apply TSLS in (2), but not in (1).
-
You may apply TSLS in both (1) and (2).
-
You may not apply TSLS in either (1) or (2).
-
TSLS is not needed since OLS provides consistent estimates in (1) and (2).
Question 12
Multiple-choice test
The model with the dependent variable (monthly pension), as a function of work experience and average earnings , is being considered:
The value of pension is restricted by the values and from the top and from the bottom, but there are no actual observations in the sample with or . The student decided to estimate a Tobit model for this sample. Indicate the correct statement:
-
The Tobit estimators of the model coefficients are biased and inconsistent.
-
The Tobit estimators of the model coefficients are biased but consistent.
-
The Tobit model estimates will be the same as the OLS estimates here.
-
The Tobit model may not be estimated for this sample.
-
None of the above.
Part 2. Free Response Questions — 1 hour 30 minutes.
Section A. Answer all questions from this section (original Questions 1-2).
Question 13
Written Question 1 — 25 marks
A student is investigating factors that affect schoolchildren's consumption of unhealthy food at fast-food restaurants, such as McDonald's. Let be the average number of hamburgers consumed per month in 2021 and let be age. The student wants to understand whether the dependence differs between boys and girls. She introduces a dummy variable equal to 1 for boys and 0 for girls.
Using a sample of 17 boys and 13 girls, for a total of 30 observations, she first runs the simple regression
with standard errors
Assuming that boys eat more frequently at a fast-food restaurant, she defines a slope dummy variable and fits the regression
with standard errors
(a)
- What is the meaning of the coefficients of regression (2)?
- Is there any difference in the influence of on between boys and girls? How should the significance of this difference be tested?
- Can the answer to the previous question be obtained using the Chow test? What additional information is needed, and how can it be obtained?
- Can the answer to the previous question be obtained using the Chow test? What additional information is needed, and how can it be obtained?
(b)
When the student showed her results to the supervisor, the supervisor advised her to evaluate a simplified regression of the form
This regression does not take into account the effect of age . The student did not have a computer with her to recalculate the coefficients. The supervisor noted that it is sufficient to know the average number of hamburgers consumed per month for girls, , and boys, , because it can be shown that
- Show that these statements are true for regression (3).
- Provide the intuition behind these statements.
Question 14
Written Question 2 — 25 marks
A student tries to determine how expenditure on education , in billions of dollars, relates to GDP , in billions of dollars, and population , in millions, using data on 34 developed and developing countries with high, medium, and low aggregate income for 2020. Here and below, denotes the regression residual.
She estimates
with standard errors
(a)
- Why may the student fear the presence of heteroscedasticity? Explain using your understanding of heteroscedasticity.
- How could heteroscedasticity influence the regression results?
- How can heteroscedasticity be detected using graphs? Specify the relevant graphs.
- The student arranges countries by and runs two regressions in specification (1). For the 10 countries with the highest values, she obtains . For the 14 countries with the lowest values, she obtains . Conduct an appropriate heteroscedasticity test using this information.
- The supervisor advises the student to use per-capita values and instead of the absolute values and . The student again arranges countries by . For the 10 countries with the highest , she obtains ; for the 14 countries with the lowest , she obtains . Explain the idea behind this advice and assess its usefulness using an appropriate test.
(b)
The student next estimates the multiple regression
with standard errors
- Compare the coefficients on in equations (1) and (2). How have the meaning, value, and significance of the coefficient changed? Which value seems more reasonable, and why?
- To test equation (2) for heteroscedasticity, the student uses the Breusch-Pagan test and obtains for the auxiliary regression. Complete the test, describe the procedure, and state the result.
On the advice of a friend, the student estimates the model in logarithms:
with standard errors
For equation (3), the auxiliary regression has for the Breusch-Pagan test and for the White test with cross terms.
- Why might using logarithms help eliminate heteroscedasticity? Did it help according to the Breusch-Pagan and White tests? Complete both tests. Why do their results differ, and which test should be trusted more?
- What would you advise the student to do to eliminate heteroscedasticity in equation (2)?
Section B. Answer one question from this section (original Question 3 or Question 4).
Question 15
Written Question 3 — 25 marks
A student in ICEF's econometrics course uses data on a sample of 100 students to study which factors determine the score , out of 100 points, on the winter econometrics exam. Since econometrics relies heavily on statistics, one possible factor is the student's knowledge of statistics :
Direct measurement of is not possible. The available variable is , the score, also out of 100 points, obtained in the second-year statistics exam. Because students were nervous during this exam, the student assumes a measurement error:
where is independent of and , with
Using OLS, she obtains
with standard errors
(a)
- What are the consequences of measurement error in the regressor when estimating by OLS?
- A friend points out that the statistics exam was graded very harshly, with students' grades lowered, so it may be more appropriate to assume . What additional consequences for estimation of by OLS does this assumption have?
(b)
- What are the consequences of measurement error when estimating the intercept by OLS under the assumption ?
- Illustrate graphically the result obtained in the previous question for the estimation of .
(c)
The student also has grades in other subjects: for mathematics, for banking, for linear algebra, and others. She assumes that these variables are not subject to measurement error. She regresses on all these variables, saves the residuals , and includes them in the equation
with standard errors
- Comment on the aim and logic of this procedure and on the obtained results.
- During the winter econometrics exam, students were also nervous, introducing measurement error into , with and . How does this assumption affect the properties of the estimate of in equation (2)?
- Would your conclusions change if, because the econometrics exam was just before New Year, graders were instructed to resolve all controversial cases in favor of students, so that ? No rigorous derivation is required.
Question 16
Written Question 4 — 25 marks
After graduating from university, a student joins a consulting firm dealing with the promotion of candidates and the organization of online elections. Her first assignment is to advise potential candidate A for the position of head of the student organization.
Candidate A is young and inexperienced. He can spend only $2,000 on advertising, but respondents rank his attractiveness as 5. Candidate B is more experienced and plans to spend $5,000 on advertising, with an attractiveness rank of 2. Each candidate receives 3 free appearances on local television; additional appearances cost $799 each.
The student has data from 200 past elections:
- : number of votes cast for the candidate;
- : binary variable equal to 1 if the candidate was elected;
- : amount spent on promotion, in thousands of US dollars;
- : number of appearances on television special events;
- : personal appeal of the candidate, on a scale from 1 to 5.
Using these data, she estimates the following models. Standard errors, or their counterparts, are in parentheses. For the probit model,
is the standard normal probability density function.
with standard errors
with standard errors
with standard errors
(a)
- Explain the meaning of regression (1). Compare the candidates' chances based on model (1), assuming that a higher expected number of votes is an indicator of success.
- Explain the meaning of regression (2) and its coefficients.
- Explain the logic of model (3), including the mechanism used to obtain its regression results.
(b)
According to model (3), what are the chances for each candidate to be elected? Compare them with the results of model (2). Which model can be trusted more?
Recall the candidate data:
- Candidate A: , , ;
- Candidate B: , , .
(c)
Using the marginal effects of advertising and television appearances in model (3), advise Candidate A how to reallocate his funds between advertising and television appearances to close the gap with Candidate B or overtake him. One additional television appearance costs $799, and the candidate has no additional funds. Show with calculations that the proposed reallocation can improve Candidate A's chances.