**Key Topics**

## Question 1

(a) Give an intuitive and non-technical explanation of what is meant by cointegration, illustrating your argument with at least one economic example.

(b) Taking as an example the determination of employment, which you may assume to be I(1), describe the steps in the construction, estimation and interpretation of an Error Correction Model (ECM) to explain employment.

## Answer 1

i) his attendance to school

ii) His concentration in the studies

iii) His exposure to outside books as good library

iv) Educated parents who encourage them

v) Educational background of siblings and cousins, etc.

Thus the independent variable, depends on so many other independent variables.

This is called multi variables. If one by one namely attendance to school and mark is checked, and an equation is found out, then next concentration and marks, and so on, then we are left with a set of 5 equations, with one independent variable and one dependent variable each in one equation.

This normally does not serve any purpose. Instead of all independent variable are co integrated and an equation with least errors is found out by

Yt = α+β1(attendance)+β2(concentration)+ β3(eduated parents)+… and so on.

With at last error function added to it.

Co integration is a statistical property of a collection of time series variables (x1, x2,…xt).

Thus instead of independent variables here we have a set of time series variables, and with these we want to estimate yt.

Co integration is an equilibrium relationship between time series that are not in equilibrium individually.

Then Y = all expenditures + all savings.

Here we find that Y and expenditure may be correlated and Y and savings may be correlated though expenditure and savings need not be correlated.

Thus we have

Yt = α+β1(expenditure) + β2(savings)+ε

Where the last term corresponds for error which has to be kept at minimum.

By survey and analysis for a series of times, β1 and β2 can be estimated with least error,and estimation of household income total would be made possible.

## Question 2

(b) Consider the model given by:

## Answer 2

Consider this

Yt = yt-1+εt

Here Yt is a time series, and we find a function yt in terms of the previous time series yt-1

i.e. Yt is a function of yt-1+ an error term (also known as white noise)

Normally error function (white noise) is N(0,1)

This type of process where the current value of the variable is composed of the past value plus an error term, is called a random walk.

A random walk with a drift :

This is a modified process of simple random walk in that

Yt = yt-1 +α + εt

i.e. the random walk includes only the past value plus error term while random walk with a drift includes the past value + error term + a term α.

If α>0, this shows a positive or upward trend otherwise downward trend.

If we try to get value of yt successively in terms of y0, we get

. yt = y0+ αt+∑_(t=1)^n?ε_t

Many economic time series follow a pattern that is a trend model or a random walk with a drift.

b) Let the initial value be y0. We know y1 = y0+ α + ε1 and y2 = y1+ α + ε2

= y0+ 2α + ε1+ ε2

And so on. Continuing in this process, we get

Yt = = y0+ αt+∑_(t=1)^n?ε_t

c) In simple trend we do not have that alpha.

Hence in simple series we have

Yt = = y0 +∑_(t=1)^n?ε_t , i.e. variable at a time depends on the previous value and an error.

Here no alpha term and hence we cannot say upward or downward trend.

## Question 3

LnWage i = α + JobTraining i β 1 + UniversityDegree i β 2 + ε i

Where LnWage i is the log of hourly wages of individual i, JobTraining i is a dummy which has value 1 if individual i participated in job training in the last 12 months and 0 otherwise, and University i is a dummy which is 1 if individual i has a university degree and 0 otherwise. While ε i is the remaining error term, α, β 1 and β 2 are the parameters to be estimated. Discuss:

a) Why in this type of model we cannot say with certainty that participating in job training has a causal impact of on wages. What would we need to be able to say that the impact is causal?

b) Assuming that the only information we have are wages and whether the individual has participated in job training, and whether she has a university degree, how can panel data help in establishing causality?

c) Discuss an intervention/experiment that may allow you to estimate the causal impact of job training on wages.

## Answer 3

Ln wagei = α+Job training β1+University degree β2+εi.

a) Participating in job training influences the log wage to the extent that

For a unit training we have increase in ln wage equal to β1.

In other words, rate of change of Ln wage wrt job training is β1 when others are kept constant.

Thus the increase or decrease depends on the value of β1. If β1>0, we find that Ln wage increases as training increases and vice versa. If β1=0, then there will not be any relation between training and ln wage.

Hence without knowing the value of β1 we cannot say that job training has a causal impact on ln wages. If we know the value of β1 then we can say whether this is a causal relation. If β1=0 or nearer to 0, then job training is not causal at all.

b) If we know the information about wages, job training and university degree, we can find the correlation (linear) between ln wages and job training and also ln wages and university degree. If this correlation coefficient which lies normally between -1 and 1 gives an indicator about the relationship. If |r| is nearer to 1, then there is a strong linear correlation between the two variables, and if |r| is nearer to zero, then the linear relationship is very weak.

Using softwares, or formulae we can find a formula for beta 1 and beta 2 and get the equation for

Ln wages = α+Job training β1+University degree β2

Here alpha is the value of ln wages, when there is no training or university degree.

Using the above we can find the variable ln wages for any given job training or university degree. The data can give us light about strong linear correlation or weak linear correlation thus helping us to say causal if strong correlation.

Linear squares method is the one which keeps the squares of deviations from the estimated regression line to observed data to the minimum.

Thus we can find the relationship as

Yt = α+Job training β1

Which has the least errors.

This repeated for different times, and thus a pattern is found out for estimating the slope beta 1.

## Question 4

a) Your estimate of average wages for men

b) Your estimate of average wages for women

c) The comparison between wages of men and women

## Answer 4

The survey takes care of workers working in firm A.

Given that there is a huge parity between wages. As a result less paid workers may not like to participate in the survey at all.

This results in non recording of all or most of the low paid workers since they did not like to participate because of their job dissatisfaction. This may lead to recording only high paid workers which will show the average pay on a higher side. Thus estimate of average wages for men will show a higher figure than actual.

b) ) Your estimate of average wages for women:

This also resembles the same as above. And also women are normally low paid than men and also psychologically women have a tendency to work, finish and go home to attend to their household responsibility. They do not want to know also about the survey, and hence only highly paid women employees would like to participate and thus this gives a very high figure of average wages for women than actual. The difference will be large and glaring than for males.

c. The comparison between wages of men and women:

As explained in part a, and b, the wages of women may be lower in the firm than men. But due to maximum women low paid not participating in the survey, the females average would show a very high figure compared to the males average which shows a slightly lower higher figure. This makes comparison difficult, the average difference between males and females would be lower as per survey than the actual difference.