多个内生变量的工具变量怎样在stata中以2sls的方法实现？

pq366

19627

收藏 2010-06-01

多个内生变量的工具变量怎样在stata中以2sls的方法实现呢，谢谢。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

jiangbogz

2010-12-20 16:22:08

同问同问。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

efei200x

2010-12-21 20:53:15

能详细一点吗？是多个内生变量，一个instrument?

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanaoxue

2012-5-5 10:39:05

你好，请问你会了吗？可以教我一下吗？谢谢

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

yarsuse

2013-8-19 16:47:16

同问同问

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

捣蛋布叮

2013-11-20 15:54:11

同问同问

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

点击查看更多内容…

蓝色

2013-11-20 19:12:19

ivregress 2sls y x1 x2 (x3=iv1 iv2) (x4=iv1 iv2)

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

捣蛋布叮

2013-11-26 09:44:40

ivregress 2sls y x1 x2 (x3=iv1 iv2) (x4=iv1 iv2) 好

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

xiaoyuertutu

2018-5-2 15:56:36

蓝色发表于 2013-11-20 19:12
ivregress 2sls y x1 x2 (x3=iv1 iv2) (x4=iv1 iv2)

您好，请问，您这个命令是x3，x4两个内生变量的工具变量相同，都是iv1和iv2嘛？

我还有个问题：
x1x2是外生变量；x3 x4 x5内生变量，相应的工具变量分别是ivx3 ivx4 ivx5，那么2sls的命令：
ivregress 2sls y x1 x2 (x3 x4 x5=ivx3 ivx4 ivx5)，r first
问题：x1 x2 x3，每个内生变量的工具变量有且仅有一个。但stata却理解为x1的工具变量为IVX1 IVX2 IVX3，x2的工具变量为IVX1 IVX2 IVX3，x3的工具变量为IVX1 IVX2 IVX3。
那么我根据您这个命令的启示，修改为：ivregress 2sls y x1 x2 (x3=ivx3) (x4= ivx4) (x5=ivx5)是可以的吗？
期待您的解答

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

蓝色

2018-5-2 20:34:02

xiaoyuertutu 发表于 2018-5-2 15:56
您好，请问，您这个命令是x3，x4两个内生变量的工具变量相同，都是iv1和iv2嘛？

我还有个问题：

命令应该是
ivregress 2sls y x1 x2 (x3 x4 x5=ivx3 ivx4 ivx5)，r first
其他格式软件应该不允许
https://www.stata.com/support/faqs/statistics/instrumental-variables-regression/

Must I use all of my exogenous variables as instruments when estimating instrumental variables regression?

Title		Two-stage least-squares regression
Author		Vince Wiggins, StataCorp

[size=14.6667px]Note: This model could also be fit with sem, using maximum likelihood instead of a two-step method.
You can find examples for recursive models fit with sem in the “Structural models: Dependencies between response variables” section of [SEM] intro 5 — Tour of models.

[size=14.6667px]Someone posed the following question:

[size=14.6667px]I am estimating an equation: Y = a + bX + cZ + dW I then want to instrument W with Q. I know the first-stage regression is supposed to be W = e + fX + gZ + hQ (i.e., use all the exogenous variables in the first stage). Actually this is automatically done if I use the ivregress command. However, I only want to use Q to instrument W without using X and Z in the first stage. Is there a way I can do it in Stata? I can regress W on Q and get the predicted W, and then use it in the second-stage regression. The standard errors will, however, be incorrect.

[size=14.6667px]ivregress will not let you do this and, moreover, if you believe W to be endogenous because it is part of a system, then you must include X and Z as instruments, or you will get biased estimates for b, c, and d.

[size=14.6667px]Consider the system

Y1 = a0 + a1*Y2 + a2*X1 + a3*X2 + e1 (1) Y2 = b0 + b1*Y1 + b2*X3 + b3*X4 + e2 (2)

[size=14.6667px]Warning: Assume we are estimating structural equation (1); if X1 and X2 are exogenous, then they must be kept as instruments or your estimates will be biased. In a general system, such exogenous variables must be used as instruments for any endogenous variables when the instrumented value for the endogenous variables appears in an equation in which the exogenous variable also appears.

[size=14.6667px]Consider the reduced forms of your two equations:

Y1 = e0 + e1*X1 + e2*X2 + e3*X3 + e4*x4 + u1 (1r) Y2 = f0 + f1*X1 + f2*X2 + f3*X3 + f4*x4 + u2 (2r)

[size=14.6667px]where e# and f# are combinations of the a# and b# coefficients from (1) and (2) and u1 and u2 are linear combinations of e1 and e2.

[size=14.6667px]All exogenous variables appear in each equation for an endogenous variable. This is the nature of simultaneous systems, so efficiency argues that allexogenous variables be included as instruments for each endogenous variable.

[size=14.6667px]Here is the real problem. Take (1): the reduced-form equation for Y2, (2r), clearly shows that Y2 is correlated with X2 (by the coefficient f2). If we do not include X2 among the instruments for Y2, then we will have failed to account for the correlation of Y2 with X2 in its instrumented values. Since we did not account for this correlation, when we estimate (1) with the instrumented values for Y2, the coefficient a3 will be forced to account for this correlation. This approach will lead to biased estimates of both a1 and a3.

[size=14.6667px]For a brief reference, see Baltagi (2011). See the whole discussion of 2SLS, particularly the paragraph after equation 11.40, on page 265. (I have no idea why this issue is not emphasized in more books.)

[size=14.6667px]Failing to include X4 affects only efficiency and not bias.

[size=14.6667px]However, there is one case where it is not necessary to include X1 and X2 as instruments for Y2. That is when the system is triangular such that Y2 does not depend on Y1, but you believe it is weakly endogenous because the disturbances are correlated between the equations. You are still consistent here to do what ivregress does and retain X1 and X2 as instruments. They are, however, no longer required. Then you could do what you suggested and just regress on the predicted instruments from the first stage.

[size=14.6667px]If you do use this method of indirect least squares, you will have to perform the adjustment to the covariance matrix yourself. Consider the structural equation

y1 = y2 + x1 + e

[size=14.6667px]where you have an instrument z1 and you do not think that y2 is a function of y1.

[size=14.6667px]The following example uses only z1 as an instrument for y2. Let’s begin by creating a dataset (containing made-up data) on y1, y2, x1, and z1:

. sysuse auto (1978 Automobile Data) . rename price y1 . rename mpg y2 . rename displacement z1 . rename turn x1

[size=14.6667px]Now we perform the first-stage regression and get predictions for the instrumented variable, which we must do for each endogenous right-hand-side variable.

. regress y2 z1

Source	SS df MS	Number of obs = 74
		F( 1, 72) = 71.41
Model	1216.67534 1 1216.67534	Prob > F = 0.0000
Residual	1226.78412 72 17.0386683	R-squared = 0.4979
		Adj R-squared = 0.4910
Total	2443.45946 73 33.4720474	Root MSE = 4.1278


y2		Coef. Std. Err. t P>\|t\| [95% Conf. Interval]

z1		-.0444536 .0052606 -8.45 0.000 -.0549405 -.0339668
_cons		30.06788 1.143462 26.30 0.000 27.78843 32.34733

. predict double y2hat (option xb assumed; fitted values) * perform IV regression . regress y1 y2hat x1

Source	SS df MS	Number of obs = 74
		F( 2, 71) = 12.41
Model	164538571 2 82269285.5	Prob > F = 0.0000
Residual	470526825 71 6627138.38	R-squared = 0.2591
		Adj R-squared = 0.2382
Total	635065396 73 8699525.97	Root MSE = 2574.3


y1		Coef. Std. Err. t P>\|t\| [95% Conf. Interval]

y2hat		-463.4688 117.187 -3.95 0.000 -697.1329 -229.8046
x1		-126.4979 108.7468 -1.16 0.249 -343.3328 90.33697
_cons		21051.36 6451.837 3.26 0.002 8186.762 33915.96

[size=14.6667px]Now we correct the variance–covariance by applying the correct mean squared error:

. rename y2hat y2hold . rename y2 y2hat . predict double res, residual . rename y2hat y2 /* put back real y2 */ . rename y2hold y2hat . replace res = res^2 (74 real changes made) . summarize res

Variable		Obs Mean Std. Dev. Min Max

res		74 7553657 1.43e+07 117.4375 1.06e+08

. scalar realmse = r(mean)*r(N)/e(df_r) /* much ado about small sample */ . matrix bmatrix = e(b) . matrix Vmatrix = e(V) . matrix Vmatrix = e(V) * realmse / e(rmse)^2 . ereturn post bmatrix Vmatrix, noclear . ereturn display


		Coef. Std. Err. t P>\|t\| [95% Conf. Interval]

y2hat		-463.4688 127.7267 -3.63 0.001 -718.1485 -208.789
x1		-126.4979 118.5274 -1.07 0.289 -362.8348 109.8389
_cons		21051.36 7032.111 2.99 0.004 7029.73 35072.99

ReferenceBaltagi, B. H. 2011.Econometrics. New York: Springer.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

黃河泉

2018-5-3 10:00:34

xiaoyuertutu 发表于 2018-5-2 15:56
您好，请问，您这个命令是x3，x4两个内生变量的工具变量相同，都是iv1和iv2嘛？

我还有个问题：

"但stata却理解为x1的工具变量为IVX1 IVX2 IVX3，x2的工具变量为IVX1 IVX2 IVX3，x3的工具变量为IVX1 IVX2 IVX3。" 这是你的理解，不是 Stata 的理解！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

xiaoyuertutu

2018-5-8 22:02:13

蓝色发表于 2018-5-2 20:34
命令应该是
ivregress 2sls y x1 x2 (x3 x4 x5=ivx3 ivx4 ivx5)，r first
其他格式软件应该不允许

谢谢老师，这个太细致了~~
看过多次您发的命令，受益匪浅，衷心感谢您~~我还有几个问题：
1、r是robust吗？
2、first是什么意思，为什么要加它？
3、一个内生变量需要选择2个以上的工具变量还是只选择1个就可以呢？
再次感谢您的回答~~辛苦啦

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

xiaoyuertutu

2018-5-8 22:26:14

黃河泉发表于 2018-5-3 10:00
"但stata却理解为x1的工具变量为IVX1 IVX2 IVX3，x2的工具变量为IVX1 IVX2 IVX3，x3的工具变量为IVX1 IVX ...

谢谢黄老师，是我学识太浅，还没搞懂这个方法~~
我看了不少帖子，对工具变量法命令的原理还是困惑，想请教您：
假设
y = x1 x2 x3，其中x1 是内生变量，z1 z2 是x3的工具变量，x2和x3是控制变量
1、两阶段检验直接一步命令：ivregress 2sls y x2 x3 (x1= z1 z2)就可以吗？后面是否需要逗号再加其他内容，如robust？因为拆开分别做的话，标准误会有问题，所以这样是否就可以一步实现？
2、如果第一阶段是ols，第二阶段想做tobit，那么命令改为：ivtobit 2sls y x2 x3 (x1= z1 z2)就可以吗？
再次感谢您~~辛苦啦

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

努力发刊的小鱼

2021-5-19 14:40:39

如果y有多个，工具变量怎么进行呢？弱工具变量怎么检测呢？

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群