如何用R语言做工具变量回归(未完工版本

zz/2024/3/1 14:47:06

在计量经济学中的回归中,可能会遇到遗漏变量偏误、测量误差、双向因果等问题,那么工具变量是解决此类内生性问题的几大利器之一。本文用Stock&Waston课本章节为例,展示如何在R语言中进行工具变量的回归。工具变量回归用到的两阶段最小二乘法(2SLS)

首先载入需要用到包,没安装的话需要提前安装

library(AER)
library(stargazer)

载入数据并且进行统计描述

data("CigarettesSW")
summary(CigarettesSW)
# compute real per capita prices
CigarettesSW$rprice <- with(CigarettesSW, price / cpi)
# compute the sales tax
CigarettesSW$salestax <- with(CigarettesSW, (taxs - tax) / cpi)
# check the correlation between sales tax and price
cor(CigarettesSW$salestax, CigarettesSW$price)
# generate a subset for the year 1995
c1995 <- subset(CigarettesSW, year == "1995")
# perform the first stage regression
cig_s1 <- lm(log(rprice) ~ salestax, data = c1995)
coeftest(cig_s1, vcov = vcovHC, type = "HC1")
# inspect the Rˆ2 of the first stage regression
summary(cig_s1)$r.squared
# store the predicted values
lcigp_pred <- cig_s1$fitted.values
# run the stage 2 regression
cig_s2 <- lm(log(c1995$packs) ~ lcigp_pred)
coeftest(cig_s2, vcov = vcovHC)

上面的方法其实很傻,并不需要这样。其实只需要执行一次就可以

# perform TSLS using 'ivreg()'
cig_ivreg <- ivreg(log(packs) ~ log(rprice) | salestax, data = c1995)
coeftest(cig_ivreg, vcov = vcovHC, type = "HC1")

那么还要有其他控制变量的话呢

# add rincome to the dataset
CigarettesSW$rincome <- with(CigarettesSW, income / population / cpi)
c1995 <- subset(CigarettesSW, year == "1995")
# estimate the model
cig_ivreg2 <- ivreg(log(packs) ~ log(rprice) + log(rincome) | log(rincome) +
salestax, data = c1995)
coeftest(cig_ivreg2, vcov = vcovHC, type = "HC1")
# add cigtax to the data set
CigarettesSW$cigtax <- with(CigarettesSW, tax/cpi)
c1995 <- subset(CigarettesSW, year == "1995")
# estimate the model
cig_ivreg3 <- ivreg(log(packs) ~ log(rprice) + log(rincome) |
log(rincome) + salestax + cigtax, data = c1995)
coeftest(cig_ivreg3, vcov = vcovHC, type = "HC1")

# subset data for year 1985
c1985 <- subset(CigarettesSW, year == "1985")
# define differences in variables
packsdiff <- log(c1995$packs) - log(c1985$packs)
pricediff <- log(c1995$price/c1995$cpi) - log(c1985$price/c1985$cpi)
incomediff <- log(c1995$income/c1995$population/c1995$cpi) -
log(c1985$income/c1985$population/c1985$cpi)
salestaxdiff <- (c1995$taxs - c1995$tax)/c1995$cpi - (c1985$taxs - c1985$tax)/c1985$cpi
cigtaxdiff <- c1995$tax/c1995$cpi - c1985$tax/c1985$cpi
# estimate the three models
cig_ivreg_diff1 <- ivreg(packsdiff ~ pricediff + incomediff | incomediff +
salestaxdiff)
cig_ivreg_diff2 <- ivreg(packsdiff ~ pricediff + incomediff | incomediff +
cigtaxdiff)
cig_ivreg_diff3 <- ivreg(packsdiff ~ pricediff + incomediff | incomediff +
salestaxdiff + cigtaxdiff)
# robust coefficient summary for 1.
coeftest(cig_ivreg_diff1, vcov = vcovHC, type = "HC1")
# robust coefficient summary for 2.
coeftest(cig_ivreg_diff2, vcov = vcovHC, type = "HC1")
# robust coefficient summary for 3.
coeftest(cig_ivreg_diff3, vcov = vcovHC, type = "HC1")
# gather robust standard errors in a list
rob_se <- list(sqrt(diag(vcovHC(cig_ivreg_diff1, type = "HC1"))),
sqrt(diag(vcovHC(cig_ivreg_diff2, type = "HC1"))),
sqrt(diag(vcovHC(cig_ivreg_diff3, type = "HC1"))))
# generate table
stargazer(cig_ivreg_diff1, cig_ivreg_diff2,cig_ivreg_diff3,
header = FALSE,
type = "html",
omit.table.layout = "n",
digits = 3,
column.labels = c("IV: salestax", "IV: cigtax", "IVs: salestax, cigtax"),
dep.var.labels.include = FALSE,
dep.var.caption = "Dependent Variable: 1985-1995 Difference in Log per Pack Price",
se = rob_se)
# first-stage regressions
mod_relevance1 <- lm(pricediff ~ salestaxdiff + incomediff)
mod_relevance2 <- lm(pricediff ~ cigtaxdiff + incomediff)
mod_relevance3 <- lm(pricediff ~ incomediff + salestaxdiff + cigtaxdiff)
# check instrument relevance for model (1)
linearHypothesis(mod_relevance1,
"salestaxdiff = 0",vcov = vcovHC, type = "HC1")
# check instrument relevance for model (2)
linearHypothesis(mod_relevance2,
"cigtaxdiff = 0",
vcov = vcovHC, type = "HC1")
# check instrument relevance for model (3)
linearHypothesis(mod_relevance3,
c("salestaxdiff = 0", "cigtaxdiff = 0"),
vcov = vcovHC, type = "HC1")
# compute the J-statistic
cig_iv_OR <- lm(residuals(cig_ivreg_diff3) ~ incomediff + salestaxdiff + cigtaxdiff)
cig_OR_test <- linearHypothesis(cig_iv_OR,
c("salestaxdiff = 0", "cigtaxdiff = 0"),
test = "Chisq")
cig_OR_test
# compute correct p-value for J-statistic
pchisq(cig_OR_test[2, 5], df = 1, lower.tail = FALSE)


http://www.ngui.cc/zz/1568978.html

相关文章

ISP一键下载电路分析+74HC1G66GW(信号图)

之前要不用的最小系统的开发板要不就是用的SWD调试口&#xff0c;没有注意过ISP一键下载电路是个怎么回事&#xff0c;因为需要就简单的看了一下野火的ISP一键下载电路。 默认情况一下&#xff0c;一般我们的程序都是从用户闪存也就是内部的FLASH启动的&#xff0c; 对于F103RC…

Python数据分析基础之描述性统计与建模(1)

葡萄酒质量数据集 葡萄酒质量数据集包括两个文件——红葡萄酒文件和白葡萄酒文件。红葡萄酒文件中包含1599条观测&#xff0c;白葡萄酒文件包含4898条观测。两个文件中都有1个输出变量和11个输入变量。输出变量是酒的质量&#xff0c;是一个从0&#xff08;低质量&#xff09;到…

html标记表示在,在HTML标记中,用于表示文件开头的标记是( )

在HTML标记中&#xff0c;用于表示文件开头的标记是( )更多相关问题[单选] 被高温分解的二噁英类前驱物在FeCl3、CuCl2等灰尘的作用下&#xff0c;与烟气中的HCl在()℃左右重新合成二噁英类物质。[单选] 相比而言&#xff0c;()在生物热能缓释中的利用价值最大。[单选] 堆肥处理…

6Lowpan报头压缩

IPv6 的地址分为两部分&#xff0c;前 64bit 为网络前缀&#xff0c;后 64bit 为接口标识。网络前缀的表示方式类似于 IPv4 中的 CIDR &#xff08;ClasslessInterDomain Routing&#xff09;无类域间路由地址的表示方式&#xff0c;IPv6 节点地址中指出了前缀长度&#xff0c;…

conda 安装 pytoch 用国内源则一直安装的 cpu版本?

最近用更改了 anaconda 的源&#xff0c;然后用 pytorch 官网安装方法&#xff0c;不加 -c pytorch forge 啥的&#xff0c;安装完&#xff0c;用以下命令&#xff0c;一直返回 False&#xff0c;就很崩溃了&#xff0c;无意中看到一个可以查看环境中安装的各个版本&#xff0c…

强网杯2021 pwn writeup by syclover

no_output 首先是利用strcpy把fd给覆盖为0 然后read hello_boy 通过检测 接着触发算数运算错误 就能进入栈溢出函数 最后就是直接用32位ret2dlresolve的模板了 from pwn import * arch 32 challenge "./test" local int(sys.argv[1]) context(log_level &…

【Python计量】异方差性的处理

文章目录一、使用“OLS稳健标准误”二、加权最小二乘法&#xff08;WLS)三、可行加权最小二乘法&#xff08;FWLS&#xff09;通过上篇内容&#xff0c;我们通过画残差图、BP检验、怀特检验、GQ检验等方法&#xff0c;发现模型存在异方差性。本篇文章主要介绍如何对异方差进行处…

医疗保健客户关系管理(CRM)-市场现状及未来发展趋势

本文研究全球及中国市场医疗保健客户关系管理(CRM)现状及未来发展趋势&#xff0c;侧重分析全球及中国市场的主要企业&#xff0c;同时对比北美、欧洲、中国、日本、东南亚和印度等地区的现状及未来发展趋势。 根据QYR&#xff08;恒州博智&#xff09;的统计及预测&#xff0c…

渗透之——使用Metasploit实现基于SEH的缓冲区溢出攻击

转载请注明出处&#xff1a;https://blog.csdn.net/l1028386804/article/details/86506457 攻击机&#xff1a;Kali 192.168.109.137 靶机&#xff1a; WinXP 192.168.109.141(其他Win系统亦可) 应用程序&#xff1a;Easy File Sharing Web Server 7.2 这里&#xff0c;我们…

使用Cypher获取指定结构的树

使用Cypher获取指定结构的树使用Cypher获取指定结构的树一、来自社区的问题链接二、编写查询实现数据封装2.1 创建样例数据2.2 Cypher实现Here’s the table of contents:使用Cypher获取指定结构的树 一、来自社区的问题链接 Neo4j 图数据库中文社区&#xff1a;如何获取指定…