
2025-07-04 12:08 点击次数:140
DID专题一:法式两期DID商量框架基础详解泷川雅美视频
未来和后天分别更新python和R的竣事代码。
二、战略配景2.1战略发布及战略演变过程2012年7月,青岛发布《对于拓荒长期医疗顾问保障轨制的见识(试行)》,其中章程在城镇运转推广长护险轨制,到2015年将这长护险轨制扩展到农村地区。这里咱们不错在意到两个要津本事点2012年和2015年。
2.2青岛长护险轨制试点的参保对象进入城镇员工医疗保障、城乡住户医疗保障的参保东谈主调理纳入长护险。
2.3享受长护险待遇的法式参保东谈主因大哥、疾病、伤残等原因终年卧床已达或预期达到6个月以上病情基本分解,按照《日常生计智力评定量表》评定低于60分2.4长护险轨制提供的待遇2.4.1 四种照护医疗专护:二级及以上入院定点医疗机构医疗专护病房为参保东谈主提供长期24小时不息医疗处事顾问院医疗顾问(院护):医养相接的顾问处事机构为入住本机构的参保东谈主提供24小时不息医疗顾问处事居家医疗处事(家护):顾问处事机构派医护东谈主员到参保员工家中提供医疗顾问处事社区巡护(巡护),顾问处事机构(含村卫生室)派医护东谈主员到参保东谈主家中提供处事2.4.2 保障妥贴章程的医疗顾问用度三、数据证实原论文给与了2011、2013、2015三期的看望数据。咱们本次主要先容法式两期DID的应用,是以只给与2011和2015两期的数据。数据样本为45岁以上中老年东谈主家庭和个东谈主,具体内容包含家庭特征、东谈主口统计学特征、个东谈主和家庭经济现象、健康现象、医疗处事运用偏持忽地、生计民风、疾病史等。
四、模子拔擢4.1 驱散组和战略实行前后在DID活动的模子拔擢中,最为病笃的两个变量拔擢是:
分清实验组和驱散组,即受到战略搅扰的样本和不受战略影响的样本,依据战略配景可知,实验组样本为看望数据中的青岛的城镇中老年住户,驱散组为看望数据中的其它城市的城镇住户;
分清战略实施的先后。依据战略配景可知,战略在2012年运转实行,由于本次咱们仅先容法式两期DID活动的应用,因此2011年数据为长护险轨制实行前的样本数据,2015年看望数据为轨制实行之后的样本数据。
4.2 效果变量效果变量本文依据论文的拔擢,包含以下办法:已往一个月门诊消费总量、已往一个月门诊就诊次数、已往一年入院消费总数、已往一年入院次数。
4.3 驱散变量相同参照论文,咱们探究了年岁、性别、婚配现象等东谈主口学变量;讲授、家庭东谈主均收入等社会经济地位变量;自评健康、慢性病患病数目等健康水平变量。何况通过驱散。
图片
最终得到以下追念模子:
式(1)中,bi暗示城市,暗示样本个体,暗示本事,为因变量具体追念中步骤替换为表1中的四个因变量,为驱散变量,为就地扰动项,为中枢追念参数,暗示在青岛城市住户中实施的长护险轨制的战略效应。
五、运用stata竣事数据分析==取得分析数据关爱公众号并回答:长护险DID分析==
5.1 装配外部敕令包ttable用于分组T视察,winsor2用于缩尾大约截尾,logout用于输出统计效果到word/excel,sum2docx用于输出刻画性统计分析到word,diff用于DID分析,reghdfe用于高维固定效应追念,ftools是基于mata说话的套件(普通运行reghdfe所需的外部包)。
ssc install ttable2 , replace ssc install winsor2 , replace ssc install logout , replace ssc install sum2docx , replace ssc install diff , replacessc install reghdfe , replace ssc install ftools, replace以上的每一溜code运行收效后会透露如下内容(****步骤为ttable2 winsor2 logout sum2docx)
checking **** consistency and verifying not already installed...all files already exist and are up to date.
在国内有时运行很慢大约装配不收效,这是不错先运行以下代码:
ssc install cnssc , replace之后再运行以下代码:
cnssc install ttable2 , replace cnssc install winsor2 , replace cnssc install logout , replace cnssc install sum2docx , replacecnssc install diff , replace cnssc install ftools, replace5.2 刻画性统计分析
最初章程责任旅途(举例数据位于E盘project文献夹,分析效果也鸠合放在此文献夹),导入数据并对数据进行缩尾处理(去除极点值)。
cd 'E:\project' //设定责任旅途use final.dta , clear //导入dta表情数据foreach var in cost_clinic time_clinic cost_hos time_hos Post Treat DID Rural Age Gender married Edu_Group fainc lnfainc chro gh pain cesd{ drop if `var'==.} //删除轻易变量有缺失值的样本winsor2 cost_clinic time_clinic cost_hos time_hos Age chro Edu_Group fainc lnfainc gh cesd,cut(1 99) replace //对整个不息变量以及多值定序分类变量进行1%和99%的缩尾处理删除任一有缺失值的样本是给与foreach轮回竣事,'vat'暗示轮回指针,'in'到'{’之间的变量暗示需要参与轮回的变量,'{}’之中是进行样本删除的代码。
winsor2敕令后是需要进行winsorize处理的变量;','后(option部分),cut(1 99)暗示缩尾的节点为1%分位数和99%分位数,小于1%分位数的值一谈改变为1%分位数的值,其含义为变量中大于99%分位数的值一谈改变为99%分位数的值。replace暗示缩尾后的变量替换原变量,不加replace暗示生成的winsorize处理后的变量其变量名带有后缀'_w'。
5.2.1 进行全样本的刻画性统计分析tabstat cost_clinic time_clinic cost_hos time_hos Post Treat DID Rural Age Gender married Edu_Group fainc lnfainc chro gh pain cesd,s(n mean sd cv min median max k sk) c(s)tabstat暗示对紧跟后来的一系列变量进行刻画性统计分析,','后(option部分)的's()'中表述需要计较的统计量,'c(s)'暗示统计量按例摆放(若为'c(v)'则暗示按列摆放的为变量,即底下的效果将进行转置)。
得到刻画性统计效果:
Variable | N Mean SD CV Min p50 Max Kurtosis Skewness-------------+------------------------------------------------------------------------------------------ cost_clinic | 29391 120.0818 518.2586 4.31588 0 0 4000 41.89921 6.052325 time_clinic | 29391 .3545303 .9212584 2.598532 0 0 5 14.01949 3.235013 cost_hos | 29391 725.6395 3311.203 4.563151 0 0 25000 37.96861 5.737651 time_hos | 29391 .1069715 .3766619 3.521141 0 0 2 16.70562 3.728614 Post | 29391 .5180838 .4996814 .9644799 0 1 1 1.005239 -.0723824 Treat | 29391 .0038787 .0621597 16.02575 0 0 1 255.8197 15.96307 DID | 29391 .0022116 .046976 21.24111 0 0 1 450.1714 21.19367 Rural | 29391 .7937124 .4046463 .5098148 0 1 1 3.107502 -1.451724 Age | 29391 59.19077 9.369093 .1582864 45 58 83 2.485779 .4759141 Gender | 29391 .5103263 .4999019 .979573 0 1 1 1.001707 -.041314 married | 29391 .8776156 .3277348 .3734378 0 1 1 6.310427 -2.304436 Edu_Group | 29391 2.938689 1.338308 .45541 1 3 5 1.873939 -.112724 fainc | 29391 12178.41 15801.98 1.297541 1 6766 93714.33 11.91505 2.663377 lnfainc | 29391 8.155491 2.527126 .3098681 .6931472 8.819813 11.44802 6.145089 -1.879134 chro | 29391 1.488483 1.456642 .9786083 0 1 6 3.607613 1.028399 gh | 29391 2.156204 .5803246 .2691418 1 2 3 2.774461 -.0277315 pain | 29391 .2998537 .4582015 1.528083 0 0 1 1.763233 .8736319 cesd | 29391 8.079378 6.284093 .7777941 0 7 26 3.078649 .8498227--------------------------------------------------------------------------------------------------------
第一种导出活动,运用'logout':
接下来将刻画性统计分析效果输出到word(table1c_descrip_all.rtf):'logout'为导出统计表格的敕令,主体部分在','后(option部分)。'save()'部分指定统计表格导出的文献夹旅途和文献名(不加文献旅途则导出到责任旅途),'word'暗示导出文献表情为rtf,'replace'若导出的文献夹已存在同名文献则替换已存在的文献;':'后为需要导出的表格,咱们这里需淌若刻画性统计分析的敕令,因此导出的表格是刻画性统计分析效果。
logout,save(table1c_descrip_all) word replace:tabstat cost_clinic time_clinic cost_hos time_hos Post Treat DID Rural Age Gender married Edu_Group fainc lnfainc chro gh pain cesd,s(n mean sd cv min median max k sk) c(s)第二种输出活动,使用sum2docx敕令:
'sum2docx'与'logout'不同,不会在stata的效果窗口生成刻画性统计分申报表,而是径直将效果导出到word。需要进行刻画性统计分析的变量放在'sum2docx'之后,'using'之后输入导出的文献夹旅途和文献名,'stats()'中指定需要输出的统计量,replace的含义与上头换取。
sum2docx cost_clinic time_clinic cost_hos time_hos Post Treat DID Rural Age Gender married Edu_Group fainc lnfainc chro gh pain cesd using 'table1c_descrip_all.docx', replace stats(N mean sd min p25 median p75 max skewness kurtosis)5.2.2 进行分组对比分析
刻画性统计分析是对各变量的散布情况,样本间相反情况的一个总体证实,咱们需要进一步视察驱散组和实验组在战略实行前后的各变量进行对比。
驱散组长护险轨制实行前后的效果变量对比:
'ttable2'为进行t视察的敕令,后紧跟需要进行t视察的变量。'if Treat==0'暗示仅对驱散组样本进行T视察分析。逗号后的'by(Post)'暗示按照战略实行前后进行对比视察,'format'章程均值和均值相反的数值表情为保留一丝点后4位一丝。
ttable2 cost_clinic time_clinic cost_hos time_hos Age Gender married Edu_Group fainc chro gh pain cesd if Treat==0 , by(Post) format(%9.4f)下表为T视察效果表,' G(0)'和'Mean1'分别暗示战略实施前的样本量和均值,' G(1)'和'Mean2'分别暗示战略实施后的样本量和均值,'MeanDiff'为均值相反。'*'、'**'和'***'分别暗示在0.1、0.05、0.01的水平通过显贵性视察。
--------------------------------------------------------------------------Variables G1(0) Mean1 G2(1) Mean2 MeanDiff--------------------------------------------------------------------------cost_clinic 14115 92.5326 15162 146.1172 -53.5846***time_clinic 14115 0.3544 15162 0.3561 -0.0016cost_hos 14115 464.8485 15162 962.3160 -497.4675***time_hos 14115 0.0808 15162 0.1308 -0.0500***Age 14115 58.7795 15162 59.5625 -0.7830***Gender 14115 0.5173 15162 0.5036 0.0136**married 14115 0.8748 15162 0.8804 -0.0055Edu_Group 14115 2.7601 15162 3.1027 -0.3426***fainc 14115 1.22e+04 15162 1.20e+04 138.3820chro 14115 1.3852 15162 1.5863 -0.2011***gh 14115 2.2072 15162 2.1103 0.0969***pain 14115 0.3170 15162 0.2843 0.0326***cesd 14115 8.3514 15162 7.8504 0.5010***--------------------------------------------------------------------------
从上表可知,在战略实施之后,驱散组样本的门诊忽地(cost_clinic)、入院忽地(time_clinic)、入院次数(time_hos)均有显贵的增多,分别增多了53.5846元、497.467元和0.05次。
驱散组战略前后对比效果输出到word,文献名为table1b_descrip_treat:语法证实与前文一致,不再陈述。
logout, save(table1b_descrip_treat) word replace:ttable2 cost_clinic time_clinic cost_hos time_hos Age Gender married Edu_Group fainc chro gh pain cesd if Treat==0 , by(Post) format(%9.4f)实验组(青岛市)长护险轨制实行前后的效果变量对比:
ttable2 cost_clinic time_clinic cost_hos time_hos Age Gender married Edu_Group fainc chro gh pain cesd if Treat==1 , by(Post) format(%9.4f)--------------------------------------------------------------------------Variables G1(0) Mean1 G2(1) Mean2 MeanDiff--------------------------------------------------------------------------cost_clinic 49 152.0000 65 5.3846 146.6154*time_clinic 49 0.3061 65 0.0462 0.2600**cost_hos 49 2067.3469 65 1138.4615 928.8854time_hos 49 0.2041 65 0.1538 0.0502Age 49 61.0408 65 60.3846 0.6562Gender 49 0.5510 65 0.5385 0.0126married 49 0.8367 65 0.8769 -0.0402Edu_Group 49 2.9592 65 3.4462 -0.4870**fainc 49 4.81e+04 65 1.97e+04 2.83e+04***chro 49 1.1020 65 1.4000 -0.2980gh 49 2.0000 65 1.9231 0.0769pain 49 0.4286 65 0.1077 0.3209***cesd 49 4.6939 65 4.9692 -0.2754--------------------------------------------------------------------------
在战略之后,实验组的门诊忽地(cost_clinic)和门诊次数(time_clinic)在战略前后存在显贵的相反(0.05和0.1的显贵性水平)。在战略推广后的2015年比较于2011,门诊忽地缩小了146.615元,门诊次数缩小了0.26次。入院忽地(cost_hos)和入院次数(time_hos)并未发生显贵性的变化。
轮廓对比来看,驱散组样本的门诊忽地(cost_clinic)、门诊次数(time_clinic)、入院忽地(time_clinic)、入院次数(time_hos)比较于战略推广前均有不同进度的增多,而实验组在战略推广后均有不同进度的着落。但咱们仍然不行莽撞的作念出长护险轨制的推广是这一景色的原因,需要尽可能抹杀其它身分的影响才能得到相对真正的效果。
实验组战略前后对比效果输出到word文献名为table1a_descrip_control
logout, save(table1a_descrip_control) word replace:ttable2 cost_clinic time_clinic cost_hos time_hos Age Gender married Edu_Group fainc chro gh pain cesd if Treat==1 , by(Post) format(%9.4f)5.3基准追念分析
基准追念咱们共享4种不同的敕令(regress,reghdfe,diff,didregress)。regress为stata自带的线性追念敕令,reghdfe和diff为需要装配的外部敕令,didregress为stata17偏持以上新增的用于did分析的官方敕令。
5.3.1给与regress敕令追念分析敕令:'reg'和'regress'敕令的缩写,后来的步骤为效果变量(被证实变量),中枢证实变量(因变量),驱散变量。'vce(cluster city)'暗示给与在城市层面的安适法式误。'est store m*'暗示将上一溜的追念效果储存在内存中,定名为'm*'。
reg cost_clinic DID Post Treat Rural Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city)est store m1 reg time_clinic DID Post Treat Rural Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city)est store m2reg cost_hos DID Post Treat Rural Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city)est store m3reg time_hos DID Post Treat Rural Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city)est store m4追念效果鸠合申报在效果窗口:esttab后输入需要鸠合申报的追念效果名,nogap暗示申报効果不留空行,compress暗示压缩列与列之间的空间。'ar2'和'r2'是指定答复R方和诊治R方的选项,'scalar(N)'指定答复样本量,t指定答复T统计值而非法式误,'star(* 0.1 ** 0.05 *** 0.01)'指定显贵性标注阵势为:p值小于0.1标注“*”,p值小于0.05标注“**”,p值小于0.01标注“***”,'mtitle()'指定各模子的标题。
esttab m1 m2 m3 m4 ,replace nogap compress r2 ar2 scalar(N ) t star(* 0.1 ** 0.05 *** 0.01) mtitle('cost_clinic' 'time_clinic' 'cost_hos' 'time_hos')
效果如下:
-------------------------------------------------------------- (1) (2) (3) (4) cost_clinic time_clinic cost_hos time_hos --------------------------------------------------------------DID -199.6*** -0.239*** -1471.3*** -0.103*** (-25.05) (-16.30) (-30.30) (-15.76) Post 55.14*** 0.0118 493.7*** 0.0495*** (7.44) (0.91) (11.83) (9.19) Treat 84.17*** 0.00649 1774.0*** 0.148*** (12.18) (0.35) (45.34) (27.94) Rural -30.53*** 0.0232 -485.3*** -0.0297*** (-3.08) (1.04) (-7.94) (-4.42) Age 0.0715 -0.000176 20.27*** 0.00260*** (0.15) (-0.21) (6.15) (8.25) Gender 6.256 0.0481*** -128.6*** -0.0132*** (0.99) (3.81) (-3.06) (-2.66) married 7.620 0.0183 118.6 0.00458 (0.73) (1.00) (1.63) (0.53) 1.Edu_Gr~p 0 0 0 0 (.) (.) (.) (.) 2.Edu_Gr~p 12.52 0.0323 119.7** 0.0132* (1.29) (1.10) (2.02) (1.72) 3.Edu_Gr~p 25.24** 0.0148 129.8** 0.00525 (2.56) (0.62) (2.41) (0.74) 4.Edu_Gr~p 37.24*** 0.0116 258.9*** 0.0219*** (3.19) (0.48) (4.10) (2.74) 5.Edu_Gr~p 43.91*** 0.0191 212.3** 0.0174* (3.41) (0.70) (2.46) (1.72) lnfainc 4.152*** 0.00495** 15.68** 0.00146* (2.75) (2.00) (2.01) (1.69) chro 34.83*** 0.0795*** 236.4*** 0.0374*** (9.73) (11.36) (10.60) (13.98) gh 82.58*** 0.161*** 575.4*** 0.0748*** (11.45) (12.26) (11.87) (14.47) pain 29.26*** 0.132*** -5.509 0.00982 (3.53) (7.30) (-0.09) (1.48) cesd 2.495*** 0.00670*** 6.527 0.00130** (3.64) (4.91) (1.53) (2.47) _cons -214.4*** -0.313*** -2299.1*** -0.300*** (-4.93) (-4.20) (-7.60) (-9.65) --------------------------------------------------------------N 29391 29391 29391 29391 R-sq 0.037 0.064 0.047 0.067 adj. R-sq 0.037 0.063 0.046 0.067 --------------------------------------------------------------t statistics in parentheses* p<0.1, ** p<0.05, *** p<0.01上表答复了基准追念的效果,咱们主要关爱DID的估量统共。DID的估量统共在1%的水平通过显贵性视察。长护险轨制在青岛实行后,实验组的门诊用度平均缩小了199.6元,去门诊的次数平均着落了0.239次,入院忽地平均缩小了1471.3元,入院次数平均缩小了0.103次。
基准追念效果输出到word文献,table2_basireg.rtf: 若要将效果输出到word而不是在效果窗口追念,只需要在鸠合申报敕令中的逗号前加上'using'和数据旅途,以及输出的文献名,如下所示。
esttab m1 m2 m3 m4 using 'table2_basireg.rtf',replace nogap compress r2 ar2 scalar(N ) t star(* 0.1 ** 0.05 *** 0.01) mtitle('月门诊用度' '月门诊次数' '年入院用度' '年入院次数')5.3.2给与外部敕令reghdfe
'reghdfe'是用于高维固定效应追念的敕令,在此处,与regress敕令独一的区别是需要加上'noabsorb'选项来证实追念不需要驱散任何固定效应。
reghdfe cost_clinic DID Post Treat Rural Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) noabsorbest store m1 reghdfe time_clinic DID Post Treat Rural Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) noabsorbest store m2reghdfe cost_hos DID Post Treat Rural Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) noabsorbest store m3reghdfe time_hos DID Post Treat Rural Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) noabsorbest store m4追念效果鸠合申报在效果窗口:
esttab m1 m2 m3 m4 ,replace nogap compress r2 ar2 scalar(N ) t star(* 0.1 ** 0.05 *** 0.01) mtitle('cost_clinic' 'time_clinic' 'cost_hos' 'time_hos')
效果如下:
-------------------------------------------------------------- (1) (2) (3) (4) cost_clinic time_clinic cost_hos time_hos --------------------------------------------------------------DID -199.6*** -0.239*** -1471.3*** -0.103*** (-25.05) (-16.30) (-30.30) (-15.76) Post 55.14*** 0.0118 493.7*** 0.0495*** (7.44) (0.91) (11.83) (9.19) Treat 84.17*** 0.00649 1774.0*** 0.148*** (12.18) (0.35) (45.34) (27.94) Rural -30.53*** 0.0232 -485.3*** -0.0297*** (-3.08) (1.04) (-7.94) (-4.42) Age 0.0715 -0.000176 20.27*** 0.00260*** (0.15) (-0.21) (6.15) (8.25) Gender 6.256 0.0481*** -128.6*** -0.0132*** (0.99) (3.81) (-3.06) (-2.66) married 7.620 0.0183 118.6 0.00458 (0.73) (1.00) (1.63) (0.53) 1.Edu_Gr~p 0 0 0 0 (.) (.) (.) (.) 2.Edu_Gr~p 12.52 0.0323 119.7** 0.0132* (1.29) (1.10) (2.02) (1.72) 3.Edu_Gr~p 25.24** 0.0148 129.8** 0.00525 (2.56) (0.62) (2.41) (0.74) 4.Edu_Gr~p 37.24*** 0.0116 258.9*** 0.0219*** (3.19) (0.48) (4.10) (2.74) 5.Edu_Gr~p 43.91*** 0.0191 212.3** 0.0174* (3.41) (0.70) (2.46) (1.72) lnfainc 4.152*** 0.00495** 15.68** 0.00146* (2.75) (2.00) (2.01) (1.69) chro 34.83*** 0.0795*** 236.4*** 0.0374*** (9.73) (11.36) (10.60) (13.98) gh 82.58*** 0.161*** 575.4*** 0.0748*** (11.45) (12.26) (11.87) (14.47) pain 29.26*** 0.132*** -5.509 0.00982 (3.53) (7.30) (-0.09) (1.48) cesd 2.495*** 0.00670*** 6.527 0.00130** (3.64) (4.91) (1.53) (2.47) _cons -214.4*** -0.313*** -2299.1*** -0.300*** (-4.93) (-4.20) (-7.60) (-9.65) --------------------------------------------------------------N 29391 29391 29391 29391 R-sq 0.037 0.064 0.047 0.067 adj. R-sq 0.037 0.063 0.046 0.067 --------------------------------------------------------------t statistics in parentheses* p<0.1, ** p<0.05, *** p<0.01基准追念效果输出到word文献,table2_basireg.rtf:
esttab m1 m2 m3 m4 using 'table2_basireg.rtf',replace nogap compress r2 ar2 scalar(N ) t star(* 0.1 ** 0.05 *** 0.01) mtitle('月门诊用度' '月门诊次数' '年入院用度' '年入院次数')5.3.3给与外部敕令diff
'diff'是繁密用于did分析的外部敕令中的一个,由于diff在探究驱散变量时不行输入factor variable(上文中的i.Edu_Group),因此需要将Edu_Group篡改为dummy variables,因此需要先运行'tab Edu_Group , gen(Edu)'。
'diff'的使用活动是将效果变量放在敕令之后,在选项部分指定实验组和驱散组的识别变量和战略实行前后的识别变量以及驱散变量。't()'指定分组变量,'p()'指定战略本事前后的鉴识变量,'cv()'指定驱散变量,'cluster()'暗示使用聚类法式误,若不加该选项即是不使用聚类法式误,report指需要答复齐全的追念效果。
先以对cost_clinic(门诊忽地)的分析为例:
tab Edu_Group , gen(Edu) //生成最高学历的杜撰变量diff cost_clinic, t(Treat) p(Post) cov(Rural Age Gender married Edu2 Edu3 Edu4 Edu5 lnfainc chro gh pain cesd) cluster(city) report在不加'report'选项下,'diff'敕令分别答复了在探究驱散变量的情况下,在战略之前(Before)与之后(after)驱散组和实验组的效果变量均值偏持之间的相反和显贵性视察,终末一溜答复了双重差分的效果,即前文中的DID变量的估量参数。借助这个表,咱们不错更好的意会平行趋势假定的病笃性,终末一溜(Diff-on-Diff)的估量统共或然是战略实行之前驱散组与实验组的相反减去战略实行之前的相反,其差值能暗示战略效应的一个病笃假定是在战略莫得实行的情况下,战略后实验组和驱散组在效果变量上的相反馈该与战略实行之前不具有显贵的不同,即就算有不同亦然由就地性产生的,如果统计视察证实该“不同”具有统计上的兴味,就不错合理推断该'不同'是由战略实行导致的。
-------------------------------------------------------- Outcome var. | cost_~c | S. Err. | |t| | P>|t|----------------+---------+---------+---------+---------Before | | | | Control | -214.368| | | Treated | -130.195| | | Diff (T-C) | 84.173 | 6.913 | 12.18 | 0.000***After | | | | Control | -159.225| | | Treated | -274.620| | | Diff (T-C) | -115.395| 6.212 | 18.58 | 0.000*** | | | | Diff-in-Diff | -199.568| 7.967 | 25.05 | 0.000***--------------------------------------------------------
相同,给与diff也不错将基准追念效果汇总申报:
diff cost_clinic, t(Treat) p(Post) cov(Rural Age Gender married Edu2 Edu3 Edu4 Edu5 lnfainc chro gh pain cesd) cluster(city) reportest store m1diff time_clinic, t(Treat) p(Post) cov(Rural Age Gender married Edu2 Edu3 Edu4 Edu5 lnfainc chro gh pain cesd) cluster(city) est store m2diff cost_hos, t(Treat) p(Post) cov(Rural Age Gender married Edu2 Edu3 Edu4 Edu5 lnfainc chro gh pain cesd) cluster(city) est store m3diff time_hos, t(Treat) p(Post) cov(Rural Age Gender married Edu2 Edu3 Edu4 Edu5 lnfainc chro gh pain cesd) cluster(city) est store m4追念效果鸠合申报在效果窗口:
esttab m1 m2 m3 m4 ,replace nogap compress r2 ar2 scalar(N ) t star(* 0.1 ** 0.05 *** 0.01) mtitle('cost_clinic' 'time_clinic' 'cost_hos' 'time_hos')
效果如下:'_diff'变量即是前文中的DID变量,追念效果也与regress敕令和reghdfe敕令一致。
-------------------------------------------------------------- (1) (2) (3) (4) cost_cl~c time_cl~c cost_hos time_hos --------------------------------------------------------------Post 55.14*** 0.0118 493.7*** 0.0495*** (7.44) (0.91) (11.83) (9.19) Treat 84.17*** 0.00649 1774.0*** 0.148*** (12.18) (0.35) (45.34) (27.94) _diff -199.6*** -0.239*** -1471.3*** -0.103*** (-25.05) (-16.30) (-30.30) (-15.76) Rural -30.53*** 0.0232 -485.3*** -0.0297*** (-3.08) (1.04) (-7.94) (-4.42) Age 0.0715 -0.000176 20.27*** 0.00260*** (0.15) (-0.21) (6.15) (8.25) Gender 6.256 0.0481*** -128.6*** -0.0132*** (0.99) (3.81) (-3.06) (-2.66) married 7.620 0.0183 118.6 0.00458 (0.73) (1.00) (1.63) (0.53) Edu2 12.52 0.0323 119.7** 0.0132* (1.29) (1.10) (2.02) (1.72) Edu3 25.24** 0.0148 129.8** 0.00525 (2.56) (0.62) (2.41) (0.74) Edu4 37.24*** 0.0116 258.9*** 0.0219*** (3.19) (0.48) (4.10) (2.74) Edu5 43.91*** 0.0191 212.3** 0.0174* (3.41) (0.70) (2.46) (1.72) lnfainc 4.152*** 0.00495** 15.68** 0.00146* (2.75) (2.00) (2.01) (1.69) chro 34.83*** 0.0795*** 236.4*** 0.0374*** (9.73) (11.36) (10.60) (13.98) gh 82.58*** 0.161*** 575.4*** 0.0748*** (11.45) (12.26) (11.87) (14.47) pain 29.26*** 0.132*** -5.509 0.00982 (3.53) (7.30) (-0.09) (1.48) cesd 2.495*** 0.00670*** 6.527 0.00130** (3.64) (4.91) (1.53) (2.47) _cons -214.4*** -0.313*** -2299.1*** -0.300*** (-4.93) (-4.20) (-7.60) (-9.65) --------------------------------------------------------------N 29391 29391 29391 29391 R-sq 0.037 0.064 0.047 0.067 adj. R-sq 0.037 0.063 0.046 0.067 --------------------------------------------------------------t statistics in parentheses* p<0.1, ** p<0.05, *** p<0.01导出到word的敕令与前边一致。
5.3.4给与didregress(仅stata17及以上可用)'didregress'是stata17偏持以上版块才能使用的官方敕令,在应用到两期法式DID模子时其基本语法结构为:didregress (ovar omvarlist) (tvar) , group(groupvar) time(timevar) options
'ovar'为效果变量,'omvarlist'为一组协变量/驱散变量。'tvar'是用于暗示哪些不雅测受到战略的影响,需要在意的是这里识别的是不雅测,举例某属于青岛的个体A(长护险轨制在青岛2012推广,属于实验组),在数据中会产生2011年和2015年两个不雅测,但唯独在长护险战略实行之后的2015年不雅测才会受到战略的影响。因此对于这个个体A,其在2011年的不雅测对应的'tvar'取0,对应2015年不雅测的'tvar'取值为0,在咱们的案例中,'tvar'即是DID。
'groupvar'是用于识别个体属于实验组如故驱散组的dummy variable,上文中的个体A,无论关联于其在2011年的不雅测如故在2015年的不雅测,'groupvar'齐应赋值为1,案例中的'Treat'即是'groupvar'。
'timevar'可看作是用于识别战略发生本事的dummy varibale,对应咱们拔擢的'Post'。'option'代表其他一系列可选要求的拔擢,'vce(cluster city)'的含义与前文一致,aequations在需要答复协变量/驱散变量的估量效果时使用。需要在意得是,'didregress'相同不提拔factor variable的使用,同期不会答复'Post'的估量统共。
didregress (cost_clinic Rural Age Gender married Edu2 Edu3 Edu4 Edu5 lnfainc chro gh pain cesd) (DID), group(Treat) time(Post) vce(cluster city) aequationsest store m1didregress (time_clinic Rural Age Gender married Edu2 Edu3 Edu4 Edu5 lnfainc chro gh pain cesd) (DID), group(Treat) time(Post) vce(cluster city) aequationsest store m2didregress (cost_hos Rural Age Gender married Edu2 Edu3 Edu4 Edu5 lnfainc chro gh pain cesd) (DID), group(Treat) time(Post) vce(cluster city) aequationsest store m3didregress (time_hos Rural Age Gender married Edu2 Edu3 Edu4 Edu5 lnfainc chro gh pain cesd) (DID), group(Treat) time(Post) vce(cluster city) aequationsest store m4
追念效果鸠合申报在效果窗口:
esttab m1 m2 m3 m4 ,replace nogap compress r2 ar2 scalar(N ) t star(* 0.1 ** 0.05 *** 0.01) mtitle('cost_clinic' 'time_clinic' 'cost_hos' 'time_hos')效果如下:
-------------------------------------------------------------- (1) (2) (3) (4) cost_cl~c time_cl~c cost_hos time_hos --------------------------------------------------------------ATET r1vs0.DID -199.6*** -0.239*** -1471.3*** -0.103*** (-25.05) (-16.30) (-30.30) (-15.76) --------------------------------------------------------------Controls Rural -30.53*** 0.0232 -485.3*** -0.0297*** (-3.08) (1.04) (-7.94) (-4.42) Age 0.0715 -0.000176 20.27*** 0.00260*** (0.15) (-0.21) (6.15) (8.25) Gender 6.256 0.0481*** -128.6*** -0.0132*** (0.99) (3.81) (-3.06) (-2.66) married 7.620 0.0183 118.6 0.00458 (0.73) (1.00) (1.63) (0.53) Edu2 12.52 0.0323 119.7** 0.0132* (1.29) (1.10) (2.02) (1.72) Edu3 25.24** 0.0148 129.8** 0.00525 (2.56) (0.62) (2.41) (0.74) Edu4 37.24*** 0.0116 258.9*** 0.0219*** (3.19) (0.48) (4.10) (2.74) Edu5 43.91*** 0.0191 212.3** 0.0174* (3.41) (0.70) (2.46) (1.72) lnfainc 4.152*** 0.00495** 15.68** 0.00146* (2.75) (2.00) (2.01) (1.69) chro 34.83*** 0.0795*** 236.4*** 0.0374*** (9.73) (11.36) (10.60) (13.98) gh 82.58*** 0.161*** 575.4*** 0.0748*** (11.45) (12.26) (11.87) (14.47) pain 29.26*** 0.132*** -5.509 0.00982 (3.53) (7.30) (-0.09) (1.48) cesd 2.495*** 0.00670*** 6.527 0.00130** (3.64) (4.91) (1.53) (2.47) 0.Post 0 0 0 0 (.) (.) (.) (.) 1.Post 55.14*** 0.0118 493.7*** 0.0495*** (7.44) (0.91) (11.83) (9.19) _cons -214.0*** -0.313*** -2292.3*** -0.299*** (-4.93) (-4.20) (-7.58) (-9.63) --------------------------------------------------------------N 29391 29391 29391 29391 R-sq adj. R-sq --------------------------------------------------------------5.4 进一步分析
基准追念分析是否仍然存在污点,这是咱们需要探究的问题。通过总体数据的刻画性统计分析,咱们不错明晰看到,4个效果变量的最小值和中位数均为0,峰渡过高,偏度隔离0,证实4个效果变量的取值大部分为最低值0,同期0到高值之间不存在赫然的过渡区间。这就引出一个要津问题,咱们所探究的协变量与因变量和这4个效果变量是否可能不是线性接洽,大约说线性接洽不行很准确的证实协变量与因变量和这4个效果变量之间的接洽,可能低谷长护险的战略效应。
从头回到4个效果变量的刻画性分析:
tabstat cost_clinic time_clinic cost_hos time_hos,s(n mean sd min p25 p50 p75 max k sk) c(v)稽查效果:
Stats | cost_c~c time_c~c cost_hos time_hos---------+---------------------------------------- N | 29391 29391 29391 29391 Mean | 120.0818 .3545303 725.6395 .1069715 SD | 518.2586 .9212584 3311.203 .3766619 Min | 0 0 0 0 p25 | 0 0 0 0 p50 | 0 0 0 0 p75 | 0 0 0 0 Max | 4000 5 25000 2Kurtosis | 41.89921 14.01949 37.96861 16.70562Skewness | 6.052325 3.235013 5.737651 3.728614--------------------------------------------------
咱们也不错通过画出核密度图来直不雅证实。
'kdensity'是用于画核密度图的敕令,ylabel用于指定坐标轴的标签,legend用于指定图例,graphregion用于指定区域颜料。'graph save'将图保存为gph表情,'graph combine'用于将gph表情的图归拢为一张图,'graph export'用于将图导出为其他表情。
kdensity cost_clinic ,ylabel(#4, format(%9.4f)) legend(off) graphregion(fcolor(white) ifcolor(white))graph save 'cost_clinic.gph' ,replacekdensity time_clinic ,ylabel(, format(%9.1f)) legend(off) graphregion(fcolor(white) ifcolor(white))graph save 'time_clinic.gph' ,replacekdensity cost_hos ,ylabel(#3, format(%9.4f)) legend(off) graphregion(fcolor(white) ifcolor(white))graph save 'cost_hos.gph' ,replacekdensity time_hos ,ylabel(, format(%9.0f)) legend(off) graphregion(fcolor(white) ifcolor(white))graph save 'time_hos.gph' ,replacegraph combine 'cost_clinic.gph' 'time_clinic.gph' 'cost_hos.gph' 'time_hos.gph' ,row(2)graph export 'figure1_kernal.png' ,width(1920) height(1500) replace图片
Fig 1: 效果变量核密度图5.4.1 进行tobit追念若咱们将效果变量的这种数据散布特征简短的看作一种截尾散布。即简短地合计唯独躯壳状态的痛楚进度达到一定阈值,才会去门诊,效果变量存在存在'负值',但负值由于客不雅情况齐取值为0了。大约说,在门诊之下,以及门诊和入院之间,若存在其它愈加具有不息性的过渡医疗处事,那么,咱们的效果变量将接近正态散布大约T散布。(内容上,这么的假定不一定具有合感性和推行性,因为对推行作念出了太大的遐想。以后有契机咱们不错挑升出专题商榷这么的问题,在此处作为学习的案例,不错姑且一试。)
因此,咱们现存的数据的散布只是是正态散布大约某种对称散布的右半部分,因此,不错给与tobit模子,对于tobit,在此处不错简短意会为给与部分线性的分段方程进行拟合。以下为tobit追念的代码,'tobit'是追念的提示,'ll(0)'是证实数据不才限为0处产生了截尾。
tobit cost_clinic DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) ll(0)est store m1 tobit time_clinic DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) ll(0)est store m2tobit cost_hos DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) ll(0)est store m3tobit time_hos DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) ll(0)est store m4esttab m1 m2 m3 m4 ,replace nogap compress scalar(N r2_p) t star(* 0.1 ** 0.05 *** 0.01) mtitle('cost_clinic' 'time_clinic' 'cost_hos' 'time_hos')
得到tobit追念效果:咱们不错看到估量效确切切,在长护险战略实行后,实验组的门诊用度平均缩小了1844.4元,去门诊的次数平均着落了3.157次,入院忽地平均缩小了8437.5元,入院次数平均缩小了0.848次,tobit估量出的长护险效应(平均角落效应)如故很接近各自的最大值,这在很猛进度上坎坷了学问,不错怀疑tobit估量效果夸大了长护险的战略效应。
-------------------------------------------------------------- (1) (2) (3) (4) cost_cl~c time_cl~c cost_hos time_hos --------------------------------------------------------------main DID -1844.0*** -3.157*** -8437.5*** -0.848*** (-27.59) (-37.83) (-13.81) (-10.99) Post 138.1*** 0.0498 4056.8*** 0.460*** (3.99) (0.73) (8.25) (7.37) Treat 314.4*** 0.248*** 11949.1*** 1.313*** (6.88) (2.62) (20.36) (17.96) Age -1.295 -0.00248 247.3*** 0.0304*** (-0.61) (-0.60) (7.76) (8.57) Gender 99.75*** 0.252*** -897.3* -0.111* (3.18) (3.98) (-1.91) (-1.89) married 39.57 0.0716 475.3 0.0198 (0.92) (0.84) (0.75) (0.25) 1.Edu_Gr~p 0 0 0 0 (.) (.) (.) (.) 2.Edu_Gr~p 80.33 0.146 1466.4** 0.166* (1.49) (1.15) (2.05) (1.82) 3.Edu_Gr~p 110.7** 0.116 1072.7* 0.0814 (2.30) (1.05) (1.71) (1.00) 4.Edu_Gr~p 142.5** 0.105 2876.7*** 0.293*** (2.52) (0.89) (4.07) (3.28) 5.Edu_Gr~p 190.4*** 0.141 3366.5*** 0.322*** (2.78) (0.97) (3.96) (3.03) lnfainc 23.28*** 0.0320*** 228.2** 0.0240** (3.59) (2.66) (2.36) (2.05) chro 181.9*** 0.365*** 2440.6*** 0.322*** (14.06) (14.34) (15.57) (17.02) gh 476.6*** 0.894*** 6721.9*** 0.838*** (15.34) (15.48) (13.11) (15.78) pain 233.8*** 0.564*** 612.9 0.111* (6.53) (7.46) (1.14) (1.66) cesd 13.06*** 0.0277*** 62.72 0.00913* (4.70) (4.77) (1.64) (1.85) _cons -3413.0*** -6.176*** -65473.2*** -8.054*** (-14.89) (-17.47) (-17.67) (-22.85) --------------------------------------------------------------/ var..cos~) 2329803.9*** (15.65) var..tim~) 9.453*** (36.75) var..cos~) 307714041.5*** (17.28) var..tim~) 4.959*** (42.74) --------------------------------------------------------------N 29391 29391 29391 29391 r2_p 0.0173 0.0438 0.0252 0.0759 --------------------------------------------------------------tobit追念效果输出到word文献,table3a_tobit.rtf:
esttab m1 m2 m3 m4 using '$path/table/table3a_tobit.rtf',replace nogap compress scalar(N r2_p) t star(* 0.1 ** 0.05 *** 0.01) mtitle('月门诊用度' '月门诊次数' '年入院用度' '年入院次数')5.4.2 运用负二项追念进行估量
从效果变量的刻画性统计分析可知,效果变量存在过渡分散的特征,即方差赫然大于生机,尝试给与零彭胀负二项追念。(负二项追念一般用于因变量为计数型数据的模子,在咱们的案例中,忽地并非是计数型数据,在证据其散布特征与计数型数据具有相似性,咱们将其作为计数型数据进行处理) stata的竣事代码如下:'nbreg'是负二项追念的敕令,'margins , dydx(*) post'用于估量平均角落效应。
nbreg cost_clinic DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) margins , dydx(*) postest store m1 nbreg time_clinic DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) margins , dydx(*) postest store m2nbreg cost_hos DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) margins , dydx(*) postest store m3nbreg time_hos DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city) margins , dydx(*) postest store m4esttab m1 m2 m3 m4 ,replace nogap compress scalar(N r2_p) t star(* 0.1 ** 0.05 *** 0.01) mtitle('cost_clinic' 'time_clinic' 'cost_hos' 'time_hos')咱们得到如下效果:
-------------------------------------------------------------- (1) (2) (3) (4) cost_cl~c time_cl~c cost_hos time_hos --------------------------------------------------------------DID -782.9*** -0.641*** -1844.4*** -0.0961*** (-19.55) (-23.59) (-11.79) (-12.09) Post 54.40*** 0.0158 491.3*** 0.0453*** (5.64) (1.12) (7.09) (7.46) Treat 239.8*** -0.0261 1529.1*** 0.122*** (16.13) (-1.31) (14.72) (21.38) Age 0.473 0.000691 23.07*** 0.00299*** (0.91) (0.79) (5.98) (9.29) Gender 6.967 0.0563*** -138.1** -0.0105** (0.80) (4.20) (-2.40) (-2.03) married -7.281 0.0127 37.27 0.00383 (-0.51) (0.71) (0.42) (0.55) 1.Edu_Gr~p 0 0 0 0 (.) (.) (.) (.) 2.Edu_Gr~p 27.14** 0.0469 136.7 0.0150** (2.11) (1.61) (1.36) (1.99) 3.Edu_Gr~p 30.26*** 0.0228 72.88 0.00603 (2.84) (1.00) (0.99) (0.91) 4.Edu_Gr~p 52.53*** 0.00729 286.5*** 0.0278*** (3.94) (0.31) (2.85) (3.41) 5.Edu_Gr~p 72.65*** 0.0120 337.7*** 0.0319*** (3.70) (0.39) (2.93) (3.01) lnfainc 5.826*** 0.00591** 19.74 0.00221** (2.86) (2.29) (1.62) (2.13) chro 34.99*** 0.0703*** 218.1*** 0.0283*** (9.89) (10.93) (8.28) (13.93) gh 89.03*** 0.186*** 600.6*** 0.0833*** (10.29) (12.10) (8.86) (14.29) pain 33.59*** 0.123*** 91.28 0.0103* (4.21) (7.48) (1.16) (1.73) cesd 2.101*** 0.00542*** 11.02** 0.000891** (2.92) (4.61) (2.01) (1.97) --------------------------------------------------------------N 29391 29391 29391 29391 r2_p --------------------------------------------------------------t statistics in parentheses* p<0.1, ** p<0.05, *** p<0.01
证据负二项追念的效果,在长护险战略实行后,实验组的门诊用度平均缩小了-782.9元,去门诊的次数平均着落了0.641次,入院忽地平均缩小了1844.4元,入院次数平均缩小了-0.0961次,估量效果在线性追念和tobit模子之间。
5.4.3 将效果变量进行对数化处理在这里只是提供一种想路,针对本次的案例分析,取对数其实并不行惩处效果变量的截尾散布问题。因为4个效果变量均在最低值0截尾,取对数也会导致变化后的数据在最低值0截尾。最初对效果变量取对数,代码如下:
foreach var in cost_clinic time_clinic cost_hos time_hos{ gen ln`var'=ln(`var'+1)}然后进行追念:
reg lncost_clinic DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city)est store m1 reg lntime_clinic DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city)est store m2reg lncost_hos DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city)est store m3reg lntime_hos DID Post Treat Age Gender married i.Edu_Group lnfainc chro gh pain cesd , vce(cluster city)est store m4esttab m1 m2 m3 m4 ,replace nogap compress r2 ar2 scalar(N ) t star(* 0.1 ** 0.05 *** 0.01) mtitle('月门诊用度' '月门诊次数' '年入院用度' '年入院次数')
追念效果如下:
-------------------------------------------------------------- (1) (2) (3) (4) lncost_~c lntime_~c lncost_~s lntime_~s --------------------------------------------------------------DID -0.900*** -0.130*** -0.455*** -0.0527*** (-27.01) (-19.00) (-11.63) (-13.37) Post 0.135*** 0.00635 0.296*** 0.0281*** (4.38) (1.03) (8.76) (8.58) Treat 0.190*** 0.0103 0.901*** 0.0841*** (4.84) (1.22) (26.82) (26.13) Age -0.00292 -0.000353 0.0169*** 0.00169*** (-1.47) (-0.92) (7.74) (8.41) Gender 0.119*** 0.0220*** -0.0647* -0.00665** (4.22) (3.79) (-1.95) (-2.16) married 0.0456 0.00743 0.0287 0.00196 (1.10) (0.88) (0.55) (0.38) 1.Edu_Gr~p 0 0 0 0 (.) (.) (.) (.) 2.Edu_Gr~p 0.0465 0.0131 0.0954* 0.00909* (0.85) (1.03) (1.86) (1.89) 3.Edu_Gr~p 0.0861* 0.00839 0.0695 0.00518 (1.80) (0.78) (1.57) (1.20) 4.Edu_Gr~p 0.120** 0.00645 0.202*** 0.0177*** (2.23) (0.57) (4.04) (3.68) 5.Edu_Gr~p 0.164** 0.00822 0.230*** 0.0199*** (2.57) (0.62) (3.86) (3.42) lnfainc 0.0197*** 0.00252** 0.0146** 0.00124** (3.42) (2.32) (2.42) (2.27) chro 0.212*** 0.0384*** 0.231*** 0.0232*** (14.29) (12.37) (14.13) (14.41) gh 0.428*** 0.0777*** 0.460*** 0.0449*** (14.42) (12.99) (14.11) (14.38) pain 0.287*** 0.0616*** 0.0355 0.00526 (7.07) (7.42) (0.80) (1.26) cesd 0.0152*** 0.00319*** 0.00549* 0.000664** (5.12) (5.23) (1.80) (2.13) _cons -0.716*** -0.112*** -2.050*** -0.203*** (-4.21) (-3.54) (-10.05) (-10.72) --------------------------------------------------------------N 29391 29391 29391 29391 R-sq 0.071 0.070 0.060 0.065 adj. R-sq 0.071 0.070 0.059 0.065 --------------------------------------------------------------t statistics in parentheses* p<0.1, ** p<0.05, *** p<0.01参考文献[1] Villa, J. M. . (2016). Diff: simplifying the estimation of difference-in-differences treatment effects. Stata Journal, 16(1), págs. 52-71.
[2] Greene, W. H. . (2018). Econometric analysis.Pearson Education India.
[3]马超泷川雅美视频,俞沁雯,宋泽 & 陈昊.(2019).长期顾问保障、医疗用度驱散与价值医疗. 中国工业经济(12),42-59.
2018年最新国产在线视频 本站仅提供存储处事,整个内容均由用户发布,如发现存害或侵权内容,请点击举报。