SAS菜鸟,想处理国际贸易数据现在有9张表,载有9类商品的贸易数据,每张表都有四个变量,即Reporter (出口国),Partner (进口国), Commodity Value(特定商品的贸易流量),Year (统计年份)。
Reporter是Group of Seven里面的七国,Partner是UN COMTRADE数据库里面能获取的所有国家。
9张表的区别在于Commodity Value这一项不一样,比如有的为pharmacy(药品)的贸易流量,有的为Aerospace (航空设备)的贸易流量。
现在我想9张表联合查询,弄出Reporter, Partner, Year, Pharmacy(Commodity1), Aerospace(Commodity2), Commodity3, ...... , Commodity9 的数据形式
编了一个初步的程序如下:
proc sql;
create table g7.hitech as
SELECT distinct
a.Reporter, a.Partner,a.Year, b.aerospace, c.office, a.telecom, e.pharmacy, f.sci, g.emachi, h.chem, i.nemachi, j.aa
FROM g7.telecom as a
LEFT JOIN g7.aerospace as b on a.Reporter = b.Reporter and a.Partner = b.Partner and a.Year = b.Year
LEFT JOIN g7.office as c on a.Reporter = c.Reporter and a.Partner = c.Partner and a.Year = c.Year
LEFT JOIN g7.pharmacy as e on a.Reporter = e.Reporter and a.Partner = e.Partner and a.Year = e.Year
LEFT JOIN g7.sci as f on a.Reporter = f.Reporter and a.Partner = f.Partner and a.Year = f.Year
LEFT JOIN g7.emachi as g on a.Reporter = g.Reporter and a.Partner = g.Partner and a.Year = g.Year
LEFT JOIN g7.chem as h on a.Reporter = h.Reporter and a.Partner = h.Partner and a.Year = h.Year
LEFT JOIN g7.nemachi as i on a.Reporter = i.Reporter and a.Partner = i.Partner and a.Year = i.Year
LEFT JOIN g7.aa as j on a.Reporter = j.Reporter and a.Partner = j.Partner and a.Year = j.Year
;
quit;
使用telecom这个表做LEFT JOIN的主表主要是因为其观测值最多。
不过这个程序有个问题,就是有的国家有Pharmacy或aerospace的贸易,但是没有telecom的贸易,因此这些观测值就被忽略了;若使用FULL JOIN,则这类Pharmacy或Aerospace的观测值会出现在新表中,但是这些值对应的Reporter, Partner, Year这三个指标就成了缺失值。
我希望保留这些观测值,同时也将其Reporter, Partner, Year的指标也放入新表中
求大侠帮忙解答,感激不尽。