全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 SAS专版
8967 12
2011-07-07
现需要做一项调查,要求按照变量GDP进行PPS抽样,要求:按照GDP大小成比例抽取20个地区样本,请问采用SAS如何实现?谢谢
Region GDP
Beijing 11865.9
Tianjin 7500.8
Hebei 17026.6
Liaoning 15065.6
Shanghai 14900.9
Jiangsu 34061.2
Zhejiang 22832.4
Fujian 11949.5
Shandong 33805.3
Guangdong 39081.6
Hainan 1646.6
Shanxi 7365.7
Jilin 7203.2
Heilongjiang 8288.0
Anhui 10052.9
Jiangxi 7589.2
Henan 19367.3
Hubei 12831.5
Hunan 12930.7
Chongqing 6528.7
Sichuan 14151.3
Guizhou 3893.5
Yunnan 6168.2
Tibet 441.4
Shaanxi 8186.7
Gansu 3382.4
Qinghai 1081.3
Ningxia 1334.6
Xinjiang 4273.6
InnerMongolia 9725.8
Guangxi 7700.4
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2011-7-8 08:37:37
弱弱的问一下,PPS是什么意思?
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2011-7-8 09:25:55
surveyselect过程 1# wcguo94
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2011-7-9 08:40:57
leedx 发表于 2011-7-8 08:37
弱弱的问一下,PPS是什么意思?
PPS抽样调查法( Probability Proportionate to Size Sampling)又称,按规模大小成比例的概率抽样/PPS抽样
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2011-7-9 08:54:49
wcguo94 发表于 2011-7-7 10:45
现需要做一项调查,要求按照变量GDP进行PPS抽样,要求:按照GDP大小成比例抽取20个地区样本,请问采用SAS如何实现?谢谢
Region GDP
Beijing 11865.9
Tianjin 7500.8
Hebei 17026.6
Liaoning 15065.6
Shanghai 14900.9
Jiangsu 34061.2
Zhejiang 22832.4
Fujian 11949.5
Shandong 33805.3
Guangdong 39081.6
Hainan 1646.6
Shanxi 7365.7
Jilin 7203.2
Heilongjiang 8288.0
Anhui 10052.9
Jiangxi 7589.2
Henan 19367.3
Hubei 12831.5
Hunan 12930.7
Chongqing 6528.7
Sichuan 14151.3
Guizhou 3893.5
Yunnan 6168.2
Tibet 441.4
Shaanxi 8186.7
Gansu 3382.4
Qinghai 1081.3
Ningxia 1334.6
Xinjiang 4273.6
InnerMongolia 9725.8
Guangxi 7700.4
试试这个:
data tmp2;
id=_n_;
input region $ gdp;
cards;
Beijing 11865.9
Tianjin 7500.8
Hebei 17026.6
Liaoning 15065.6
Shanghai 14900.9
Jiangsu 34061.2
Zhejiang 22832.4
Fujian 11949.5
Shandong 33805.3
Guangdong 39081.6
Hainan 1646.6
Shanxi 7365.7
Jilin 7203.2
Heilongjiang 8288.0
Anhui 10052.9
Jiangxi 7589.2
Henan 19367.3
Hubei 12831.5
Hunan 12930.7
Chongqing 6528.7
Sichuan 14151.3
Guizhou 3893.5
Yunnan 6168.2
Tibet 441.4
Shaanxi 8186.7
Gansu 3382.4
Qinghai 1081.3
Ningxia 1334.6
Xinjiang 4273.6
InnerMongolia 9725.8
Guangxi 7700.4
;
proc sort;by gdp;
run;
%macro ranselect(data,method=,samprate=,n=,total=,out=,strata=, var=,export=0);
options nodate nonumber ps=100 ls=90;
%if &data ne %then %do;

%if &strata ne %then %do;proc sort data=&data;by &strata;%end;

%if &n ne %then %do;

proc surveyselect data=&data seed=20060704 method=&method n=&n out=&out;%end;

%if &samprate ne %then %do;

proc surveyselect data=&data seed=20060704 method=&method samprate=&samprate out=&out;%end;

%if strata ne %then %do;strata &strata;%end;

%if &var ne %then %do;id &var;%end;run;%end;
%else %do;
data aaaa;do id=1 to &total;output;end;run;
%if &strata ne %then %do;proc sort data=aaaa;by &strata;%end;
%if &n ne %then %do;
proc surveyselect data=aaaa seed=20060704 method=&method n=&n out=&out;%end;
%if &samprate ne %then %do;
proc surveyselect data=aaaa seed=20060704 method=&method samprate=&samprate out=&out;%end;
%if strata ne %then %do;strata &strata;%end;
%if &var ne %then %do;id &var;%end;run;
%end;
%if &export=0 %then %do;proc print data=&out noobs;run;%end;
%else %do;
proc export data=&out outfile='d:\tmp\ranselect.xls' dbms=excel2000 replace;%end;run;
ods select none;
proc datasets;delete &out;run;quit;ods select all;
%mend ranselect;
%ranselect(tmp2,method=srs,samprate=,n=20,total=31,out=bbbb,strata=,var=id region gdp,export=1);
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2011-7-12 09:44:55
wcguo94 发表于 2011-7-7 10:45
现需要做一项调查,要求按照变量GDP进行PPS抽样,要求:按照GDP大小成比例抽取20个地区样本,请问采用SAS如何实现?谢谢
Region GDP
Beijing 11865.9
Tianjin 7500.8
Hebei 17026.6
Liaoning 15065.6
Shanghai 14900.9
Jiangsu 34061.2
Zhejiang 22832.4
Fujian 11949.5
Shandong 33805.3
Guangdong 39081.6
Hainan 1646.6
Shanxi 7365.7
Jilin 7203.2
Heilongjiang 8288.0
Anhui 10052.9
Jiangxi 7589.2
Henan 19367.3
Hubei 12831.5
Hunan 12930.7
Chongqing 6528.7
Sichuan 14151.3
Guizhou 3893.5
Yunnan 6168.2
Tibet 441.4
Shaanxi 8186.7
Gansu 3382.4
Qinghai 1081.3
Ningxia 1334.6
Xinjiang 4273.6
InnerMongolia 9725.8
Guangxi 7700.4
There is a requirements for PPS method. In your example, the maximum size is 9. It cannot do n=20.

Hope the following example illustrate the idea clearly.

data TravelExpense;
length Region $20 GDP 8;
input Region GDP;
cards;
Beijing 11865.9
Tianjin 7500.8
Hebei 17026.6
Liaoning 15065.6
Shanghai 14900.9
Jiangsu 34061.2
Zhejiang 22832.4
Fujian 11949.5
Shandong 33805.3
Guangdong 39081.6
Hainan 1646.6
Shanxi 7365.7
Jilin 7203.2
Heilongjiang 8288.0
Anhui 10052.9
Jiangxi 7589.2
Henan 19367.3
Hubei 12831.5
Hunan 12930.7
Chongqing 6528.7
Sichuan 14151.3
Guizhou 3893.5
Yunnan 6168.2
Tibet 441.4
Shaanxi 8186.7
Gansu 3382.4
Qinghai 1081.3
Ningxia 1334.6
Xinjiang 4273.6
InnerMongolia 9725.8
Guangxi 7700.4
;
proc sql;
create table TravelExpense2 as
select
*, sum(GDP) as TotalSize, GDP/sum(GDP) as RelativeSize, int(sum(GDP)/GDP) as max
from TravelExpense;
quit;
proc print;run;
proc surveyselect data=TravelExpense2 method=pps n=9 seed=123
out=sample20;
size GDP;
run;
proc surveyselect data=TravelExpense2 method=pps n=10 seed=123
out=sample20;
size GDP;
run;
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

点击查看更多内容…
相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群