全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 SAS专版
2560 3
2010-10-22
如果dataset A有100个观测,那么下面的程序递交后,dataset B有多少观测,为什么?

data B;
    output;
    set A;
    output;
run;
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2010-10-22 14:39:54
201条
第一个output 产生101条记录,第一条是空的
第二个output 产生100条记录
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2010-10-22 14:46:52
这个DATA步并不是在run语句后结束的,
而是在set语句试图从数据集中读数据,而没有可读数据的时候结束的。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2010-10-23 10:55:00
pobel 发表于 2010-10-22 14:46
这个DATA步并不是在run语句后结束的,
而是在set语句试图从数据集中读数据,而没有可读数据的时候结束的。
"这个DATA步并不是在run语句后结束的,"  I don't like this statement. This is even a wrong  statement.

Data a b c;
  ...;
run;

This is a clear well defined data step block. The RUN; statement is one of the most important statement in SAS though even SAS provides examples often omitted it unfortunately.

When SAS supervisor sees

data a b c;
run;

It starts to compile the data step. It may issue a compiling error and stop.

Unlike other language, SAS data step/regular data step has a lot of implicit statement, for example, the output statement you need not to specify and it is 'just' above the run statement. SAS has a internal loop that goes through all rows/obsrvations.

"而是在set语句试图从数据集中读数据,而没有可读数据的时候结束的。" True.


16
17   data t2;
18     set t1;
19     put _all_;
20   run;

x=1 y=2 _ERROR_=0 _N_=1
x=1 y=2 _ERROR_=0 _N_=2
x=1 y=2 _ERROR_=0 _N_=3
NOTE: There were 3 observations read from the data set WORK.T1.
NOTE: The data set WORK.T2 has 3 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

SAS read all three obs from data set t1 and output then to data set t2. The _N_ variable tells you the loops.



21
22   data t3;
23    do i = 1 to nobs;
24     set t1 nobs=nobs;
25     put _all_;
26     output;
27    end;
28   run;

i=1 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
i=2 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
i=3 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
NOTE: There were 3 observations read from the data set WORK.T1.
NOTE: The data set WORK.T3 has 3 observations and 3 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

You can put the set statement within a explicit do loop. Now the _N_ is 1.

89   data t3;
90    do i = 1 to nobs;
91     set t1 nobs=nobs;
92     put _all_;
93     *output;
94    end;
95   run;

i=1 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
i=2 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
i=3 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
NOTE: There were 3 observations read from the data set WORK.T1.
NOTE: The data set WORK.T3 has 1 observations and 3 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

Without output statement, only the last one will output. Remember, the output statement is 'just' above the RUN if there is no other output statement is defined.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群