pobel 发表于 2010-10-22 14:46 
这个DATA步并不是在run语句后结束的,
而是在set语句试图从数据集中读数据,而没有可读数据的时候结束的。
"这个DATA步并不是在run语句后结束的," I don't like this statement. This is even a wrong statement.
Data a b c;
...;
run;
This is a clear well defined data step block. The RUN; statement is one of the most important statement in SAS though even SAS provides examples often omitted it unfortunately.
When SAS supervisor sees
data a b c;
run;
It starts to compile the data step. It may issue a compiling error and stop.
Unlike other language, SAS data step/regular data step has a lot of implicit statement, for example, the output statement you need not to specify and it is 'just' above the run statement. SAS has a internal loop that goes through all rows/obsrvations.
"而是在set语句试图从数据集中读数据,而没有可读数据的时候结束的。" True.
16
17 data t2;
18 set t1;
19 put _all_;
20 run;
x=1 y=2 _ERROR_=0 _N_=1
x=1 y=2 _ERROR_=0 _N_=2
x=1 y=2 _ERROR_=0 _N_=3
NOTE: There were 3 observations read from the data set WORK.T1.
NOTE: The data set WORK.T2 has 3 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
SAS read all three obs from data set t1 and output then to data set t2. The _N_ variable tells you the loops.
21
22 data t3;
23 do i = 1 to nobs;
24 set t1 nobs=nobs;
25 put _all_;
26 output;
27 end;
28 run;
i=1 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
i=2 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
i=3 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
NOTE: There were 3 observations read from the data set WORK.T1.
NOTE: The data set WORK.T3 has 3 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
You can put the set statement within a explicit do loop. Now the _N_ is 1.
89 data t3;
90 do i = 1 to nobs;
91 set t1 nobs=nobs;
92 put _all_;
93 *output;
94 end;
95 run;
i=1 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
i=2 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
i=3 nobs=3 x=1 y=2 _ERROR_=0 _N_=1
NOTE: There were 3 observations read from the data set WORK.T1.
NOTE: The data set WORK.T3 has 1 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
Without output statement, only the last one will output. Remember, the output statement is 'just' above the RUN if there is no other output statement is defined.