在诸如此类语句时。
data three;
  set one two;
  by var;
run;
SAS help中说明:
The values of the variables in the program data vector are set to missing each time SAS starts to read a new
data set and when the BY group changes. (SAS  language reference 9.2, P362)
但我经过测试,我发现其实除了首次将PDV置为缺失时,SAS开始读另一个data set和by 组改变时,均没有置为缺失。
以下是我的测试:
log:148  data three;
149  put "before set:" _all_;
150    set one two;
151    by x;
152  put "after set:" _all_;
153  run;
before set:x=. FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=1
after set:x=1 FIRST.x=1 LAST.x=0 _ERROR_=0 _N_=1
before set:x=1 FIRST.x=1 LAST.x=0 _ERROR_=0 _N_=2
after set:x=1 FIRST.x=0 LAST.x=1 _ERROR_=0 _N_=2
before set:x=1 FIRST.x=0 LAST.x=1 _ERROR_=0 _N_=3
after set:x=2 FIRST.x=1 LAST.x=0 _ERROR_=0 _N_=3
before set:x=2 FIRST.x=1 LAST.x=0 _ERROR_=0 _N_=4
after set:x=2 FIRST.x=0 LAST.x=1 _ERROR_=0 _N_=4
before set:x=2 FIRST.x=0 LAST.x=1 _ERROR_=0 _N_=5
after set:x=3 FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=5
before set:x=3 FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=6
after set:x=4 FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=6
before set:x=4 FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=7
NOTE: There were 3 observations read from the data set WORK.ONE.
NOTE: There were 3 observations read from the data set WORK.TWO.
NOTE: The data set WORK.THREE has 6 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds
标红部分都是by组改变,且要切换到另一个data set里读取观测。但是pdv里的值是retain的。没有置为missing.
同样,对于MERGE+by语句:
SAS HELP说明:
When SAS has read all observations in a
BY group from all data sets, it sets all variables in the program data vector
(except those created by SAS) to missing (SAS  language reference 9.2, P373)
LOG:
198  data c;
199  put "before set:" _all_;
200    merge a b;
201    by x;
202  put "after set:" _all_;
203  run;
before set:x=. y=  z=  FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=1
after set:x=1 y=a1 z=b1 FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=1
before set:x=1 y=a1 z=b1 FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=2
after set:x=2 y=a2 z=b2 FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=2
before set:x=2 y=a2 z=b2 FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=3
after set:x=3 y=a3 z=b3 FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=3
before set:x=3 y=a3 z=b3 FIRST.x=1 LAST.x=1 _ERROR_=0 _N_=4
NOTE: There were 3 observations read from the data set WORK.A.
NOTE: There were 3 observations read from the data set WORK.B.
NOTE: The data set WORK.C has 3 observations and 3 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds 如上,标红部分也是by组change时,结果还是retain,没有置为Missing.