请教高手:怎么抽取数据

joskow

2584

收藏 2009-12-17

我现在有两个数据集a和b。
其中a包括两个变量v1和v2，b实际上是一个800*800对称矩阵。
我想在a中生成一个变量v3，v3的取值是b中第v1行，第v2列的数。
应该怎么在SAS中实现？
非常感谢。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

bobguy

2009-12-17 12:05:39

joskow 发表于 2009-12-17 10:11
我现在有两个数据集a和b。
其中a包括两个变量v1和v2，b实际上是一个800*800对称矩阵。
我想在a中生成一个变量v3，v3的取值是b中第v1行，第v2列的数。
应该怎么在SAS中实现？
非常感谢。

The data a store an index information and data b is actually the data.

You want to get the data out of b according to the index in a. That b is symmetric is not relevant here I suppose.

Here is a solution.

1) load data into a temporary array (matrix).
2) grab the data accoding to v1, v2;

Here is a sample program and I define b matrix in 10*10. and so the index range with be in [1,10]. Mydata is your v3.

Hope this helps.

data a;

do i=1 to 5;
v1=ceil(10*ranuni(101));
v2=ceil(10*ranuni(101));
output;
end;
drop i;
run;

data b;
array x(10);
do k=1 to 10;
do i=1 to 10;
   x(i)=ceil(100*ranuni(101));
end;
output;
end;
drop i k;
run;

title 'a=index of Row and Column of b';
proc print data=a; run;
title 'b is a 10*10 square matrix';
proc print data=b; run;

data need;
array datmt (10,10) _temporary_;
do until (end);
   set b end=end;
      array x(10);
      n+1;
      do i = 1 to dim(x);
         datmt (n,i)=x(i);
      end;
end;
do until (end2);
   set a end=end2;
      mydata=datmt (v1,v2);
      output;
end;
keep v1 v2 mydata;
run;

proc print; run;

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

joskow

2009-12-17 15:09:38

2# bobguy

太谢谢了。翻了手册和帮助，不知这么理解思路对吗？

data need;
array datmt (10,10) _temporary_;
do until (end);
   set b end=end;
      array x(10);
      n+1;
      do i = 1 to dim(x);
         datmt (n,i)=x(i);
      end;
end;                                              /*这段是把数据集b转换成二维数组，要通过一维数组过度*/
do until (end2);
   set a end=end2;
      mydata=datmt (v1,v2);
      output;                                  /*用二维数组向新变量赋值。只是不明白的是为什么不能直接mydata=datmt(v1,v2)，而要用循环语句？*/
end;
keep v1 v2 mydata;
run

我没有编程基础，刚刚开始学SAS编程。希望多多指教。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

lwien007

2009-12-17 18:26:52

data a;
do v1=1 to 10;
  do v2=1 to 10;
output;
  end;
end;
run;
data b(drop=i j);
array t x1-x10;
do i=1 to 10;
  do j=1 to 10;
t[j]=round(rannor(12345)*2.5+10,1);
  end;
  output;
end;
run;
data a(keep=v1 v2 v3);
set a;
temp=v1;
array tmp x1-x10;
set b point=temp;
v3=tmp[v2];
run;

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

bobguy

2009-12-17 22:15:14

joskow 发表于 2009-12-17 15:09
2# bobguy

太谢谢了。翻了手册和帮助，不知这么理解思路对吗？

data need;
array datmt (10,10) _temporary_;
do until (end);
   set b end=end;
      array x(10);
      n+1;
      do i = 1 to dim(x);
         datmt (n,i)=x(i);
      end;
end;                                              /*这段是把数据集b转换成二维数组，要通过一维数组过度*/
do until (end2);
   set a end=end2;
      mydata=datmt (v1,v2);
      output;                                  /*用二维数组向新变量赋值。只是不明白的是为什么不能直接mydata=datmt(v1,v2)，而要用循环语句？*/
end;
keep v1 v2 mydata;
run

我没有编程基础，刚刚开始学SAS编程。希望多多指教。

1) 不知这么理解思路对吗？
You should be able to design cases to check it is Right or Wrong. It should be a problem.

2)/*这段是把数据集b转换成二维数组，要通过一维数组过度*/
Think about that a SAS table is two dimentional. The first set loop just dumps the table into a 2-dimentional temporary array so that it could be referred anytime in PGM.

The following programs are equivalent
*1);
data b2;
set b;
run;
*2);
data b3;
  do until(end);
set b end=end;
output;
  end;
run;

Type 1)  is usually you see in SAS book. Actually SAS do a lot of thing in a set statement. For example, SAS internal loop, where it puts a output statement implicitly, etc. The most important thing is SAS accesses its data set sequencially(without point option in set).

Type 2)  just writes out more explicitly.There are certain avantages. For example, in this example, load data into a temporary array for look-up is much simpler that conventional ways.

I see a quite few of OLD programmers like that way because it is more like a COBOL style. And some people believe it is faster/more efficient than type 1.

3) /*用二维数组向新变量赋值。只是不明白的是为什么不能直接mydata=datmt(v1,v2)，而要用循环语句？*/

I think the answer is in 2) if you understand type 2)

Beyond SAS manuals, Rick Aster wrote a SAS/BASE book "Professional SAS Programming Secrets" many years ago based on version SAS6.6, this is the only book I would recommend. Unfortunately it is quite old. He may update it ...

enjoy!

HTH.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

jingju11

2009-12-24 05:40:56

5# bobguy

That is a model program. So admirable!

To Joskow:

3) /*用二维数组向新变量赋值。只是不明白的是为什么不能直接mydata=datmt(v1,v2)，而要用循环语句？*/

I may have more words on this question.
The loop you have mentioned, with the explicit output statement, forcibly read and output data A with variable ‘mydata’ till its end. If without the explicit loop, after reading the first record in A, (SAS) control is passed to the top of the program, where the end of data B is detected at the time(because it is already at its end), SAS terminates the program thus only one record is created, as you might have seen.

Only my personal understanding.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群