全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 SAS专版
4739 3
2007-03-19
土问modify语句中,key选项所规定的index是何意义?另外,数据集的索引index的作用是什么?
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2007-3-20 09:03:00

数据集的索引是为了提高查询效率,就好象把字典中所有的字按拼音或者笔画建立目录便于查找一样.

但是索引不是在所有的情况下都有效,比如需要顺序遍历数据集的时候是不必要建立索引的.一般索引建立在where条件的变量上.建立索引以后,查询和修改的速度会提高不少

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2007-3-20 11:36:00

非常感谢楼上,下面是一个运用索引的modify语句.其过程及结果都不很明白.望指教

data master (index=(locate));
input locate $ code @@;
cards;
a 200 a 201 b 100 a 202 a 203 b 101 c 600 d 700 d 701
;
data keyvals;
input locate $ newcode @@;
cards;
b 1 a 2 a 3 a 4 b 11 a 12 c 6 d 16 d 7
;
data master;
set keyvals;
modify master key=locate;
code=newcode;
run;

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2007-3-20 16:38:00

sas帮助中对modify语句使用索引的时候是这样说明的:

If there are duplicate values of the indexed variable in the master data set, only the first occurrence is retrieved, modified, or replaced. Use a DO LOOP to execute a SET statement with the KEY= option multiple times to update all duplicates with the transaction value.

If there are duplicate, nonconsecutive values in the like-named variable in the data source, MODIFY applies each transaction cumulatively to the first observation in the master data set whose index value matches the values from the data source. Therefore, only the value in the last duplicate transaction is the result in the master observation unless you write an accumulation statement to accumulate each duplicate transaction value in the master observation.

If there are duplicate, consecutive values in the variable in the data source, the values from the first observation in the data source are applied to the master data set, but the DATA step terminates with an error when it tries to locate an observation in the master data set for the second duplicate from the data source. To avoid this error, use the UNIQUE option in the MODIFY statement. The UNIQUE option causes SAS to return to the top of the master data set before retrieving a match for the index value. You must write an accumulation statement to accumulate the values from all the duplicates. If you do not, only the last one applied is the result in the master observation.

If there are duplicate index values in both data sets, you can use SQL to apply the duplicates in the transaction data set to the duplicates in the master data set in a one-to-one correspondence.

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群