全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 R语言论坛
1974 11
2020-02-01

急求大神指点,以下是一个含3列的数据框(取自TCGA临床数据),第1行是列名,想要将这3列数据合并在第4列(新建1列,列名为Outcome),规则如下:1.认为Unknown与Not Available同义,并且级别最低,如果某一行数据在三列中不同,取Unknown或Not Available以外的某列数据放在第4列。2.如果Unknown或Not Available与Complete Remission/Response、Partial Remission/Response、Stable Disease、Progressive Disease等值在某一个格子内同时出现,认为是原始记录错误,忽略Unknown或Not Available。3.如果Complete Remission/Response、Partial Remission/Response、Stable Disease、Progressive Disease这4种结果同时出现,无论是某一列内(记录错误)还是3列间,按照Complete Remission/Response > Partial Remission/Response > Stable Disease > Progressive Disease的优先级保留唯一的值。敬请大神指点,多谢!

follow_ups.follow_up.followup_treatment_successfollow_ups.follow_up.primary_therapy_outcome_successprimary_therapy_outcome_success
Complete Remission/ResponseComplete Remission/ResponseComplete Remission/Response
Complete Remission/Response;Complete Remission/ResponseComplete Remission/Response;Complete Remission/ResponseNot Available
Not Available
Complete Remission/Response;Complete Remission/ResponseComplete Remission/Response;Complete Remission/ResponseNot Applicable
Progressive DiseaseProgressive DiseaseNot Available
Complete Remission/ResponseComplete Remission/ResponseUnknown
UnknownUnknownNot Available
Complete Remission/ResponseComplete Remission/ResponseComplete Remission/Response
Unknown
Not Available;Complete Remission/ResponseComplete Remission/Response;Complete Remission/ResponseNot Available
Stable DiseaseStable DiseaseNot Available
Not AvailableProgressive DiseaseNot Available
Complete Remission/Response;Complete Remission/ResponseComplete Remission/Response;Complete Remission/ResponseNot Available
Complete Remission/ResponseComplete Remission/ResponseComplete Remission/Response
Not ApplicableNot AvailableNot Available
Not AvailableComplete Remission/ResponseNot Available
Complete Remission/ResponseNot ApplicableNot Available
Not AvailableNot AvailableNot Available
Complete Remission/Response;Complete Remission/Response;Progressive DiseaseComplete Remission/Response;Not Available;Complete Remission/ResponseNot Available
Progressive DiseaseComplete Remission/ResponseComplete Remission/Response
Complete Remission/ResponseComplete Remission/ResponseComplete Remission/Response
Not AvailableProgressive DiseaseNot Available
Complete Remission/Response
Complete Remission/ResponseComplete Remission/ResponseNot Available
Complete Remission/Response;Complete Remission/ResponseComplete Remission/Response;Complete Remission/ResponseNot Available
Not AvailableNot AvailableNot Available
Not Available;UnknownNot Available;Complete Remission/ResponseNot Available
Not AvailableNot AvailableNot Available
Stable DiseaseStable DiseaseStable Disease

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2020-2-2 01:56:02
我这段程序有点笨,但是可以实现你要的功能。[em07][em07]

复制代码
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-2-2 11:58:05
caozhaowen 发表于 2020-2-2 01:56
我这段程序有点笨,但是可以实现你要的功能。
多谢朋友指点,我执行您的代码后,最后一列(Outcome列)都是unknown,能否请您再看一下?
另外,我想应该再增加一条规则(高于前面的优先级),如果三列中有两列一致,另一列不一致,认为不一致的可能是记录错误,取两列一致的值(Unknown与Not Available除外),例如:Complete Remission/Response、        Not Applicable、Not Available,取Complete Remission/Response;Partial Remission/Response、Complete Remission/Response、Partial Remission/Response,取Partial Remission/Response
希望您:
1.考虑新增规则和之前规则,重新设计代码。
2.若所有规则可以同时满足,也请单独考虑新增规则,设计代码。(前面的优先级规则很可能引入误差)
3.若不能同时满足优先级规则,请单独考虑新增规则,设计代码。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-2-2 12:12:52
EveIOU 发表于 2020-2-2 11:58
多谢朋友指点,我执行您的代码后,最后一列(Outcome列)都是unknown,能否请您再看一下?
另外,我想应 ...
怎么会,你仔细看懂代码,然后再执行。

QQ图片20200202121034.png

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-2-2 12:14:37
EveIOU 发表于 2020-2-2 11:58
多谢朋友指点,我执行您的代码后,最后一列(Outcome列)都是unknown,能否请您再看一下?
另外,我想应 ...
若要增加规则,可仿此加代码即可。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-2-2 12:15:27
执行后代结果见截图
附件列表
QQ图片20200202121034.png

原图尺寸 91.23 KB

QQ图片20200202121034.png

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

点击查看更多内容…
相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群