这个是小弟手头的一组数据(经过简化)
第三列为数据类型,四五列为跨越长度,第九列为特征ID,其余不管
1、首先我想只留下表1第三列为A的行,把A1A2A3的列都去掉,变成表2那样
2、其次,比如以EF为出发点,我想提取出以四五列的长度为依据,向上200、向下200距离的特征ID。比如EF的跨度是301到400,要把它上面100-300的AB,CD的ID,下面400-600的GH,IJ的ID提取出来
3、再进一步,如果要从CD,IL出发,分别同时把CD上下的AB,EF和IL上下的GH,KL的ID提取出来
求问用R应该如何写出程序? 谢谢各位
| chr1 | HAVANA | A | 100 | 200 | . | + | . | ID = AB |
| chr1 | HAVANA | A1 | 100 | 120 | . | + | . | ID = AB |
| chr1 | HAVANA | A2 | 121 | 156 | . | + | . | ID = AB |
| chr1 | HAVANA | A3 | 157 | 200 | . | + | . | ID = AB |
| chr1 | HAVANA | A | 201 | 300 | . | + | . | ID = CD |
| chr1 | HAVANA | A1 | 201 | 267 | . | + | . | ID = CD |
| chr1 | HAVANA | A2 | 268 | 289 | . | + | . | ID = CD |
| chr1 | HAVANA | A3 | 289 | 300 | . | + | . | ID = CD |
| chr1 | HAVANA | A | 301 | 400 | . | - | . | ID = EF |
| chr1 | HAVANA | A1 | 301 | 345 | . | - | . | ID = EF |
| chr1 | HAVANA | A2 | 346 | 378 | . | - | . | ID = EF |
| chr1 | HAVANA | A3 | 378 | 400 | . | - | . | ID = EF |
| chr1 | HAVANA | A | 401 | 500 | . | - | . | ID = GH |
| chr1 | HAVANA | A1 | 401 | 434 | . | - | . | ID = GH |
| chr1 | HAVANA | A2 | 434 | 477 | . | - | . | ID = GH |
| chr1 | HAVANA | A3 | 477 | 500 | . | - | . | ID = GH |
| chr1 | HAVANA | A | 501 | 600 | . | - | . | ID = IJ |
| chr1 | HAVANA | A1 | 501 | 524 | . | - | . | ID = IJ |
| chr1 | HAVANA | A2 | 524 | 568 | . | - | . | ID = IJ |
| chr1 | HAVANA | A3 | 568 | 600 | . | - | . | ID = IJ |
| chr1 | HAVANA | A | 601 | 700 | . | - | . | ID = KL |
| chr1 | HAVANA | A1 | 601 | 647 | . | - | . | ID = KL |
| chr1 | HAVANA | A2 | 647 | 700 | . | - | . | ID = KL |
表1
| chr1 | HAVANA | A | 100 | 200 | . | + | . | ID = AB |
| chr1 | HAVANA | A | 201 | 300 | . | + | . | ID = CD |
| chr1 | HAVANA | A | 301 | 400 | . | - | . | ID = EF |
| chr1 | HAVANA | A | 401 | 500 | . | - | . | ID = GH |
| chr1 | HAVANA | A | 501 | 600 | . | - | . | ID = IJ |
| chr1 | HAVANA | A | 601 | 700 | . | - | . | ID = KL |
表2