原数据是一个变量加上多个年份year和多个年龄组age-group,这个变量有很多NA空值,像这样:
| surveyperiod_year | age_group | variable |
2008 | 2 | 10 |
2008 | 2 | 7 |
2008 | 2 | NA |
2008 | 1 | 8 |
2009 | 2 | 5 |
2009 | 1 | 8 |
2009 | 3 | NA |
2009 | 4 | 7 |
2010 | 1 | 8 |
2010 | 1 | 8 |
2010 | 2 | 8 |
2010 | 4 | NA |
2010 | 3 | 10 |
2011 | 3 | NA |
2011 | 1 | 10 |
2011 | 3 | 7 |
2012 | 4 | 5 |
2012 | 4 | NA |
2012 | 3 | NA |
2012 | 2 | 8 |
我现在已经按year和age-group计算出了该变量的非NA值在不同年份中不同年龄组的均值,是像这样的dataframe:
| surveyperiod_year | age_group | mean_value |
2008 | 1 | 7.982143 |
2008 | 2 | 7.896907 |
2008 | 3 | 7.917293 |
2008 | 4 | 8.096491 |
2008 | 5 | 7.82906 |
2009 | 1 | 7.850242 |
2009 | 2 | 8.021739 |
2009 | 3 | 7.99537 |
2009 | 4 | 8.066372 |
2009 | 5 | 7.99569 |
2010 | 1 | 7.988827 |
2010 | 2 | 7.873016 |
2010 | 3 | 8.199029 |
2010 | 4 | 8 |
2010 | 5 | 7.961686 |
2011 | 1 | 7.565934 |
2011 | 2 | 8.045455 |
2011 | 3 | 7.855346 |
我现在想用每个year和age-group的mean_value来一一对应给每一个同样year和age-group的变量的空值进行赋值,比如对于year=2008和age-group=1的variable中的NA进行赋值,我的code是这样写的:
data4variable[data4surveyperiod_year == 2008 & data4age_group == 1 & is.na(data4variable) == T] <- data_mean4variable_mean[data_mean4surveyperiod_year == 2008 & data_mean4age_group == 1]
但是结果不对,R不是把variable中的NA进行一一赋值,而是全部赋值了,并且赋值也不是按同样的year和age-group,而是按同行进行的。
请教各位老师,我应该怎么做?谢谢!!!