全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 R语言论坛
3936 4
2011-04-15
我有几个问题请教,原文如下:    FFT analysis of the (gene) expression profiles.     Fourier analysis was performed on each profile in the quality-controlled set (5,081 oligonucleotides). Profiles were smoothed with missing values imputed using a locally weighted regression algorithm with local weighting restricted to 12% using R. Fourier analysis was performed on each profile using the fft() function of R, padded with zeros to 64 measurements. The power spectrum was calculated using the spectrum() function of R. The power at each frequency (Power()), the total power (Ptot), and the frequency of maximum power (Fmax) were determined. The periodicity score was defined as Power[(Fmax-1) + (Fmax) + (Fmax+1)]/Ptot. The most frequent value of Fmax across all profiles was deemed the major frequency (m) and used in determining phase information. The phase of each profile was calculated as atan2[-(I (m)],R (m), where atan2 is R’s arctangent function and I and R are the imaginary and real parts of the FFT. Profiles were then ordered in increasing phase from -p to p. The loess smooth profiles were drawn through the raw expression data using the loess() function found in the modern regression library of R (version 1.5.1). The default parameters were used, with the exception that local weighting was reduced to 30%. For the averaged profiles of the functional groups, the loess smooth profiles were calculated for each expression profile individually and subsequently averaged to create the representative profile. These same methods were applied to both the randomized set and the yeast cell cycle dataset.    因为我又相似的数据要处理,但是我在重复该方法时有些地方不明白,在给作者写信未果的情况下,到论坛来求教。    1. Each profile(基因表达谱) was smoothed with missing values imputed using a local weighted regression algorithm...这里不知道怎么做了,基因表达谱是先定义成时间序列的,中间有时间点没有数据,我不知道local weighted regression和restricted by 12% 怎么调用。    2. FFT()中 padded with zero to 64 measurement 怎么做到?    3. periodic score 怎么做出来的?    4. ordered from phase -p to p又是如何实现的?感谢指教和任何有益的建议,盼复。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2011-4-17 11:46:40
不知道基因数据的处理。
有如下猜测:
(1) 第一步是用局部加权非参数拟合,如 loess() 函数填补
(2) 第二步是FFT过滤,如 fftfilter(),对应参数 n=64

第三步和第四步完全不懂。

(如果有数据,详细的中间步骤或第一步要实现的目的,可能明白些)
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2011-4-19 09:24:34
1# zhanguozhou
友情帮顶,
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2011-4-24 16:56:07
多谢你们了,很抱歉没有把问题讲清楚。
基因表达的变化是通过信号强度表示的,数据格式如下:
gene    1h     6h    12h  18h  24h  30h  36h  42h  48h
1          0.14 0.15 0.13 0.12 0.13 0.96 4.19 3.19 1.04
2          0.29 0.21 0.25 0.24 0.14 0.15 0.51 4.03 3.77
3          0.18 0.17 0.16 0.17 0.19 3.42 4.46 1.74 0.65
4          7.59 1.48 0.96 0.77 0.42 0.27 0.28 0.88 1.63
5          0.63 0.50 0.26 0.18 0.14 0.18 0.25 3.03 3.86
6          0.40 0.25 0.19 0.12 0.34 3.21 2.44 0.94 0.61
7          0.24 0.22 0.19 0.17 0.18 0.18 0.24 3.24 4.12
8          0.31 0.26 0.22 0.20 0.21 0.43 2.67 4.83 2.08
9          0.17 0.18 0.16 0.16 0.19 1.95 3.58 2.06 0.88
10        0.19 0.21 0.19 0.17 0.22 2.38 3.86 1.94 0.72
......
在此基础上有如下数据操作,我不明白:
1  每一行转换成一个time series, imputing the missing data with locally weighted smooth(<=12%)?
2  每一行进行fft()变换,padded with 0~64 measurements?
3  每一行使用spectrum()?
4  得到power at each frequency(), the total power()(P_tot), the frequency of maximum power()(F_max)?
5  periodic score=power[(F_max-1)+(F_max)+(F_max+1)]/P_tot  用来分组?
6  the most frequent value of F_max across all profiles was deemed the major frequency(m) and used in determing phase information?
7  the phase of each profile was calculated as atan2[-I(m)],R(m);
8  profiles was ordered in increasing phase from -pi to pi;
完成以后的效果是将基因排列成附图(transcriptome01.png)格式。
    再次感谢你们的帮助!
附件列表
transcriptome01.png

原图尺寸 663.86 KB

transcriptome01.png

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2011-4-24 17:00:07
错误和不当之处,请不吝批评指正!
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群