摘要翻译:
在真核生物基因组中,由多个转录因子结合位点组成的顺式调控模块控制着基因的表达。比较基因组研究表明,由于进化的限制,这些调控元件在物种间更加保守。我们提出了一种在从头母题发现中结合模块结构和跨物种矫形学的统计方法。我们使用隐马尔可夫模型(HMM)来捕捉每个物种中的模块结构,并通过多物种比对来耦合这些HMM。进化模型被结合来考虑不同物种之间排列序列位置之间的相关结构。基于我们的模型,我们发展了一种马尔可夫链蒙特卡罗方法MultiModule,用于在多个物种的同源序列群中同时发现CRMs及其组成基序。我们的方法在哺乳动物和果蝇的模拟数据集和生物数据集上进行了测试,与其他模体和模块发现方法相比有显著的改进。
---
英文标题:
《Coupling hidden Markov models for the discovery of Cis-regulatory
  modules in multiple species》
---
作者:
Qing Zhou, Wing Hung Wong
---
最新提交年份:
2007
---
分类信息:
一级分类:Statistics        统计学
二级分类:Applications        应用程序
分类描述:Biology, Education, Epidemiology, Engineering, Environmental Sciences, Medical, Physical Sciences, Quality Control, Social Sciences
生物学,教育学,流行病学,工程学,环境科学,医学,物理科学,质量控制,社会科学
--
---
英文摘要:
  Cis-regulatory modules (CRMs) composed of multiple transcription factor binding sites (TFBSs) control gene expression in eukaryotic genomes. Comparative genomic studies have shown that these regulatory elements are more conserved across species due to evolutionary constraints. We propose a statistical method to combine module structure and cross-species orthology in de novo motif discovery. We use a hidden Markov model (HMM) to capture the module structure in each species and couple these HMMs through multiple-species alignment. Evolutionary models are incorporated to consider correlated structures among aligned sequence positions across different species. Based on our model, we develop a Markov chain Monte Carlo approach, MultiModule, to discover CRMs and their component motifs simultaneously in groups of orthologous sequences from multiple species. Our method is tested on both simulated and biological data sets in mammals and Drosophila, where significant improvement over other motif and module discovery methods is observed. 
---
PDF链接:
https://arxiv.org/pdf/708.4318