摘要翻译:
在高通量基因组研究中验证候选疾病基因的一个计算挑战是阐明候选基因集与疾病表型之间的关联。由于现有的致病基因注释不完整,传统的基因集富集分析往往不能揭示疾病表型与注释不良基因的基因集之间的关系。我们提出了一种基于网络的计算方法,称为rcNet,以发现基因集和疾病表型之间的联系。假设按其与查询基因集的相关性排序的基因和按其与查询基因集的隐藏目标疾病表型的相关性排序的疾病表型之间的一致性关联,我们建立了一个关于已知疾病表型-基因关联的秩一致性最大化的学习框架。提出了一种将岭回归与标号传播相结合的高效算法,并引入了两种变体来寻找该框架的最优解。我们评估了rcNet算法和现有的基线方法,既保留一个交叉验证,又预测了最近发现的OMIM疾病与基因的关联。实验表明,与基线相比,rcNet算法取得了最好的综合排名。为了进一步验证该算法的重复性,我们将这些算法应用于从最近的GWAS、DNA拷贝数变异分析和基因表达谱研究中获得的新的候选疾病基因识别目标疾病。在所有三个案例研究中,算法将候选基因的目标疾病排列在排列列表的顶部。rcNet算法可作为疾病和基因集关联分析的网络工具,网址为http://compbio.cs.umn.edu/dgsa_rcnet。
---
英文标题:
《Inferring Disease and Gene Set Associations with Rank Coherence in
Networks》
---
作者:
TaeHyun Hwang, Wei Zhang, Maoqiang Xie, Rui Kuang
---
最新提交年份:
2011
---
分类信息:
一级分类:Quantitative Biology 数量生物学
二级分类:Genomics 基因组学
分类描述:DNA sequencing and assembly; gene and motif finding; RNA editing and alternative splicing; genomic structure and processes (replication, transcription, methylation, etc); mutational processes.
DNA测序与组装;基因和基序的发现;RNA编辑和选择性剪接;基因组结构和过程(复制、转录、甲基化等);突变过程。
--
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Machine Learning
机器学习
分类描述:Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文(有监督的,无监督的,强化学习,强盗问题,等等),包括健壮性,解释性,公平性和方法论。对于机器学习方法的应用,CS.LG也是一个合适的主要类别。
--
一级分类:Quantitative Biology 数量生物学
二级分类:Molecular Networks 分子网络
分类描述:Gene regulation, signal transduction, proteomics, metabolomics, gene and enzymatic networks
基因调控、信号转导、蛋白质组学、代谢组学、基因和酶网络
--
---
英文摘要:
A computational challenge to validate the candidate disease genes identified in a high-throughput genomic study is to elucidate the associations between the set of candidate genes and disease phenotypes. The conventional gene set enrichment analysis often fails to reveal associations between disease phenotypes and the gene sets with a short list of poorly annotated genes, because the existing annotations of disease causative genes are incomplete. We propose a network-based computational approach called rcNet to discover the associations between gene sets and disease phenotypes. Assuming coherent associations between the genes ranked by their relevance to the query gene set, and the disease phenotypes ranked by their relevance to the hidden target disease phenotypes of the query gene set, we formulate a learning framework maximizing the rank coherence with respect to the known disease phenotype-gene associations. An efficient algorithm coupling ridge regression with label propagation, and two variants are introduced to find the optimal solution of the framework. We evaluated the rcNet algorithms and existing baseline methods with both leave-one-out cross-validation and a task of predicting recently discovered disease-gene associations in OMIM. The experiments demonstrated that the rcNet algorithms achieved the best overall rankings compared to the baselines. To further validate the reproducibility of the performance, we applied the algorithms to identify the target diseases of novel candidate disease genes obtained from recent studies of GWAS, DNA copy number variation analysis, and gene expression profiling. The algorithms ranked the target disease of the candidate genes at the top of the rank list in many cases across all the three case studies. The rcNet algorithms are available as a webtool for disease and gene set association analysis at http://compbio.cs.umn.edu/dgsa_rcNet.
---
PDF链接:
https://arxiv.org/pdf/1102.3919