全部版块 我的主页
论坛 经济学人 二区 外文文献专区
506 0
2022-03-04
摘要翻译:
这项工作为DNA提出了一个markovian无记忆模型,极大地简化了它的复杂性。我们将核苷酸序列编码成符号序列,称为单词,从中我们建立有意义的单词长度和共享符号相似性的单词组。解释一个节点来表示一组相似的词,解释边缘来表示它们的功能连通性,这使得我们可以构造一个支配DNA中一组词出现的语法规则网络。我们的模型能够以前所未有的精确度预测DNA中词组之间的转换,并可以轻松计算许多信息量来更好地表征DNA。此外,我们将已知细菌的DNA简化为一个只有几十个节点的网络,展示了如何使用我们的模型来检测不同生物中相似(或不相似)的基因,以及哪些符号序列负责DNA的大部分信息内容。因此,DNA确实可以被视为一种语言,一种马尔可夫语言,在这种语言中,一个“词”是一个群体的一个元素,它的语法代表了任何两个群体之间转换概率背后的规则。
---
英文标题:
《Markovian language model of the DNA and its information content》
---
作者:
Shambhavi Srivastava and Murilo S. Baptista
---
最新提交年份:
2015
---
分类信息:

一级分类:Physics        物理学
二级分类:Biological Physics        生物物理学
分类描述:Molecular biophysics, cellular biophysics, neurological biophysics, membrane biophysics, single-molecule biophysics, ecological biophysics, quantum phenomena in biological systems (quantum biophysics), theoretical biophysics, molecular dynamics/modeling and simulation, game theory, biomechanics, bioinformatics, microorganisms, virology, evolution, biophysical methods.
分子生物物理、细胞生物物理、神经生物物理、膜生物物理、单分子生物物理、生态生物物理、生物系统中的量子现象(量子生物物理)、理论生物物理、分子动力学/建模与模拟、博弈论、生物力学、生物信息学、微生物、病毒学、进化论、生物物理方法。
--
一级分类:Quantitative Biology        数量生物学
二级分类:Other Quantitative Biology        其他定量生物学
分类描述:Work in quantitative biology that does not fit into the other q-bio classifications
不适合其他q-bio分类的定量生物学工作
--

---
英文摘要:
  This work proposes a markovian memoryless model for the DNA that simplifies enormously the complexity of it. We encode nucleotide sequences into symbolic sequences, called words, from which we establish meaningful length of words and group of words that share symbolic similarities. Interpreting a node to represent a group of similar words and edges to represent their functional connectivity allows us to construct a network of the grammatical rules governing the appearance of group of words in the DNA. Our model allows to predict the transition between group of words in the DNA with unprecedented accuracy, and to easily calculate many informational quantities to better characterize the DNA. In addition, we reduce the DNA of known bacteria to a network of only tens of nodes, show how our model can be used to detect similar (or dissimilar) genes in different organisms, and which sequences of symbols are responsible for the most of the information content of the DNA. Therefore, the DNA can indeed be treated as a language, a markovian language, where a "word" is an element of a group, and its grammar represents the rules behind the probability of transitions between any two groups.
---
PDF链接:
https://arxiv.org/pdf/1510.02375
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群