摘要翻译:
Hartley-Shannon信息守恒(CoHSI)是一个无机制和符号不可知的守恒原理,它可以约束离散系统的结构,而不管其来源或功能如何。尽管基因组和计算机软件的来源不同,但它们都有一个简单的结构特性;它们是基于线性符号的离散系统,因此它们提供了一个在比较背景下测试COHSI预测的机会。在这里,不考虑它们在指定功能中的作用,也不考虑它们的相关性,我们确定了10个代表性基因组(从微生物到人类)和大量软件集合包含相同结构的嵌套子系统。对于基因组中的碱基序列,CoHSI预言,如果我们把基因组分成n个元组(2元组是一对连续的碱基;3元组是一个三元组,以此类推),而不考虑一个区域是否编码,那么每个n元组集合将构成一个齐次离散系统,并且在n元组出现的频率上服从幂律。我们考虑了10个物种的1,2,3,4,5,6,7和8元组,并证明了预测的幂律行为是存在的,而且预测的幂律行为对元组提取的开始窗口不敏感,即阅读框架是无关的。我们继续提供Chargaff第二奇偶性规则的证明,并在此基础上预测高阶元组奇偶性规则,然后在基因组数据中识别。CoHSI精确地预测了计算机软件中的相同行为。这一预测通过在多个计算机程序中使用机器码的十六进制表示的2-,3-和4-元组进行了测试和证实,强调了CoHSI在定义基于离散符号的系统必须在其中运行的景观方面所发挥的基本作用。
---
英文标题:
《CoHSI V: Identical multiple scale-independent systems within genomes and
computer software》
---
作者:
Les Hatton, Gregory Warr
---
最新提交年份:
2019
---
分类信息:
一级分类:Quantitative Biology 数量生物学
二级分类:Other Quantitative Biology 其他定量生物学
分类描述:Work in quantitative biology that does not fit into the other q-bio classifications
不适合其他q-bio分类的定量生物学工作
--
---
英文摘要:
A mechanism-free and symbol-agnostic conservation principle, the Conservation of Hartley-Shannon Information (CoHSI) is predicted to constrain the structure of discrete systems regardless of their origin or function. Despite their distinct provenance, genomes and computer software share a simple structural property; they are linear symbol-based discrete systems, and thus they present an opportunity to test in a comparative context the predictions of CoHSI. Here, without any consideration of, or relevance to, their role in specifying function, we identify that 10 representative genomes (from microbes to human) and a large collection of software contain identically structured nested subsystems. In the case of base sequences in genomes, CoHSI predicts that if we split the genome into n-tuples (a 2-tuple is a pair of consecutive bases; a 3-tuple is a trio and so on), without regard for whether or not a region is coding, then each collection of n-tuples will constitute a homogeneous discrete system and will obey a power-law in frequency of occurrence of the n-tuples. We consider 1-, 2-, 3-, 4-, 5-, 6-, 7- and 8-tuples of ten species and demonstrate that the predicted power-law behavior is emphatically present, and furthermore as predicted, is insensitive to the start window for the tuple extraction i.e. the reading frame is irrelevant. We go on to provide a proof of Chargaff's second parity rule and on the basis of this proof, predict higher order tuple parity rules which we then identify in the genome data. CoHSI predicts precisely the same behavior in computer software. This prediction was tested and confirmed using 2-, 3- and 4-tuples of the hexadecimal representation of machine code in multiple computer programs, underlining the fundamental role played by CoHSI in defining the landscape in which discrete symbol-based systems must operate.
---
PDF链接:
https://arxiv.org/pdf/1902.09360