摘要翻译:
近年来,由于测序技术的进步和成本的大幅下降,我们目睹了基因组学中数据的急剧爆炸。我们正在进入数百万可用基因组的时代。值得注意的是,每个基因组可以由数十亿个核苷酸组成,这些核苷酸以千兆字节的形式存储在纯文本文件中。不可否认,那些基因组数据给我们带来了前所未有的数据挑战。在这篇文章中,我们简要讨论了近年来与基因组学相关的大数据挑战。
---
英文标题:
《Big Data Challenges in Genome Informatics》
---
作者:
Ka-Chun Wong
---
最新提交年份:
2018
---
分类信息:
一级分类:Quantitative Biology 数量生物学
二级分类:Other Quantitative Biology 其他定量生物学
分类描述:Work in quantitative biology that does not fit into the other q-bio classifications
不适合其他q-bio分类的定量生物学工作
--
一级分类:Computer Science 计算机科学
二级分类:Computational Engineering, Finance, and Science 计算工程、金融和科学
分类描述:Covers applications of computer science to the mathematical modeling of complex systems in the fields of science, engineering, and finance. Papers here are interdisciplinary and applications-oriented, focusing on techniques and tools that enable challenging computational simulations to be performed, for which the use of supercomputers or distributed computing platforms is often required. Includes material in ACM Subject Classes J.2, J.3, and J.4 (economics).
涵盖了计算机科学在科学、工程和金融领域复杂系统的数学建模中的应用。这里的论文是跨学科和面向应用的,集中在技术和工具,使挑战性的计算模拟能够执行,其中往往需要使用超级计算机或分布式计算平台。包括ACM学科课程J.2、J.3和J.4(经济学)中的材料。
--
---
英文摘要:
In recent years, we have witnessed a dramatic data explosion in genomics, thanks to the improvement in sequencing technologies and the drastically decreasing costs. We are entering the era of millions of available genomes. Notably, each genome can be composed of billions of nucleotides stored as plain text files in GigaBytes (GBs). It is undeniable that those genome data impose unprecedented data challenges for us. In this article, we briefly discuss the big data challenges associated with genomics in recent years.
---
PDF链接:
https://arxiv.org/pdf/1803.09632