全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 MATLAB等数学软件专版
1022 2
2017-05-04
  • Title: Optimized R functions for analysis of ecological community data using the R virtual laboratory (RvLab)
  • Author: Oulas, Anastasis; Varsos, Constantinos; Patkos, Theodore; Pavloudi, Christina;Gougousis, Alexandros; Ijaz, Umer; Filiopoulou, Irene; Pattakos, Nikolaos; Vanden Berghe, Edward; Fernández - Guerra, Antonio; Faulwetter, Sarah; Chatzinikolaou, Eva; Pafilis, Evangelos; Bekiari, Chryssoula; Doerr, Martin; Arvanitidis, Christos
  • Date: 2016
  • Is Part Of: Biodiversity Data Journal, 2016 [Peer Reviewed Journal]
  • Description: Background Parallel data manipulation using R has previously been addressed by members of the R community, however most of these studies produce ad hoc solutions that are not readily available to the average R user. Our targeted users, ranging from the expert ecologist/microbiologists to computational biologists, often experience difficulties in finding optimal ways to exploit the full capacity of their computational resources. In addition, improving performance of commonly used R scripts becomes increasingly difficult especially with large datasets. Furthermore, the implementations described here can be of significant interest to expert bioinformaticians or R developers. Therefore, our goals can be summarized as: (i) description of a complete methodology for the analysis of large datasets by combining capabilities of diverse R packages, (ii) presentation of their application through a virtual R laboratory (RvLab) that makes execution of complex functions and visualization of results easy and readily available to the end-user. New information In this paper, the novelty stems from implementations of parallel methodologies which rely on the processing of data on different levels of abstraction and the availability of these processes through an integrated portal. Parallel implementation R packages, such as the pbdMPI (Programming with Big Data – Interface to MPI) package, are used to implement Single Program Multiple Data (SPMD) parallelization on primitive mathematical operations, allowing for interplay with functions of the vegan package. The dplyr and RPostgreSQL R packages are further integrated offering connections to dataframe like objects (databases) as secondary storage solutions whenever memory demands exceed available RAM resources. The RvLab is running on a PC cluster, using version 3.1.2 (2014-10-31) on a x86_64-pc-linux-gnu (64-bit) platform, and offers an intuitive virtual environmet interface enabling users to perform analysis of ecological and microbial communities based on optimized vegan functions. A beta version of the RvLab is available after registration at: https://portal.lifewatchgreece.eu/


8507226.pdf
大小:(1.02 MB)

只需: 6 个论坛币  马上下载



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2017-5-4 22:57:14
一哇一嘛呆 发表于 2017-5-4 18:23
  • Title: Optimized R functions for analysis of ecological community data using the R virtual labo ...
  • 好的呀好
    二维码

    扫码加我 拉你入群

    请注明:姓名-公司-职位

    以便审核进群资格,未注明则拒绝

    2017-5-5 04:52:27
    谢谢分享。
    二维码

    扫码加我 拉你入群

    请注明:姓名-公司-职位

    以便审核进群资格,未注明则拒绝

    栏目导航
    热门文章
    推荐文章

    说点什么

    分享

    扫码加好友,拉您进群
    各岗位、行业、专业交流群