摘要翻译:
类预测是微阵列基因表达
数据分析的一个重要应用。微阵列数据的高维性使得许多预测技术(如logistic回归、判别分析)的应用变得困难。解决这一问题的有效方法是利用降维统计技术。Rasch模型(RM)越来越多地应用于心理学相关的应用中,为处理高维微阵列数据提供了一个有吸引力的框架。本文利用二值化的微阵列基因表达数据,研究了基于RM的建模在降维方面的潜力,并在线性判别分析的分类预测中研究了其预测性能。两个不同的公开可用的微阵列数据集被用来说明该方法的一般框架。以主成分分析(PCA)为基准,采用再随机化方案对该方法的性能进行了评估。我们的结果表明,基于RM的降维和基于PCA的降维是一样有效的。该方法具有通用性,可推广应用于其他高维数据问题。
---
英文标题:
《Rasch-based high-dimensionality data reduction and class prediction with
applications to microarray gene expression data》
---
作者:
Andrej Kastrin, Borut Peterlin
---
最新提交年份:
2010
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence
人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类:Statistics 统计学
二级分类:Applications 应用程序
分类描述:Biology, Education, Epidemiology, Engineering, Environmental Sciences, Medical, Physical Sciences, Quality Control, Social Sciences
生物学,教育学,流行病学,工程学,环境科学,医学,物理科学,质量控制,社会科学
--
一级分类:Statistics 统计学
二级分类:Methodology 方法论
分类描述:Design, Surveys, Model Selection, Multiple Testing, Multivariate Methods, Signal and Image Processing, Time Series, Smoothing, Spatial Statistics, Survival Analysis, Nonparametric and Semiparametric Methods
设计,调查,模型选择,多重检验,多元方法,信号和图像处理,时间序列,平滑,空间统计,生存分析,非参数和半参数方法
--
一级分类:Statistics 统计学
二级分类:Machine Learning
机器学习
分类描述:Covers machine learning papers (supervised, unsupervised, semi-supervised learning, graphical models, reinforcement learning, bandits, high dimensional inference, etc.) with a statistical or theoretical grounding
覆盖机器学习论文(监督,无监督,半监督学习,图形模型,强化学习,强盗,高维推理等)与统计或理论基础
--
---
英文摘要:
Class prediction is an important application of microarray gene expression data analysis. The high-dimensionality of microarray data, where number of genes (variables) is very large compared to the number of samples (obser- vations), makes the application of many prediction techniques (e.g., logistic regression, discriminant analysis) difficult. An efficient way to solve this prob- lem is by using dimension reduction statistical techniques. Increasingly used in psychology-related applications, Rasch model (RM) provides an appealing framework for handling high-dimensional microarray data. In this paper, we study the potential of RM-based modeling in dimensionality reduction with binarized microarray gene expression data and investigate its prediction ac- curacy in the context of class prediction using linear discriminant analysis. Two different publicly available microarray data sets are used to illustrate a general framework of the approach. Performance of the proposed method is assessed by re-randomization scheme using principal component analysis (PCA) as a benchmark method. Our results show that RM-based dimension reduction is as effective as PCA-based dimension reduction. The method is general and can be applied to the other high-dimensional data problems.
---
PDF链接:
https://arxiv.org/pdf/1006.1030