急怎么计算主成份得分啊

4148

收藏 2014-08-27

最近在写论文找了13年的数据想计算一个综合系数= WiZit（求和）( i = 1, 2, 3,,k ; t = 1, 2, 3,,, n) Wi为第i 主成分的方差贡献率; Zit为第 i 主成分得分怎么计算主成份得分呢然后计算每年的这个综合系数呢下面是我做的结果求高人指点啊感激不尽

附件列表

JJXVXX)P[2VS3AS$943IK_V.jpg

原图尺寸 64.1 KB

68_9YL~4H6]]I]OAD{O6`]K.jpg

原图尺寸 94.08 KB

$68_9YL~4H6]]I]OAD{O6`]K.jpg$

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

ReneeBK

2014-8-28 05:33:01

http://www.mun.ca/biology/scarr/2900_PCA_Analysis.htm

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

ReneeBK

2014-8-28 05:36:43

First, lets define a score:

John, Mike and Kate get the following percentages for exams in Maths, Science, English and Music as follows:

   Maths Science English Music
John  80       85       60    55
Mike  90       85       70    45
Kate  95       80       40    50
In this case there are 12 scores in total. Each score represents the exam results for each person in a particular subject. So a score in this case is simply a representation of where a row and column intersect.

Now lets informally define a Principal Component:

In the table above, can you easily plot the data in a 2D graph? No, because there are four subjects (which means four variables), i.e.:

You could plot two subjects in the exact same way you would with x & y co-ordinates in a 2D graph.
You could even plot three subjects in the same way you would plot x, y & z in a 3D graph (though this is generally bad practice, because some distortion is inevitable in the 2D representation of 3D data).
But how would you plot 4 subjects?

At the moment we have four variables which each represent just one subject. So a method around this might be to somehow combine the subjects into maybe just two new variables which we can then plot. This is known as Multidimensional scaling.

Principal Component analysis is a form of multidimensional scaling. It is a linear transformation of the variables into a lower dimensional space which retain maximal amount of information about the variables. For example, this would mean we could look at the types of subjects each student is maybe more suited to.

A principal component is therefore a combination of the original variables after a linear transformation. In R, this is:

DF<-data.frame(Maths=c(80, 90, 95), Science=c(85, 85, 80), English=c(60, 70, 40), Music=c(55, 45, 50))
prcomp(DF, scale = FALSE)
Which will give you something like this (first two Principal Components only for sake of simplicity):

            PC1       PC2
Maths 0.27795606  0.76772853
Science -0.17428077 -0.08162874
English -0.94200929  0.19632732
Music 0.07060547 -0.60447104
So what is a Principal Component Score?

It's a score from the table at the end of this post.

The output from R means we can now plot each person's score across all subjects in a 2D graph as follows:

   x                                     y
John 0.28*80 + -0.17*85 + -0.94*60 + 0.07*55  0.77*80 + -0.08*85 + 0.19*60 + -0.60*55
Mike 0.28*90 + -0.17*85 + -0.94*70 + 0.07*45  0.77*90 + -0.08*85 + 0.19*70 + -0.60*45
Kate 0.28*95 + -0.17*80 + -0.94*40 + 0.07*50  0.77*95 + -0.08*80 + 0.19*40 + -0.60*50
Which simplifies to:

   x    y
John  -44.6  33.2
Mike  -51.9 48.8
Kate  -21.1 44.35
There are six principal component scores in the table above. You can now plot the scores in a 2D graph to get a sense of the type of subjects each student is perhaps more suited to.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

389544197

2014-8-28 19:28:44

ReneeBK 发表于 2014-8-28 05:36
First, lets define a score:

John, Mike and Kate get the following percentages for exams in Maths, ...

嗯懂了可是如果出现负数怎么办呢

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

bakoll

2014-12-27 22:23:36

389544197 发表于 2014-8-28 19:28
嗯懂了可是如果出现负数怎么办呢

某城市的主成分因子得分为负数，这是因为在计算时对原始数据作了标准化处理，把各经济指标的平均水平当作零来处理的缘故。因此，某城市的主成分因子得分为负数，只表明该城市在原始数据中的平均发展水平之下。平移一下，加一个正数再乘一个系数。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群