主成分 - SPSS论坛 - 经管之家

i.c.t.

1840

收藏 2014-05-20

应该是使用SPSS版本13做主成分分析，两次使用数据的单位不同，会得到不同的结果。
第一次是使用原始数据：城镇人口比重85，人均GDP80354；
第二次使用调整过的数据：城镇人口比重0.85，人均GDP8.0354万；

都用SPSS自带的标准化处理过了，然而第一次得到4个主成分，第二次5个。。。

请教一下为何，谢谢

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

complicated

2014-5-20 10:39:26

帮顶，问我我回答不出来，求牛人！多谢多谢！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

Nicolle

2014-5-20 12:49:40

提示: 作者被禁止或删除内容自动屏蔽

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

kuangsir6

2014-5-20 13:58:27

不要用标准化，用均值化试试。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

ReneeBK

2014-5-21 02:21:54

85 is not the proportion of urban population(0-1). I wonder if the problem might be the large ratios between pairs of variables. For example 80354 is roughly 1000 times larger than 85. The ratio of variances will be roughly 10E6. Perhaps significant digits are being lost in the eigenvalue solution. Please fix the data typo and try again.
PCA is not scale free. PCA on a variance-covariance matrix of a set of variables will typically yield different results compared to PCA on the correlation matrix of the same variables.
If you was analyzing correlations, there would be no excuse for the correlations to come out different. So, the likely explanation is some unnoted change in data. Look at all the means and r's.
If you was analyzing variances - which must be considered a mistake -there is no reason for loadings of the two scores to have much resemblance across analyses. And the "number of factors extracted"is not meaningfully related to a cutoff of 1.0, if that was used. When PCA is employed on correlations, the 1.0 represents the amount of variance to be explained for each variable, and "less than one" says that the factor is worth less than a single variable and thus might be ignored for subsequent rotation... assuming you are working from a theory about important latent factors.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

Lisrelchen

2014-5-23 00:28:55

Changing the scale of a variable by a multiplicative constant will NOT change the correlations. Since a PCA is solely a function of the correlations, I would attribute your finding of the difference using two data sets to typo(s) (either in calculation or reporting)! Please compare the Means, SDs, Ns and R matrix (recalculate means and SDs based on the different 'scaling').

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群