全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 SPSS论坛
7960 3
2008-07-18

请问有谁知道cophenetic correlation是用来干什么的吗?

在SPSS上好像没有啊?在哪个软件上有呢?

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2008-7-19 00:01:00

In statistics, and especially in biostatistics, cophenetic correlation(more precisely, the cophenetic correlation coefficient) is a measure of how faithfully a dendrogram preserves the pairwise distances between the original unmodeled data points. Although it has been most widely applied in the field of biostatistics (typically to assess cluster-based models of DNA sequences, or other taxonomic models), it can also be used in other fields of inquiry where raw data tend to occur in clumps, or clusters.This coefficient has also been proposed for use as a test for nested clusters.

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2008-7-19 00:03:00

Matlab:

cophenet - Cophenetic Correlation Coefficient

Syntax

c = cophenet(Z,Y)
[c,d] = cophenet(Z,Y)

Description

c = cophenet(Z,Y) computes the cophenetic correlation coefficient for the hierarchical cluster tree represented by Z. Z is the output of the linkage function. Y contains the distances or dissimilarities used to construct Z, as output by the pdist function. Z is a matrix of size (m-1)-by-3, with distance information in the third column. Y is a vector of size .

[c,d] = cophenet(Z,Y) returns the cophenetic distances d in the same lower triangular distance vector format as Y.

The cophenetic correlation for a cluster tree is defined as the linear correlation coefficient between the cophenetic distances obtained from the tree, and the original distances (or dissimilarities) used to construct the tree. Thus, it is a measure of how faithfully the tree represents the dissimilarities among observations.

The cophenetic distance between two observations is represented in a dendrogram by the height of the link at which those two observations are first joined. That height is the distance between the two subclusters that are merged by that link.

The output value, c, is the cophenetic correlation coefficient. The magnitude of this value should be very close to 1 for a high-quality solution. This measure can be used to compare alternative cluster solutions obtained using different algorithms.

The cophenetic correlation between Z(:,3) and Y is defined as

where:

  • Yij is the distance between objects i and j in Y.

  • Zij is the cophenetic distance between objects i and j, from Z(:,3).

  • y and z are the average of Y and Z(:,3), respectively.

Example

X = [rand(10,3); rand(10,3)+1; rand(10,3)+2]; Y = pdist(X); Z = linkage(Y,'average'); % Compute Spearman's rank correlation between the % dissimilarities and the cophenetic distances [c,D] = cophenet(Z,Y); r = corr(Y',D','type','spearman') r = 0.8279 
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2008-7-19 00:16:00

It is procedure built into NCSS which used to help find best cluster solution. As NCSS said: Cophenetic Correlation is the Pearson correlation between the actual distances and the predicted distances based on this particular hierarchical configuration. A value of 0.75 or above needs to be achieved in order for the clustering to be considered useful."



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群