全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 python论坛
1064 1
2018-06-07
    data = pd.read_csv("tree_addhealth.csv")
   
    #upper-case all colu`enter code here`mn names in DataFrame
    data.columns = map(str.upper, data.columns)
   
    data_clean = data.dropna()
   
    cluster=data_clean[['ALCEVR1','MAREVER1','ALCPROBS1','DEVIANT1','VIOL1',
    'DEP1','ESTEEM1','SCHCONN1','PARACTV', 'PARPRES','FAMCONCT']]
   
    clustervar=cluster.copy()
    clustervar['ALCEVR1']=preprocessing.scale(clustervar['ALCEVR1'].astype('float64'))
    clustervar['ALCPROBS1']=preprocessing.scale(clustervar['ALCPROBS1'].astype('float64'))
    clustervar['MAREVER1']=preprocessing.scale(clustervar['MAREVER1'].astype('float64'))
     clustervar['DEP1']=preprocessing.scale(clustervar['DEP1'].astype('float64'))
    clustervar['ESTEEM1']=preprocessing.scale(clustervar['ESTEEM1'].astype('float64'))
    clustervar['VIOL1']=preprocessing.scale(clustervar['VIOL1'].astype('float64'))
    clustervar['DEVIANT1']=preprocessing.scale(clustervar['DEVIANT1'].astype('float64'))
    clustervar['FAMCONCT']=preprocessing.scale(clustervar['FAMCONCT'].astype('float64'))
    clustervar['SCHCONN1']=preprocessing.scale(clustervar['SCHCONN1'].astype('float64'))
    clustervar['PARACTV']=preprocessing.scale(clustervar['PARACTV'].astype('float64'))
    clustervar['PARPRES']=preprocessing.scale(clustervar['PARPRES'].astype('float64'))
   
    clus_train, clus_test = train_test_split(clustervar, test_size=.3, random_state=123)
   
    model3=KMeans(n_clusters=3)
    model3.fit(clus_train)
    clusassign=model3.predict(clus_train)
    #clusassign.shape=(clus_train.shape[0],1)
   
    y=Series(clusassign)
    x=clus_train
    target_names=[0,1,2]
    colors = ['navy', 'turquoise', 'darkorange']
   
    from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
    LDA_2 = LDA(n_components=2)
    plot_columns = LDA_2.fit(x,y).transform(x)
    for color, target_name in zip(colors, target_names):
    plt.scatter(plot_columns[y == target_name, 0], plot_columns[y   ==target_name, 1], alpha=.8, color=color,
                    label=target_name)
    plt.legend(loc='best', shadow=False, scatterpoints=1)
    plt.title('LDA of dataset')
附件列表

Row Data.docx

大小:126.88 KB

 马上下载

参考数据

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2018-6-23 17:17:29
代码请添加注释详细说明, 否则很难阅读!
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群