求如何做岭回归 - SPSS论坛

求如何做岭回归

hooyan

8823

收藏 2007-10-14

求如何做岭回归，希望能以这个例子得出各个参数的回归系数，和相关的统计量最好能告诉一下操作方法，谢谢

164005.rar
大小:(2.07 KB)

马上下载

本附件包括：

求岭回归.xls

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

hanszhu

2007-10-16 08:08:00

For SAS:

THIS RUN DEMONSTRATES ONE APPROACH TO RIDGE REGRESSION.

THIS RUN MAY EASILY BE EXPANDED TO PLOT THE RIDGE TRACES OF

ALL THE B-VALUES AS WELL AS B'B;

DATA;

START=51735717;

DO N=1 TO 50;

     U=RANUNI(START)*5;

     X1=U+RANNOR(START)*.5;

     X2=U+RANNOR(START)*.5;

     X3=U+RANNOR(START)*.5;

     X4=U+RANNOR(START)*.5;

     Y=1+X1+X2+X3+X4+RANNOR(START);

     KEEP X1-X4 Y;

     OUTPUT;

END;

RUN;

PROC IML;

RESET AUTONAME ;

START MAIN;

*------------------ RIDGE REGRESSION ------------------------*;

 N= J({1});

 USE _LAST_ ;

 READ ALL INTO XY ;

 J= NCOL(XY)-{1};

 N= NROW(XY);

 IJ={1}:J;

 XY=XY- J(N,{1})*( J({1},N)*XY* RECIP(N));

 C=XY`*XY;

 S= DIAG( RECIP( SQRT( VECDIAG(C))));

 R=S*C*S;

 PRINT " CORRELATION MATRIX";

 PRINT R;

 SX= S[IJ,IJ];

 SY= RECIP( S[J+{1},J+{1}]);

 RX= R[IJ,IJ];

 RY= R[IJ,J+{1}];

 SKIP 2;

 *-------------- OBTAIN O L S ESTIMATES -------------*;

 CALL EIGEN( M, E, RX);

 GRX=E* DIAG( RECIP( FUZZ(M)))*E`;

 B_OLS=GRX*RY;

 SSE={1}-RY`*GRX*RY;

 MSE=SSE* RECIP(N-J-{1});

 PRINT B_OLS;

 TB_OLS=SX*B_OLS*SY;

 PRINT " OLS ESTIMATES";

 PRINT TB_OLS;

 Q= SSQ(B_OLS)-MSE* TRACE(GRX);

 PRINT Q;

 IF ( Q<={0}) THEN DO;

 PRINT 'Q<=0, K NOT DETERMINED';

 STOP;

 END;

 SKIP 2;

 *---- SOLVE FOR K SUCH THAT SSQ(BK)=Q, BY NEWTONS METHOD ----;

 K={0.5};

 L=E`*RY;

 IT={0};

LOOP: KJ=K* J(J,{1});

 IT=IT+{1};

 IF ( IT>{25}) THEN GOTO GOTK;

 F= SSQ(L# RECIP(M+KJ))-Q;

 IF ( ABS(F)<{1E-6}) THEN GOTO GOTK;

 RMK=(M+KJ)#(M+KJ)#(M+KJ);

 DF={2}* SUM(L#L# RECIP(RMK));

 CF=F* RECIP(DF);

 K=K+CF;

 GOTO LOOP;

GOTK: BK=E* DIAG( RECIP(M+KJ))*E`*RY;

   BKB= SSQ(BK);

   PRINT K, BK, BKB, IT;

   TBK=SX*BK*SY;

   PRINT " RIDGE ESTIMATES";

   PRINT TBK;

*--------- PLOT THE RIDGE TRACE ---------*;

RT: OK= SSQ(B_OLS)||{ 0};

K={0};

LL: K=K+{.1};

 KJ=K* J(J,{1});

 SBK= SSQ(E* DIAG( RECIP(M+KJ))*E`*RY);

 OK=OK//(SBK||K);

 IF ( K<{2}) THEN GOTO LL;

 OK=OK|| J( NROW(OK),{1},Q);

 _TMP_ROW = 'ROW1 ' : compress('ROW'+char(nrow(OK)));

 CREATE KK ( RENAME=(_TMP_ROW=ROW )) FROM OK [ROWNAME=_TMP_ROW ];

 APPEND FROM OK [ROWNAME=_TMP_ROW];

FINISH MAIN;

RUN MAIN;

QUIT;

OPTIONS NOOVP;

PROC PLOT;

PLOT COL2*COL1='*' COL2*COL3='-' / OVERLAY;

TITLE 'RIDGE TRACE'; LABEL COL1='B(K)''B(K)' COL2='K VALUE';

RUN;

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2007-10-16 08:36:00

Ridge Regression in SPSS

SPSS does lots of things with its GUI interface, but not everything. There are many procedures that can only be used through syntax files. Thus goes back to the old days when SPSS was command driven. The syntax is also useful when you want to repeat a command often. Further, some people write little mini-programs to do specific things in SPSS. Many of these can be downloaded from http://www.spsstools.net/. The people at SPSS felt ridge regression was so useful that they they include the syntax with the main program. This suggests that it will be in the GUI soon.

To run a ridge regression, create a new syntax file, and type:

INCLUDE 'Ridge regression.sps'.

(The full stop is important!!! SPSS is usually not case sensitive.)

Click the little triangle or go to RUN and either run the whole syntax file or just highlight the part you want to run. If this does not work it is likely because SPSS did not find the file. If so, go to SEARCH FILES in Windows (or whatever it is for a MAC) and find a file called "Ridge Regression.sps". It should be somewhere in the SPSS folder. Write the whole address out. So, for my computer I would write:

INCLUDE 'C:\Program Files\SPSS\Ridge regression.sps'.

Now the program has in its active memory the ridge regression procedure. To run a ridge regression stay in the syntax. Suppose you are trying to predict exam score (EXAM) from marks on 5 assignments (ASSIGN1, ASSIGN2, ...). Write

RIDGEREG DEP=exam /ENTER = assign1 assign2 assign3 assign4 assign5/.

(Again, full stop important).

Highlight this and run. This will give you the two graphs to help you decide how much shrinkage to have. Shrinkage is measured by K. If you choose a good K (like where the coefficients have leveled out, pretend it is .5 here), then type

RIDGEREG DEP=exam /ENTER = assign1 assign2 assign3 assign4 assign5/k=.5.

This will produce the coefficients for this value K. There are more advanced ways to choose k. See Hastie et al. (2001).

NOTE1: It is worth playing around with SPSS syntax. Many people find it better than the windows interface. It can certainly be quicker.

NOTE2: If you have lots of predictor variables, the procedure will not be able to print out all of your output if the screen width is set to the default of most university computers. To change the default, go to EDIT in the menus, then options. You get a screen like this:

Tick the viewer thingee at the top and you get a screen like this:

Your default for text output page size width is probably 80. Just tick the Wide option or the custom one so that it prints wide enough for your output.

It is worth playing around with the various options just to see what SPSS can do.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2007-10-16 08:47:00

/* SAS Example of Ridge Regression */ options nocenter pagesize=80 linesize=120; data farmprod; infile '/users/n/newman/public_html/Classes/S510/DATA/farmprod.dat' firstobs=2; input year y x1 x2 x3 x4 x5 x6 x7 x8; /*----- fitting raw data w/ OLS ------------------------------------ */ /* NOTE: SAS scales "automatically" for calculation of eigenvalues so that X'X has 1's on the diagonal */ proc reg; model y = x1 x2 x3 x4 x5 x6 x7 x8/collin VIF; run; /* --- preparation for plotting ridge trace ---------------------- */ title 'Ridge Trace of Farm Production Data'; symbol1 value=x color=black; symbol2 v=circle c=red; symbol3 v=square c=green; symbol4 v=triangle c=blue; symbol5 v=plus c=orange; symbol6 v=6 c=purple; symbol7 v=7 c=brown; symbol8 v=8 c=magenta; /* note valid colors include the above and cyan,gray,pink,white,yellow */ legend2 position=(top right inside) across=3 cborder=black offset=(0,0) label=(color=blue position=(top center) 'Predictors'); /*----- fitting ridge regression -------------------------------- */ proc reg graphics outest=temp ridge=0 to 0.02 by 0.005 outvif; model y = x1 x2 x3 x4 x5 x6 x7 x8/collin VIF; plot/ridgeplot nomodel legend=legend2 nostat vref=0 lvref=1 cvref=blue; run; /* --- looking at coefficients for different shrinkage par. values */ proc print data=temp; run;

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2007-10-16 08:48:00

For S-Plus

ridge.lm <- function(y,Xmat,k.vec,plot.it=T,crossvalid=F,original.scale=T,mycex=0.7) { #if crossvalid=T, exact PRESS statistic calculated #if original.scale=T, the ridge trace will plot coefficients based on unctr'd, unscaled Xmat n <- length(y) xmeans <- apply(Xmat,2,mean) xscale <- sqrt(apply(Xmat,2,var)*(n-1)) X.std <- cbind(1,scale(Xmat)/sqrt(n-1)) p <- dim(X.std)[[2]] if(sum(k.vec==0)==0) k.vec <- c(0,k.vec) #add 0 shrinkage if not included r <- length(k.vec) betas <- matrix(NA,r,p) VIF <- matrix(NA,r,p-1) C.k <- rep(NA,r) df.k <- rep(NA,r) PR.ridge.k <- rep(NA,r) GCV.k <- rep(NA,r) real.cv <- NA if(crossvalid) real.cv <- rep(NA,r) #--OLS model ols <- lm(y ~ X.std-1) sigma.2 <- (summary(ols)$sigma)^2 #---looping over the different shrinkage parameters #---don't shrink the intercept since centered data for(i in 1:r) { cat("iter=",i,"\n") k <- c(0,rep(k.vec,p-1)) X.k <- t(X.std) %*% X.std + k * diag(p) betas[i,] <- solve(qr(X.k), t(X.std) %*% cbind(y)) errs <- y-X.std%*%cbind(betas[i,]) SSres.k <- sum(errs^2) temp <- solve(t(X.k)) VIF[i,] <- diag(temp%*%t(X.std)%*%X.std%*%temp)[-1] H.k.ii <- diag(X.std[,-1]%*%solve(t(X.std[,-1])%*%X.std[,-1]+k[-1]*diag(p-1))%*%t(X.std[,-1])) C.k <- SSres.k/sigma.2-n+2+2*sum(H.k.ii) df.k <- sum(H.k.ii) PR.ridge.k <- sum((errs/(1-1/n-H.k.ii))^2) GCV.k <- SSres.k/(n-1-df.k)^2 if(crossvalid) { pred.err <- 0 for(j in 1:n) { X <- Xmat[-j,] X1.std <- cbind(1,scale(X)/sqrt(n-2)) X.k <- t(X1.std) %*% X1.std + k * diag(p) b <- solve(qr(X.k), t(X1.std) %*% cbind(y[-j])) xm <- apply(X,2,mean) xs <- sqrt(apply(X,2,var)*(n-2)) omit.x <- c(1,(Xmat[j,]-xm)/xs) errs <- y[j]-sum(omit.x*b) pred.err <- pred.err + errs^2 } real.cv <- pred.err } } #---backtransforming betas to original location and scale of Xmat beta.orig <- t(t(betas[,2:p])/xscale) intercept <- betas[,1]-apply(t(t(beta.orig)*xmeans),1,sum) beta.orig <- cbind(intercept,beta.orig) #---ridge trace plot if(plot.it) { orig.par <- par() par(mfrow=c(3,2),oma=c(0,0,3,0)) if(original.scale) { betas.ts <- as.ts(beta.orig[,-1]) } else { betas.ts <- as.ts(betas[,-1]) } ts.plot(betas.ts,xlab="k",ylab="Betas",main="Ridge trace", axes=F,type='b',lty=1:(p-1),pch=1:(p-1),cex=mycex) axis(side=2) # axis(side=1,at=1:r,labels=as.character(k.vec)) ???doesn't work?? mylab <- as.character(k.vec) axis(side=1,at=1:r,labels=mylab,cex=mycex) box() legend(round(r*.75),max(as.vector(betas.ts)),legend=paste("x",as.character(1:(p-1))), lty=1:(p-1),marks=1:(p-1),cex=mycex) plot(k.vec,C.k,type='l',xlab="k",ylab="C_k",main="C_k vs k",cex=mycex) plot(k.vec,PR.ridge.k,type='l',xlab="k",lty=1, ylab="PR(Ridge)_k",main="Pseudo-Press vs k",cex=mycex) if(crossvalid) { lines(k.vec,real.cv,type='l',lty=2) legend(k.vec[1],max(real.cv,PR.ridge.k),legend=c("PRESS","PR(Ridge)"),lty=2:1,cex=mycex) } plot(k.vec,GCV.k,type='l',xlab="k",ylab="GCV(k)",main="GCV(k) vs k",cex=mycex) plot(k.vec,df.k,type='l',xlab="k",ylab="df_k",main="df vs k",cex=mycex) par(mgp=orig.par$mgp) } dimnames(VIF) <- list(as.character(k.vec),paste("beta",as.character(1:(p-1)))) dimnames(beta.orig) <- list(as.character(k.vec),paste("beta",as.character(0:(p-1)))) return(betas,beta.orig,VIF,C.k,df.k,PR.ridge.k,real.cv,GCV.k) } #---farm data # temp <- read.table("Data/farmprod.dat",header=T,row.names="year") # # k.vec <- seq(0,0.02,by=0.01) # # out <- ridge.lm(temp[,1],temp[,-1],k.vec=k.vec,plot.it=T,crossvalid=T) # #---hospital data temp <- as.matrix(read.table("Data/hospital.dat",header=T,row.names=NULL)) k.vec <- seq(0,0.25,by=0.01) out <- ridge.lm(temp[,6],temp[,-6],k.vec=k.vec,plot.it=T,original.scale=F,crossvalid=T)

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

地板

大笨蛋

2007-12-22 16:04:00

都什么乱七八糟的呀
连我都看不懂

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

点击查看更多内容…

7楼

斜阳残雪

2007-12-22 22:20:00

我只会SPSS的做法：
新建一个语句窗口，然后输入：
include'<你的SPSS目录>\ridge regression.sps'.
ridgereg dep=y/enter x1 x2 x3.
其中y是因变量，x1x2x3是自变量，形式根据自己需要更改

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

8楼

unicornliu

2007-12-22 23:25:00

那怎么判定选定的k回归的结果的多重共线性消除情况呢？
如何求vif或dw值啊？
多谢指教~

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

9楼

hanszhu

2007-12-23 01:47:00

http://www.ats.ucla.edu/stat/sas/examples/alsm/alsmsasch10.htm

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

10楼

阳光下的星星

2008-12-20 18:47:00

用R怎么做啊？最好给出例题~~谢谢啦

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

11楼

doctor1985

2009-12-4 13:20:07

回归——》最优尺度——》规则化里面有岭回归的，不用那么麻烦了spss18里面已经比较完善了

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

相关推荐

[求助]SAS 岭回归

关于R中岭回归方程的回归系数

岭估计的显著性检验

求助关于如何做岭回归统计量检验

获得系数i的回归系数对应的t统计量的p值

系数i的回归系数对应的t统计量的p值？

关于SPSS的岭回归系数问题

岭回归回归系数符号与现实不符

回归显著怎样看出来？

为什么R语言做岭回归与SPSS做岭回归求出回归系数的结果不一样？？

栏目导航

SPSS论坛

真实世界经济学(含财经时事)

微观经济学

文献求助专区

stata专版

行业分析报告

热门文章

2026 科研 AI 工具天花板！Gemini /GPT/ De ...

CDA数据分析师必备：时间序列基础认知，解锁 ...

CDA 认证考试大纲 2025 重磅更新：一二级考 ...

[重磅更新]2006-2024年地级市绿色全要素生产 ...

冲击与重构：国际战略报告2025

奇瑞，出口突破600万辆！

2026年GEO（生成引擎优化）发展白皮书

2028年全球智能危机（中文翻译）

《杭州模式》刘典；胡宇东

Machine learning from a “Universe” of ...

推荐文章

2026JG学术冬训营:从Stata初高到Python机器 ...

【必看】【本版版规，欢迎发悬赏贴求助】

【新课】26年3月｜Gemini辅助论文写作与数据 ...

关于如何利用文献的若干建议

关于学术研究和论文发表的一些建议

关于科研中如何学习基础知识的一些建议 (一 ...

一个自编的经济学建模小案例 --写给授课本科 ...

AI智能体赋能教学改革: 全国AI教育教学应用 ...

2025中国AIoT产业全景图谱报告-406页

关于文献求助的一些建议

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群