For SAS:
THIS RUN DEMONSTRATES ONE APPROACH TO RIDGE REGRESSION.
THIS RUN MAY EASILY BE EXPANDED TO PLOT THE RIDGE TRACES OF
ALL THE B-VALUES AS WELL AS B'B;
Ridge Regression in SPSS
SPSS does lots of things with its GUI interface, but not everything. There are many procedures that can only be used through syntax files. Thus goes back to the old days when SPSS was command driven. The syntax is also useful when you want to repeat a command often. Further, some people write little mini-programs to do specific things in SPSS. Many of these can be downloaded from http://www.spsstools.net/. The people at SPSS felt ridge regression was so useful that they they include the syntax with the main program. This suggests that it will be in the GUI soon.
To run a ridge regression, create a new syntax file, and type:
INCLUDE 'Ridge regression.sps'.
(The full stop is important!!! SPSS is usually not case sensitive.)
Click the little triangle or go to RUN and either run the whole syntax file or just highlight the part you want to run. If this does not work it is likely because SPSS did not find the file. If so, go to SEARCH FILES in Windows (or whatever it is for a MAC) and find a file called "Ridge Regression.sps". It should be somewhere in the SPSS folder. Write the whole address out. So, for my computer I would write:
INCLUDE 'C:\Program Files\SPSS\Ridge regression.sps'.
Now the program has in its active memory the ridge regression procedure. To run a ridge regression stay in the syntax. Suppose you are trying to predict exam score (EXAM) from marks on 5 assignments (ASSIGN1, ASSIGN2, ...). Write
RIDGEREG DEP=exam /ENTER = assign1 assign2 assign3 assign4 assign5/.
(Again, full stop important).
Highlight this and run. This will give you the two graphs to help you decide how much shrinkage to have. Shrinkage is measured by K. If you choose a good K (like where the coefficients have leveled out, pretend it is .5 here), then type
RIDGEREG DEP=exam /ENTER = assign1 assign2 assign3 assign4 assign5/k=.5.
This will produce the coefficients for this value K. There are more advanced ways to choose k. See Hastie et al. (2001).
NOTE1: It is worth playing around with SPSS syntax. Many people find it better than the windows interface. It can certainly be quicker.
NOTE2: If you have lots of predictor variables, the procedure will not be able to print out all of your output if the screen width is set to the default of most university computers. To change the default, go to EDIT in the menus, then options. You get a screen like this:

Tick the viewer thingee at the top and you get a screen like this:

Your default for text output page size width is probably 80. Just tick the Wide option or the custom one so that it prints wide enough for your output.
It is worth playing around with the various options just to see what SPSS can do.
/* SAS Example of Ridge Regression */ options nocenter pagesize=80 linesize=120; data farmprod; infile '/users/n/newman/public_html/Classes/S510/DATA/farmprod.dat' firstobs=2; input year y x1 x2 x3 x4 x5 x6 x7 x8; /*----- fitting raw data w/ OLS ------------------------------------ */ /* NOTE: SAS scales "automatically" for calculation of eigenvalues so that X'X has 1's on the diagonal */ proc reg; model y = x1 x2 x3 x4 x5 x6 x7 x8/collin VIF; run; /* --- preparation for plotting ridge trace ---------------------- */ title 'Ridge Trace of Farm Production Data'; symbol1 value=x color=black; symbol2 v=circle c=red; symbol3 v=square c=green; symbol4 v=triangle c=blue; symbol5 v=plus c=orange; symbol6 v=6 c=purple; symbol7 v=7 c=brown; symbol8 v=8 c=magenta; /* note valid colors include the above and cyan,gray,pink,white,yellow */ legend2 position=(top right inside) across=3 cborder=black offset=(0,0) label=(color=blue position=(top center) 'Predictors'); /*----- fitting ridge regression -------------------------------- */ proc reg graphics outest=temp ridge=0 to 0.02 by 0.005 outvif; model y = x1 x2 x3 x4 x5 x6 x7 x8/collin VIF; plot/ridgeplot nomodel legend=legend2 nostat vref=0 lvref=1 cvref=blue; run; /* --- looking at coefficients for different shrinkage par. values */ proc print data=temp; run;
For S-Plus
ridge.lm <- function(y,Xmat,k.vec,plot.it=T,crossvalid=F,original.scale=T,mycex=0.7) { #if crossvalid=T, exact PRESS statistic calculated #if original.scale=T, the ridge trace will plot coefficients based on unctr'd, unscaled Xmat n <- length(y) xmeans <- apply(Xmat,2,mean) xscale <- sqrt(apply(Xmat,2,var)*(n-1)) X.std <- cbind(1,scale(Xmat)/sqrt(n-1)) p <- dim(X.std)[[2]] if(sum(k.vec==0)==0) k.vec <- c(0,k.vec) #add 0 shrinkage if not included r <- length(k.vec) betas <- matrix(NA,r,p) VIF <- matrix(NA,r,p-1) C.k <- rep(NA,r) df.k <- rep(NA,r) PR.ridge.k <- rep(NA,r) GCV.k <- rep(NA,r) real.cv <- NA if(crossvalid) real.cv <- rep(NA,r) #--OLS model ols <- lm(y ~ X.std-1) sigma.2 <- (summary(ols)$sigma)^2 #---looping over the different shrinkage parameters #---don't shrink the intercept since centered data for(i in 1:r) { cat("iter=",i,"\n") k <- c(0,rep(k.vec,p-1)) X.k <- t(X.std) %*% X.std + k * diag(p) betas[i,] <- solve(qr(X.k), t(X.std) %*% cbind(y)) errs <- y-X.std%*%cbind(betas[i,]) SSres.k <- sum(errs^2) temp <- solve(t(X.k)) VIF[i,] <- diag(temp%*%t(X.std)%*%X.std%*%temp)[-1] H.k.ii <- diag(X.std[,-1]%*%solve(t(X.std[,-1])%*%X.std[,-1]+k[-1]*diag(p-1))%*%t(X.std[,-1])) C.k <- SSres.k/sigma.2-n+2+2*sum(H.k.ii) df.k <- sum(H.k.ii) PR.ridge.k <- sum((errs/(1-1/n-H.k.ii))^2) GCV.k <- SSres.k/(n-1-df.k)^2 if(crossvalid) { pred.err <- 0 for(j in 1:n) { X <- Xmat[-j,] X1.std <- cbind(1,scale(X)/sqrt(n-2)) X.k <- t(X1.std) %*% X1.std + k * diag(p) b <- solve(qr(X.k), t(X1.std) %*% cbind(y[-j])) xm <- apply(X,2,mean) xs <- sqrt(apply(X,2,var)*(n-2)) omit.x <- c(1,(Xmat[j,]-xm)/xs) errs <- y[j]-sum(omit.x*b) pred.err <- pred.err + errs^2 } real.cv <- pred.err } } #---backtransforming betas to original location and scale of Xmat beta.orig <- t(t(betas[,2:p])/xscale) intercept <- betas[,1]-apply(t(t(beta.orig)*xmeans),1,sum) beta.orig <- cbind(intercept,beta.orig) #---ridge trace plot if(plot.it) { orig.par <- par() par(mfrow=c(3,2),oma=c(0,0,3,0)) if(original.scale) { betas.ts <- as.ts(beta.orig[,-1]) } else { betas.ts <- as.ts(betas[,-1]) } ts.plot(betas.ts,xlab="k",ylab="Betas",main="Ridge trace", axes=F,type='b',lty=1:(p-1),pch=1:(p-1),cex=mycex) axis(side=2) # axis(side=1,at=1:r,labels=as.character(k.vec)) ???doesn't work?? mylab <- as.character(k.vec) axis(side=1,at=1:r,labels=mylab,cex=mycex) box() legend(round(r*.75),max(as.vector(betas.ts)),legend=paste("x",as.character(1:(p-1))), lty=1:(p-1),marks=1:(p-1),cex=mycex) plot(k.vec,C.k,type='l',xlab="k",ylab="C_k",main="C_k vs k",cex=mycex) plot(k.vec,PR.ridge.k,type='l',xlab="k",lty=1, ylab="PR(Ridge)_k",main="Pseudo-Press vs k",cex=mycex) if(crossvalid) { lines(k.vec,real.cv,type='l',lty=2) legend(k.vec[1],max(real.cv,PR.ridge.k),legend=c("PRESS","PR(Ridge)"),lty=2:1,cex=mycex) } plot(k.vec,GCV.k,type='l',xlab="k",ylab="GCV(k)",main="GCV(k) vs k",cex=mycex) plot(k.vec,df.k,type='l',xlab="k",ylab="df_k",main="df vs k",cex=mycex) par(mgp=orig.par$mgp) } dimnames(VIF) <- list(as.character(k.vec),paste("beta",as.character(1:(p-1)))) dimnames(beta.orig) <- list(as.character(k.vec),paste("beta",as.character(0:(p-1)))) return(betas,beta.orig,VIF,C.k,df.k,PR.ridge.k,real.cv,GCV.k) } #---farm data # temp <- read.table("Data/farmprod.dat",header=T,row.names="year") # # k.vec <- seq(0,0.02,by=0.01) # # out <- ridge.lm(temp[,1],temp[,-1],k.vec=k.vec,plot.it=T,crossvalid=T) # #---hospital data temp <- as.matrix(read.table("Data/hospital.dat",header=T,row.names=NULL)) k.vec <- seq(0,0.25,by=0.01) out <- ridge.lm(temp[,6],temp[,-6],k.vec=k.vec,plot.it=T,original.scale=F,crossvalid=T) 我只会SPSS的做法:
新建一个语句窗口,然后输入:
include'<你的SPSS目录>\ridge regression.sps'.
ridgereg dep=y/enter x1 x2 x3.
其中y是因变量,x1x2x3是自变量,形式根据自己需要更改
扫码加好友,拉您进群



收藏
