钟威Wei Zhong (William),Assistant Professor.
The Wang Yanan Institute for Studies in Economics (WISE) Department of Statistics, School of Economics(SOE)Xiamen University
Education
Ph.D. in Statistics, Department of Statistics, Pennsylvania State University, State College, PA, 2012
Advisor: Professor Runze Li
B.S. in Statistics, School of Mathematical Sciences, Beijing Normal University(BNU), Beijing, China, 2008
Research Interests
> High/Ultra-high dimensional data analysis: large p small n problems
> Econometrics and Financial Econometrics
> Nonparametric and semiparametric models
> Large covariance matrix estimation
> Applications of statistics in business analytics, information science, finance etc.
A3:
这个问题问的很好,也是过去20多年统计界很热门的研究方向: high dimensional data analysis, penalized regression and regularization methods. 具体的话,如果变量个数大于样本量的时候,一般最小二乘就不能用了饿,如果做回归分析,可以用加惩罚项的回归分析,例如 LASSO (Tibshirani 1996), SCAD (Fan and Li, 2001)。最近几年,有很多关于超高维数据的分析研究,就是变量个数远远大于样本量,这样的方法可以采用 independence screening的方法先做一次,可以看看Fan and Lv (2008), Li, Zhong and Zhu(2012)等。最后建议你看看这个综述的文章,Fan, J., Lv, J., and Qi, L. (2011)
Sparse high-dimensional models in economics.
Annual Review of Economics, 3, 291-317
链接:http://orfe.princeton.edu/~jqfan/publications-general.html
Q4:坛友swei007:
我想问一下,对于多维数据的处理,降维除了用主成分,因子分析,神经网络之外还有什么其他的方法比较常用的。 A4:
针对高维数据分析,主要现在有两大类方法:降维(dimension reduction),这一类就是有点像PCA,主成分分析一样,找一些变量的线性组合作为一个新的变量,可以参看 Dennis Cook, Bing Li, Xiangrong Yin等统计牛人文章;另一个方法做的人更多了,就是 变量选择(variable selection),可以首先看看 这个综述的文章,Fan, J., Lv, J., and Qi, L. (2011)
Sparse high-dimensional models in economics.
Annual Review of Economics, 3, 291-317
连接:http://orfe.princeton.edu/~jqfan/publications-general.html
其次推荐你看看 The Elements of Statistical Learning:
Data Mining, Inference, and Prediction.
Second Edition by Trevor Hastie, Robert Tibshirani, Jerome Friedman