Linear Regression Models for Panel Data Using SAS, STATA, LIMDEP, and SPSS

本附件包括：

Lillo & Mantegna - Variety And Volatility In Financial Markets(pdf).pdf

[此贴子已经被作者于2006-4-24 1:28:23编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-23 22:54:00

[下载]Marcus Pivato.Analysis,Measure and Probability.A Visual Introduction.2003.pd

Online Mathematics Materials

In my spare time I've been developing some online educational materials for mathematics. Some of these may eventually become part of the Felynx Cougati library of multimedia mathematics materials, an ambitious project which I am peripherally involved in. Others were developed for courses I was lecturing, or as personal projects.

Classification of Cellular Automata Some lecture notes I've prepared for an informal seminar on cellular automata I delivered at the University of Houston in spring of 2002. [Gzipped PostScript Version]...... [Adobe PDF Version]
Lecture notes in Partial Differential Equations (250 pages; working draft) I originally wrote these for Math 3363, a course I taught at the University of Houston in 2001-2002. I am now using them to teach Math 305 at Trent University [Gzipped PostScript Version]...... [Adobe PDF Version]
Visual Abstract Algebra(240 pages; working draft) An introduction to groups, rings, and fields, which I'm using to teaching Math 330 here at Trent.
Analysis, Measure & Probability: A visual Introduction (163 pages; working draft) My attempt to develop a new pedagogical approach to real analysis, with emphasis on geometric intuition and probabilistic interpretations. [[Postscript]........[PDF]]
Voting, Arbitration, and Fair Division: the mathematics of social choice (133 pages)
Contents: Survey of voting procedures and their shortcomings. Sen's Impossibility Theorem, Arrow's Impossibility Theorem, and the Gibbard-Satterthwaite Impossibility Theorem. Binary voting systems; May's theorem. Weighted and Vector-weighted systems. Bentham's utilitarianism; von Neumann-Morgenstern definition of cardinal utility. Bargaining Games and the Nash Arbitration Scheme. Fair division theory. [Gzipped Postscript]................ [Adobe PDF]
Models of Philosophy (121 pages) A short monograph on how mathematical methods could be profitably employed to address certain problems in philosophy of mind, philosophy of language, philosophy of science, and political philosophy.
Lecture notes in Linear Algebra I wrote these for Math 223, a course I taught at the University of Toronto in spring of 2000.
Ergodic Theory, Stochastic Processes, and Information Theory (30 pages) Slides from a two hour seminar I delivered in 1999 at the University of Toronto, intended to be an introduction for nonspecialists. It is (I hope) accessible to any mathematically literate person. [Gzipped PostScript Version]...... [Adobe PDF Version]

[此贴子已经被作者于2006-4-24 3:42:23编辑过]

附件列表

49691.pdf

大小:1.31 MB

[推荐]Statistics Ebooks

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-24 12:37:00

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-24 21:31:00

[推荐]Selected Discussion Topic

[此贴子已经被作者于2006-4-24 21:52:50编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-25 07:22:00

[推荐]

《实用数据统计分析及SPSS 12.0应用》
出版社：人民邮电出版社
作者：求是科技/章文波/陈红艳

[此贴子已经被作者于2006-4-25 7:34:52编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-25 07:23:00

[推荐]

《SPSS12统计建模与应用实务》
出版社：中国铁道出版社
作者：林杰斌/林川雄/刘明德/飞捷工作室
上架日期：2006-04-03 出版日期：2006年2月

[此贴子已经被作者于2006-4-25 7:35:23编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-25 07:23:00

《社会统计分析——SPSS应用教程》
出版社：清华大学出版社
作者：卢湘鸿/周爽/朱志洪/朱星萍
上架日期：2006-03-29 出版日期：2006年3月

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-25 07:24:00

《统计学习基础—数据挖掘、推理与预测》 [ 华储网推荐 ]
The Elements of Statistical Learning:Data Mining,Inference,and Prediction
出版社：电子工业出版社
作者：[美]Trevor Hastie/Robert Tibshirani/Jerome Friedman著/范明/柴玉梅/昝红英等译

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-25 07:24:00

《统计学习理论》 [ 华储网推荐 ]
Statistical Learning Theory
出版社：电子工业出版社
作者：（美）Vladimir N.Vapnik 许建华/张学工
上架日期：2004-07-17 出版日期：2004年6月

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-25 22:00:00

Linear Regression Models for Panel Data Using SAS, STATA, LIMDEP, and SPSS

Hun Myoung Park

Winter 2005

Table of Contents

This document summarizes linear regression models for panel data and illustrates how to estimate each model using SAS 9.1, STATA 9.0, LIMDEP 8.0, and SPSS 13.0. This document does not address nonlinear models (i.e., logit and probit models), but focuses on linear regression models.

[此贴子已经被作者于2006-4-25 22:52:25编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

SASCHEN

2006-4-26 04:24:00

[下载]Resampling Methods: Concepts, Applications, and Justification

50042.pdf
大小:(171.43 KB)

Introduction

In recent years many emerging statistical analytical tools, such as exploratory data analysis (EDA), data visualization, robust procedures, and resampling methods, have been gaining attention among psychological and educational researchers. However, many researchers tend to embrace traditional statistical methods rather than experimenting with these new techniques, even though the data structure does not meet certain parametric assumptions. Three factors contribute to this conservative practice. First, newer methods are generally not included in statistics courses, and as a result, the concepts of these newer methods seem obscure to many people. Second, in the past most software developers devoted efforts to program statistical packages for conventional data analysis. Even if researchers are aware of these new techniques, the limited software availability hinders them from implementing them. Last, even with awareness of these concepts and access to software, some researchers hesitate to apply "marginal" procedures. Traditional procedures are perceived as founded on solid theoretical justification and empirical substantiation, while newer techniques face harsh criticisms and seem to be lacking theoretical support.
This article concentrates on one of the newer techniques, namely, resampling, and attempts to address the above issues. First, concepts of different types of resampling will be introduced with simple examples. Next, software applications for resampling are illustrated. Contrary to popular beliefs, many resampling tools are available in standard statistical applications such as SAS and SyStat. Resampling can also be performed in spreadsheet programs such as Excel. Last but not least, arguments for and against resampling are discussed. I propose that there should be more than one way to construe probabilistic inferences and that counterfactual reasoning is a viable means to justify use of resampling as an inferential tool.

[此贴子已经被作者于2006-4-26 4:28:17编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-26 12:30:00

Multiple Imputation for Missing Data

Overview

SAS/STAT software, Version 8, introduces the experimental MI and MIANALYZE procedures for creating and analyzing multiply imputed data sets for incomplete multivariate data. Multiple imputation provides a useful strategy for dealing with data sets with missing values. Instead of filling in a single value for each missing value, Rubin's (1987) multiple imputation procedure replaces each missing value with a set of plausible values that represent the uncertainty about the right value to impute. These multiply imputed data sets are then analyzed by using standard procedures for complete data and combining the results from these analysis. No matter which complete-data analysis is used, the process of combining results from different imputed data sets is essentially the same. This results in statistically valid inferences that properly reflect the uncertainty due to missing values.

The MI procedure is a multiple imputation procedure that creates multiply imputed data sets for incomplete p-dimensional multivariate data. It uses methods that incorporate appropriate variability across m imputations. Once the m complete data sets are analyzed using standard SAS/STAT procedures, PROC MIANALYZE can be used to generate valid statistical inferences about these parameters by combining the results.

Introduction

Most SAS statistical procedures exclude observations with any missing variable values from an analysis. These observations are called incomplete cases. While using only complete cases has its simplicity, you lose information in the incomplete cases. This approach also ignores the possible systematic difference between the complete cases and incomplete cases, and the resulting inference may not be applicable to the population of all cases, especially with a smaller number of complete cases.

Some SAS procedures use all the available cases in an analysis, that is, cases with available information. For example, PROC CORR estimates a variable mean by using all cases with nonmissing values on this variable, ignoring the possible missing values in other variables. PROC CORR also estimates a correlation by using all cases with nonmissing values for this pair of variables. This may make better use of the available data, but the resulting correlation matrix may not be positive definite.

Another strategy is single imputation, in which you substitute a value for each missing value. Standard statistical procedures for complete data analysis can then be used with the filled-in data set. For example, each missing value can be imputed from the variable mean of the complete cases, or it can be imputed from the mean conditional on observed values of other variables. This approach treats missing values as if they were known in the complete-data analysis. Single imputation does not reflect the uncertainty about the predictions of the unknown missing values, and the resulting estimated variances of the parameter estimates will be biased towards zero.

Instead of filling in a single value for each missing value, a multiple imputation procedure (Rubin 1987) replaces each missing value with a set of plausible values that represent the uncertainty about the right value to impute. The multiply imputed data sets are then analyzed by using standard procedures for complete data and combining the results from these analysis. No matter which complete-data analysis is used, the process of combining the results from different data sets is essentially the same.

SAS/STAT procedures implements multiple imputation inferences in three distinct phases:

Create m multiply imputed complete data sets using the MI procedure.
Analyze the m complete data sets by using standard procedures such as PROC REG or PROC GLM.
Generate valid statistical inferences about the parameters of interest by combining the results using the MIANALYZE procedure.

Figure1. The Multiple Imputation Process using SAS Software

Imputation Mechanisms

The SAS multiple imputation procedures assume that the missing data are missing at random (MAR), that is, the probability that an observation is missing may depend on the observed values but not the missing values. These procedures also assume that the parameters q of the data model and the parameters f of the missing data indicators are distinct. That is, knowing the values of q does not provide any additional information about f, and vice versa. If both MAR and the distinctness assumptions are satisfied, the missing data mechanism is said to be ignorable

The MI procedure provides three methods for imputing missing values and the method of choice depends on the type of missing data pattern. For monotone missing data patterns, either a parametric regression method that assumes multivariate normality or a nonparametric method that uses propensity scores is appropriate. For an arbitrary missing data pattern, a Markov chain Monte Carlo (MCMC) method that assumes multivariate normality can be used.

Regression Method

In the regression method, a regression model is fitted for each variable with missing values, with the previous variables as covariates. Based on the resulting model, a new regression model is then simulated and is used to impute the missing values for each variable.

Propensity Score Method

The propensity score is the conditional probability of assignment to a particular treatment given a vector of observed covariates. In the propensity score method, a propensity score is generated for each variable with missing values to indicate the probability of the observation being missing. The observations are then grouped based on these propensity scores, and an approximate Bayesian bootstrap imputation is applied to each group.

MCMC Method

In MCMC, one constructs a Markov chain long enough for the distribution of the elements to stabilize to a common, stationary distribution. By repeatedly simulating steps of the chain, it simulates draws from the distribution of interest.

In Bayesian inference, information about unknown parameters is expressed in the form of a posterior distribution. MCMC has been applied as a method for exploring posterior distributions in Bayesian inference. That is, through MCMC, one can simulate the entire joint distribution of the unknown quantities and obtain simulation-based estimates of posterior parameters that are of interest.

Assuming that the data are from a multivariate normal distribution, data augmentation is applied to Bayesian inference with missing data by repeating a series of imputation and posterior steps. These two steps are iterated long enough for the results to be reliable for a multiply imputed data set (Schafer 1997). The goal is to have the iterates converge to their stationary distribution and then to simulate an approximately independent draw of the missing values.

Release 8.2

Release 8.2 of SAS/STAT software includes the second experimental releases of the MI and MIANALYZE procedures. Additions to PROC MI include the TRANSFORM statement to transform variables before performing the imputation, autocorrelation and iteration plots, a monotone-data MCMC method to impute just enough values to achieve a monotone missing pattern for the imputed data, and the EM statement to derive the MLE and related EM results.

For more Information

For more information, refer to the paper "Multiple Imputation for Missing Data: Concepts and New Development" and the documentation on the MI and MIANALYZE procedures, which is available for downloading from the SAS/STAT Documentation section on this Community site.

References

Rubin, D. B. (1987), Multiple Imputation for Nonresponse in Surveys, New York: John Wiley & Sons, Inc.

Schafer, J. L. (1997), Analysis of Incomplete Multivariate Data, New York: Chapman and Hall

Download pdf version.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-26 12:33:00

Analysis of incomplete multivariate data from repeated measurement experiments.

Crepeau H, Koziol J, Reid N, Yuh YS.

This paper analyses two sets of data that consist of repeated measurements with missing data. The missing observations always occur at the end of the series of repeated measurements. The score test for multivariate normal data is used to compare treatment groups; if the original data are not multivariate normal they are replaced by expected normal scores.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-26 12:35:00

Multivariate Analysis of Incomplete Mapped Data.

Stéphane Dray, Nathalie Pettorelli, and Daniel Chessel

50110.pdf
大小:(405.31 KB)

[此贴子已经被作者于2006-4-26 12:40:26编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-26 12:42:00

[下载]LISREL: ANALYSIS OF MULTIVARIATE DATA WITH MISSING VALUES

LISREL: ANALYSIS OF MULTIVARIATE DATA WITH MISSING VALUES

50112.pdf
大小:(206.95 KB)

[此贴子已经被作者于2006-4-26 12:44:46编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-26 12:49:00

STAT 598A: Statistical Analysis with Missing Data

Lectures: TTH 12:00PM - 1:15PM, UNIV 019

Textbook: Statistical Analysis with Missing Data, Roderick J. A. Little, Donald B. Rubin, which is available at http://www.purdueu.com/
- Chapter 1 of the book, Introduction, is available at the publisher's web site.
Syllabus
Computational Methods
Bayesian Inference
Homework Assignments:
- HW #1: Problems 1.1 and 1.6; Due Tue., Jan. 24.
- HW #2: Problems 6.2, 6.18, and 6.20; Due Tue., Feb. 14.
- HW #3: Problems 7.10, and 7.16; Due Tue., Feb. 21.
- HW #4: Problems 8.16, 10.1, 10.2, and 10.3, and 10.4; Due Tue., Mar. 21.
Term Project Topics
Term Project Presentations [There will be a LCD projector available for presentations]
Software:
- The multivariate normal distribution (download and run % java -jar normal.jar or run online)
- The general location model (download and run % java -jar glom.jar or run online)
- Multivariate-t regression (download and run % java -jar t.jar or run online)
References:
- Chapter 11: Multivariate Normal Examples
  1. Mixed-Effects Models (Tue. 03/21/2006): Pinheiro, J. C., Liu, C., and Wu, Y. (2001). Efficient Algorithms for Robust Estimation in Linear Mixed-Effects Models Using the Multivariate t-Distribution, Journal of Computational and Graphical Statistics 10, 249-276.
  2. Factor Analysis (Thu. 03/23/2006): Liu, C. and Rubin, D. B. (1998). Maximum likelihood estimation of factor analysis using the ECME algorithm with complete and incomplete data, Statistica Sinica 8, 729-747.
- Chapter 12: Robust Estimation
  1. Multivariate Multiple Regression (ML) (Tue. 03/28/2006):
    Liu, C. and Rubin, D. B. (1995). ML estimation of the t distribution using EM and its extensions, ECM and ECME, Statistica Sinica 5, 19-39.
    Liu, C. (1997). ML estimation of the multivariate t distribution and the EM algorithms, J. Multi. Anal. 63, 296-312.
  2. Multivariate Multiple Regression (Bayesian) (Thu. 03/30/2006): Liu, C. (1996). Bayesian Robust Multivariate Linear Regression With Incomplete Data, 91, 1219-1227.
- Chapters 13 and 14: Mixed Normal and Non-normal Incomplete Data (Tue. 04/04-06/2006)
  1. General Location Models: Liu, C. and Rubin, D. B. (1998). Ellipsoidally Symmetric Extensions of the General Location Model for Mixed Categorical and Continuous Data, Biometrika 85, 673-688.
- Miscellanea (Thu. 04/11-13/2006)
  1. Identification of Differentially Expressed Genes: Liang, F., Liu, C., and Wang, N. (2006). A Robust Sequential Bayesian Method for Identification of Differentially Expressed Genes, to appear in Statistica Sinica.
  2. Logistic, probit, and robit Models: Liu, C. (2004). Robit regression: a simple robust alternative to logistic regression and probit regression, in Missing Data and Bayesian Methods in Practic, eds. A. Gelman and X. Meng.
  3. Multivariate Logistic, probit, and robit Models: Liu, C. (2001). Bayesian Analysis of Multivariate Probit Models: Discussion on ``The Art of Data Augmentation'' by van Dyk and Meng, Journal of Computational and Graphical Statistics 10, 75-81.
- Additional References
  1. Schafer, J.L. (1997), Analysis of Incomplete Multivariate Data, Chapman & Hall, London
  2. Ekholm and Skinner(1998), the Muscatine Children's Obesity Data Reanalysed Using Pattern Mixture Models , Applied Statistics, 251-263
  3. Wakefield, J. (2004), Ecological Inference for 2 x 2 Tables, J. R. Statist. Soc. A, 385-445
  4. Liu, C. (1998), Information Matrix Computation from Conditional Information via Normal Approximation, Biometrika, 85, 973-979.
  5. Liu, C. (1999). Efficient ML Estimation of the Multivariate Normal Distribution from Incomplete Data, J. Multi. Anal. 69, 206-217.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-26 13:01:00

BIOSTAT 2065: Analysis of Incomplete Data

Instructor: Gong Tang

Course syllabus
Lecture notes and supplements:

September 1, 2005

Notes

Assignment(Word) Answer of HW#1

September 6, 2005

Notes

September 8, 2005

Will go over some examples, no extra notes.

September 13 & 15, 2005

Notes

Past work on problem 2.13 Answer of HW#2
September 20 & 22, 2005

Notes Solution of HW#3
September 27, 2005

Notes

September 29, 2005

Notes Solution of HW#4
October 4, 2005

Notes
October 6, 2005

Notes Solution of HW#5
October 11, 2005

Notes
October 13 & 18, 2005

Notes Solution of HW#6
October 20, 2005

Notes Proof of Louis' formula

October 25, 2005

Notes Historic Midterm Exam

November 1, 2005

Notes

November 8, 2005

Midterm exam and answer
November 8, 2005

Topics for the final project Teams
November 10, 2005

Notes
November 15, 2005

Notes
November 22, 2005

An example to run OSWALD
December 1, 2005

Three papers on testing whether data are MCAR:

Testing for Random Dropouts in Repeated Measurement Data. P. Diggle, Biometrics, Volumn 45, 1255-1258, 1989

A Test of Missing Completely at Random for Multivariate Data with Missing Values. R.J.A. Little, JASA, Vol. 83, 1198-1202, 1988.

A Test of Missing Completely at Random for Generalized Estimating Equations with Missing Values. H.Chen and R.J.A. Little, Biometrika, Vol. 86, 1-13, 1999.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

statax

2006-4-26 13:14:00

thanks a lot

the upstairs are so kind...

[em01]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-26 21:48:00

吴明隆/ SPSS统计应用实务(免费，呵呵)

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-26 21:48:00

[下载]格林--计量经济分析（英文第5版）下载

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

[经典] Fumio Hayashi - Econometrics

2006-4-26 21:49:00

[下载][推荐]

[此贴子已经被作者于2006-4-26 22:40:39编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

[下载]古扎拉蒂 Basic Econometrics （英文第4版）

2006-4-26 21:49:00

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-26 21:53:00

[推荐]

Bayesian Models for Categorical Data
Peter Congdon

chapter1 Principles of Baysian Inference
chapter2 Model Comparison and choice
chapter3 Regression for Metric Outcomes
chapter4 Models for Binary and Count Outcomes
chapter5 Further Questions in Binomial and Count Regression
chapter6 Random Effect and Latent Variable Models for Multicategory outcomes
chapter7 Ordinal regression
chapter8 Discrete Spatial Data
chapter9 Time Series Models for Discrete Variables
chapter10 Hierarchical and Panel Data Models
chapter11 Missing Data Models
chapter12 Index

The use of Bayesian methods for the analysis of data has grown substantially in areas as diverse as applied statistics, psychology, economics and medical science. Bayesian Methods for Categorical Data sets out to demystify modern Bayesian methods, makingthem accessible to students and researchers alike. Emphasizing the use of statistical computing and applied data analysis, this book provides a comprehensive introduction to Bayesian methods of categorical outcomes. * Reviews recent Bayesian methodologyThe use of Bayesian methods for the analysis of data has grown substantially in areas as diverse as applied statistics, psychology, economics and medical science. Bayesian Methods for Categorical Data sets out to demystify modern Bayesian methods, makingthem accessible to students and researchers alike. Emphasizing the use of statistical computing and applied data analysis, this book provides a comprehensive introduction to Bayesian methods of categorical outcomes. * Reviews recent Bayesian methodology

点击浏览该文件

[此贴子已经被作者于2006-4-26 21:55:10编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

2006-4-26 22:03:00

[下载][推荐]

协整分析（cointegration）的一本好书

[此贴子已经被作者于2006-4-26 22:04:20编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

J. M. Wooldridge:Ec.... Panel Data[长]

2006-4-26 22:08:00

[下载][推荐]

J. M. Wooldridge:Econometric Analysis of Cross Section and Panel Data

[此贴子已经被作者于2006-4-26 22:41:13编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝