7720

收藏 2005-08-28

Experimental Design and Data Analysis

General information

General information including Course purpose, Lectures - where and when, Tutorials/Labs, assessment and textbooks etc can also be downloaded here as one document - the first handout in the first lecture.

Lecture slides
Getting help: 620-160 Consultation roster
Weekly student materials: Quick Quizzes/Tutorials/Labs/Answers

Assignments: dates/assignments/solutions/special consideration

Answers to the chapter exercises in the Lecture Notes.
Assessment and exam information

Course purpose, Course content and Learning outcomes
Contact staff/ Lecturers' consultation hours/Lectures : where and when
Tutorials and labs: where and when
Textbook and Resources

Lecture slides can be downloaded from here

Weekly materials: Quick Quizzes/Tutorials/Labs/answers

[此贴子已经被作者于2006-4-27 14:01:51编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

gemini69

2005-8-28 01:43:00

这是什麽乱七八糟的咚咚？！这又与计量经济学的联系在那边呢？！

当初只是弄块神主牌，让人既缅怀又警惕，可没打算让人一直拜下去。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2005-8-28 02:52:00

[下载]Applied Nonlinear Analysis

S650 Categorical Data Analysis

Instructor: Scott Long
Teaching Assistant Spring 2005/2006: Jason Cummings

Enrolling and Time Conflicts News Download Links Computing Getting-Ready Books

About S650

S650 is the second course in sociology’s graduate sequence in applied statistics. The first course, S554, deals with models in which the dependent variable is continuous. These include the linear regression model, seemingly unrelated regressions, and systems of simultaneous equations. S650 deals with regression models in which the dependent variable is limited or categorical. Such models include probit, logit, ordered logit, and Poisson regression, among others. The prerequisite for this class is a prior course in regression. To see the syllabus, click here.

News for Fall 2005/2006

August 30, 2005 - CLASSPAK - If the bookstore doesn't have copies of the ClassPak, go to the textbook register at the IU Bookstore and purchase a voucher. They will have a copy by 3PM the next day. Contact the TA if you have problems. If they tell you that they can't do this, ask to talk to Keith Waits. Or, E-mail Kathy Parker cparker@indiana.edu.
11Aug2005 - I still don't know how large the room will be so those waiting for authorization will have to continue to wait.
ClassPak: If the bookstores do not have copies of the ClassPak, go to the textbook register at the IU Bookstore and purchase a voucher. They will have a copy by 3PM the next day. Contact the TA if you have problems.
To install sample do files for Stata, from within Stata type the command: findit soc650
To install sample data files, from within Stata type the command: findit socdata

Books

ClassPak - be sure to bring this the first day of class. It includes lecture notes and reprints. Required.
Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage. Required.
Long, J. Scott and Freese, Jeremy. 2003. Regression Models for Categorical Dependent Variables Using Stata, Revised Edition. Stata Press: College Stata, TX. Required. If you have the “unrevised” edition, you do not need to buy the revised edition. Note that the Second Edition will be published later this year. Required.
American Psychological Association. 2005. Concise Rules of APA Style. American Psychological Association: Washington, DC. You should use APA style, including their guidelines for tables and figures.

Files to Download

Most materials other than the course notes (available at TIS or the Campus Bookstore) can be downloaded here. Files will be added throughout the semester.

Computing

If you want to install the ado files needed for this class, follow this link. You will also find sample programs and data sets at that location. While you may freely use my ado files, you must purchase Stata. For details, you can contact either the Stata Corporation or buy the program from the IU Stat/Math Center.

Enrolling and Dealing with Time Conflicts

Enrollment: Unfortunately, there are more students who want to take S650 than there are seats in the class. First priority is given to graduate students in sociology since this is a required course for them. Otherwise, authorizations for the class are given on a first-come-first-serve basis. If you are interested in taking the class, contact the graduate secretary in sociology to get on the list. The graduate secretary (socgrad@indiana.edu) will contact you regarding authorization for the class. If you are given an authorization, you need to sign up for the class during the normal enrollment period; if you do not, your authorization will be given to the next student on the wait list.

Time conflicts: If you have another class that overlaps with the lecture time for S650, you will need to take one of the classes in another semester. If you have a time conflict with all of the lab times, you should take 650 some other semester. If you can attend some of the labs each week and you are already familiar with Stata (or can learn it on your own), you will probably do fine but might have to work harder than students who can attend lab. While most of the lab time is used for students doing independent work, the teaching assistant will give some short lectures related to the assignments. For example, he/she might provide additional information about keeping a research log or how to format tables using Word.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2005-8-29 11:13:00

[下载]

Lecture Notes

2006 Bigelow (pdf, 73 pp)

Practice Problems with Solutions

Week 6 practice (pdf, 3 pp) Solutions (forthcoming)

[此贴子已经被作者于2006-4-27 13:41:02编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2005-8-29 11:20:00

[下载]

MIT OpenCourseWare »

Mathematics »

Statistical Inference, Spring 2002

[此贴子已经被作者于2006-4-27 13:19:41编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2005-8-30 08:06:00

[下载]Allen Hatcher.Algebraic Topology

STAT 7030, Categorical Data Analysis, Spring 2006

General Information

Instructor:	Peng Zeng
Office:	230C Parker Hall
Email:	zengpen AT auburn DOT edu
Phone:	(334) 844 - 3680
Office hour:	3:30--4:30pm, Tuesday/Thursday or by appointment

Time & Location: 2:00--3:15pm, Tuesday/Thursday in 224 Parker Hall
Course Syllabus

Textbook and References

Alan Agresti (2002). Categorical Data Analysis. (2nd Edition). Wiley. (Author's Website for this book)

Alan Agresti (1990). Categorical Data Analysis. [ebook]
M. E. Stokes, C. S. Davis, and G. G. Koch (2000). Categorical Data Analysis Using the SAS System (2nd Edition).
Bayo Lawal (2003). Categorical Data Analysis with SAS and SPSS Applications. [ebook]
Larry Hatcher (2003). Step-by-Step Basic Statistics Using SAS. [ebook]

Lecture Notes

Introduction (01/10) sas code
Discrete Random Variables (01/12, 01/17) sas code
Two-Way Contingency Table (01/19, 01/24) sas code
Two-Way Contingency Table: More Discussion (01/26, 01/31) sas code
Generalized Linear Model (02/02, 02/07, 02/09) sas code
Logistic Regression Model (02/14, 02/16) sas code
Logistic Regression Model: More Discussion (02/21, 02/23, 02/28) sas code
Midterm Review (03/02)
Logit Models for Multinomial Responses (03/09, 03/14) sas code
Loglinear Models for Contingency Tables (03/16, 03/21) sas code
Building and Extending Loglinear/Logit Models (03/23, 04/04, 04/06) sas code
Model for Matched Pairs (04/11, 04/18) sas code
Final Review (04/20, 04/25) sas code

Homeworks and Answer Keys

Homework 1 (Due: 01/26/2006) Answer Key to hw1
Homework 2 (Due: 02/09/2006) Answer Key to hw2
Homework 3 (Due: 02/28/2006) Answer Key to hw3, (hw3data.sas)
Homework 4 (Due: 04/06/2006) Answer Key to hw4, (hw4data.sas)
Homework 5 (Due: 04/20/2006) Answer Key to hw5, (hw5data.sas)

Some Online Resources

[此贴子已经被作者于2006-4-27 13:36:20编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

点击查看更多内容…

hanszhu

2005-8-30 08:20:00

[下载]Collins G W.Numerical Methods And Data Analysis

Fundamental Numerical Methods and Data Analysis

George W. Collins, II

24553.rar
大小:(2.7 MB)

只需: 50 个论坛币马上下载

本附件包括：

Collins G W.Numerical Methods And Data Analysis.pdf

[此贴子已经被作者于2006-4-27 13:40:10编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

蓝色

2005-8-30 08:21:00

这么多书，好强啊！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2005-8-30 08:34:00

MISSING DATA

Conventional methods for missing data, like listwise deletion or regression imputation, are prone to three serious problems:

Inefficient use of the available information, leading to low power and Type II errors.

Biased estimates of standard errors, leading to incorrect p-values.

Biased parameter estimates, due to failure to adjust for selectivity in missing data.

More accurate and reliable results can be obtained with maximum likelihood or multiple imputation.

These new methods for handling missing data have been around for at least a decade, but have only become practical in the last few years with the introduction of widely available and user friendly software. Maximum likelihood and multiple imputation have very similar statistical properties. If the assumptions are met, they are approximately unbiased and efficient--that is, they have minimum sampling variance. What's remarkable is that these newer methods depend on less demanding assumptions than those required for conventional methods for handling missing data. At present, maximum likelihood is best suited for linear models or log-linear models for contingency tables. Multiple imputation, on the other hand, can be used for virtually any statistical problem.

This course will cover the theory and practice of both maximum likelihood and multiple imputation. Maximum likelihood for linear models will be demonstrated with Amos 4, a software package designed for estimating structural equation models with latent variables. Multiple imputation will be demonstrated with two new SAS procedures, PROC MI and PROC MIANALYZE.

Materials

In addition to Professor Allison's text Missing Data, participants receive a bound manual containing detailed lecture notes (with equations and graphics), examples of computer printout, and many other useful features. This book frees participants from the distracting task of note taking.

Course outline

1. Assumptions for missing data methods
2. Problems with conventional methods
3. Maximum likelihood (ML)
4. ML with EM algorithm
5. Direct ML with Amos
6. ML for contingency tables
7. Multiple Imputation (MI)
8. MI under multivariate normal model
9. MI with SAS
10. MI with categorical and nonnormal data
11. Interactions and nonlinearities
12. Using auxiliary variables
13. Other parametric approaches to MI
14. Linear hypotheses and likelihood ratio tests
15. Nonparametric and partially parametric methods
16. Sequential generalized regression models
17. MI and ML for nonignorable missing data

Comments by April 2005 Participants

Participants in the April 2005 seminar were asked to rate the course on a scale of 1 (worst) to 10 (best). The average score for 27 respondents was 9.2. They were also asked if they wished to make an attributed statement regarding the course. Here are all the comments that were received:

"This has been a great learning experience for me. Intensive, yet reasonably paced, it offered a balanced combination of theories of missing data adjustment and practical applications. For someone like me who has had little previous experience with missing data analysis, this is a good way to get started."

Anca Romantan, Annenberg School for Communication, University of Pennsylvania

"Wonderful course! Makes you realize what your data/analysis is 'missing'."

Faika Zanjani, University of Pennsylvania

"Dr. Allison explains things thoroughly and with enough datail that the student is able to use the material after the course. A large amount of material is carefully condensed and presented in such a way as to still be easily comprehended. The course has an amazing balance between theory and practice. The presentations are engaging."

Jim Godbold, Mount Sinai School of Medicine

"This is a great class. I would recomend it for anyone doing applied or simulation research with missing data."

Carolyn Furlow, Georgia State University

"Even for a novice researcher with no SAS experience, this course has been an invaluable review of conceptual and practical issues related to missing data. Clear, cogent and thorough."

Angela Duckworth, Positive Psychology Center, University of Pennsylvania

"This course is very helpful and Dr. Allison explains complicated contents very easily."

Sunhee Park, University of Pennsylvania School of Nursing

"Theoretically informed, but a very practical 'how-to-do' approach to very common problems. Readily applicable to 'real-world' situations."

Daniel K. Cooper, Harris Interactive

"Missing data is becoming a big issue in all industries, from telecommunications to bank/financial services. Professor Allison taught us how to tackle this problem with the most up-to-date methodologies (both theoretical and practical approaches)."

Shakuntala Choudhury, Senior Marketing Statistician

[此贴子已经被作者于2006-4-27 14:30:12编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2005-8-30 08:38:00

[下载]

Categorical Data Analysis

http://www.stat.ufl.edu/~presnell/Courses/sta4504-2000sp/

Course Information

Instructor

This instructor for this section is Brett Presnell. His office hours and other contact information are given on Presnell's home page.

Syllabus
Here is the syllabus for the course (in PDF format).

Handouts

Lecture Notes: copies of the transparencies used in class (chapters 1, 2, and 4 were done on the blackboard). Provided in three formats, 1, 2, and 4 slides to a page, for those who wish to conserve paper (pdf files).
Chapter 3 slides. (2 to a page version) (4 to a page version).
Chapter 5 slides (2 to a page version) (4 to a page version).
Chapter 6 slides (2 to a page version) (4 to a page version).
Chapter 8 slides (2 to a page version) (4 to a page version).
Downloading and using data from the General Social Survey.

SAS

Most of the computations for this class will be demonstrated using SAS. SAS is available on the PCs in the CIRCA labs (such as CSE 211). The CIRCA "SAS for Windows" handout will get you started (hard copies are also available from CIRCA). You can also get SAS for your home PC through the new Student Home-Use Program (current price is $35 for one academic year).
SAS code for examples done in class (and for some of the exercises)
SAS Manuals This is a link to nearly a full set of SAS manuals. You might specifically be interested in the entries for PROC FREQ , PROC GENMOD, PROC CATMOD, and PROC LOGISTIC. Simple "PROCS", like MEANS, SORT, and UNIVARIATE can be found in the SAS Procedures Guide, while more involved procedures are in the SAS/STAT User's Guide.

R and Rweb

Many (all?) of the computations for this class can be done using "R", which is a free/open implementation of the "S" statistical programming language. You can install R on your PC or use the web-based version Rweb. Whenever time permits I will make available R programs (scripts) for the various examples done in class and in the text.
The R page for this course: everything you need to know about R (yeah, right).
Other Things

Some data sources:

General Social Survey (15 March 1999 release).
SDA: Survey Documentation and Analysis: Click on SDA Archive to see some of the available survey data sources. The General Social Survey is also available here. The Multi-Investigator Survey might yield some interesting information (how do things like order or wording of questions effect response?).
An Example of Misinterpreted Odds Ratios
The Effect of Race and Sex on Physicians' Recommendations for Cardiac Catheterization
Misunderstandings about the Effects of Race and Sex on Physicians' Referrals for Cardiac Catheterization
Race, Sex, and Physicians' Referrals for Cardiac Catheterization

[此贴子已经被作者于2006-4-27 13:54:27编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

罗马朗

2005-9-1 00:05:00

太贵了，呵呵，买不起

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

zhouye

2005-9-1 11:49:00

可惜没钱

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

math2008

2005-9-5 08:01:00

都是数学书啊

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2005-9-5 23:34:00

[下载]

LECTURE NOTESNovember 20, 2002

ANNOUNCEMENT

This class will have a final exam, on Wednesday December 11 from 2pm to 4pm in APM 4882. Exam questions will be similar to assignment questions, but easier.

MIXTURE MODELS

Supose we have data from a combination of two Gaussians. The probability of generating x is

g(x) = piN(x,theta_1) + (1 - pi)N(x,theta_2)

The model has five parameters, pi and the two means and variances.

Suppose we want to do MLE. The log likelihood is

l(theta|X) = SUM log [ piN(x,theta_1) + (1 - pi)N(x,theta_2) ]

Maximizing this is difficult because of the + inside the logarithm.

Also, the actual maximum gives parameter estimates that are unwanted: set one mean equal to one observation, with zero variance. Then the likelihood of this point is infinite, so the total likelihood is infinite also.

EM is a general iterative procedure for finding a local optimum for this type of hard MLE problem.

LEARNING WITH CENSORED DATA

Often, we want to learn from data with parts of observations missing (censored data). These situations give rise to "chicken and egg" problems.

Example: waiting for the AP&M elevator, where the waiting time has an exponential distribution, which has the Markov property of history-independence: Pr(T > t+r | T>t) = Pr(T > r)

What is the expected time to wait, called mu?

Data: wait 7 min, success
wait 12 min, give up (censored data)
wait 8 min, success
wait 5 min, give up

Guess mu = 8. Fill in missing data, giving 7, 20, 8, 13. Now the new estimate of mu is the mean which is 48/4 = 12. Repeat.

E step: compute expected value, or a probability distribution for the missing info.
M step: compute MLE of model parameters given expected values for the missing data.

Theorem: Under certain conditions, this process converges to a model with locally maximal likelihood L(data | theta).

THE EM FRAMEWORK

The EM algorithm is useful in two situations. One situation is where some training data is genuinely unobserved, i.e. censored. The bus waiting time scenario is an example of this situation.

The other situation is where maximizing the original incomplete likelihood L(theta | X) is too difficult. Consider the general mixture modeling scenario where we have components numbered i = 1 to i = M.

Suppose we have observed data X generated by a pdf with parameter theta. To estimate theta, we want to maximize the log-likelihood l(theta; X) = log p(X; theta).

Suppose there is additional data called Z also generated with parameter theta. Names for the Z data include latent, missing, and hidden.

Let the "complete" data be T = (X,Z) with log-likelihood l_0(theta; X, Z).

In the Gaussian mixture case, z_i can be a 0/1 variable that reveals whether x_i was generated by theta_1 or theta_2.

DERIVING THE EM ALGORITHM

By the definition of conditional probability, p(Z | X, theta') = p(Z, X | theta') / p(X | theta').

So p(X | theta') = P(Z, X | theta') / p(Z | X, theta').

Changing to log-likelihoods, l(theta'; X) = l_0(theta'; Z, X) - l_1(theta'; Z | fixed X)

Now we take the expectation of each side over Z, where the distribution of Z is a function of a given X and theta (different from theta'). On the left we have

E[ l(theta'; X) ] where Z ~ f(X,theta) = E [ log p(X; theta') ] = log p(X; theta')

On the right we have

E[ l_0(theta'; Z, X) ] - E[ l_1(theta'; Z | fixed X) ] where Z ~ f(X,theta)

Call this expression Q(theta', theta) - R(theta', theta).

We have a lemma that says we can maximize this expression just by mazimizing Q(theta', theta).

Lemma: If Q(theta', theta) > Q(theta,theta) then Q(theta', theta) - R(theta', theta) > Q(theta, theta) - R(theta, theta).

In words, this lemma says that if the expected complete log-likelihood is increased, then the incomplete log-likelihood is also increased.

MAXIMIZING Q

The E step is to evaluate Q(theta', thetaj) symbolically as a function of theta', for a given thetaj.

Based on the E step, the M step is to find the value for theta' that maximizes Q(theta', thetaj).

What is Q(theta', thetaj)? It is E [ l_0(theta'; T) ] where the expectation averages over alternative values for the missing data Z, whose distribution depends on the known data X and the parameter thetaj.

Often, a major simplification is possible in the E step. The simplification is to bring the expectation inside the l_0 function, so it applies just to Z:

E [ l_0(theta'; X, Z) ] = l_0(theta'; X, E[Z])

where the distribution of Z is a function of thetaj and also of X.

SIMPLIFYING THE EM STEPS

In the E step we evaluate Q(theta, theta') = E[ log p(X,Z | theta) | X, thetaj ] where the expectation is an integral over Z.

Often we can simplify this by finding a single special z value such that the integral over Z is the same as evaluating the integrand for this special z value.

The obvious choice for the special z value is the expectation of Z. We want to use z = E[ Z | X, thetaj ] as an imputed value for Z, instead of averaging over all possible values of Z. In this case Q(theta, thetaj) = E[ log p(X, E[ Z | X, thetaj ] | theta)]

In general, the expectation of a function is the function of the expectation if and only if the function is linear. Fortunately, many log-likelihood functions are linear.

A further simplification: If a missing variable Yi is binary-valued then then p(Yi = 1) = E[ Yi ].

In M step we have a function of theta to maximize. Usually we do this by computing the derivative and solving for when the derivative equals zero. Sometimes the derivative expression is a sum of terms where each term involves only a part of the theta parameters. Then we can solve separately for when each term equals zero.

GAUSSIAN MIXTURE EXAMPLE

l_0(theta'; T) = log p(X, Z; theta') = SUM_{i s.t. Zi = 0} f(xi, theta'1) + SUM_{i s.t. Zi = 1} f(xi, theta'2)
= SUM_i (1 - Zi) f(xi, theta'1) + Zi f(xi, theta'2)

Now we want to take the average value of this, for Z following a distribution that depends on X and thetaj. Note that f(xi,theta'1) and f(xi, theta'2) do not depend on Z. So we take them outside the expectation and get

SUM_i f(xi, theta'1) (1 - E[Zi]) + f(xi, theta'2) E[Zi]

For fixed X and thetaj, we can compute E[Zi] using Bayes' rule:

E[Zi] = P(Zi = 1|X, thetaj) = P(Zi = 1|Xi, thetaj)
= P(Zi = 1 and X | thetaj) / ( P(Zi = 1 and X | thetaj) + P(Zi = 0 and X | thetaj)

[此贴子已经被作者于2006-4-27 14:06:50编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2005-9-5 23:39:00

The Matrix Reference Manual

Copyright © 1998-2005 Mike Brookes, Imperial College, London, UK Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".

To cite this manual use: Brookes, M., "The Matrix Reference Manual", [online] http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/intro.html, 2005

The Matrix Reference Manual

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2005-9-9 11:11:00

[下载]

MIT OpenCourseWare

» Political Science

» Quantitative Research Methods: Multivariate, Spring 2004

[此贴子已经被作者于2006-4-27 14:28:26编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

Trevor

2005-9-19 07:51:00

Good Job!

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

ReneeBK

2005-9-27 08:14:00

The More Strong in Math, the better in the study of economics and econometrics!

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

zhouzuyu

2005-9-27 08:28:00

楼主，太贵了，买不起，不能降点价吗？

如果每个人都买不起，您还不是得不到钱，得不到钱也就没有钱去买更多的坛子上的好东东啊！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

libingfu

2005-9-27 12:06:00

以下是引用zhouzuyu在2005-9-27 8:28:48的发言：

楼主，太贵了，买不起，不能降点价吗？

如果每个人都买不起，您还不是得不到钱，得不到钱也就没有钱去买更多的坛子上的好东东啊！

确实是这样啊！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

lf_2364

2005-9-28 12:01:00

以下是引用zhouzuyu在2005-9-27 8:28:48的发言：

楼主，太贵了，买不起，不能降点价吗？

如果每个人都买不起，您还不是得不到钱，得不到钱也就没有钱去买更多的坛子上的好东东啊！

言之有理,希望搂主考虑。可否考虑薄利多销

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

pan1111111

2005-9-28 13:10:00

是啊！书是好书，能不能便宜点呢？

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2005-11-6 04:11:00

After 便宜, it will be not 好书.

[此贴子已经被作者于2006-4-27 14:10:57编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

mtex

2005-11-6 09:38:00

是啊！书是好书，能不能便宜点呢？

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

Nicolle

2005-12-11 03:04:00

提示: 作者被禁止或删除内容自动屏蔽

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

gaojizu

2005-12-11 16:21:00

在那里搞到这么多书？

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hangover

2005-12-11 22:09:00

用电炉下，这些书都有。
嘿嘿，不是我像挖你墙角，你都有那么多钱了，又何必呢？

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

QQ503

2006-4-27 11:27:00

唉，没钱啊！怎么那么穷啊！！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2006-4-27 13:27:00

General Information

Announcements

Lecture notes

Homework and Exams . Complete set with SAS code has been posted

Example analyses

Handy program and links

[此贴子已经被作者于2006-4-27 13:29:38编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

hanszhu

2006-4-27 13:42:00

Health Studies 333 / Statistics 369
Longitudinal Data Analysis

Autumn 2005

Updates or new material was posted to this site on:
14 December 2005
(see current syllabus)

Tuesdays & Thursdays
9:00a—10:20a
BSLC Room 313

Instructor: Paul Rathouz, Ph.D.

Directions to my office

Course Assistant: Chava Zibman

Lab / Help Session
Wednesdays 1:30p—3:00p
AMB Building Room W-264 (my office)

Office Hours for Dr. Rathouz
Mondays 1:30p—3:00p
or by appointment
AMB Building Room W-264 (my office)
Directions to my office

Downloadable Materials

Course Materials

Course Syllabus for Autumn 2005

Lecture Notes

Lecture 1 Notes
Lecture 2 Notes
Lecture 2 Supplement
Comment on Stata commands tsset, iis and tis
Lecture 3 Notes
Handwritten Summary Slides on Exploratory Analysis for Longitudinal Data
Lecture 4 Notes
Lecture 5 Notes
Introductory notes on using SAS
Lecture 6 and Lecture 6a Notes
Lecture 7 Notes
Lecture 8 Notes
Example of using Stata command xtmixed to fit random effects models (updated 10 Nov 05)
Lecture 9 Notes (updated 07 Nov 05)
Lecture 10 Notes
Notes on Exploratory Data Analysis for Categorical Longitudinal Data
Lecture 11 Notes
Lecture 12 Notes
Lecture 12a Notes
Handwritten Summary Slides on Linear Models for Longitudinal Data
Handwritten Summary Slides on Generalized Linear Models for Longitudinal Data

Problem Sets

Problem Set 1
Solutions for Problem Set 1
Problem Set 2
Solutions for Problem Set 2
Problem Set 3
Solutions for Problem Set 3
Problem Set 4
Stata manual pages for GEE problem
Solutions for Problem Set 4
Problem Set 5
Solutions for Problem Set 5
Final Problem Set. NOTE: You must do this problem set entirely on your own. No working with other students. You may ask me or Chava any questions that arise.
Solutions for Final Problem Set

Data Sets

Following are some data sets which may be discussed in class. They are all in Stata format (*.dta) files, which should be readable by Stata v.7.x or higher on any platform.

Alternatively, you can download the *.raw file and the *-readin.do file and run the *-readin.do file in your version of stata. This is useful if you are running an earlier version of stata.

Please let me know if you have trouble reading the files or if you would prefer to have the data in another format.

CD4: Multicenter AIDS cohort study of HIV
cd4.dta cd4.raw cd4-readin.do
Cows: Protein content of milk from cows
cows.dta cowsbarley.raw cowslupins.raw cowsmixed.raw cows-readin.do
Nepal: Growth of Nepalese Children
nepal.dta data description nepal.raw nepal-readin.do
AHEAD: Data from first four waves of the Study of Assets and Health Dynamics Among the Oldest Old
Stata data file (recommended)
NOTE: If you were to use these data for your own research, you should not use this data set.
Rather, begin by registering at the HRS/AHEAD web site and down-loading the full data set.
Seizure: Clinical trial for epileptic treatment
seizure.dta seizure.raw seizure-readin.do
Riesby: Clinical trial for depression
riesby.dta riesby.raw riesby-readin.do
ICHS: ICHS data set for Final Problem Set
ichs.dta ichs.raw ichs-readin.do
AFCR: Data set from MS clinical trial
afcr.raw (save as source to download). afcr-readin.do
Back pain: Data set from back pain clinical trial
back.raw (save as source to download). back-readin.do
Schizophrenia: Data set from schizophrenia clinical trial
schizrep.raw (save as source to download).
Anxiety: Data from study of panic attacks
anx.dta (Save as source to download).

Stata Ado Files

For each of these files, download the *.ado file and the *.hlp file. I recommend using "Save As Source" when you save them to your hard disk from your web browser.

To use the *.ado files, put them in your current directory, in your stata "ado" directory, or in a directory where Stata will know where to look for them.

These are *not* thoroughly tested functions! Please let me know of any bugs that you find in these functions.

ksmapprox: Faster function to generate smooth model fits
ksmapprox.ado ksmapprox.hlp
autocor: Function to compute sample autocorrelation function for fixed time points of equal lag
autocor.ado pdf help file
Credit: This function was written by Dr. Elizabeth Garrett at Johns Hopkins University
xtgraph: Function for making plots of means over time
xtgraph.ado xtgraph.hlp pdf demonstration file Credit: This program was writted by Paul Seed. I swiped it from the following archive of Stata programs: [website]

You might also examine the functions longplot and linkplot from that same website
xtsumcorr: Corrected within and between variance estimation (New on 04 October 2005)
xtsumcorr.ado xtsumcorr.hlp
variogram: Variogram plot (New on 04 October 2005)
variogram.ado variogram.hlp
This function requires xtdiff.ado and ksmapprox.ado (described above).

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

Getting help: 620-160 Consultation roster

Weekly student materials: Quick Quizzes/Tutorials/Labs/Answers

Assignments: dates/assignments/solutions/special consideration

Answers to the chapter exercises in the Lecture Notes.

Contact staff/ Lecturers' consultation hours/Lectures : where and when

Tutorials and labs: where and when

Your SSLC representatives

View LLN applet and CLT, binomial approximation and CI applets.

Lecture slides can be downloaded from here

Weekly materials: Quick Quizzes/Tutorials/Labs/answers

扫码加我 拉你入群

扫码加我 拉你入群

[下载]Applied Nonlinear Analysis

S650 Categorical Data Analysis

News for Fall 2005/2006

Enrolling and Dealing with Time Conflicts

扫码加我 拉你入群

[下载]

扫码加我 拉你入群

[下载]

扫码加我 拉你入群

[下载]Allen Hatcher.Algebraic Topology

STAT 7030, Categorical Data Analysis, Spring 2006

General Information

Textbook and References

Lecture Notes

Homeworks and Answer Keys

Some Online Resources

扫码加我 拉你入群

[下载]Collins G W.Numerical Methods And Data Analysis

Fundamental Numerical Methods and Data Analysis

扫码加我 拉你入群

扫码加我 拉你入群

MISSING DATA

More accurate and reliable results can be obtained with maximum likelihood or multiple imputation.

Materials

Course outline

Comments by April 2005 Participants

扫码加我 拉你入群

[下载]

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

[下载]

ANNOUNCEMENT

MIXTURE MODELS

LEARNING WITH CENSORED DATA

THE EM FRAMEWORK

DERIVING THE EM ALGORITHM

MAXIMIZING Q

SIMPLIFYING THE EM STEPS

GAUSSIAN MIXTURE EXAMPLE

扫码加我 拉你入群

The Matrix Reference Manual

扫码加我 拉你入群

[下载]

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

Health Studies 333 / Statistics 369Longitudinal Data Analysis

Autumn 2005

Updates or new material was posted to this site on: 14 December 2005 (see current syllabus)

Tuesdays & Thursdays 9:00a—10:20a BSLC Room 313

Instructor: Paul Rathouz, Ph.D.

Course Assistant: Chava Zibman

Lab / Help Session Wednesdays 1:30p—3:00pAMB Building Room W-264 (my office)

Office Hours for Dr. Rathouz Mondays 1:30p—3:00por by appointmentAMB Building Room W-264 (my office)Directions to my office

Downloadable Materials

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

Health Studies 333 / Statistics 369
Longitudinal Data Analysis

Updates or new material was posted to this site on:
14 December 2005
(see current syllabus)

Tuesdays & Thursdays
9:00a—10:20a
BSLC Room 313

Lab / Help Session
Wednesdays 1:30p—3:00p
AMB Building Room W-264 (my office)

Office Hours for Dr. Rathouz
Mondays 1:30p—3:00p
or by appointment
AMB Building Room W-264 (my office)
Directions to my office

扫码加我拉你入群