全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 SPSS论坛
3684 10
2014-04-23
My understanding is that a sample can be adjusted based on a known population using the "Weight Cases" function in SPSS. What is the proper procedure for weighting by multiple factors? Let's say I have WEIGHT_SEX and WEIGHT_ETHNICITY. Can I compute a WEIGHT_COMPOSITE by taking their product? Are there any downsides to doing this that I should be aware of?
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2014-4-23 01:49:59
       
What is weight in your case? Is it frequency weight (the weight value shows how much times the row of data is counted)? SPSS command WEIGHT is this frequency weighting
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2014-4-23 01:50:51
Base SPSS isn't very deft at handling sampling weights. See this website for some details on weighting (including caveats on SPSS): http://www.ats.ucla.edu/stat/spss/faq/weights.htm

Weight cases in SPSS treats each line as representative of a certain number of observed samples. (e.g. if you assign a weight of 100 to a particular line (case), then that line will be treated as 100 replicated observations of the information in that line.) This means that your sampling inferences will be based on a sample size that is too large -- and hence the calculations that follow will have too great a precision. As an example:

You might have 100 observations that you think are representative of 10,000 people in the population. The sampling properties of your statistics are driven by the 100 observations in your sample. Using weight cases would make SPSS think that your sample actually consisted of 10,000 cases, and thus statistical inference (in terms of standard errors, confidence intervals, hypothesis tests) would be wildly over-optimistic.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2014-4-23 01:52:36
The proper procedure would be to first create a new categorical variable which will be the intersection of sex & ethnicity (i.e., it will have categories white-male, white-female, africanamerican-male, africanamerican-female, etc.).

Then one has to identify the weights for each of the categories (i.e., sub populations/strata) identified by the new variable.

This may or may not be the same as the products of the weight variables (most likely not).

@James Stanley makes an important point. The issue is not necessarily of SPSS, rather how the weighting is used. A way to deal with that issue is to "re-base the weight variable to the sample size". That is to assign weights such that the weighted total sample size is equal to (very close to) the unweighted sample size. This can be achieved by computing the weight for the category to be the population proportion for the category divided by the sample proportion for the category. That is, suppose there are ni observations from k sub-populations, adding to a total sample size of n. Suppose you know that the population proportion for that category is pi. Then, the weight for that category is wi=pi/(ni/n).  In computation the total weighted sample size under this approach will only be approximately equal to because of rounding issues.

Survey sample weighting is a complicated matter. There are different types of weights and how they should be handled. For example, SPSS may not be able to handle replicate weights.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2014-4-23 01:53:24
If you need a multidimensional correction for representativeness, you might want to use the SPSSINC RAKE extension command. It computes weights matching specified marginals in up to 10 dimensions. You can get it from the SPSS Community website at www.ibm.com/developerworks/spssdevcentral. It requires the Python Essentials, which are also available via that site and the Advanced Statistics module. The latter is needed because RAKE uses GENLOG to fit a loglinear model as part of the process.

If you have a complex samples design, though, you should use the procedures in the Complex Samples option.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2014-4-23 01:54:37

You can use SPSS macro for multivariate weighting. Download macro from this page: SPSS multivariate weighting

You have to define weight parameters (number of parameters is unlimited) and the name of weighting variable only. Everything is calculated itself.

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

点击查看更多内容…
相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群