按比例分层抽样的增益 gains from proportionate stratified sampling

1601

收藏 2012-02-15

   若能收集到该词条相关资料的跟帖者给予奖励，根据收集资料的水平给予10～100不等的论坛币，当然也可以谈谈自己的看法，上传资料等等，请大家尽量不要复制一些搜索引擎很容易收集到的资料。
   我每天都会发送不等数量的词条，所以诚邀不同统计背景的各路大侠发表评论。
   欢迎大家多多跟帖！要数量也要质量哟

可能就是抽样理论的一种应用吧

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

区域经济爱好者

2012-10-25 09:12:08

Title

gsample -- Sampling

Syntax

      gsample [#|varname] [if] [in] [weight] [, options]

options             Description
-------------------------------------------------------------------------------------------------------------------------------------------    percent                sample size is in percent
   wor                      sample without replacement
   strata(varlist)          variables identifying strata
   cluster(varlist)       variables identifying resampling clusters
   idcluster(newvar)    create new cluster ID variable
   keep                      keep observations that do not meet if and in
   generate(newvar)    store sampling frequencies in newvar
   replace                   overwrite existing variables
------------------------------------------------------------------------------------------------------------------------------------------- aweights are allowed; see weight.

Description

gsample draws a random sample from the data in memory. Simple random sampling (SRS) is supported, as well as unequal probability sampling (UPS), of which sampling with probabilities proportional to size (PPS) is a special case. Both methods, SRS and UPS/PPS, provide sampling with replacement and sampling without replacement. Furthermore, stratified sampling and cluster sampling is supported.

# specifies the size of the sample. The default for gsample is to replace the data in memory with the sampled observations in random order. Alternatively, gsample may store a new variable containing the sampling frequencies of the observations (see the generate(newvar) option). In the case of sampling without replacement (see the wor option), the sample size must be less than or equal to the number of sampling units in the data. Sampling units are either single observations or clusters identified by the cluster() option. If # is not specified or if #==., the sample size is equal to the observed number of units in the data. For stratified sampling, # units will be selected from each stratum identified by the strata() option. Alternatively, specify varname instead of #, where varname is a variable containing for each stratum a specific sample size. varname is assumed to be constant within strata.
Specifying aweights causes unequal probability sampling (UPS/PPS) to be performed. The sampling probabilities of the observations will be proportional to the specified weights in this case. gsample is implemented as a wrapper for the mm_sample() function from the moremata package. See help for mm_sample() for methodical details and  references. Note that for unequal probability sampling without replacement many different algorithms have been proposed in the literature and there  may be better solutions than the method implemented here. In addition, UPS without replacement may fail if the distribution of weights is very uneven (see help for mm_sample() for an explanation of this problem).

If you are serious about sampling, you should first set the random number seed; see help generate.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群