Most Stata commands can deal with weighted data. Stata allows four kinds of weights:
1. fweights, or frequency weights, are weights that indicate the number of duplicated
observations.
2. pweights, or sampling weights, are weights that denote the inverse of the probability that
the observation is included because of the sampling design.
3. aweights, or analytic weights, are weights that are inversely proportional to the variance
of an observation; that is, the variance of the jth observation is assumed to be
sigma^2/w_j, where w_j are the weights. Typically, the observations represent averages
and the weights are the number of elements that gave rise to the average. For most Stata
commands, the recorded scale of aweights is irrelevant; Stata internally rescales them to
sum to N, the number of observations in your data, when it uses them.
4. iweights, or importance weights, are weights that indicate the "importance" of the
observation in some vague sense. iweights have no formal statistical definition; any
command that supports iweights will define exactly how they are treated. Usually, they
are intended for use by programmers who want to produce a certain computation.
The general syntax is
command ... [weightword=exp] ...
For example:
. anova y x1 x2 x1*x2 [fweight=pop]
. regress avgy avgx1 avgx2 [aweight=cellpop]
. regress y x1 x2 x3 [pweight=1/prob]
. scatter y x [aweight=y2], mfcolor(none)
You type the square brackets.
Stata allows abbreviations: fw for fweight, aw for aweight, and so on. You could type
. anova y x1 x2 x1*x2 [fw=pop]
. regress avgy avgx1 avgx2 [aw=cellpop]
. regress y x1 x2 x3 [pw=1/prob]
. scatter y x [aw=y2], mfcolor(none)
Also, each command has its own idea of the "natural" kind of weight. If you type
. regress avgy avgx1 avgx2 [w=cellpop]
the command will tell you what kind of weight it is assuming and perform the request as if you
specified that kind of weight.
There are synonyms for some of the weight types. fweight can also be referred to as frequency
(abbreviation freq). aweight can be referred to as cellsize (abbreviation cell):