The varlist1 (varlist2) syntax is of special use to programmers. It verifies that the data are sorted by
varlist1 varlist2 and then performs a by as if only varlist1 were specified. For instance,
by pid (time): generate growth = (bp - bp[_n-1])/bp
performs the generate by values of pid but first verifies that the data are sorted by pid and time within pid.
在论坛见到很多次by varlist1 (varlist2) 和 by varlist1 varlist2,之前有混用过,今天查了下帮助文件,发现应该是有区别的,括号的意思是,括号里的变量参与排序但不作为分组变量参与计算。例子如上。
另外,
by和bys的区别在于前者没有排序,要求数据是已经排好序的,后者命令可以排序。
Syntax
by varlist: stata_cmd
bysort varlist: stata_cmd
The above diagrams show by and bysort as they are typically used. The full syntax of the commands is
by varlist1 [(varlist2)] [, sort rc0]: stata_cmd
bysort varlist1 [(varlist2)] [, rc0]: stata_cmd
例如by rep78, sort: tabulate foreign等价于bysort rep78: tabulate foreign