Clyde 提供下列作法:
In Stata terminology, in your example, all of your stkcd's in the example have at least 5 consecutive years of observations. I assume what you mean is 5 consecutive years of observations with non-missing values of x. I notice that in your example data there are no gaps in the year variable, but the code above does not rely on that--it will work correctly if there are some gaps.
The logic is that a spell of consecutive non-missing observations begins, with the observations are sorted by year within stkcd, when x is not missing but x[_n-1] is. Here we actually start be counting runs of consecutive missing or consecutive non-missing observations. Then each run has a length. Then we calculate the length of the longest run of non-missing values. And we keep those where that longest run's length is at least 5.