添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
相关文章推荐
心软的针织衫  ·  SQL datagrip-阿里云·  1 年前    · 
千杯不醉的仙人球  ·  webpack ...·  1 年前    · 

Case 2. I already have an id variable, and I have multiple observations per id, but I want a new id variable containing 1 for the first id, 2 for the second, and so on. Such questions often arise with panel data and in other circumstances. Perhaps the identifier variable is a string — id "numbers" 1A038, 2B217, ... — and you need numeric identifiers — 1, 2, ... — because some Stata commands require them. Perhaps the original id is numeric — of the form 102938, 149384, 150394, ... — but you want to draw a graph using the identifier as one of the axes and want the data points equally spaced. Answer 1.
To create a new variable newid from the existing variable oldid , whether oldid is string or numeric, type
        . egen newid = group(oldid)
  The new variable newid will contain 1 for the first value of
  oldid, 2 for the second value, and so on.
  Answer 2.
To create a new variable newid from the existing variable oldid, whether oldid is string or numeric, type
        . sort oldid
        . by oldid: gen newid = 1 if _n==1
        . replace newid = sum(newid)
        . replace newid = . if missing(oldid)
  Both answers yield the same results:  the four lines of answer 2 amount to
  egen does.  It is,
  however, worth understanding answer 2.
  We start with existing identifier ID, which may be either a numeric variable
  or a string variable.
        . sort oldid
  This command puts the observations in the order of oldid.
        . by oldid: gen newid = 1 if _n == 1 
  This command creates a new variable newid that is 1 for the first
  observation for each individual and missing otherwise.  _n is the
  Stata way of referring to the observation number; in a 10-observation
  dataset, _n takes on the values 1, 2, ..., 10.  When _n is
  combined with by, however, _n is the observation number within
  by-group, in this case, within oldid.  If there were three
  oldid==1 observations followed by two oldid==2 observations in
  the dataset, _n would take on the values 1, 2, 3, 1, 2.  Thus,
  by ...: ...  if _n==1 is a way to refer to the first
  observation in each by-group.  See the sections of [U] indexed under
  by varlist: prefix.
  by oldid: gen newid=1 if _n==1 sets newid to 1 in the first
  observation of each oldid.
        . replace newid = sum(newid)
  This command replaces newid by its cumulative or running sum. 
        . replace newid = . if missing(oldid)
  This command puts missing value into newid, where oldid
  contained missing value.  This step is probably unnecessary because if
  oldid really is an ID variable, it should never contain missing
  anyway.
  Let us see how that works for a simple dataset. Missing values (.)
  make no difference to a cumulative sum. In that context, they are treated as
  numerically equal to 0.
        oldid     newid (as created)   newid (as replaced) 
        1             1                    1 
        1             .                    1
        1             .                    1 
        1             .                    1
        22            1                    2 
        22            .                    2 
        22            .                    2
        33            1                    3
        33            .                    3
  We have said that both answers are the same. But there is an advantage to
  the first. Using the label option
        . egen newid = group(oldid), label
  will ensure that the values of the existing variable oldid (or their value
  labels if they exist) are copied across as value labels for the new
  variable newid. That way, you get the best of both worlds, tidy identifiers
  with values 1 and up and labels that preserve information from your existing
  dataset.
            

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Privacy policy

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Privacy policy

Required cookies
Advertising cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.