How to split a sample according to a certain variable in Stata?

Asked 25/9, 2012 at 16:20 Answered 3/9, 2014 at 22:16

I'd like to split a sample according to a specific variable, creating 4 sub-samples each one related to a quartile of the variable's distribution. The aim is to demonstrate that the presence of different levels of this variable influences the outcome of a regression, making it significant or not.

Kolnick answered 25/9, 2012 at 16:20 Comment(0)

The easiest way to do this is to use the egen command to cut your variable into four equally-spaced intervals.

Example:

. sysuse auto, clear
(1978 Automobile Data)

. sum price, detail

                            Price
-------------------------------------------------------------
      Percentiles      Smallest
 1%         3291           3291
 5%         3748           3299
10%         3895           3667       Obs                  74
25%         4195           3748       Sum of Wgt.          74

50%       5006.5                      Mean           6165.257
                        Largest       Std. Dev.      2949.496
75%         6342          13466
90%        11385          13594       Variance        8699526
95%        13466          14500       Skewness       1.653434
99%        15906          15906       Kurtosis       4.819188

. egen price_cut = cut(price), group(4)

. table price_cut, contents(n price min price max price)

----------------------------------------------
price_cut |   N(price)  min(price)  max(price)
----------+-----------------------------------
        0 |         18       3,291       4,187
        1 |         19       4,195       4,934
        2 |         18       5,079       6,303
        3 |         19       6,342      15,906
----------------------------------------------

I hope this helps you.

Neptunian answered 25/9, 2012 at 17:11 Comment(0)

This is the easiest way you can go about it:

xtile xx=yourvariable, nq(4)

I hope this helps.

Stereotyped answered 3/9, 2014 at 22:16 Comment(1)

Not easier; not more difficult either, but not easier. xtile does what is designed to do; egen's cut() allows other ways of subdivision too. – Crucifix 4/9, 2014 at 6:55

Recommended topics

Hot tags