How to rename many variables with string suffixes
Asked Answered
D

5

6

In Stata, I have a set of variables that all begin with pkg. In their current state, their endings are numeric: pkg1, pkg2, pkg3, pkg4 and so on.

I need to change all of these variables' endings to strings: pkgmz, pkggmz, pkgsp, pkgsptc etc.

I have a column of these string endings, which I can designate as a local list.

For example:

local croplist mz gmz sp sptc mil cof suk tea ric

How do I change the numeric endings to the string endings?

My guess at the code can be found below and the ??? indicate where I am stumped:

local croplist crops mz gmz sp sptc mil cof suk tea ric

foreach x of varlist pkg* {
    local new1 = substr(`x', 1, 3)
    local new2 = ???
    rename `x' ``new1'`new2''
    label var ``new1'`new2'' "Avg district level `new2' price"
}

I wonder if it would be better to utilize the regexr() command, but can't think of a way to include it.

Any help is appreciated.

Downey answered 3/12, 2012 at 2:3 Comment(0)
V
9

There is no need here to invoke regular expressions. You have the new suffixes; the prefix pkg is always the same, so the labour of extracting it repeatedly is unnecessary. The heart of the problem is cycling over two lists at once. Here is one way to fix your code.


local croplist mz gmz sp sptc mil cof suk tea ric
local j = 1 
foreach x of varlist pkg* {
    local sffx : word `j' of `croplist' 
    rename `x' pkg`sffx'
    label var pkg`sffx' "Avg district level `sffx' price"
    local ++j 
}

Note also rename in Stata 12+ can handle this; regexr() is a function, not a command; a more general discussion in http://www.stata-journal.com/sjpdf.html?articlenum=pr0009 (a little out-of-date, but relevant on the main issue); you have too many quotation marks on your rename command, so it wouldn't work.

EDIT 30 July 2018

I tend now more often to use gettoken:

local croplist mz gmz sp sptc mil cof suk tea ric
foreach x of varlist pkg* {
    gettoken sffx croplist: croplist
    rename `x' pkg`sffx'
    label var pkg`sffx' "Avg district level `sffx' price"
}

The local macro croplist is a stack. Each time around the loop we take the top item from the stack and leave the rest for the next time. Each time around the loop

Viola answered 3/12, 2012 at 7:21 Comment(3)
Oh so that's how you advance the counter in stata! Thanks! I was always trying to do something like j=j+1, as you would do in most other programming (visual basic, matlab, etc.). It seems to me stata goes out of its way sometimes to be different from intuitive programming conventions.Downey
@Nick Cox You mentioned that the rename command in Stata 12+ can handle this. Is it possible to do so in a single rename command? I posted an answer with a few options using the new command, but I couldn't figure out if it was possible to use a single call.Asante
@Michael A It is possible to do it with one command, but the only answer that occurs to me is not engaging. rename (pkg1-pkg9 ) (pkgmz pkggmz pkgsp pkgsptc pkgmil pkgcof pkgsuk pkgtea pkgric) Still, it remains true that people will spend minutes trying to think up a clever trick when the names could have been typed in seconds.Viola
V
10

Here is another way to do it. tokenize puts separate words in macros numbered 1 up. The nested reference ``j'' is handled just in elementary algebra: evaluate the inside macro reference first.

 
tokenize "mz gmz sp sptc mil cof suk tea ric" 
forval j = 1/9 {
    rename pkg`j' pkg``j''
    label var pkg``j'' "Avg district level ``j'' price"
}
Viola answered 3/12, 2012 at 8:1 Comment(1)
This would be good when the numbers are consecutive. As it turns out in my case the numbers are not consecutive (although I thought they were when originally posted my question). So I will give the check to Nick's first answer. But both are very useful.Downey
V
9

There is no need here to invoke regular expressions. You have the new suffixes; the prefix pkg is always the same, so the labour of extracting it repeatedly is unnecessary. The heart of the problem is cycling over two lists at once. Here is one way to fix your code.


local croplist mz gmz sp sptc mil cof suk tea ric
local j = 1 
foreach x of varlist pkg* {
    local sffx : word `j' of `croplist' 
    rename `x' pkg`sffx'
    label var pkg`sffx' "Avg district level `sffx' price"
    local ++j 
}

Note also rename in Stata 12+ can handle this; regexr() is a function, not a command; a more general discussion in http://www.stata-journal.com/sjpdf.html?articlenum=pr0009 (a little out-of-date, but relevant on the main issue); you have too many quotation marks on your rename command, so it wouldn't work.

EDIT 30 July 2018

I tend now more often to use gettoken:

local croplist mz gmz sp sptc mil cof suk tea ric
foreach x of varlist pkg* {
    gettoken sffx croplist: croplist
    rename `x' pkg`sffx'
    label var pkg`sffx' "Avg district level `sffx' price"
}

The local macro croplist is a stack. Each time around the loop we take the top item from the stack and leave the rest for the next time. Each time around the loop

Viola answered 3/12, 2012 at 7:21 Comment(3)
Oh so that's how you advance the counter in stata! Thanks! I was always trying to do something like j=j+1, as you would do in most other programming (visual basic, matlab, etc.). It seems to me stata goes out of its way sometimes to be different from intuitive programming conventions.Downey
@Nick Cox You mentioned that the rename command in Stata 12+ can handle this. Is it possible to do so in a single rename command? I posted an answer with a few options using the new command, but I couldn't figure out if it was possible to use a single call.Asante
@Michael A It is possible to do it with one command, but the only answer that occurs to me is not engaging. rename (pkg1-pkg9 ) (pkgmz pkggmz pkgsp pkgsptc pkgmil pkgcof pkgsuk pkgtea pkgric) Still, it remains true that people will spend minutes trying to think up a clever trick when the names could have been typed in seconds.Viola
V
7

Ben asked in a comment about incrementing counters held in local macros.

Stata's local macros are in general for holding strings; string characters can be numeric, so holding numbers is a special case, but naturally a very useful one. This thread alone has shown several examples. It helps to hold that history in mind. A longstanding syntax is based on the forms

local macname <contents> 

and

local macname = <expression> 

The first form copies to macname while the second form evaluates before assigning the results to macname. The main way to increment counters was for several versions

local j = `j' + 1 

but the syntax

local ++j 

is now allowed. However, although allowed

local j++ 

won't work as you may expect, although what happens is consistent with the first syntax for macros.

So, if this looks a little odd given your background, that's understandable, but local macros were intended for string processing, not arithmetic. Mata is much more mainstream-like in this regard.

I wrote a tutorial on loops and macros in

Cox, N.J. 2002. How to face lists with fortitude. Stata Journal 2(2): 202-222

which is accessible to all at

http://www.stata-journal.com/sjpdf.html?articlenum=pr0005

Viola answered 10/12, 2012 at 10:16 Comment(0)
A
1

As of Stata 12+, rename can handle this case in several ways.

This method creates a new macro new_croplist containing variable names pkgmz pkggmz pkgsp pkgsptc pkgmil pkgcof pkgsuk pkgtea pkgric, then uses rename to rename variables following the pattern pkg<digits> to the names specified in new_croplist. The numbers following pkg do not need to be consecutive.

local croplist mz gmz sp sptc mil cof suk tea ric
local new_croplist
foreach name of local croplist {
    local new_croplist `new_croplist' pkg`name'
}
rename pkg# (`new_croplist')

A second method uses the new rename function twice; as before, this does not require consecutive numbers in the original names. The first command renames variables of the pattern pkg<digits> to the names specified in croplist. The second command adds the prefix pkg to the new variable names.

rename pkg# (`croplist')
rename (`croplist') pkg=

In both cases, and in general when using the rename command (referred to as rename group in Stata's documentation), the number of old variable names must match the number of new variable names, so make sure that the number of variables matched by pkg# matches the number of new names specified in `croplist'.

Asante answered 18/6, 2015 at 22:20 Comment(0)
K
0

An alternative to using a counter as in @Nick's excellent example, is to employ macro shift:

clear

forvalues i = 1 / 9 {
    generate pkg`i' = runiform()
}

local croplist mz gmz sp sptc mil cof suk tea ric
tokenize "`croplist'"

foreach var of varlist pkg* {
    rename `var' pkg`1'
    label var pkg`1' "Avg district level `1' price"
    macro shift
}

You can also use the ds command to obtain a list of variable names starting with pkg:

local croplist mz gmz sp sptc mil cof suk tea ric
tokenize "`croplist'"

ds pkg*

foreach var of varlist `r(varlist)' {
    rename `var' pkg`1'
    label var pkg`1' "Avg district level `1' price"
    macro shift
}

In both cases you get:

pkgmz           float   %9.0g                 Avg district level mz price
pkggmz          float   %9.0g                 Avg district level gmz price
pkgsp           float   %9.0g                 Avg district level sp price
pkgsptc         float   %9.0g                 Avg district level sptc price
pkgmil          float   %9.0g                 Avg district level mil price
pkgcof          float   %9.0g                 Avg district level cof price
pkgsuk          float   %9.0g                 Avg district level suk price
pkgtea          float   %9.0g                 Avg district level tea price
pkgric          float   %9.0g                 Avg district level ric price
Kennithkennon answered 26/5, 2018 at 20:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.