ctags and R regex
Asked Answered
C

2

2

I'm trying to use ctags with R. Using this answer I've added

--langdef=R
--langmap=r:.R.r
--regex-R=/^[ \t]*"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<-[ \t]function/\1/f,Functions/
--regex-R=/^"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<-[ \t][^f][^u][^n][^c][^t][^i][^o][^n]/\1/g,GlobalVars/ 
--regex-R=/[ \t]"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<-[ \t][^f][^u][^n][^c][^t][^i][^o][^n]/\1/v,FunctionVariables

to my .ctags file. However when trying to generate tags with the following R file

x <- 1
foo <- function () {
    y <- 2
    return(y)
}

Only the function foo is recognized. I want to generate tags also for variables (i.e for x and y in my code above). Should I change the regex in the ctags file? Thanks in advance.

Ceroplastics answered 25/8, 2015 at 14:21 Comment(2)
did you install the plugin and modify your .vimrc?Gillett
@Gillett I'm on Vim using this tagbar settings.Ceroplastics
L
4

Those patterns don't appear to be correct, because a variable is only identified if it is assigned a value that has the same length as "function" but does not share any of its characters. E.g.:

x <- aaaaaaaa

The following ctags configuration should work properly:

--langdef=R
--langmap=R:.R.r
--regex-R=/^[ \t]*"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<-[ \t]function[ \t]*\(/\1/f,Functions/
--regex-R=/^"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<-[ \t][^\(]+$/\1/g,GlobalVars/
--regex-R=/[ \t]"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<-[ \t][^\(]+$/\1/v,FunctionVariables/

The idea here is that a function must be followed by a parenthesis while the variable declarations do not. Given the limitations of a regex parser this parenthesis must appear in the same line as the keyword "function" for this configuration to work.

Update: R support will be included in Universal Ctags, so give it a try and report bugs and missing features.

Loom answered 26/8, 2015 at 9:9 Comment(0)
W
0

The problem with the original regex is that it does not capture variable declarations / assignments that are shorter than 8 characters (since the regex requires 8 characters that are NOT f-u-n-c-t-i-o-n in that specific order).

The problem with the regex updated by Vitor is that, it does not capture variable declarations / assignments which incorporate parantheses, a situation which is quite common in R.

And a forgotton issue is that, neither of them captures superassigned objects with <<- (only local assignment with <- is covered).

As I checked the Universal Ctags repo, although pcre regex is planned to be supported as also raised in Issue 519 and a commented out pcre flag exists in the configuration file, there is not yet support for pcre type positive/negative lookahead or lookbehind expressions in ctags unfortunately. When that support starts, things will be much easier.

My solution, first of all, takes into account superassignment by "<{1,2}-" and also the fact that the right side of assignment can include:

  • either a sequence of 8 characters which are not f-u-n-c-t-i-o-n followed by any or no characters (.*)
  • OR a sequence of at most 7 any characters (.{1,7})

My proposed regex patterns are as such:

--langdef=R
--langmap=R:.R.r
--regex-R=/^[ \t]*"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<-[ \t]function[ \t]*\(/\1/f,Functions/
--regex-R=/^"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<{1,2}-[ \t]([^f][^u][^n][^c][^t][^i][^o][^n].*|.{1,7}$)/\1/g,GlobalVars/ 
--regex-R=/[ \t]"?([.A-Za-z][.A-Za-z0-9_]*)"?[ \t]*<{1,2}-[ \t]([^f][^u][^n][^c][^t][^i][^o][^n].*|.{1,7}$)/\1/v,FunctionVariables/

And my tests show that, it captures much more objects than the previous ones.

Edit: Even these do not capture the variables that are declared as arguments to functions. Since ctags regex makes a non-greedy search and non-capturing groups do not work, the result is still limited. But we can at least capture the first arguments to functions and define them as an additional type as a, Arguments:

--regex-R=/.+function[ ]*\([ \t]*(,*([^,= \(\)]+)[,= ]*)/\2/a,Arguments/
Woodland answered 9/6, 2017 at 23:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.