Passing multiple arguments via command line in R
Asked Answered
A

7

5

I am trying to pass multiple file path arguments via command line to an Rscript which can then be processed using an arguments parser. Ultimately I would want something like this

Rscript test.R --inputfiles fileA.txt fileB.txt fileC.txt --printvar yes --size 10 --anotheroption helloworld -- etc...

passed through the command line and have the result as an array in R when parsed

args$inputfiles =  "fileA.txt", "fileB.txt", "fileC.txt"

I have tried several parsers including optparse and getopt but neither of them seem to support this functionality. I know argparse does but it is currently not available for R version 2.15.2

Any ideas?

Thanks

Amelita answered 9/12, 2012 at 18:55 Comment(3)
Can you elaborate on why @agstudy's solution does not work? It is pretty accurateBethelbethena
Also, possible duplicate of #2151712Bethelbethena
@RicardoSaporta it is not duplicated. It is a little bit different.Mantic
A
1

After searching around, and avoiding to write a new package from the bottom up, I figured the best way to input multiple arguments using the package optparse is to separate input files by a character which is most likely illegal to be included in a file name (for example, a colon)

Rscript test.R --inputfiles fileA.txt:fileB.txt:fileC.txt etc...

File names can also have spaces in them as long as the spaces are escaped (optparse will take care of this)

Rscript test.R --inputfiles file\ A.txt:file\ B.txt:fileC.txt etc...

Ultimatley, it would be nice to have a package (possibly a modified version of optparse) that would support multiple arguments like mentioned in the question and below

Rscript test.R --inputfiles fileA.txt fileB.txt fileC.txt

One would think such trivial features would be implemented into a widely used package such as optparse

Cheers

Amelita answered 10/12, 2012 at 23:57 Comment(2)
why not to explain how do you use the optparse package?Mantic
That's beyond the scope of my question. Optparse examples can be found in the documentation here cran.r-project.org/web/packages/optparse/optparse.pdfAmelita
W
9

The way you describe command line options is different from the way that most people would expect them to be used. Normally, a command line option would take a single parameter, and parameters without a preceding option are passed as arguments. If an argument would take multiple items (like a list of files), I would suggest parsing the string using strsplit().

Here's an example using optparse:

library (optparse)
option_list <- list ( make_option (c("-f","--filelist"),default="blah.txt", 
                                   help="comma separated list of files (default %default)")
                     )

parser <-OptionParser(option_list=option_list)
arguments <- parse_args (parser, positional_arguments=TRUE)
opt <- arguments$options
args <- arguments$args

myfilelist <- strsplit(opt$filelist, ",")

print (myfilelist)
print (args)

Here are several example runs:

$ Rscript blah.r -h
Usage: blah.r [options]


Options:
    -f FILELIST, --filelist=FILELIST
        comma separated list of files (default blah.txt)

    -h, --help
        Show this help message and exit


$ Rscript blah.r -f hello.txt
[[1]]
[1] "hello.txt"

character(0)
$ Rscript blah.r -f hello.txt world.txt
[[1]]
[1] "hello.txt"

[1] "world.txt"
$ Rscript blah.r -f hello.txt,world.txt another_argument and_another
[[1]]
[1] "hello.txt" "world.txt"

[1] "another_argument" "and_another"
$ Rscript blah.r an_argument -f hello.txt,world.txt,blah another_argument and_another
[[1]]
[1] "hello.txt" "world.txt" "blah"     

[1] "an_argument"      "another_argument" "and_another"     

Note that for the strsplit, you can use a regular expression to determine the delimiter. I would suggest something like the following, which would let you use commas or colons to separate your list:

myfilelist <- strsplit (opt$filelist,"[,:]")
Whiggism answered 22/8, 2016 at 20:40 Comment(0)
A
6

Although it wasn't released on CRAN when this question was asked a beta version of the argparse module is up there now which can do this. It is basically a wrapper around the popular python module of the same name so you need to have a recent version of python installed to use it. See install notes for more info. The basic example included sums an arbitrarily long list of numbers which should not be hard to modify so you can grab an arbitrarily long list of input files.

> install.packages("argparse")
> library("argparse")
> example("ArgumentParser")
Achromatous answered 1/3, 2013 at 2:13 Comment(0)
M
4

In the front of your script test.R, you put this :

args <- commandArgs(trailingOnly = TRUE)

hh <- paste(unlist(args),collapse=' ')
listoptions <- unlist(strsplit(hh,'--'))[-1]
options.args <- sapply(listoptions,function(x){
         unlist(strsplit(x, ' '))[-1]
        })
options.names <- sapply(listoptions,function(x){
  option <-  unlist(strsplit(x, ' '))[1]
})
names(options.args) <- unlist(options.names)
print(options.args)

to get :

$inputfiles
[1] "fileA.txt" "fileB.txt" "fileC.txt"

$printvar
[1] "yes"

$size
[1] "10"

$anotheroption
[1] "helloworld"
Mantic answered 9/12, 2012 at 19:1 Comment(6)
This does not work for me. My arguments are much more complicated. There are roughly 8-10 options passed. Hence, the arg parsersAmelita
@Omar complicated? how? you can give an example?Mantic
Sure, I added an example in my question. Your solution would only work for me if the files were the only thing I was passing in as an argument. No?Amelita
@Omar If you Are you comfortable with vbs script , you can perfom one and call it from R. I think it is easy to find this in SO.Mantic
I keep finding missing features with this quick method. For example: files with spaces in them when escaped become file\ a.txt. This is not detected. Can anyone suggest another more robust way of parsing these arguments? or would I have to write a parser from the bottom up?Amelita
@Omar good idea to do it from the bottom and to share of course.Mantic
A
1

After searching around, and avoiding to write a new package from the bottom up, I figured the best way to input multiple arguments using the package optparse is to separate input files by a character which is most likely illegal to be included in a file name (for example, a colon)

Rscript test.R --inputfiles fileA.txt:fileB.txt:fileC.txt etc...

File names can also have spaces in them as long as the spaces are escaped (optparse will take care of this)

Rscript test.R --inputfiles file\ A.txt:file\ B.txt:fileC.txt etc...

Ultimatley, it would be nice to have a package (possibly a modified version of optparse) that would support multiple arguments like mentioned in the question and below

Rscript test.R --inputfiles fileA.txt fileB.txt fileC.txt

One would think such trivial features would be implemented into a widely used package such as optparse

Cheers

Amelita answered 10/12, 2012 at 23:57 Comment(2)
why not to explain how do you use the optparse package?Mantic
That's beyond the scope of my question. Optparse examples can be found in the documentation here cran.r-project.org/web/packages/optparse/optparse.pdfAmelita
R
1

@agstudy's solution does not work properly if input arguments are lists of the same length. By default, sapply will collapse inputs of the same length into a matrix rather than a list. The fix is simple enough, just explicitly set simplify to false in the sapply parsing the arguments.

args <- commandArgs(trailingOnly = TRUE)

hh <- paste(unlist(args),collapse=' ')
listoptions <- unlist(strsplit(hh,'--'))[-1]
options.args <- sapply(listoptions,function(x){
         unlist(strsplit(x, ' '))[-1]
        }, simplify=FALSE)
options.names <- sapply(listoptions,function(x){
  option <-  unlist(strsplit(x, ' '))[1]
})
names(options.args) <- unlist(options.names)
print(options.args)
Remove answered 11/12, 2014 at 20:23 Comment(0)
I
1

Just run into this issue, and fortunately, {argparser} package supports multiple-value arguments. There is nargs argument in add_argument() function, and specifying it as Inf will work.

As an example:

library(argparser)
cli_args <- c("-s", 2, 3, 5)
arg_parser("Test with multiple values") |>
  add_argument("--subject", "sub", type = "numeric", nargs = Inf) |>
  parse_args(cli_args)
#> [[1]]
#> [1] FALSE
#> 
#> $help
#> [1] FALSE
#> 
#> $opts
#> [1] NA
#> 
#> $subject
#> [1] 2 3 5

Created on 2023-10-10 with reprex v2.0.2

Impracticable answered 9/10, 2023 at 15:46 Comment(0)
M
0

I had this same issue, and the workaround that I developed is to adjust the input command line arguments before they are fed to the optparse parser, by concatenating whitespace-delimited input file names together using an alternative delimiter such as a "pipe" character, which is unlikely to be used as part of a file name.

The adjustment is then reversed at the end again, by removing the delimiter using str_split().

Here is some example code:

#!/usr/bin/env Rscript

library(optparse)
library(stringr)

# ---- Part 1: Helper Functions ----

# Function to collapse multiple input arguments into a single string 
# delimited by the "pipe" character
insert_delimiter <- function(rawarg) {
  # Identify index locations of arguments with "-" as the very first
  # character.  These are presumed to be flags.  Prepend with a "dummy"
  # index of 0, which we'll use in the index step calculation below.
  flagloc <- c(0, which(str_detect(rawarg, '^-')))
  # Additionally, append a second dummy index at the end of the real ones.
  n <- length(flagloc)
  flagloc[n+1] <- length(rawarg) + 1
  
  concatarg <- c()
  
  # Counter over the output command line arguments, with multiple input
  # command line arguments concatenated together into a single string as
  # necessary
  ii <- 1
  # Counter over the flag index locations
  for(ij in seq(1,length(flagloc)-1)) {
    # Calculate the index step size between consecutive pairs of flags
    step <- flagloc[ij+1]-flagloc[ij]
    # Case 1: empty flag with no arguments
    if (step == 1) {
      # Ignore dummy index at beginning
      if (ij != 1) {
        concatarg[ii] <- rawarg[flagloc[ij]]
        ii <- ii + 1
      }
    }
    # Case 2: standard flag with one argument
    else if (step == 2) {
      concatarg[ii] <- rawarg[flagloc[ij]]
      concatarg[ii+1] <- rawarg[flagloc[ij]+1]
      ii <- ii + 2
    }
    # Case 3: flag with multiple whitespace delimited arguments (not
    # currently handled correctly by optparse)
    else if (step > 2) {
      concatarg[ii] <- rawarg[flagloc[ij]]
      # Concatenate multiple arguments using the "pipe" character as a delimiter
      concatarg[ii+1] <- paste0(rawarg[(flagloc[ij]+1):(flagloc[ij+1]-1)],
                                collapse='|')
      ii <- ii + 2
    }
  }
  
  return(concatarg)
}

# Function to remove "pipe" character and re-expand parsed options into an
# output list again
remove_delimiter <- function(rawopt) {
  outopt <- list()
  for(nm in names(rawopt)) {
    if (typeof(rawopt[[nm]]) == "character") {
      outopt[[nm]] <- unlist(str_split(rawopt[[nm]], '\\|'))
    } else {
      outopt[[nm]] <- rawopt[[nm]]
    }
  }
  
  return(outopt)
}

# ---- Part 2: Example Usage ----

# Prepare list of allowed options for parser, in standard fashion
option_list <- list(
  make_option(c('-i', '--inputfiles'), type='character', dest='fnames',
              help='Space separated list of file names', metavar='INPUTFILES'),
  make_option(c('-p', '--printvar'), type='character', dest='pvar',
              help='Valid options are "yes" or "no"',
              metavar='PRINTVAR'),
  make_option(c('-s', '--size'), type='integer', dest='sz',
              help='Integer size value',
              metavar='SIZE')
)

# This is the customary pattern that optparse would use to parse command line
# arguments, however it chokes when there are multiple whitespace-delimited
# options included after the "-i" or "--inputfiles" flag.
#opt <- parse_args(OptionParser(option_list=option_list),
#                  args=commandArgs(trailingOnly = TRUE))

# This works correctly
opt <- remove_delimiter(parse_args(OptionParser(option_list=option_list),
                        args=insert_delimiter(commandArgs(trailingOnly = TRUE))))

print(opt)

Assuming the above file were named fix_optparse.R, here is the output result:

> chmod +x fix_optparse.R 
> ./fix_optparse.R --help
Usage: ./fix_optparse.R [options]


Options:
    -i INPUTFILES, --inputfiles=INPUTFILES
        Space separated list of file names

    -p PRINTVAR, --printvar=PRINTVAR
        Valid options are "yes" or "no"

    -s SIZE, --size=SIZE
        Integer size value

    -h, --help
        Show this help message and exit


> ./fix_optparse.R --inputfiles fileA.txt fileB.txt fileC.txt --printvar yes --size 10
$fnames
[1] "fileA.txt" "fileB.txt" "fileC.txt"

$pvar
[1] "yes"

$sz
[1] 10

$help
[1] FALSE

>

A minor limitation with this approach is that if any of the other arguments have the potential to accept a "pipe" character as a valid input, then those arguments will not be treated correctly. However I think you could probably develop a slightly more sophisticated version of this solution to handle that case correctly as well. This simple version works most of the time, and illustrates the general idea.

Mitrailleuse answered 13/9, 2022 at 19:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.