Parse a command line string into flags and arguments in Golang
Asked Answered
M

6

16

I'm looking for a package that would take a string such as -v --format "some example" -i test and parse it into a slice of strings, handling quotes, spaces, etc. properly:

-v
--format
some example
-i
test

I've checked the built-in flag package as well as other flag handling packages on Github but none of them seem to handle this particular case of parsing a raw string into tokens. Before trying to do it myself I'd rather look for a package as I'm sure there are a lot of special cases to handle.

Any suggestion?

Mayda answered 6/12, 2015 at 14:53 Comment(7)
This is a mixture of what the shell does (the quoted string grouping) and the particular behavior of gnu option parsing tools (the interspersed args and flags, and different long/short flag format). I'm not aware of anyone combining these into a package.Amalea
A quick hack (if you are flexible to change the way you provide the input) would be (please see flag package's documentation): set := &flag.FlagSet{} v := set.Bool("v", false, "") format := set.String("format", "", "") i := set.String("i", "", "") set.Parse([]string{-v, --format="some example", -i=test}) for k, v := range set.Args() { log.Println(k, v) } log.Printf("v=%v format=%v i=%v", *v, *format, *i)Bessbessarabia
I'm not sure if I understand. Are you planning on doing something like: cmd -v --format "some example" -i test? If so, you could just grab all of the arguments from os.Args.Punchdrunk
I don't know how that question could have attracted "opinionated answers and spam", as it's a specific problem, which would require a specific answer. Anyway I couldn't find any package so I've ended up doing it myself. Solution is there: github.com/laurent22/massren/blob/…Mayda
@this.lau_ take a look at go-getoptions, I needed extra flexibility. Using opt.SetUnknownMode("pass") will leave things as you want them in the remaining slice.Vaccaro
@this.lau_: Super useful for me. Thanks for posting a link to your codebase for massren :]Serai
@this.lau_ also useful for me, thanks! Though I modified example a bit to iterate over runes in string (not bytes) to handle unicodeSerriform
M
11

For information, this is the function I've ended up creating.

It splits a command into its arguments. For example, cat -v "some file.txt", will return ["cat", "-v", "some file.txt"].

It also correctly handles escaped characters, spaces in particular. So cat -v some\ file.txt will also correctly be split into ["cat", "-v", "some file.txt"]

func parseCommandLine(command string) ([]string, error) {
    var args []string
    state := "start"
    current := ""
    quote := "\""
    escapeNext := true
    for i := 0; i < len(command); i++ {
        c := command[i]

        if state == "quotes" {
            if string(c) != quote {
                current += string(c)
            } else {
                args = append(args, current)
                current = ""
                state = "start"
            }
            continue
        }

        if (escapeNext) {
            current += string(c)
            escapeNext = false
            continue
        }

        if (c == '\\') {
            escapeNext = true
            continue
        }

        if c == '"' || c == '\'' {
            state = "quotes"
            quote = string(c)
            continue
        }

        if state == "arg" {
            if c == ' ' || c == '\t' {
                args = append(args, current)
                current = ""
                state = "start"
            } else {
                current += string(c)
            }
            continue
        }

        if c != ' ' && c != '\t' {
            state = "arg"
            current += string(c)
        }
    }

    if state == "quotes" {
        return []string{}, errors.New(fmt.Sprintf("Unclosed quote in command line: %s", command))
    }

    if current != "" {
        args = append(args, current)
    }

    return args, nil
}
Mayda answered 27/10, 2017 at 11:5 Comment(1)
this will make an error if command starts with "Greengage
C
16

Looks similar to shlex:

import "github.com/google/shlex"
shlex.Split("one \"two three\" four") -> []string{"one", "two three", "four"}
Corody answered 13/6, 2020 at 4:53 Comment(1)
Should be the official answer.Haerr
M
11

For information, this is the function I've ended up creating.

It splits a command into its arguments. For example, cat -v "some file.txt", will return ["cat", "-v", "some file.txt"].

It also correctly handles escaped characters, spaces in particular. So cat -v some\ file.txt will also correctly be split into ["cat", "-v", "some file.txt"]

func parseCommandLine(command string) ([]string, error) {
    var args []string
    state := "start"
    current := ""
    quote := "\""
    escapeNext := true
    for i := 0; i < len(command); i++ {
        c := command[i]

        if state == "quotes" {
            if string(c) != quote {
                current += string(c)
            } else {
                args = append(args, current)
                current = ""
                state = "start"
            }
            continue
        }

        if (escapeNext) {
            current += string(c)
            escapeNext = false
            continue
        }

        if (c == '\\') {
            escapeNext = true
            continue
        }

        if c == '"' || c == '\'' {
            state = "quotes"
            quote = string(c)
            continue
        }

        if state == "arg" {
            if c == ' ' || c == '\t' {
                args = append(args, current)
                current = ""
                state = "start"
            } else {
                current += string(c)
            }
            continue
        }

        if c != ' ' && c != '\t' {
            state = "arg"
            current += string(c)
        }
    }

    if state == "quotes" {
        return []string{}, errors.New(fmt.Sprintf("Unclosed quote in command line: %s", command))
    }

    if current != "" {
        args = append(args, current)
    }

    return args, nil
}
Mayda answered 27/10, 2017 at 11:5 Comment(1)
this will make an error if command starts with "Greengage
M
5

If the args were passed to your program on the command line then the shell should handle this and os.Args will be populated correctly. For example, in your case os.Args[1:] will equal

[]string{"-v", "--format", "some example", "-i", "test"}

If you just have the string though, for some reason, and you'd like to mimic what the shell would do with it, then I recommend a package like https://github.com/kballard/go-shellquote

Marrin answered 27/5, 2018 at 3:57 Comment(0)
B
2

@laurent 's answer is wonderful, but it doesn't work when command includes utf-8 char.

It fail the third test:

func TestParseCommandLine(t *testing.T){
    tests := []struct{
        name string
        input string
        want []string
    }{
        {
            "normal",
            "hello world",
            []string{"hello", "world"},
        },
        {
            "quote",
            "hello \"world hello\"",
            []string{"hello", "world hello"},
        },
        {
            "utf-8",
            "hello 世界",
            []string{"hello", "世界"},
        },
        {
            "space",
            "hello\\ world",
            []string{"hello world"},
        },
    }
    for _, tt := range tests{
        t.Run(tt.name, func(t *testing.T) {
            got, _ := parseCommandLine(tt.input)
            if !reflect.DeepEqual(got, tt.want){
                t.Errorf("expect %v, got %v", tt.want, got)
            }
        })
    }
}

Based on his/her answer, i wrote this func that works good for utf-8, just by replacing for i := 0; i < len(command); i++ {c := command[i] to for _, c := range command

Here's the my answer:

func parseCommandLine(command string) ([]string, error) {
    var args []string
    state := "start"
    current := ""
    quote := "\""
    escapeNext := true
    for _, c := range command {

        if state == "quotes" {
            if string(c) != quote {
                current += string(c)
            } else {
                args = append(args, current)
                current = ""
                state = "start"
            }
            continue
        }

        if escapeNext {
            current += string(c)
            escapeNext = false
            continue
        }

        if c == '\\' {
            escapeNext = true
            continue
        }

        if c == '"' || c == '\'' {
            state = "quotes"
            quote = string(c)
            continue
        }

        if state == "arg" {
            if c == ' ' || c == '\t' {
                args = append(args, current)
                current = ""
                state = "start"
            } else {
                current += string(c)
            }
            continue
        }

        if c != ' ' && c != '\t' {
            state = "arg"
            current += string(c)
        }
    }

    if state == "quotes" {
        return []string{}, errors.New(fmt.Sprintf("Unclosed quote in command line: %s", command))
    }

    if current != "" {
        args = append(args, current)
    }

    return args, nil
}
Bordeaux answered 26/9, 2021 at 12:46 Comment(3)
brave but not convincing.Glialentn
@ЯрославРахматуллин should I make some further updates for making it more convincing?Bordeaux
No, it makes enough sense the way it is and is clearly an improvement, for which you deserve +1. Anyway, I think I would update the other answerGlialentn
C
0

hedzr/cmdr might be good. it's a getopt-like command-line parser, light weight, fluent api or classical style.

Chemiluminescence answered 30/5, 2019 at 8:39 Comment(0)
W
0

I know this is an old question, but might be still relevant. What about using regex? It is quite simple and might be enough for most of cases:

r := regexp.MustCompile(`\"[^\"]+\"|\S+`)
m := r.FindAllString(`-v --format "some example" -i test`, -1)
fmt.Printf("%q", m)
// Prints out ["-v" "--format" "\"some example\"" "-i" "test"]

You can try https://go.dev/play/p/1K0MlsOUzQI

Edit:

To handle also test\ abc to be a 1 entry, use this regex: \"[^\"]+\"|\S+\\\s\S+|\S+

Woodhead answered 6/12, 2021 at 15:8 Comment(1)
I wouldn't recommend rolling your own arg parsing manually....Gemology

© 2022 - 2024 — McMap. All rights reserved.