Dynamically parse yaml field to one of a finite set of structs in Go
Asked Answered
A

2

8

I have a yaml file, where one field could be represented by one of possible kinds of structs. To simplify the code and yaml files, let's say I have these yaml files:

kind: "foo"
spec:
  fooVal: 4
kind: "bar"
spec:
  barVal: 5

And these structs for parsing:

    type Spec struct {
        Kind string      `yaml:"kind"`
        Spec interface{} `yaml:"spec"`
    }
    type Foo struct {
        FooVal int `yaml:"fooVal"`
    }
    type Bar struct {
        BarVal int `yaml:"barVal"`
    }

I know that I can use map[string]interface{} as a type of Spec field. But the real example is more complex, and involves more possible struct types, not only Foo and Bar, this is why I don't like to parse spec into the field.

I've found a workaround for this: unmarshal the yaml into intermediate struct, then check kind field, and marshal map[string]interface{} field into yaml back, and unmarshal it into concrete type:

    var spec Spec
    if err := yaml.Unmarshal([]byte(src), &spec); err != nil {
        panic(err)
    }
    tmp, _ := yaml.Marshal(spec.Spec)
    if spec.Kind == "foo" {
        var foo Foo
        yaml.Unmarshal(tmp, &foo)
        fmt.Printf("foo value is %d\n", foo.FooVal)
    }
    if spec.Kind == "bar" {
        tmp, _ := yaml.Marshal(spec.Spec)
        var bar Bar
        yaml.Unmarshal(tmp, &bar)
        fmt.Printf("bar value is %d\n", bar.BarVal)
    }

But it requires additional step and consumes more memory (real yaml file could be bigger than in examples). Does some more elegant way exist to unmarshal yaml dynamically into a finite set of structs?

Update: I'm using github.com/go-yaml/yaml v2.1.0 Yaml parser.

Armhole answered 19/3, 2021 at 14:24 Comment(2)
Which yaml package and version of the package are you using?Mastic
@Mastic sorry, updated the questionArmhole
M
8

For use with yaml.v2 you can do the following:

type yamlNode struct {
    unmarshal func(interface{}) error
}

func (n *yamlNode) UnmarshalYAML(unmarshal func(interface{}) error) error {
    n.unmarshal = unmarshal
    return nil
}

type Spec struct {
    Kind string      `yaml:"kind"`
    Spec interface{} `yaml:"-"`
}
func (s *Spec) UnmarshalYAML(unmarshal func(interface{}) error) error {
    type S Spec
    type T struct {
        S    `yaml:",inline"`
        Spec yamlNode `yaml:"spec"`
    }

    obj := &T{}
    if err := unmarshal(obj); err != nil {
        return err
    }
    *s = Spec(obj.S)

    switch s.Kind {
    case "foo":
        s.Spec = new(Foo)
    case "bar":
        s.Spec = new(Bar)
    default:
        panic("kind unknown")
    }
    return obj.Spec.unmarshal(s.Spec)
}

https://play.golang.org/p/Ov0cOaedb-x


For use with yaml.v3 you can do the following:

type Spec struct {
    Kind string      `yaml:"kind"`
    Spec interface{} `yaml:"-"`
}
func (s *Spec) UnmarshalYAML(n *yaml.Node) error {
    type S Spec
    type T struct {
        *S   `yaml:",inline"`
        Spec yaml.Node `yaml:"spec"`
    }

    obj := &T{S: (*S)(s)}
    if err := n.Decode(obj); err != nil {
        return err
    }

    switch s.Kind {
    case "foo":
        s.Spec = new(Foo)
    case "bar":
        s.Spec = new(Bar)
    default:
        panic("kind unknown")
    }
    return obj.Spec.Decode(s.Spec)
}

https://play.golang.org/p/ryEuHyU-M2Z

Mastic answered 19/3, 2021 at 18:11 Comment(0)
G
4

You can do this by implementing a custom UnmarshalYAML func. However, with the v2 version of the API, you would basically do the same thing as you do now and just encapsulate it a bit better.

If you switch to using the v3 API however, you get a better UnmarshalYAML that actually lets you work on the parsed YAML node before it is processed into a native Go type. Here's how that looks:

package main

import (
    "errors"
    "fmt"
    "gopkg.in/yaml.v3"
)

type Spec struct {
    Kind string      `yaml:"kind"`
    Spec interface{} `yaml:"spec"`
}
type Foo struct {
    FooVal int `yaml:"fooVal"`
}
type Bar struct {
    BarVal int `yaml:"barVal"`
}

func (s *Spec) UnmarshalYAML(value *yaml.Node) error {
    s.Kind = ""
    for i := 0; i < len(value.Content)/2; i += 2 {
        if value.Content[i].Kind == yaml.ScalarNode &&
            value.Content[i].Value == "kind" {
            if value.Content[i+1].Kind != yaml.ScalarNode {
                return errors.New("kind is not a scalar")
            }
            s.Kind = value.Content[i+1].Value
            break
        }
    }
    if s.Kind == "" {
        return errors.New("missing field `kind`")
    }
    switch s.Kind {
    case "foo":
        var foo Foo
        if err := value.Decode(&foo); err != nil {
            return err
        }
        s.Spec = foo
    case "bar":
        var bar Bar
        if err := value.Decode(&bar); err != nil {
            return err
        }
        s.Spec = bar
    default:
        return errors.New("unknown kind: " + s.Kind)
    }
    return nil
}

var input1 = []byte(`
kind: "foo"
spec:
  fooVal: 4
`)

var input2 = []byte(`
kind: "bar"
spec:
  barVal: 5
`)

func main() {
    var s1, s2 Spec
    if err := yaml.Unmarshal(input1, &s1); err != nil {
        panic(err)
    }
    fmt.Printf("Type of spec from input1: %T\n", s1.Spec)
    if err := yaml.Unmarshal(input2, &s2); err != nil {
        panic(err)
    }
    fmt.Printf("Type of spec from input2: %T\n", s2.Spec)
}

I suggest looking into the possibility of using YAML tags instead of your current structure to model this in your YAML; tags have been designed exactly for this purpose. Instead of the current YAML

kind: "foo"
spec:
  fooVal: 4

you could write

--- !foo
fooVal: 4

Now you don't need the describing structure with kind and spec anymore. Loading this would look a bit different as you'd need a wrapping root type you can define UnmarshalYAML on, but it may be feasible if this is just a part of a larger structure. You can access the tag !foo in the yaml.Node's Tag field.

Geographical answered 19/3, 2021 at 15:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.