What would be the best approach to converting protoc generated structs from bson structs?
Asked Answered
S

5

15

I'm writing a RESTful API in Golang, which also has a gRPC api. The API connects to a MongoDB database, and uses structs to map out entities. I also have a .proto definition which matches like for like the struct I'm using for MongoDB.

I just wondered if there was a way to share, or re-use the .proto defined code for the MongoDB calls also. I've noticed the strucs protoc generates has json tags for each field, but obviously there aren't bson tags etc.

I have something like...

// Menu -
type Menu struct {
    ID          bson.ObjectId      `json:"id" bson"_id"`
    Name        string             `json:"name" bson:"name"`
    Description string             `json:"description" bson:"description"`
    Mixers      []mixers.Mixer     `json:"mixers" bson:"mixers"`
    Sections    []sections.Section `json:"sections" bson:"sections"`
}

But then I also have protoc generated code...

type Menu struct {
    Id          string     `protobuf:"bytes,1,opt,name=id" json:"id,omitempty"`
    Name        string     `protobuf:"bytes,2,opt,name=name" json:"name,omitempty"`
    Description string     `protobuf:"bytes,3,opt,name=description" json:"description,omitempty"`
    Mixers      []*Mixer   `protobuf:"bytes,4,rep,name=mixers" json:"mixers,omitempty"`
    Sections    []*Section `protobuf:"bytes,5,rep,name=sections" json:"sections,omitempty"`
}

Currently I'm having to convert between the two structs depending what I'm doing. Which is tedious and I'm probably quite a considerable performance hit. So is there a better way of converting between the two, or re-using one of them for both tasks?

Sapienza answered 17/7, 2017 at 22:49 Comment(8)
It might be possible to just manually add the bson tags. Have you tried it as a test? If it works, you could probably write a script to take care of it from then on.Eckardt
With the bson.ObjectId, you could put both in the struct (or embed), then just make sure when you retrieve one from either source, you populate the empty one. I suppose that still exposes some tedious work, but not as much as converting the entire struct.Eckardt
Trouble is, I was planning on automating the code generation on build or something, so it would just override it. I guess I could just not do that and manually update it, but it feels like there should be a standard way of doing this. Surely loads of people are spitting out mongodb queries into gRPC in Golang? Embedding the ID could work actually! Still tricky as you mentioned thoughSapienza
You can have a look at gogoprotobuf's extension moretags. I used it for this very use case and it works fine.Nimiety
@MarkusWMahlberg how did you deal with the ID parameter naming mismatch (Id string and ID bson.ObjectId) ?Sapienza
Hey ! So how did you handle it?Hundredfold
@Hundredfold have you found any good solution?Achlamydeous
The best idea for me is just to create different objects for the MangoDb uses.. In the protobufs you define messages that are used as interfaced communication. To me, it's important to separate every layers and to me, the communication and the data stored has to be different structures, even if they are composed of the same data, they don't have the same roles :) that's my norme when I concept any architecture or when I code, I try to separate as much as I can, and I write a lot of mappers ...Hundredfold
T
2

Having lived with this same issue, there's a couple methods of solving it. They fall into two general methods:

  1. Use the same data type
  2. Use two different struct types and map between them

If you want to use the same data type, you'll have to modify the code generation

You can use something like gogoprotobuf which has an extension to add tags. This should give you bson tags in your structs.

You could also post-process your generated files, either with regular expressions or something more complicated involving the go abstract syntax tree.

If you choose to map between them:

  1. Use reflection. You can write a package that will take two structs and try to take the values from one and apply it to another. You'll have to deal with edge cases (slight naming differences, which types are equivalent, etc), but you'll have better control over edge cases if they ever come up.

  2. Use JSON as an intermediary. As long as the generated json tags match, this will be a quick coding exercise and the performance hit of serializing and deserializing might be acceptable if this isn't in a tight loop in your code.

  3. Hand-write or codegen mapping functions. Depending on how many structs you have, you could write out a bunch of functions that translate between the two.

At my workplace, we ended up doing a bit of all of them: forking the protoc generator to do some custom tags, a reflection based structs overlay package for mapping between arbitrary structs, and some hand-written ones in more performance sensitive or less automatable mappings.

Thinia answered 2/7, 2020 at 16:24 Comment(0)
F
2

I have played with it and have a working example with:

github.com/gogo/protobuf v1.3.1
go.mongodb.org/mongo-driver v1.4.0
google.golang.org/grpc v1.31.0

First of all I would like to share my proto/contract/example.proto file:

syntax = "proto2";

package protobson;

import "gogoproto/gogo.proto";

option (gogoproto.sizer_all) = true;
option (gogoproto.marshaler_all) = true;
option (gogoproto.unmarshaler_all) =  true;
option go_package = "gitlab.com/8bitlife/proto/go/protobson";

service Service {
    rpc SayHi(Hi) returns (Hi) {}
}

message Hi {
    required bytes id = 1 [(gogoproto.customtype) = "gitlab.com/8bitlife/protobson/custom.BSONObjectID", (gogoproto.nullable) = false, (gogoproto.moretags) = "bson:\"_id\""] ;
    required int64 limit = 2  [(gogoproto.nullable) = false, (gogoproto.moretags) = "bson:\"limit\""] ;
}

It contains a simple gRPC service Service that has SayHi method with request type Hi. It includes a set of options: gogoproto.sizer_all, gogoproto.marshaler_all, gogoproto.unmarshaler_all. Their meaning you can find at extensions page. The Hi itself contains two fields:

  1. id that has additional options specified: gogoproto.customtype and gogoproto.moretags
  2. limit with only gogoproto.moretags option

BSONObjectID used in gogoproto.customtype for id field is a custom type that I defined as custom/objectid.go:

package custom

import (
    "go.mongodb.org/mongo-driver/bson/bsontype"
    "go.mongodb.org/mongo-driver/bson/primitive"
)

type BSONObjectID primitive.ObjectID

func (u BSONObjectID) Marshal() ([]byte, error) {
    return u[:], nil
}

func (u BSONObjectID) MarshalTo(data []byte) (int, error) {
    return copy(data, (u)[:]), nil
}

func (u *BSONObjectID) Unmarshal(d []byte) error {
    copy((*u)[:], d)
    return nil
}

func (u *BSONObjectID) Size() int {
    return len(*u)
}

func (u *BSONObjectID) UnmarshalBSONValue(t bsontype.Type, d []byte) error {
    copy(u[:], d)
    return nil
}

func (u BSONObjectID) MarshalBSONValue() (bsontype.Type, []byte, error) {
    return bsontype.ObjectID, u[:], nil
}

It is needed because we need to define a custom marshaling and un-marshaling methods for both: protocol buffers and mongodb driver. This allows us to use this type as an object identifier in mongodb. And to "explaine" it to mongodb driver I marked it with a bson tag by using (gogoproto.moretags) = "bson:\"_id\"" option in proto file.

To generate source code from the proto file I used:

protoc \
    --plugin=/Users/pstrokov/go/bin/protoc-gen-gogo \
    --plugin=/Users/pstrokov/go/bin/protoc-gen-go \
    -I=/Users/pstrokov/Workspace/protobson/proto/contract \
    -I=/Users/pstrokov/go/pkg/mod/github.com/gogo/[email protected] \
    --gogo_out=plugins=grpc:. \
    example.proto

I have tested it on my MacOS with running MongoDB instance: docker run --name mongo -d -p 27017:27017 mongo:

package main

import (
    "context"
    "log"
    "net"
    "time"

    "gitlab.com/8bitlife/protobson/gitlab.com/8bitlife/proto/go/protobson"
    "go.mongodb.org/mongo-driver/bson"
    "go.mongodb.org/mongo-driver/mongo"
    "go.mongodb.org/mongo-driver/mongo/options"
    "google.golang.org/grpc"
)

type hiServer struct {
    mgoClient *mongo.Client
}

func (s *hiServer) SayHi(ctx context.Context, hi *protobson.Hi) (*protobson.Hi, error) {
    collection := s.mgoClient.Database("local").Collection("bonjourno")
    res, err := collection.InsertOne(ctx, bson.M{"limit": hi.Limit})
    if err != nil { panic(err) }
    log.Println("generated _id", res.InsertedID)

    out := &protobson.Hi{}
    if err := collection.FindOne(ctx, bson.M{"_id": res.InsertedID}).Decode(out); err != nil { return nil, err }
    log.Println("found", out.String())
    return out, nil
}

func main() {
    ctx, cancel := context.WithCancel(context.Background())
    defer cancel()
    lis, err := net.Listen("tcp", "localhost:0")
    if err != nil { log.Fatalf("failed to listen: %v", err) }
    clientOptions := options.Client().ApplyURI("mongodb://localhost:27017")
    clientOptions.SetServerSelectionTimeout(time.Second)
    client, err := mongo.Connect(ctx, clientOptions)
    if err != nil { log.Fatal(err) }
    if err := client.Ping(ctx, nil); err != nil { log.Fatal(err) }
    grpcServer := grpc.NewServer()
    protobson.RegisterServiceServer(grpcServer, &hiServer{mgoClient: client})
    go grpcServer.Serve(lis); defer grpcServer.Stop()
    conn, err := grpc.Dial(lis.Addr().String(), grpc.WithInsecure())
    if err != nil { log.Fatal(err) }; defer conn.Close()
    hiClient := protobson.NewServiceClient(conn)
    response, err := hiClient.SayHi(ctx, &protobson.Hi{Limit: 99})
    if err != nil { log.Fatal(err) }
    if response.Limit != 99 { log.Fatal("unexpected limit", response.Limit) }
    if response.Id.Size() == 0 { log.Fatal("expected a valid ID of the new entity") }
    log.Println(response.String())
}

Sorry for the formatting of the last code snippet :) I hope this can help.

Frangipani answered 1/8, 2020 at 14:34 Comment(1)
why exactly did you need to make sure there are custom BSON marshaling and un-marshaling methods? making sure there are bson tags is not enough? why?Acidfast
A
1

I'm in the process of testing and may provide code shortly, (ping me if you don't see it and you want it) but https://godoc.org/go.mongodb.org/mongo-driver/bson/bsoncodec looks like the ticket. protoc will make your structs and you don't have to mess with customizing them. Then you can customize the mongo-driver to do the mapping of certain types for you and it looks like their library for this is pretty good.

This is great because if I use the protogen structs then I'd like that to my application core / domain layer. I don't want to be concerned about mongoDB compatibility over there.

So right now, it seems to me that @Liyan Chang 's answer saying

If you want to use the same data type, you'll have to modify the code generation doesn't necessarily have to be the case. Because you can opt to use 1 datatype.

You can use one generated type and account for seemingly whatever you need to in terms of getting and setting data to the DB with this codec system.

See https://mcmap.net/q/826119/-necessity-of-bson-struct-tag-when-using-with-mongodb-go-client - the bson struct tags are not an end all be all. looks like codec can totally help with this.

See https://mcmap.net/q/826120/-how-to-ignore-nulls-while-unmarshalling-a-mongodb-document fo a nice write up about codecs in general.

Please keep in mind these codecs were released in 1.3 of the mongodb go driver. I found this which directed me there: https://developer.mongodb.com/community/forums/t/mgo-setbson-to-mongo-golang-driver/2340/2?u=yehuda_makarov

Acidfast answered 2/12, 2020 at 16:2 Comment(2)
Hey, were you able you able to get it working? And if so can you provide some sample code for it?Darbie
@Darbie hey, no. I got a little burnt out on the idea, and looked into postgres/mapping. Then found out some people prefer to have mappers for moving from protobuf to the data layer. And never got back to this.Acidfast
M
1

Note: As of May 2023 there are multiple contradicting/outdated answers and protobuf 3 is the latest version. After a lot of digging I came up with this:

  • brew install protobuf - First we need to install protoc, proto compiler.
  • go install google.golang.org/protobuf/cmd/protoc-gen-go - Installs protoc-gen-go globally. This is a plugin for the Google protocol buffer compiler to generate Go code.
  • go install github.com/favadi/protoc-go-inject-tag@latest - Extension that can add any custom tags on the generated go structs. We need bson.
  • Update your .bashrc/.zshrc file. (Mac/linux). You will need to setup the path to point to goroot. Otherwise you wont be able to run protoc
export PATH=~/flutter/bin:$PATH
export PATH=~/.local/bin/:$PATH
export LANG=en_US.UTF-8
export GOROOT=/usr/local/go
export GOPATH=$HOME/go
export GOBIN=$GOPATH/bin
export PATH=$PATH:$GOROOT:$GOPATH:$GOBIN
  • cd your-project - Protobuf files can be generated for the entire project in one command.
  • protoc --go_out=. **/*.proto - This command generates *.pb.go files. These are necessary to marshal and unmarshal our kafka messages.
  • string id = 1; // @gotags: bson:"id,omitempty" - Add this comment above or right side of the proto fields that need to be tagged with bson tags.
  • protoc-go-inject-tag -remove_tag_comment -input="**/**/*.pb.go - Run this command after you generated go files using protoc.

Detailed Explanation

  • Protobuf does not generate by default bson tags. Neither the go golang extension. The reasoning can be found here in this Github ticket: MutateHook for protoc-gen-go. Some older answers on StackOverflow suggest that we should write our own script to add these missing tags. That is by no means a task that I'd like to carry out as long as there's some github repo that ca do it. What would be the best approach to converting protoc generated structs from bson structs? | StackOverflow

    Can I customize the code generated by protoc-gen-go? In general, no. Protocol buffers are intended to be a language-agnostic data interchange format, and implementation-specific customizations run counter to that intent.

    This has been variously discussed before, but the decision usually settles on, “we do not think this is a feature we can or should add.” Unfortunately, this package has a need to stick strictly to the protobuf standards, which by design targets multiple languages, and needs to ensure maximum compatibility. We have already been bit before by adding json tags, because in Go doing so was so easy to do. But now that protobuf has a standard JSON mapping, those JSON tags are now non-compliant, and the standard library encoding/json cannot be retrofitted to make it compliant. However, because people have been relying on the json tags, we cannot just remove them, even though they were a mistake. Because of this history, we’re quite reluctant to add anything unilaterally, and the protobuf project as a whole frowns upon adding language-specific features, because as mentioned, it needs to be language-agnostic. There have been people presenting tools to perform this ask, but the official golang protobuf module is unlikely to ever take up things that have not been agreed upon by the wider protobuf standard.

  • srikrsna/protoc-gen-gotag - Not working - Initially I found this library protoc-gen-gotag (PGGT) in this github ticket MutateHook for protoc-gen-go. The library seems outdated and abandoned. I simply don't get it how it's supposed to be used. The instructions don't offer a clear path forward. Neither any resources on the web. Not even this apparently decent tutorial provided a good indication of what to do to make tagger.tags work: New official MongoDB Go Driver and Google Protobuf — making them work together.

  • favadi/protoc-go-inject-tag - Working - After hours of digging on the web I stumbled once again on this github ticket: protoc-gen-go: support go_tag option to specify custom struct tags. Reading again I found a library that uses magic comment syntax to add the missing bson tags: protoc-go-inject-tag. Fortunately it works with latest protobuf 3. It also seems to have better traction. And even better is that the syntax is not distracting from the go struct, thus maintaining decent readability of the generated structs.

    • go install github.com/favadi/protoc-go-inject-tag@latest - Installs the extension
    • protoc --go_out=. **/*.proto - generate go protobuf as usual
    • string id = 1; // @gotags: bson:"id,omitempty" - Add this comment above or right side of the proto fields that need to be tagged with bson tags.
    • protoc-go-inject-tag -remove_tag_comment -input="**/**/*.pb.go - Run this command after you generated go files using protoc. Sadly there's no way around this second step. Note: for some reason the glob syntax does not go into deep nested folders. So we have to repeat for all levels. This means that if we have proto files 3 folders deep, this command wont match them. -remove_tag_comment will remove the @gotag comments form the generated struct. (Let me know if you find a fix for the glob pattern)
Manhole answered 30/5, 2023 at 20:16 Comment(0)
D
0

I was able to put together a relatively clean solution thanks to this package: https://github.com/custom-app/protobson

It's a custom bson encoder which uses the official proto and protoreflect packages under the hood and is therefore fully capable of handling the different protobuf types such as oneof, repeated and map which result in interfaces in the generated field types which just adding bson tags won't be able to handle.

It allows you to use the generated json tags for the BSON tags as well as other options. Here's an example of how to use it:

ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()

hosts := []string{fmt.Sprintf("%v:%v", os.Getenv("MONGO_HOST"), os.Getenv("MONGO_PORT"))}

auth := options.Credential{
    Username: os.Getenv("MONGO_USER"),
    Password: os.Getenv("MONGO_PASSWORD"),
}

registry := bson.NewRegistry()

codec := protobson.NewCodec(protobson.WithFieldNamerByJsonName())
msgType := reflect.TypeOf((*proto.Message)(nil)).Elem()

registry.RegisterInterfaceDecoder(msgType, codec)
registry.RegisterInterfaceEncoder(msgType, codec)

options := options.Client().SetHosts(hosts).SetAuth(auth).SetRegistry(registry)

Client, err = mongo.Connect(ctx, options)

For the ObjectId fields in proto definitions I'd recommend using the string type:

string id = 1 [json_name="_id"];

This does however have one drawback - when you want to insert / update documents with an ObjectID field, you cannot use the generated proto types directly or they will be inserted as strings. I was able to workaround this by marshalling to json and back however I'm sure there is a cleaner way to do this in the custom bson encoder (this also has the risk of losing number types)

newItem := &pb.Item{...}
json, _ = protojson.Marshal(newItem)
var newItemBson bson.M
bson.UnmarshalExtJSON(json, true, &newItemBson)
newItemBson["itemId"] = itemId // of type primitive.ObjectID

result, err := mongo.Collection("items").InsertOne(context.TODO(), newItemBson)
Dubious answered 29/6 at 12:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.