How to decode binary/raw google protobuf data
Asked Answered
R

4

44

I have a coredump with encoded protobuf data and I want to decode this data and see the content. I have the .proto file which defines this message in raw protocol buffer. My proto file looks like this:

$  cat my.proto 
message header {
  required uint32 u1 = 1;
  required uint32 u2 = 2;
  optional uint32 u3 = 3 [default=0];
  optional bool   b1 = 4 [default=true];
  optional string s1 = 5;
  optional uint32 u4 = 6;
  optional uint32 u5 = 7;
  optional string s2 = 9;
  optional string s3   = 10; 
  optional uint32 u6 = 8;
}

And protoc version:

$  protoc --version
libprotoc 2.3.0

I have tried the following:

  1. Dump the raw data from the core

    (gdb) dump memory b.bin 0x7fd70db7e964 0x7fd70db7e96d

  2. Pass it to protoc

    //proto file (my.proto) is in the current dir
    $ protoc --decode --proto_path=$pwd my.proto < b.bin
    Missing value for flag: --decode
    To decode an unknown message, use --decode_raw.

    $ protoc --decode_raw < /tmp/b.bin
    Failed to parse input.

Any thoughts on how to decode it? The documentation doesn’t explain much on how to go about it.

Edit: Data in binary format (10 bytes)

(gdb) x/10xb 0x7fd70db7e964
0x7fd70db7e964: 0x08    0xff    0xff    0x01    0x10    0x08    0x40    0xf7
0x7fd70db7e96c: 0xd4    0x38
Runaway answered 27/1, 2016 at 22:50 Comment(0)
T
55

You used --decode_raw correctly, but your input does not seem to be a protobuf.

For --decode, you need to specify the type name, like:

protoc --decode header my.proto < b.bin

However, if --decode_raw reports a parse error than --decode will too.

It would seem that the bytes you extracted via gdb are not a valid protobuf. Perhaps your addresses aren't exactly right: if you added or removed a byte at either end, it probably won't parse.

I note that according to the addresses you specified, the protobuf is only 9 bytes long, which is only enough space for three or four of the fields to be set. Is that what you are expecting? Perhaps you could post the bytes here.

EDIT:

The 10 bytes you added to your question appear to decode successfully using --decode_raw:

$ echo 08ffff01100840f7d438 | xxd -r -p | protoc --decode_raw
1: 32767
2: 8
8: 928375

Cross-referencing the field numbers, we get:

u1: 32767
u2: 8
u6: 928375
Transmogrify answered 28/1, 2016 at 18:36 Comment(7)
Thanks for the response, I added the raw bytes (10 bytes) in my question above. And yes, only some of the optional fields will be set here, so this is expected.Runaway
@brokenfoot: It appears to me that the bytes you gave do in fact parse successfully -- I edited by answer to show this. Somehow b.bin must not have ended up containing exactly those bytes. The dump memory command you gave looks like it would only dump 9 bytes. Remember that the dump does not include the byte at the end address -- it includes up to the byte immediately before it.Transmogrify
Perfect! I wasn't aware about the dump memory command not including the last byte in the output. Thanks a lot!Runaway
Is it possible to pass protoc hex instead of the binary?Reft
@Reft No, but the unix command xxd -r -p decodes hex to binary, so you can use it in a pipeline like shown in my answer. If you're not running protoc from a unix command line then you'll have to come up with some other solution...Transmogrify
The protoc --decode ... command from this answer works for me, but instead of field names in the output, I get "indexes" assigned to these field. How to fix that?Fromm
@BartekPacia It sounds like you may not have specified the correct type or proto file. If you try to decode a protobuf as the wrong type, it usually won't fail, but because the field numbers don't match up against the fields defined in the type definition, they'll instead be printed "raw" (as if you used --decode_raw).Transmogrify
M
32

protoc --decode [message_name] [.proto_file_path] < [binary_file_path],

where

  • [message_name] is the name of the message object in the .proto file. If the message is inside a package in the .proto file, use package_name.message_name.
  • [.proto_file_path] is the path to the .proto file where the message is defined.
  • [binary_file_path] is the path to the file you want to decode.

Example for the situation in the question (assuming that my.proto and b.bin are in your current working directory):

protoc --decode header my.proto < b.bin

Meganmeganthropus answered 7/6, 2018 at 11:38 Comment(5)
Thanks package_name.message_name was key for me!Evenhanded
I wonder if the my.proto have import other proto files how can protoc find them? the path is what???Mcminn
@dahohu527, feels like the alternatives are either your current path on the command line or the path of the .proto file. Maybe you could try it out to find which one it is :)Meganmeganthropus
mentioned "reference" to "package name" .. was really helpful. thanksRambert
File does not reside within any path specified using --proto_path (or -I). You must specify a --proto_path which encompasses this file. Note that the proto_path must be an exact prefix of the .proto file names -- protoc is too dumb to figure out when two paths (e.g. absolute and relative) are equivalent (it's harder than you think). is the error that i getTrumaine
R
9

proto file:

syntax = "proto3";
package response;

// protoc --gofast_out=. response.proto

message Response {
  int64 UID        
  ....
}

use protoc:
protoc --decode=response.Response response.proto < response.bin
protoc --decode=[package].[Message type] proto.file < protobuf.response
Really answered 6/9, 2019 at 8:52 Comment(0)
P
0

Mouse Melon

I just made a software application that can be used to view and edit Protocol Buffers data. It's commercial software, but the free version is probably good enough for what you want.

enter image description here

Poucher answered 10/7, 2024 at 17:49 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.