avro - schema for logicalType
Asked Answered
P

1

5

I am trying to learn avro and have a question in schema.

Some documents say

{
      "name": "userid",
       "type" : "string",
       "logicalType" : "uuid"
},

And some say

{
  "name": "userid",
  "type" : {
    "type" : "string",
    "logicalType" : "uuid"
  }
},

Which one is right? Or are they same?

Thank you!

Poriferous answered 9/3, 2021 at 4:1 Comment(1)
I originally thought both, but now I think maybe just the second one is correct.Pawsner
S
6

I ran variants of your schemas with the avro tools "random" command ( aliased as avro below). It tries to generate a random value for a schema.

A schema with just this type using the nested type syntax to specify logicalType is rejected:

avro random --schema '{ "name": "userid", "type" : { "type": "string", "logicalType" : "uuid" } }' -

[...] No type: {"name":"userid","type":{"type":"string","logicalType":"uuid"}}

However, it works when putting the logicalType next to type:

avro random --schema ' { "type" : "string", "logicalType" : "uuid" }' -

[...] Objavro.schemaL{"type":"string","logicalType":"uuid"}avro.codecdeflate}�j�U�.�\�o���

Now, when we use it in a record, we get a warning when putting logicalType next to type:

avro random --schema '{ "type": "record", "fields": [ { "type" : "string", "logicalType" : "uuid", "name": "f"} ] , "name": "rec"}' -

[...] WARN avro.Schema: Ignored the rec.f.logicalType property ("uuid"). It should probably be nested inside the "type" for the field. Objavro.schema�{"type":"record","name":"rec","fields":[{"name":"f","type":"string","logicalType":"uuid"}]}avro.codecdeflate��w�9�9�n�s�

The nested syntax is accepted without a warning:

avro random --schema '{ "type": "record", "fields": [ { "type" : { "type": "string", "logicalType" : "uuid" } , "name": "f"} ] , "name": "rec"}' -

�w<��qcord","name":"rec","fields":[{"name":"f","type":{"type":"string","logicalType":"uuid"}}]}avro.codecdeflate8��t

Further if we look at logicaltypes inside arrays:

avro random --count 1 --schema ' { "type": "array", "items": { "type" : "string", "logicalType" : "uuid" , "name": "f"} , "name": "farr" } ' -

[... random bits]

While the nested version fails:

avro random --count 1 --schema ' { "type": "array", "items": {"type": { "type" : "string", "logicalType" : "uuid" , "name": "f"} } , "name": "farr" } ' -

[...] No type: {"type":{"type":"string","logicalType":"uuid","name":"f"}}

It appears that if a logicalType is a type of a field in a record, you need to use the nested syntax. Otherwise you need to use non-nested syntax.

Selfjustifying answered 29/3, 2021 at 8:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.