Ensuring consistency with Kafka Schema and OpenAPI specification
Asked Answered
H

1

5

A number of API/microservices provide access to critical resources including Kafka topics. The API/microservice messages are validated using an OpenAPI specification that defines the API/microservice contract. Once the microservice validates the message it is published to a Kafka topic, at which point the message is (again) validated against Kafka's schema registry.

The problem is that there are two message definitions upon which messages are validated (the OpenAPI spec and the Kafka's schema registry) and it is a challenge to ensure both message definitions are in sync.

With this in mind, I have a few questions:

  • is there a way to convert OpenAPI specs to Kafka schema registry format (and vice-versa)?
  • is there a way to allow Kafka to verify against an OpenAPI spec instead of registry (probably not a great solution as native Kafka capabilities should be used)?
  • is there a way to allow an API/Microservice to validate its messages against a Kafka schema instead of OpenAPI spec (again, probably not a good approach since OpenAPI specs are the standard approach to define messages for APIs)?

Lastly, which of the above makes the most sense. Are there any other better alternatives?

Henigman answered 16/12, 2020 at 14:32 Comment(7)
Unclear what you mean by "Kafka schema registry format"... The new version of the registry accepts JSON schemas (which is what openapi uses, right?). Also, "native Kafka" has zero concept of a schema, so you're open to doing anything with your messages, with the tradeoffs of latencySuprasegmental
I should be more specific: Confluent's Kafka distro has a schema registry. Schemas can be defined and stored in the registry. The schemas are then used to validate messages. There are two attributes that permit this capability: "confluent.value.schema.validation" and "confluent.key.schema.validation". When set to "true" they permit message validation, and if the message (or any one message in a batch of messages) is invalid then the message/batch is discarded. On your second point, I agree there are latency considerations but the message validation (in my environment) is higher priority.Henigman
I know what it does. To be more specific - what "format" are you expecting of a schema? In particular, the Schema Registry has a pluggable interface for definiing your own formats, and you could inspect the source code for how the existing Avro/Protobuf/JSON serializers operate on/validate the messages, but that's on a per-client basis, not overall per-topic or cluster-wideSuprasegmental
I would like an industry standard OpenAPI spec to be the format. Rationale: API/Microservices and Kafka are often combined. Hence, a harmonized message definition AND security model between API/Microservices and Kafka is ideal. API/Microservices use an OpenAPI spec to define/validate message AND security definition (scopes) which allows fine grained access control on a per-topic basis: by using OpenAPI specs (one for each topic), I can not only validate messages consistently between API/Microservice and Kafka, but I also get a harmonized security model.Henigman
OpenAPI / HTTP are more around synchronous request/response bodies. You could use swaggergen, for example, to create models, then pass those around Kafka topics. However, you might also want to look at AsyncAPISuprasegmental
Great suggestion - I did look at AsyncAPI, and it is pretty close to the OpenAPI specification, but unfortunately we have a large investment in OpenAPI specifications which drives the need for OpenAPI... that being said, there are common elements (components) that I may be able (with a bit of work) to salvage from the OpenAPI specs and repurpose for an AsyncAPI message definition.Henigman
In any case, my experience with Kafka messages are more events than request/response models. Therefore the schemas have some overlap, but not 1:1. If you need to do message validation, that would be performed in the serializers rather than at the broker side. I know Confluent Server has that confluent.value.schema.validation field, but I think that is very specifically tied to their Schema Registry, which, as mentioned does offer extensions for custom schema types.Suprasegmental
H
6

After many attempts to solve this problem I have concluded that there is an architecture approach that will allow JSON Schema, OpenAPI specs, and Schema Registry to work well together (I have a prototype working but it is too complex to show here). Here is the general approach...

First a few facts that underpin my approach:

  • Fact: OpenAPI (v3.1+) is a superset of JSON Schema; the request and response definitions are 100% compliant to JSON Schema
  • Fact: Schema Registry supports JSON Schema; JSON Schemas appear to be first class citizens: they can be stored and queried in/by Schema Registry; And libraries (although not easily found) are available that allow JSON Schema to work with Kafka (consumers, producers, KSQLDB for Confluent) and its connectors

The implication of this is that JSON Schema would need to be the primary way to define messages flowing through rather than AVRO (I think this may not be looked upon very fondly by the Kafka community). But by using JSON Schema is allows OpenAPI, Schema Registry, and JSON Schema to play nice together!

In other words a generalized solution would have:

  • OpenAPI becomes the primary way of defining APIs that frequently are front-ends in an event streaming solution
  • JSON Schemas define the OpenAPI request/response formats
  • OpenAPI specs, with a modest bit of tooling (several are available), OpenAPI specs can be assembled/bundled from JSON Schema parts, so now OpenAPI and JSONN Schemas are finally harmonized
  • Since API request/response form "events" in an event mgmt system, and there definition is independent of the OpenAPI spec, JSON schemas now define events
  • And finally, since JSON Schemas are in Schema Registry, all events are defined in a Kafka-friendly way and accessible in any normal Kafka fashion
Henigman answered 22/1, 2022 at 16:38 Comment(1)
Did you had a look at asyncapi.com ?Bacardi

© 2022 - 2024 — McMap. All rights reserved.