I fetch protobuf data from google pub/sub and deserialize the data to Message type object. So i get PCollection<Message>
type object. Here is sample code:
public class ProcessPubsubMessage extends DoFn<PubsubMessage, Message> {
@ProcessElement
public void processElement(@Element PubsubMessage element, OutputReceiver<Message> receiver) {
byte[] payload = element.getPayload();
try {
Message message = Message.parseFrom(payload);
receiver.output(message);
} catch (InvalidProtocolBufferException e) {
LOG.error("Got exception while parsing message from pubsub. Exception =>" + e.getMessage());
}
}
}
PCollection<Message> event = psMessage.apply("Parsing data from pubsub message",
ParDo.of(new ProcessPubsubMessage()));
I want to apply transformation on PCollection<Message> event
to write in parquet format. I know apache beam has provided ParquetIO but it works fine for PCollection<GenericRecord>
type and conversion from Message
to GenericRecord
may solve the problem (Yet don't know how to do that). There is any easy way to write in parquet format ?