Apache Beam: Unable to find registrar for gs
Asked Answered
P

2

7

Beam is using both Google's auto/value and auto/service tools.

I want to run a pipeline with Dataflow runner and data is stored on Google Cloud Storage.

I've added a dependencies:

<dependency>
    <groupId>org.apache.beam</groupId>
    <artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
    <version>2.0.0</version>
</dependency>

<dependency>
    <groupId>org.apache.beam</groupId>
    <artifactId>beam-sdks-java-extensions-google-cloud-platform-core</artifactId>
    <version>2.0.0</version>
</dependency>

I'm able to start the pipeline from the IntelliJ. But when the jar is compiled through a mvn package and run with java -jar it throws an error:

java.lang.IllegalStateException: Unable to find registrar for gs

The fatjar is package with maven-assembly-plugin. GcsFileSystemRegistrar class is in the jar.

Planck answered 5/6, 2017 at 9:15 Comment(0)
E
10

The issue is in the way that you are building your fatjar. The maven-assembly-plugin is not handling files associated with ServiceLoader correctly. ServiceLoader relies on entries being listed within META-INF/services/org.apache.beam.sdk.io.FileSystemRegistrar for each implementation so that Java knows how to find them.

The contents of the META-INF/services/org.apache.beam.sdk.io.FileSystemRegistrar in your fatjar is likely only:

org.apache.beam.sdk.io.LocalFileSystemRegistrar

You need to have it list (and any other implementations that you want):

org.apache.beam.sdk.io.LocalFileSystemRegistrar
org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystemRegistrar

Your best bet is to use a tool which understands these ServiceLoader requirements like the maven-shade-plugin when configured to use the ServicesResourceTransformer to build your fatjar.

Eskisehir answered 5/6, 2017 at 15:45 Comment(2)
Thanks! You're right. I've figure it out yesterday late at night by creating a Beam project from archetype and comparing the differences between mine pom.xml and the new pom.xml. After switching to maven-shade-plugin it started working just fine.Carnegie
If you are using Gradle with the Shadow plugin, you can fix this by using mergeServiceFiles() in the shadowJar { ... } closure.Reek
M
2

This looks like a problem with assembly strategy, you should accumulate/merge the services for org.apache.beam.sdk.io.FileSystemRegistrar. More on similar problem here.

Mcginn answered 5/6, 2017 at 15:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.