How to store eventhub checkpoint data locally in Java
Asked Answered
Z

1

7

I'm trying to find an example of Custom Checkpoint Manager in JAVA, that can store checkpoint data in a local folder.

Basically, I'm building a java application that reads data from azure event hub with multiple consumer groups. Previously I was instantiating the EventProcessorHost using storage account connection string and storage container located in azure blobs - which was working fine.

POM entry:

        <dependency>
            <groupId>com.microsoft.azure</groupId>
            <artifactId>azure-eventhubs-eph</artifactId>
            <version>2.4.0</version>
        </dependency>

Java code for instantiating the host:

storageConnectionString="DefaultEndpointsProtocol=https;AccountName=MyAccountName;AccountKey=MyAccountKey;EndpointSuffix=core.windows.net";
storageContainerName="MyContainerName",

EventProcessorHost host = new EventProcessorHost(
                EventProcessorHost.createHostName(hostNamePrefix),
                eventHubName,
                consumerGroupName,
                eventHubConnectionString.toString(),
                storageConnectionString,
                storageContainerName);

Now, the requirement is to use local folder in Azure Databricks cluster (a DBFS:/ path) to store the checkpoint data.

I think I'll have to write a custom checkpoint manager implementing ICheckpointManager. I was able to find an example doing this in SQL database, but I wasn't able to find an example of CheckpointManager storing checkpoint data on a local folder.

Can anyone please help, either give me a link to the example or a code snippet?

Zsa answered 24/7, 2019 at 15:51 Comment(1)
I was looking for something similar in Java. Any update on how you finally did it @Mani Mukhtar?Husein
D
0

you can create a custom checkpoint manager that implements icheckpointmanager, something like that,

import com.microsoft.azure.eventprocessorhost.Checkpoint;
import com.microsoft.azure.eventprocessorhost.ICheckpointManager;
import com.microsoft.azure.eventprocessorhost.PartitionContext;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.concurrent.CompletableFuture;

public class LocalCheckpointManager implements ICheckpointManager {
    private final String checkpointDirectory;

    public LocalCheckpointManager(String checkpointDirectory) {
        this.checkpointDirectory = checkpointDirectory;
    }

    @Override
    public CompletableFuture<Boolean> checkpointStoreExists() {
        File dir = new File(checkpointDirectory);
        return CompletableFuture.completedFuture(dir.exists() && dir.isDirectory());
    }

    @Override
    public CompletableFuture<Void> createCheckpointStoreIfNotExists() {
        File dir = new File(checkpointDirectory);
        if (!dir.exists()) {
            dir.mkdirs();
        }
        return CompletableFuture.completedFuture(null);
    }

    @Override
    public CompletableFuture<Checkpoint> getCheckpoint(PartitionContext context) {
        File checkpointFile = new File(checkpointDirectory, context.getPartitionId() + ".checkpoint");
        if (!checkpointFile.exists()) {
            return CompletableFuture.completedFuture(null);
        }

        try (FileReader reader = new FileReader(checkpointFile)) {
            char[] buffer = new char[(int) checkpointFile.length()];
            reader.read(buffer);
            String offset = new String(buffer);
            Checkpoint checkpoint = new Checkpoint();
            checkpoint.setOffset(offset);
            checkpoint.setPartitionId(context.getPartitionId());
            return CompletableFuture.completedFuture(checkpoint);
        } catch (IOException e) {
            return CompletableFuture.failedFuture(e);
        }
    }

    @Override
    public CompletableFuture<Void> createCheckpoint(PartitionContext context, Checkpoint checkpoint) {
        File checkpointFile = new File(checkpointDirectory, context.getPartitionId() + ".checkpoint");

        try (FileWriter writer = new FileWriter(checkpointFile)) {
            writer.write(checkpoint.getOffset());
            return CompletableFuture.completedFuture(null);
        } catch (IOException e) {
            return CompletableFuture.failedFuture(e);
        }
    }

    @Override
    public CompletableFuture<Void> deleteCheckpoint(PartitionContext context) {
        File checkpointFile = new File(checkpointDirectory, context.getPartitionId() + ".checkpoint");
        if (checkpointFile.exists()) {
            checkpointFile.delete();
        }
        return CompletableFuture.completedFuture(null);
    }
}

and then you use it in the event processor host

String dbfsCheckpointDir = "/dbfs/mnt/checkpoints"; // change it with your DBFS path

EventProcessorHost host = new EventProcessorHost(
    EventProcessorHost.createHostName(hostNamePrefix),
    eventHubName,
    consumerGroupName,
    eventHubConnectionString.toString(),
    new LocalCheckpointManager(dbfsCheckpointDir)
);

host.registerEventProcessor(EventProcessor.class, eventProcessorOptions).get();

i hope it helps.

Dismissal answered 11/8 at 17:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.