Batch Processing on Kubernetes
Asked Answered
C

1

11

Anyone here have experience about batch processing (e.g. spring batch) on kubernetes ? Is it good idea ? How to prevent batch processing process same data if we use kubernetes auto scaling feature ? Thank you.

Certiorari answered 30/3, 2020 at 4:25 Comment(4)
I personally looked into using Kubernetes for batch processing, and decided not to do it because of the auto-scaling problem you mention. I have batch processes that typically run 1-4 hours. Technically I could run them in Kubernetes, but my pods would end up repeating themselves every time the cluster scaled down. Technically I might be able to checkpoint when I receive a SIGTERM, but it started to seem like too much work compared with just spinning up individual VMs and watching their progress. Overall, Kubernetes doesn't seem designed for this use-case.Asthmatic
Here's a longer article that somebody wrote on the topic. Bottom line, Kubernetes might be a great choice for some needs, and a bad choice for others. ben-morris.com/do-you-really-need-kubernetesAsthmatic
It's also worth noting that Google had a beta program for batch processing in GKE but then deprecated it: cloud.google.com/kubernetes-engine/docs/concepts/batchAsthmatic
you could use safe-to-evict label to avoid autoscaler kills long running jobs during scaling down pod resource optimization. see details https://mcmap.net/q/591997/-prevent-killing-some-pods-when-scaling-down-possibleAril
G
18

Anyone here have experience about batch processing (e.g. spring batch) on kubernetes ? Is it good idea ?

For Spring Batch, we (the Spring Batch team) do have some experience on the matter which we share in the following talks:

Running batch jobs on kubernetes can be tricky:

  • pods may be re-scheduled by k8s on different nodes in the middle of processing
  • cron jobs might be triggered twice
  • etc

This requires additional non-trivial work on the developer's side to make sure the batch application is fault-tolerant (resilient to node failure, pod re-scheduling, etc) and safe against duplicate job execution in a clustered environment.

Spring Batch takes care of this additional work for you and can be a good choice to run batch workloads on k8s for several reasons:

  • Cost efficiency: Spring Batch jobs maintain their state in an external database, which makes it possible to restart them from the last save point in case of job/node failure or pod re-scheduling
  • Robustness: Safe against duplicate job executions thanks to a centralized job repository
  • Fault-tolerance: Retry/Skip failed items in case of transient errors like a call to a web service that might be temporarily down or being re-scheduled in a cloud environment

I wrote a blog post in which I explain all these aspects in details with code examples. You can find it here: Spring Batch on Kubernetes: Efficient batch processing at scale

How to prevent batch processing process same data if we use kubernetes auto scaling feature ?

Making each job process a different data set is the way to go (a job per file for example). But there are different patterns that you might be interested in, see Job Patterns from k8s docs.

Ghibelline answered 30/3, 2020 at 11:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.