Akka Streams on Kubernetes

Hi guys, I have some questions about Akka Streams on Kubernetes.

My scenario is, Kafka In, Flow, Kafka Out.

First question, is ok to up 3 or more pods with the same application, right? Pods with scale 3, is a best idea? Because with Kafka with the same GroupID each application is balanced with topic partitions.

Second question, about heath check, Kubernetes have health check for kill and up new application after main process exit or health check not respond. What is the best way to make a health check endpoint? Have any way to check if the stream is alive or stuck?

Third question, about metrics, have any way to extract metrics about stream, like throughput, slow areas, etc? Or only via Lightbend Telemetry?

Any other warning about deploy Akka Stream in Kubernetes?

Thx a lot guys!!!

1 Like

Hello Jonatas,

is ok to up 3 or more pods with the same application, right? Pods with scale 3, is a best idea? Because with Kafka with the same GroupID each application is balanced with topic partitions

This should indeed be fine, as long as you don’t need to share any state across partitions, of course.

Kubernetes have health check for kill and up new application after main process exit or health check not respond. What is the best way to make a health check endpoint? Have any way to check if the stream is alive or stuck?

I can’t really think of a meaningful health check to perform in this case.

have any way to extract metrics about stream, like throughput, slow areas, etc? Or only via Lightbend Telemetry?

Outside of Lightbend Telemetry you could add your own stages reporting one this kind of information, but there’s nothing out-of-the-box.

Any other warning about deploy Akka Stream in Kubernetes?

There are some general guidelines around running Akka applications in Docker at Deploying • Akka Documentation, but it mostly comes down to application-specific tuning.

1 Like

I think there is no “best-way”. We use the sbt-buildInfo plugin and sometimes monitoring inside the application for really highlevel metrics. Examples here and here. But this is just one possible option.

There was a repo called akka-streams-checkpoint. The original repo got deleted or moved but some sources are still in maven-central. I’m not saying that using a non-maintained codebase is a good idea, but checking the inner logics, and use it for an own implementation can be useful (if you search in github there is an old fork too for a starting point). (Actually, I’m a bit shocked bcs I worked on multiple projects which used this lib which means I should came up with something on them.) You can still read the basic concepts from the lib author.

One thing; deciding if a stream is “stucked” or just not get enough data is HARD. You can have assumptions that you should get at least one element in every 10 minutes (or sth like that), but if your upstream (for ex. your kafka) is failing, you will kill the innocent stream apps over and over.

Also I found this: https://github.com/ruippeixotog/akka-stream-mon but seems dead too. For starting points and ideas this could be good too. Actually I really like the inner design of this.