In my use case, I need to produce data into Kafka. When Kafka is not available, I want the producing stream to backpressure and retry with exponential backoff, and not lose any produced data.
It looks like Akka Streams / Alpakka doesn’t offer a solution for that. I’d like to be proven wrong.
Here’s what I have (it’s Kotlin, but should be pretty straightforward – the first line generates an Iterable of all int values starting from 1: 1, 2, 3, …).
Source.from(generateSequence(0) { it + 1 }.asIterable())
.throttle(5000, Duration.ofSeconds(1))
.map {
ProducerMessage.single(ProducerRecord<String, String>("increasing-topic-1", it.toString()))
}
.via(RestartFlow.onFailuresWithBackoff(Duration.ofMillis(100), Duration.ofSeconds(30), 0.5, -1) {
Producer.flexiFlow<String, String, NotUsed>(producerSettings)
})
.runWith(Sink.ignore(), mat)
What I see (consistently with what RestartFlow’s docs say) is that when I run this program and make Kafka not available / available again, then some numbers are lost, e.g. there are 60k messages in the topic, but the highest one contains number 72k.