Hi David,
maybe the discussion in Throughput and batching with Alpakka Kafka is helpful to you?
Short summary:
-
The
mapAsyncpart that you found is independent of the order of publishing. In itself, it only guarantees that you will receive theResults(i.e. the success notifications of the publishing process) in the same order as the messages you put into it. (If theProducerwas usingmapAsyncUnorderedinstead, this would not be guaranteed. Theparallelismsetting affects the size of the internal buffer thanmapAsyncuses, but that buffer uses ahead-of-line-blocking to guarantee ordering.) -
The order of publishing is managed by the underlying
KafkaProducerfrom the kafka clients library. The alpakka flexiflow will pass messages to theKafkaProducerin the same order they arrive, then collect the futures that track publishing success for each message and hand them back to you once they’re done. TheKafkaProducerwill usually keep the order, but make sure to read the documentation for the producer configsretriesandmax.in.flight.requests.per.connectionat https://kafka.apache.org/documentation/ ! In particular, the default formax.in.flight.requests.per.connectionis5and I don’t think alpakka overrides this by default, so you need to adjust your configuration to retain ordering. -
Finally - although I’m certain you’re aware of that - “publishing in order” only preserves order for downstream consumers of your partitioner is the same for the source and target topic, as two messages consumed from the source will inherently lose their order if they are being published to two different partitions downstream.
Hope that helps!