Lets say I have a simple Graph: Source → Flow → Sink
The Source is an AzureServiceBusSource.
The Flow takes each message and uses the provided information to make an HTTP request to an API. When that request is successful, I complete the message at the Service Bus, effectively removing it from the Queue. When it fails with an expected error value, it is forwarded to the Dead Letter Message Queue, also removing it from the Queue.
Now the problem is, how do I deal with transient failure in an effective way e.g. when an API is not available?
Inside the Flow, individual HTTP requests to this service are retried 5 times. Currently, when the retry attempts are exhausted, the element is discarded, and the next element is tried. This is not ideal, because the target service could still be unavailable, and I don’t just want to discard elements because of eventual consistency.
When the request has failed 5 times, I can conclude that the external service is unreachable. At this point, I immediately want to stop any new elements to be pulled from my Source, and I also don’t want to lose the element that caused the error.
Ideally I want to have a Backoff period, and afterwards the Flow tries to continue with the last element before failure.
What are my options?
It would be unneccessary to restart the whole stream, because downstream stages not depicted here could very well still work, why disrupt them?
A RestartFlow seems to be lossy, so the original message will be lost. I cannot afford that.
Another idea was to propagate the failure upstream to the Source, so it could be replaced with another instance of the source, effectively restarting it, but there seems to be no way to propagate Exceptions upstream.
What should I do now? Do I need to use a KillSwitch, a custom Graph Stage, or an Actor to do what I need to do?
Any hints would be appreciated.