The ScalaWS documentation says that there’s a play.ws.ahc.maxRequestRetry param that it’s possible to configure on AsyncHttpClientConfig. If this is the right file, it looks like the default is 5.
What is the behavior of maxRequestRetry in the presence of timeouts? Let’s say that I make a call to an endpoint and the call times out. Will that call be retried up to maxRequestRetry times?
@marcospereira Thanks for the response. It seems to me that there’s a problem between what play-ws is trying to do and what is actually happening. If you look at play-ws, it is configured for 5 retries by default. Yet it doesn’t work. Either it should be made to work, or the retries should be configured to 0 by default (?).
I mean that if you look at Play’s default configuration, it is intending to perform 5 retries of any request. The API (or configuration, in this case), is making that promise to me as the app developer. However, in reality, it seems that no retries are performed - this seems like a bug. See the Stack Overflow link posted above - others are running into the same scenario and similarly questioning it. Either the retries should be made to work with the configuration of 5, or the configuration should be changed to 0 (because that’s how it behaves - 0 retries). In any case, it seems an issue somewhere that setting a non-0 value for retries does not perform any retries?
Well, as I said, the semantics of the retry are not defined by play-ws, it is just exposing the configuration. Although I understand that this should not be a concern for the developer using play-ws. But setting to zero or any other value sounds pointless to me if in the end the expectations aren’t met.
Anyway, I don’t have the time to investigate how async-http-client handles the retries and under which circumstances (retry on connection error? connection timeout?). By the way, under which circumstances are you expecting a retry? Finally, do you have time to do some investigation here?
My fear is actually that I will update the Play version at some point and all of a sudden I will have working retries when I wasn’t expecting them.
In my system, some calls are made to make purchases from a legacy service which can take a long time to respond under load - we can see timeouts on the call (due to the legacy service trying to generate the response), but the purchase went through. If all of a sudden I got 5 retries on any timed-out calls without explicitly asking for them, I would have many users who wind up with 6 purchases instead of 1 (?). That would be a big problem. In my experience, many (most?) endpoints are not written to be idempotent, so I would expect the default to be 0 retries and for developers to carefully and explicitly use retries where needed on a per-call basis, not a WSClient basis.
My suggestion would be to change the default configuration to 0 retries, though I understand if others feel differently.
under which circumstances are you expecting a retry?
We have implemented retries ourselves at the application level (to explicitly use for certain calls where we know it’s safe/desired to do so) and so far we are retrying on TimeoutException or ConnectException.
do you have time to do some investigation here?
I don’t have any experience with AHC or Netty, so I don’t think it would be very efficient for me to take a stab at this problem… but I could take a peek at the underlying code sometime.