Hi, I’m going to exaggerate a bit here for illustrative purposes.
Is it possible to configure akka-http such that even 100,000 POSTs sent within a space of 1 ms (part of the same tcp connection or not) wouldn’t result in any Connection Refused errors sent to the client/any of the clients?
What I would like to happen is that all of those 100,000 requests would get queued (in no particular order) and then be processed by [insert number of processing cores here] threads. The kicker is that for every request received I’m making an async call to a legacy service, which needs to be throttled to 1 per second. Is there a way to configure akka-http to just take 100,000 seconds to process all the results, and not throw any exceptions?
If this is not possible, what’s the way to get the closest to that goal?
this is a pretty interesting question that could provide enough stuff for a whole article…
I guess the underlying question here is how queuing and timeouts work together for such a spike and what other resources are needed to queue up 100000 requests.
Since each TCP connection can only handle a single request at a time with HTTP/1.1, let’s assume we are talking about 100000 incoming TCP connection in one ms. This will be pretty tough on the OS, so it depends a lot on how your TCP stack is configured. Here is only a very quick rundown on what is involved:
How many (TCP SYN) packets can the network deliver in 1ms?
Can the OS process these packets (complete TCP handshake) or will some be dropped?
If some packets are dropped will they be retried by the client and for how long?
After the connection is accepted it will be put into a socket backlog, what’s the size of the backlog and how much memory does it need per connection?
How many memory / resources does the OS need to maintain 100000 TCP connections (buffer sizes, config data in ipfilter etc)
When the TCP handshake is successful, will the clients wait for long enough after they have sent their requests before timing out?
Akka HTTP has a max-connections setting which governs how many connections to accept concurrently, when that number is reached, TCP connections should be backlogged in the socket TCP backlog which is maintained by the OS.
If there is more intermediate infrastructure like reverse proxies etc, then these will introduce their own settings and queues and you will have to take them into account as well.
I’d run performance tests, and then try to find out what the current problems are. This is likely more of an OS TCP stack optimization problem than an Akka problem after all.
I think my case is much more mundane. I created a simple test project to show my simplified setup. I have: #1: Undertow that responds after one second (LaunchMeFirstApp.scala) #2: Undertow that calls #1 synchronously in a thread pool (UndertowApp.scala) #3: Akka-http that calls #1 async (AkkaHttpApp.scala)
If you launch all three and then the Gatling test (Runner.scala) and press 1 to test the Undertow app, you’ll get decent results - 10k messages all OK
> request count 10000 (OK=10000 KO=0 )
if you launch Gatling and press 0 to test the Akka app, you’ll get lots (99%+) of failures (j.n.ConnectException: Connection refused: no further information)
> request count 10000 (OK=187 KO=9813 )
I understand that my settings must be incorrect (scala object Conf in AkkaHttpApp.scala). Do you have any pointers as to what could be going on?