Right now Akka gRPC is not highly optimized yet, we are still making improvements there. Even with those improvements we do expect some overhead compared to a fully native implementation, though.
It seems that akka-grpc has overhead about 35ms? In my case it is only about 2ms, maybe the hardware difference? could this affect so much from 2ms to 35ms? Or maybe there is something wrong in my benchmarking?