System is terminated as expected or crashed?

In graceful shutdown, is it possible to know whether the system is terminated as expected or crashed due to unexpected error?

This topic is based on a conversation between an anonymous user of Akka and our “Ask AI about Akka” bot in the documentation.

Yes, it is possible to know whether the system is terminated as expected or crashed due to an unexpected error. In Akka, the Coordinated Shutdown process is used to manage the graceful shutdown of an application. This process is triggered when a node is sent a SIGTERM signal, not a SIGKILL.

During the Coordinated Shutdown, the system will perform a Cluster(system).leave on itself, change the status of the member to Exiting, allow any shards to be shutdown gracefully, and migrate ClusterSingletons if this was the oldest node. Finally, the node is removed from the Akka Cluster membership. This process ensures that the system is terminated as expected.

However, if an error occurs, such as a crash, the system may be terminated unexpectedly. In such cases, the system may not have time to run the Coordinated Shutdown process. For example, if the guardian actor crashes on start, the system is immediately terminated, and there is no actor system to keep running the coordinated shutdown anymore.

In general, the Coordinated Shutdown will be most important for things that need some graceful shutting down, such as sharding waiting for stopping entities, Akka HTTP/gRPC giving in-flight requests time to complete before closing connections, and cluster leaving instead of relying on SBR to kick a node out.

To handle such scenarios, Akka provides the Split Brain Resolver module. This module has a number of strategies that can ensure that the cluster continues to function during network partitions and node failures.

Sources:

Assume we have 2 functions, handleTerminate() should be called when system exit properly, and handleCrash() should be called when system crashes, write sample code in java to show how we can call them correctly in coordinated shutdown.

Sure, you can use the Coordinated Shutdown API to add tasks that will be executed when the system is shutting down. Here is a sample code in Java:

import akka.actor.CoordinatedShutdown;
import java.util.concurrent.CompletionStage;
import java.util.concurrent.TimeUnit;

// Define your shutdown handlers
Runnable handleTerminate = () -> System.out.println("System terminated properly");
Runnable handleCrash = () -> System.out.println("System crashed");

// Add your shutdown handlers to the Coordinated Shutdown
CoordinatedShutdown.get(system)
    .addTask(
        CoordinatedShutdown.PhaseBeforeServiceUnbind(),
        "taskName",
        () -> {
            handleTerminate.run();
            return CompletableFuture.completedFuture(Done.getInstance());
        });

// Add JVM shutdown hook for crash
Runtime.getRuntime().addShutdownHook(new Thread() {
    public void run() {
        handleCrash.run();
    }
});

In this code, handleTerminate is added as a task to the Coordinated Shutdown and will be executed when the system is shutting down properly. handleCrash is added as a JVM shutdown hook and will be executed when the JVM is shutting down, which could be due to a system crash.

Please note that the handleCrash will also be called when the system is shutting down properly because JVM shutdown hooks are called on normal termination as well. If you want to distinguish between a crash and a normal termination, you would need to add some additional logic.

Sources:

From handleTerminate, can I access certain class variables of actors to support cleanup logic?

In general, accessing actor state directly from outside the actor is not recommended in Akka, as it breaks the actor encapsulation and can lead to race conditions. However, you can design your actors to respond to specific messages with their state, which you can then use in your shutdown logic.

Here’s an example of how you might do this:

import akka.actor.typed.Behavior;
import akka.actor.typed.javadsl.Behaviors;

public class MyActor {
    // Define your actor's state
    private String state;

    // Define a message that requests the actor's state
    public static final class GetState {}

    public Behavior<GetState> create() {
        return Behaviors.receive(GetState.class)
            .onMessage(GetState.class, msg -> {
                // Respond with the actor's state
                getContext().getSender().tell(state);
                return Behaviors.same();
            })
            .build();
    }
}

Then, in your shutdown logic, you can send a GetState message to the actor and use the response in your cleanup logic:

import akka.pattern.Patterns;
import scala.concurrent.Future;

// Define your shutdown handler
Runnable handleTerminate = () -> {
    // Ask the actor for its state
    Future<Object> future = Patterns.ask(myActor, new MyActor.GetState(), timeout);

    // Use the actor's state in your cleanup logic
    future.onComplete(state -> {
        System.out.println("Actor's state: " + state);
        // Perform cleanup...
    }, system.dispatcher());
};

Please note that this is a simplified example and actual usage would depend on your specific use case.

Sources: