How does Akka Database sharding work?

How does Akka Database sharding work?

This topic is based on a conversation between an anonymous user of Akka and our “Ask AI about Akka” bot in the documentation.

Akka Persistence R2DBC is able spread the load of your application over more than one database. The event journal, snapshot store and durable state can be split up over multiple tables and physical backend databases.

  1. Database Sharding: Using a single non-distributed database can become a bottleneck for applications that have high throughput requirements. To address this, the event journal, snapshot store, and durable state can be split up over multiple tables and physical backend databases. This is achieved through database partitioning and database sharding. The data is partitioned by the slices that are used for eventsBySlices and Projections.

  2. Data Partitions: You can configure how many data partitions are needed. A data partition corresponds to a separate database table. The number of data partitions must be between 1 and 1024 and a whole number divisor of 1024 (number of slices), e.g. 2, 4, 8, 16. Each data partition corresponds to a database table. The tables will have the data partition as suffix, e.g. event_journal_0, event_journal_1.

  3. Configuration: The data partitions are configured with the number of tables and databases that the data will be split into. The selection of data partition is made from the slice of the persistenceId. The number of Projection instances when using eventsBySlices must be greater than or equal to the number of data partitions.

  4. Schema: Each data partition corresponds to a table. You can copy the DDL statements for the tables and indexes from Creating the schema but change the table and index names to include data partition suffix.

  5. Example: If you configure 8 data partitions and 4 databases, an individual entity will map to a specific slice and the entity will read and write to the table that covers the corresponding slice range. Each database may host several of the data partition tables. Each database requires a separate connection factory and connection pool.

You can find more details on the source page.

This is a corrected answer to the reply of the “Ask AI about Akka” bot in the documentation. Originally the bot explained the concept of “Akka Cluster Sharding” which must not be confused with “Akka Database Sharding”.