Partitions and objects in Diskless Topics

Diskless topics use the standard Kafka partitioning model but store data in cloud object storage instead of broker-local disks. Brokers batch messages and upload them as objects to the storage layer.

Partitions

Partitions in diskless topics behave the same as in classic Kafka topics. Each partition is an ordered append-only log of messages that supports message ordering, parallelism, and horizontal scalability.

Producers write to partitions based on a key or round-robin logic.
Consumers read from partitions independently, enabling concurrent processing.
The number of partitions controls how many producers or consumers can operate in parallel.

Objects in diskless topics

In classic Kafka, partitioned data is stored in ordered segment files on broker disks. Diskless topics replace these segments with cloud-stored objects.

Each object is a batch of messages that a broker uploads to cloud object storage. Unlike classic Kafka segments, an object is not limited to a single partition. It can include messages from multiple partitions. Messages within an object are not ordered across partitions.

Storage detail	Classic Kafka segment	Diskless topics object
Location	Local disk on broker	Cloud object storage
Structure	Ordered messages per partition	Batches containing messages from one or more partitions
Management	Via broker	Via internal Batch Coordinator metadata
Replication	Kafka-based	Storage-provider-based

Message ordering is preserved at the partition level using metadata. When a broker uploads a batch, it registers the offset range and object reference with the internal Batch Coordinator. Consumers use this metadata to fetch messages in the correct order, even when data spans multiple objects.

To reduce latency, each broker may cache frequently accessed objects in memory or on ephemeral disk, typically within the same availability zone.

Batching and delivery in diskless topics

Partitions​

Objects in diskless topics​

Partitions

Objects in diskless topics