Architecture: Kafka Active-Active with Confluent Bidirectional Cluster Linking

Codelooru Kafka Active-Active with Confluent Bidirectional Cluster Linking

Your Confluent Cluster Linking DR setup is solid. Two clusters, one link, replication lag under five seconds. Then your platform team asks a reasonable question: the DR cluster is sitting idle, your producers in the secondary region are paying cross-regional latency on every write, and a failover still involves a manual promotion step. Can both clusters carry live traffic simultaneously? Can a regional outage become a non-event rather than an incident?

Active-active is the answer. Both clusters handle live producer and consumer traffic at all times. When one goes down, the other is already running. This post covers how to build that topology correctly using Confluent's bidirectional cluster link mode, including the topic naming convention Confluent recommends, how consumers handle the split event stream, and the full operational runbooks for failover and failback.

This post assumes familiarity with Confluent Cluster Linking fundamentals — mirror topics, offset preservation, the cluster link object. If you have not read Architecture: Confluent Cluster Linking, start there.

The big picture

A bidirectional cluster link is a single link object that treats both clusters as equal peers. There is no source cluster and no destination cluster. Either cluster can host mirror topics of topics owned by the other. Data and metadata — including consumer offsets from both regular topics and mirror topics — flow in both directions across a single link.

This is different from creating two separate unidirectional links. With two separate links, consumer offset sync covers only the topics owned by that link's source. In active-active, consumers read from a mix of locally owned topics and mirrored topics. A bidirectional link syncs offsets for both types on both sides, which is what makes clean failover possible when consumers are already spread across both topic sets.

Confluent's recommendation is to use bidirectional mode for all DR deployments, active-active and active-passive alike. There is no capability available in unidirectional links that is not available in bidirectional mode, and bidirectional mode unlocks the reverse command for failback, which unidirectional links do not support.

Topic naming: the same-name convention

Active-active requires a naming convention that keeps owned topics and mirror topics distinguishable on the same cluster. The approach Confluent recommends is to give owned topics the same name on both clusters — orders on cluster A and orders on cluster B — and use the link's link.prefix setting to namespace the mirror copies.

When link.prefix is set to us-east. on cluster B's side of the link, the mirror of cluster A's orders topic on cluster B is named us-east.orders. The owned orders topic on cluster B is untouched. The two never collide. The same applies in reverse: cluster A's mirror of cluster B's orders is named eu-west.orders.

The significant operational benefit is that producers never need to know about the link topology. A producer in us-east-1 always writes to orders. A producer in eu-west-1 always writes to orders. If a producer is rerouted to the other region during a failover, it continues writing to orders — the same topic name on the surviving cluster. No producer reconfiguration is needed for the topic name itself, only for the bootstrap server endpoint.

Consumers read from orders (local owned) and us-east.orders or eu-west.orders (mirrored from the remote cluster), depending on which cluster they are connected to. The prefix makes the origin of each stream explicit without requiring producers to embed region information in their write path.

Cluster type requirements

Bidirectional mode requires Dedicated or Enterprise clusters on Confluent Cloud. Basic and Standard clusters do not support it. On Confluent Platform, both clusters must be running version 7.5 or later. Open-source Apache Kafka cannot be used as either endpoint. Verify your cluster types before designing around this architecture.

The reverse-and-start, reverse-and-pause, and truncate-and-restore commands are available on Confluent Cloud Dedicated and Enterprise clusters, and on Confluent Platform 7.7 or later in KRaft mode. ZooKeeper-based Confluent Platform clusters do not support these commands. If your Confluent Platform clusters are still on ZooKeeper, failback requires creating a new forward link rather than using the reverse commands.

Setting up the bidirectional link

A bidirectional link is created with --config link.mode=BIDIRECTIONAL. Because both clusters are peers, each needs credentials to authenticate to the other. The --remote-api-key and --remote-api-secret flags provide credentials for the local cluster to reach the remote. The --local-api-key and --local-api-secret flags provide credentials for the remote cluster to reach back to the local. Both pairs are required. The link name must be identical on both clusters.

# Create the bidirectional link — run this command once, targeting Cluster A.
# The link is registered on both clusters simultaneously.
confluent kafka link create active-active-link \
  --cluster lkc-clusterA \
  --remote-cluster lkc-clusterB \
  --remote-bootstrap-server pkc-b.eu-west-1.aws.confluent.cloud:9092 \
  --remote-api-key CLUSTER_B_KEY \
  --remote-api-secret CLUSTER_B_SECRET \
  --local-api-key CLUSTER_A_KEY \
  --local-api-secret CLUSTER_A_SECRET \
  --config link.mode=BIDIRECTIONAL,consumer.offset.sync.enable=true,consumer.offset.sync.ms=5000

Consumer offset sync is enabled at link creation. With a bidirectional link, a single creation command covers both directions: offsets for owned topics sync to the remote cluster's mirrors, and offsets for mirror topics sync back to the source. Both are necessary for clean failover when consumers are spread across both topic sets.

Next, create the mirror topics on each cluster with the appropriate prefix. On cluster B, mirrors of cluster A's topics get the us-east. prefix; on cluster A, mirrors of cluster B's topics get the eu-west. prefix. Note that link.prefix and auto.create.mirror.topics.enable cannot be used together on a bidirectional link — mirror topics with a prefix must be created explicitly as shown here.

# On Cluster B: mirror Cluster A's owned topics with prefix
confluent kafka mirror create orders \
  --link active-active-link \
  --mirror-topic us-east.orders \
  --cluster lkc-clusterB

confluent kafka mirror create payments \
  --link active-active-link \
  --mirror-topic us-east.payments \
  --cluster lkc-clusterB

# On Cluster A: mirror Cluster B's owned topics with prefix
confluent kafka mirror create orders \
  --link active-active-link \
  --mirror-topic eu-west.orders \
  --cluster lkc-clusterA

confluent kafka mirror create payments \
  --link active-active-link \
  --mirror-topic eu-west.payments \
  --cluster lkc-clusterA

Once lag stabilizes on both mirror sets, both clusters carry the full event stream: locally owned topics for their region's producers, and prefixed mirrors of the remote cluster's owned topics.

Producer routing

Because owned topics share the same name on both clusters, producer routing is straightforward. A producer always writes to orders, regardless of which cluster it is connected to. The only routing decision is which cluster's bootstrap server to connect to, which is determined by the producer's region or deployment environment.

A producer in us-east-1 connects to cluster A and writes to orders. A producer in eu-west-1 connects to cluster B and writes to orders. During a failover, a rerouted producer connects to the surviving cluster and continues writing to orders. No topic name change is required. This is the key operational advantage of the same-name convention.

The constraint that must be enforced is that producers never write to a prefixed mirror topic. us-east.orders and eu-west.orders are read-only mirrors. ACLs should explicitly deny write access to any topic matching the mirror prefix pattern for all application service accounts, making a misconfigured producer fail fast with a clear authorization error rather than a silent broker rejection.

Consumer groups must have globally unique names across both clusters. The bidirectional link uses consumer group names to sync offsets between clusters; duplicate group names across clusters cause offset sync collisions. Additionally, a consumer group can be active on only one cluster at a time. When moving a consumer group from cluster A to cluster B — during failover or deliberately — shut it down on cluster A completely before starting it on cluster B. Starting it on both simultaneously causes split-brain offset commits and unpredictable behavior.

Consumer strategy and ordering

A consumer that needs the full event stream for orders must read from two topics on its local cluster: the owned orders topic and the prefixed mirror. On cluster A that is orders and eu-west.orders. On cluster B that is orders and us-east.orders.

Kafka supports subscribing to multiple topics natively in a single consumer instance, either by passing a list of topic names or by using a regex pattern. A regex approach handles the mirror topic discovery automatically without hardcoding prefixes:

// Regex subscription: matches "orders" and any prefixed mirror of orders
consumer.subscribe(Pattern.compile("(.*\\.)?orders"));

This consumer receives records from both orders and any topic matching *.orders on the same cluster, which covers the mirror regardless of which region prefix is in use.

The ordering problem

Within a single topic, Kafka guarantees ordering within a partition. Across two topics, there is no such guarantee. An event written to orders on cluster A at time T and an event written to orders on cluster B at T+40ms will arrive at a consumer in an order determined by replication lag and poll timing. The consumer has no inherent way to know which event happened first.

This is a fundamental property of any system that accepts writes at two locations simultaneously, not a Cluster Linking limitation. Active-active trades global ordering for write locality and availability. The three patterns below represent the main strategies for handling it.

Pattern 1: timestamp windowing

Embed a producer-side wall-clock timestamp in each message as a header or payload field. The consumer reads from both topics, buffers events within a short time window, and processes them in timestamp order before committing offsets. The window must be larger than your P99 replication lag between clusters to allow the slower stream to catch up before the window closes.

The trade-off is added processing latency equal to the window size. For use cases where near-real-time processing is acceptable but strict ordering matters, this is usually the right choice. Use broker-assigned timestamps (LogAppendTime) rather than producer-assigned timestamps (CreateTime) where possible, as broker clocks are more consistent across heterogeneous infrastructure.

Pattern 2: per-entity sequence numbers

Embed a monotonically increasing sequence number per entity key in each event. The consumer tracks the last processed sequence number per entity and buffers out-of-order arrivals until the gap is filled. This approach is more precise than timestamp ordering because it does not rely on clock synchronization across regions. It requires the producer to maintain a per-entity counter, which is straightforward for stateful producer services and achievable via a shared counter store for stateless ones.

Pattern 3: idempotent processing

For many use cases, strict ordering is not actually required. What is required is that every event is processed exactly once and that the final state is correct regardless of arrival order. Each event carries a globally unique ID; the consumer applies each event as an idempotent upsert and skips duplicates. This is the lowest-complexity consumer design and the most horizontally scalable. If your downstream state store supports idempotent upserts — a database INSERT ... ON CONFLICT DO UPDATE, for example — this pattern adds no buffering overhead and no sequence tracking.

In practice, most active-active deployments use pattern 3 as the default and layer stricter ordering only on the specific event types where the business logic genuinely requires it. Designing the entire consumer estate for strict global ordering is more expensive than the actual ordering requirements justify in most systems.

ACLs in active-active

Each cluster maintains an independent ACL store. ACLs are not replicated by the bidirectional link. For active-active to survive failover cleanly, both clusters must have identical ACL configuration for all application service accounts before a failover event occurs.

The permission model per cluster is: application producers need WRITE on the locally owned topic (orders) only. Explicit deny ACLs on any topic matching the mirror prefix pattern (us-east.*, eu-west.*) surface misconfigured producers immediately. Application consumers need READ on both owned and mirror topics. The link's service account needs READ and DescribeConfigs on the remote cluster's owned topics.

The Terraform pattern from the Cluster Linking post applies here directly: define owned and mirror topic sets as separate variables, provision producer WRITE ACLs only against the owned set, and apply the full ACL configuration to both cluster IDs in a single plan. Include an explicit ACL verification step in both the failover and failback runbooks.

Operational procedures

Active-active is the topology where normal failure handling requires the fewest commands. That is the point of it. Understanding which operations are automatic and which require deliberate action is important, because reaching for mirror commands during a normal outage is both unnecessary and introduces risk.

Unplanned outage: what happens automatically

When cluster A becomes unavailable, cluster B continues operating without any intervention:

Cluster B's owned orders topic keeps accepting writes. It was never affected.
Cluster B's us-east.orders mirror enters SOURCE_UNAVAILABLE state and stops advancing. The data already replicated is intact and readable. Consumers can still consume up to the last replicated offset.
When cluster A recovers, the bidirectional link resumes automatically. us-east.orders catches up without any commands.

The only deliberate actions needed during an unplanned outage are application-level, not Kafka-level:

Confirm the outage is real. Check Confluent Cloud status and your monitoring. Define a threshold in your DR policy before acting — for example, cluster unreachable for more than five minutes.
Record the mirror lag. This is your RPO at the moment of the outage. The Partition Mirror Lag and Last Source Fetch Offset fields tell you the data window.
```
confluent kafka mirror describe us-east.orders \
  --link active-active-link \
  --cluster lkc-clusterB
```
Reroute producers from the failed region to cluster B. They write to orders. Only the bootstrap server changes.
Shut down consumers on cluster A, then restart them on cluster B. A consumer group can be active on only one cluster at a time. Shut down completely on cluster A before starting on cluster B. Committed offsets for both orders and us-east.orders are already present on cluster B via bidirectional offset sync. Consumers resume without an offset reset.
Monitor us-east.orders status. SOURCE_UNAVAILABLE is expected and requires no action. Alert on it for awareness, not for response.
```
confluent kafka mirror list \
  --link active-active-link \
  --cluster lkc-clusterB
```

When cluster A recovers, the link resumes on its own. Reroute producers back to cluster A, shut down consumers on cluster B, and restart them on cluster A. Their offsets are already synced back. The topology is fully restored with no mirror commands at any point.

There is no meaningful concept of failback in active-active. When cluster A is healthy again, redirecting traffic back to it is a routing preference — returning producers to their home region for latency reasons — not a recovery procedure. The data never stopped flowing and no mirror state was modified.

When mirror commands are actually needed

There are two scenarios where deliberately running mirror commands is justified in an active-active deployment. Both are optional choices, not required recovery steps.

Extended or permanent loss of cluster A. If cluster A is not coming back and you need us-east.orders on cluster B to become writable — for example, to consolidate both streams into a single namespace — run failover to convert it immediately. This accepts data loss equal to the replication lag at the time of the outage.

confluent kafka mirror failover us-east.orders \
  --link active-active-link \
  --cluster lkc-clusterB

If you run failover and cluster A later recovers, restoring the original topology requires truncate-and-restore on cluster A's orders topic followed by reverse-and-start on cluster B's us-east.orders. truncate-and-restore deletes divergent records written to cluster A after the outage; ensure consumers can reprocess if needed. This command requires KRaft mode on Confluent Platform — it is not available on ZooKeeper-based clusters. If you are on Confluent Platform 7.9.0–7.9.2 or 8.0.0 with Tiered Storage, do not use it — a known bug can cause silent data loss. Upgrade to 7.9.3, 8.0.1, or 8.1.0 first.

# Resync cluster A's orders from cluster B's now-writable us-east.orders
confluent kafka mirror truncate-and-restore orders \
  --link active-active-link \
  --cluster lkc-clusterA

# Once lag is zero, flip the relationship back
confluent kafka mirror reverse-and-start us-east.orders \
  --link active-active-link \
  --cluster lkc-clusterB

Planned DR testing or scheduled maintenance. If you want to deliberately shift all traffic to cluster B to verify the topology under realistic conditions, use reverse-and-start. This is a clean, data-safe operation that requires both clusters to be healthy. Stop producers on cluster A first, run reverse-and-start on cluster B's mirrors of cluster A's topics, redirect producers and consumers to cluster B, and verify. To restore the original topology afterward, run reverse-and-start again in the opposite direction.

# Stop producers on cluster A first, then:
confluent kafka mirror reverse-and-start us-east.orders \
  --link active-active-link \
  --cluster lkc-clusterB

confluent kafka mirror reverse-and-start us-east.payments \
  --link active-active-link \
  --cluster lkc-clusterB

For transactional producers, run reverse-and-start on one topic at a time and monitor each topic to its completed end state before running it on the next. Starting the application before all topics have finished transitioning can result in writes landing on a topic still in an immutable intermediate state.

When active-active is the wrong choice

Strict global ordering requirements. If your consumers require a single globally ordered event stream, active-active cannot provide it. You are producing into two independent ordered streams and merging them. Active-passive with a single writable topic is the correct architecture for this requirement.

High consumer estate complexity. Every consumer in the system must subscribe to two topics and apply an ordering or deduplication strategy. If your consumer estate is large and heterogeneous, the cost of modifying every consumer may outweigh the availability benefit. Active-passive is simpler to operate at scale.

Strong consistency requirements on shared entities. Financial transactions, inventory counts, and other entities where concurrent writes from two regions can produce conflicting state are poor fits for active-active without explicit conflict resolution logic in the consumer. An inventory system decrementing stock from two clusters simultaneously without coordination can produce negative inventory counts. This is a distributed systems problem that active-active exposes, not a Kafka problem. If your domain requires read-your-writes consistency or last-write-wins semantics across regions, the consumer complexity is significant.

Low write volume with acceptable cross-region latency. If write throughput is modest and the latency cost of routing all writes to a single primary cluster is negligible, active-passive is simpler and its operational model is more straightforward. Active-active earns its complexity at scale and at geographic distances where cross-region write latency is a meaningful user-facing problem.

Summary

Active-active with Confluent's bidirectional cluster link is built on three decisions: a single bidirectional link object rather than two separate unidirectional links, the same-name topic convention with link.prefix handling mirror disambiguation, and bidirectional consumer offset sync that covers both owned and mirror topic offsets on both clusters simultaneously.

The same-name convention is the detail that makes the operational model tractable. Producers always write to orders. Failover is a bootstrap server change, not a topic name change. Consumer regex subscriptions handle both the owned and mirror topic streams without hardcoded names. The prefixed mirrors make the origin of each stream explicit for consumers without imposing that knowledge on producers.

The bidirectional link's reverse command removes the most operationally awkward part of active-passive failback: creating a new link in the reverse direction under pressure. Failback becomes a sequenced set of reverse, promote, and mirror create commands on the existing link object, with consumer offsets valid at every step and no offset resets at any transition.

The complexity that active-active introduces lives in the consumer layer. The ordering problem across two topic streams is real, and pattern 3 — idempotent processing with deduplication by event ID — is the right default for most event types. Reserve timestamp windowing or per-entity sequence numbers for the specific event types where the business logic genuinely requires strict ordering.

Related on this blog: Architecture series