Kafka and ZooKeeper: Legacy, Challenges, and the Future with KRaft

Apache Kafka

5 MIN READ

December 2, 2025

Loading

kafka with zookeeper - blog
Apache Kafka has become the go-to solution for real-time event streaming across various industries. Its distributed architecture and scalability power high-throughput pipelines used in everything from e-commerce to telemetry systems. But behind this robust system, there has been a key enabler quietly coordinating the cluster's heartbeat: Apache ZooKeeper.

Apache ZooKeeper has played an integral role in managing Kafka’s operations for many years. But as Kafka matures, the dependency on ZooKeeper is being phased out, ushering in a new era of simplified architecture through the KRaft protocol.

In this blog, we’ll take a deep dive into the role ZooKeeper has played in Kafka’s architecture, why this dependency existed, its limitations, and how Kafka is evolving beyond it. If you’re just starting out or managing production Kafka clusters, understanding this evolution is crucial.

What Is ZooKeeper and Why Was It Used in Kafka?

Apache ZooKeeper is a distributed coordination service that offers features such as configuration management, leader election, and metadata synchronization. Its primary purpose is to maintain reliable coordination across distributed systems. When Kafka was first developed, ZooKeeper was adopted to handle critical functions that Kafka itself wasn’t equipped for at the time. In the context of Kafka, ZooKeeper historically played a crucial role in:

  • Broker Coordination

ZooKeeper maintains a list of all the active Kafka brokers in the cluster. When a broker starts, it registers itself in ZooKeeper. This helps other components discover and communicate with it.

  • Controller Election

In Kafka, one broker acts as a controller to manage the state of partitions and replicas. ZooKeeper was responsible for electing this controller broker through leader election.

  •  Metadata Management

ZooKeeper stores metadata such as Topics and their partitions, Replica assignments, and Configuration changes. This data was critical for Kafka brokers to operate consistently and ensure fault tolerance.

  • Detecting Failures

ZooKeeper used ephemeral nodes to detect broker failures. If a broker went down, its node in ZooKeeper disappeared, prompting the controller to reassign its tasks.

In essence, ZooKeeper enabled Kafka to be distributed, consistent, and highly available.

Benefits of Using ZooKeeper with Kafka

While managing ZooKeeper adds some complexity, it offers several advantages in a Kafka deployment:

  • Reliable Coordination: ZooKeeper ensured that Kafka clusters could coordinate actions like controller failover and partition management.
  • Consistency: Centralized metadata management helped keep the Kafka cluster in sync.
  • Scalability Foundation: In the earlier versions, ZooKeeper allowed Kafka to scale to hundreds of brokers efficiently.

However, as Kafka evolved and adoption grew to enterprise-level scales, several challenges started surfacing.

The Challenges of Using ZooKeeper with Kafka

While ZooKeeper was essential in Kafka’s early success, it introduced several complexities and limitations as Kafka grew in scale and usage:

  • Operational Overhead

Maintaining a separate ZooKeeper cluster demanded additional resources, monitoring, and troubleshooting. Administrators had to deal with cluster synchronization, node failures, and quorum requirements.

  •  Scalability Bottlenecks

ZooKeeper wasn’t built to handle high-frequency updates or manage large metadata volumes. As Kafka clusters scaled with thousands of partitions, ZooKeeper became a performance bottleneck.

  • Increased Latency

Changes like broker reassignment, topic creation, or leadership failover had to be propagated through ZooKeeper, often resulting in increased latency for those operations.

  • Tight Coupling

Kafka’s core services were tightly bound to ZooKeeper. Any issues with ZooKeeper (e.g., split-brain scenarios or session expirations) directly impacted Kafka’s availability and stability.

Streamline your Kafka migration or deployment.

The KRaft Revolution: Kafka Without ZooKeeper

To address the shortcomings of ZooKeeper and simplify Kafka’s architecture, the community introduced KRaft – short for Kafka Raft Metadata mode. This was a significant milestone in Kafka’s evolution.

KRaft eliminates the need for ZooKeeper by integrating metadata management directly into Kafka using a Raft-based consensus protocol. Here’s how it changes the game:

Key Features of KRaft:

  • Metadata Quorum: Replaces ZooKeeper with a quorum of Kafka controllers that replicate metadata using the Raft consensus algorithm.
  • Self-Managed Metadata: Kafka stores topic configurations, partition assignments, and other metadata internally—no need for external coordination.
  • Faster Leader Elections: KRaft enables faster and more deterministic leader elections, reducing downtime in failover scenarios.
  • Improved Observability: Metrics and logs are consolidated, giving developers a clearer picture of what’s happening across the system.
  • Simplified Deployment: No need to provision or maintain a separate ZooKeeper cluster.

ZooKeeper vs. KRaft: A Quick Comparison

Feature / Capability ZooKeeper-Based Kafka Kafka with KRaft (Raft Metadata Mode)
Metadata Management Stored in external ZooKeeper Stored internally in Kafka metadata quorum
Leader Election ZooKeeper handles controller election Raft algorithm ensures deterministic elections
Architecture Complexity Requires 2 clusters (Kafka + ZK) Single Kafka cluster, no external dependencies
Failover Time Slower, session-based Significantly faster, deterministic
Scalability Limited by ZooKeeper write/latency constraints Designed for larger metadata loads and broker counts
Operational Cost Requires separate cluster setup and maintenance Reduced cost; fewer nodes and easier ops
Security Management Two security models (Kafka + ZK) Unified Kafka security model
Observability Split between Kafka and ZooKeeper Centralized in Kafka controllers
Configuration Management ZK stores configs Kafka stores configs natively
Support & Future Roadmap Deprecated; no longer advancing Officially the future of Kafka

Ksolves: Your Trusted Partner for Apache Kafka Consulting

At Ksolves, we help businesses build powerful, real-time streaming solutions with Apache Kafka. Whether you’re running Kafka clusters with ZooKeeper, planning a smooth migration to KRaft, or starting your Kafka journey from the ground up, our certified Kafka experts are here to guide you every step of the way.

Our Kafka Consulting Services Include:

  • Kafka Cluster Setup & Performance Tuning
  •  ZooKeeper to KRaft Migration
  •  Real-Time Data Pipeline Development
  • 24/7 Kafka Monitoring, Scaling & Support

Contact us to discuss your project requirements with our experts.

Conclusion

ZooKeeper was an essential part of Kafka’s early journey, ensuring reliable coordination, managing metadata, and enabling Kafka’s distributed nature. But as demands grew and architecture evolved, the need for a more scalable, integrated solution became clear. That solution is KRaft. Whether you’re maintaining an existing cluster or planning a new one, understanding this architectural shift is vital to making informed decisions.

Loading

AUTHOR

author image
Atul Khanduri

Apache Kafka

Atul Khanduri, a seasoned Associate Technical Head at Ksolves India Ltd., has 12+ years of expertise in Big Data, Data Engineering, and DevOps. Skilled in Java, Python, Kubernetes, and cloud platforms (AWS, Azure, GCP), he specializes in scalable data solutions and enterprise architectures.

Leave a Comment

Your email address will not be published. Required fields are marked *

(Text Character Limit 350)

Frequently Asked Questions

Is ZooKeeper completely removed from Kafka?

Not yet. ZooKeeper support still exists in current versions, but the Kafka community has officially moved toward KRaft as the long-term architecture. ZooKeeper will be fully deprecated in future Kafka releases.

Is it mandatory to migrate to KRaft?

Yes, eventually. Since new Kafka improvements are built around KRaft, staying on ZooKeeper will limit future upgrades and features.

Can I migrate an existing ZooKeeper-based Kafka cluster to KRaft?

Yes, but migration requires a planned downtime and the use of Kafka’s metadata migration tool. Rolling migrations are not currently supported.

Does KRaft improve Kafka performance?

Yes. KRaft offers faster leader elections, reduced metadata latency, quicker failovers, and better scalability because metadata is handled internally using Raft consensus.

What happens to ZooKeeper after migrating to KRaft?

Once the Kafka cluster runs successfully in KRaft mode, ZooKeeper is no longer required and can be safely decommissioned.

Do I still need quorum nodes with KRaft?

Yes. Instead of ZooKeeper nodes, you now maintain a KRaft controller quorum (usually 3 controllers) responsible for metadata replication and consensus.

Is KRaft production-ready for real workloads?

Yes. KRaft is considered production-ready starting from Kafka 3.5+, with significant stability improvements in later versions.

Will my applications change after migrating to KRaft?

No. Producer and consumer APIs remain unchanged. Migration impacts cluster configuration and operations, not application code.

Is KRaft more reliable than ZooKeeper?

Yes. KRaft offers deterministic Raft-based consensus, avoids ZooKeeper session issues, and keeps all metadata inside Kafka for better consistency and reliability.

Can I run a mixed cluster with some brokers on ZooKeeper and some on KRaft?

No, Kafka does not support running a hybrid or mixed-mode cluster. A Kafka cluster operates entirely in one mode only:

ZooKeeper mode → all brokers depend on ZooKeeper

KRaft mode → all brokers depend on the internal Raft-based controller quorum