Kafka and ZooKeeper: Legacy, Challenges, and the Future with KRaft
Apache Kafka
5 MIN READ
December 2, 2025
Apache Kafka has become the go-to solution for real-time event streaming across various industries. Its distributed architecture and scalability power high-throughput pipelines used in everything from e-commerce to telemetry systems. But behind this robust system, there has been a key enabler quietly coordinating the cluster's heartbeat: Apache ZooKeeper.
Apache ZooKeeper has played an integral role in managing Kafka’s operations for many years. But as Kafka matures, the dependency on ZooKeeper is being phased out, ushering in a new era of simplified architecture through the KRaft protocol.
In this blog, we’ll take a deep dive into the role ZooKeeper has played in Kafka’s architecture, why this dependency existed, its limitations, and how Kafka is evolving beyond it. If you’re just starting out or managing production Kafka clusters, understanding this evolution is crucial.
What Is ZooKeeper and Why Was It Used in Kafka?
Apache ZooKeeper is a distributed coordination service that offers features such as configuration management, leader election, and metadata synchronization. Its primary purpose is to maintain reliable coordination across distributed systems. When Kafka was first developed, ZooKeeper was adopted to handle critical functions that Kafka itself wasn’t equipped for at the time. In the context of Kafka, ZooKeeper historically played a crucial role in:
Broker Coordination
ZooKeeper maintains a list of all the active Kafka brokers in the cluster. When a broker starts, it registers itself in ZooKeeper. This helps other components discover and communicate with it.
Controller Election
In Kafka, one broker acts as a controller to manage the state of partitions and replicas. ZooKeeper was responsible for electing this controller broker through leader election.
Metadata Management
ZooKeeper stores metadata such as Topics and their partitions, Replica assignments, and Configuration changes. This data was critical for Kafka brokers to operate consistently and ensure fault tolerance.
Detecting Failures
ZooKeeper used ephemeral nodes to detect broker failures. If a broker went down, its node in ZooKeeper disappeared, prompting the controller to reassign its tasks.
In essence, ZooKeeper enabled Kafka to be distributed, consistent, and highly available.
Benefits of Using ZooKeeper with Kafka
While managing ZooKeeper adds some complexity, it offers several advantages in a Kafka deployment:
Reliable Coordination: ZooKeeper ensured that Kafka clusters could coordinate actions like controller failover and partition management.
Consistency: Centralized metadata management helped keep the Kafka cluster in sync.
Scalability Foundation: In the earlier versions, ZooKeeper allowed Kafka to scale to hundreds of brokers efficiently.
However, as Kafka evolved and adoption grew to enterprise-level scales, several challenges started surfacing.
The Challenges of Using ZooKeeper with Kafka
While ZooKeeper was essential in Kafka’s early success, it introduced several complexities and limitations as Kafka grew in scale and usage:
Operational Overhead
Maintaining a separate ZooKeeper cluster demanded additional resources, monitoring, and troubleshooting. Administrators had to deal with cluster synchronization, node failures, and quorum requirements.
Scalability Bottlenecks
ZooKeeper wasn’t built to handle high-frequency updates or manage large metadata volumes. As Kafka clusters scaled with thousands of partitions, ZooKeeper became a performance bottleneck.
Increased Latency
Changes like broker reassignment, topic creation, or leadership failover had to be propagated through ZooKeeper, often resulting in increased latency for those operations.
Tight Coupling
Kafka’s core services were tightly bound to ZooKeeper. Any issues with ZooKeeper (e.g., split-brain scenarios or session expirations) directly impacted Kafka’s availability and stability.
Streamline your Kafka migration or deployment.
The KRaft Revolution: Kafka Without ZooKeeper
To address the shortcomings of ZooKeeper and simplify Kafka’s architecture, the community introduced KRaft – short for Kafka Raft Metadata mode. This was a significant milestone in Kafka’s evolution.
KRaft eliminates the need for ZooKeeper by integrating metadata management directly into Kafka using a Raft-based consensus protocol. Here’s how it changes the game:
Key Features of KRaft:
Metadata Quorum: Replaces ZooKeeper with a quorum of Kafka controllers that replicate metadata using the Raft consensus algorithm.
Self-Managed Metadata: Kafka stores topic configurations, partition assignments, and other metadata internally—no need for external coordination.
Faster Leader Elections: KRaft enables faster and more deterministic leader elections, reducing downtime in failover scenarios.
Improved Observability: Metrics and logs are consolidated, giving developers a clearer picture of what’s happening across the system.
Simplified Deployment: No need to provision or maintain a separate ZooKeeper cluster.
ZooKeeper vs. KRaft: A Quick Comparison
Feature / Capability
ZooKeeper-Based Kafka
Kafka with KRaft (Raft Metadata Mode)
Metadata Management
Stored in external ZooKeeper
Stored internally in Kafka metadata quorum
Leader Election
ZooKeeper handles controller election
Raft algorithm ensures deterministic elections
Architecture Complexity
Requires 2 clusters (Kafka + ZK)
Single Kafka cluster, no external dependencies
Failover Time
Slower, session-based
Significantly faster, deterministic
Scalability
Limited by ZooKeeper write/latency constraints
Designed for larger metadata loads and broker counts
Operational Cost
Requires separate cluster setup and maintenance
Reduced cost; fewer nodes and easier ops
Security Management
Two security models (Kafka + ZK)
Unified Kafka security model
Observability
Split between Kafka and ZooKeeper
Centralized in Kafka controllers
Configuration Management
ZK stores configs
Kafka stores configs natively
Support & Future Roadmap
Deprecated; no longer advancing
Officially the future of Kafka
Ksolves: Your Trusted Partner for Apache Kafka Consulting
At Ksolves, we help businesses build powerful, real-time streaming solutions with Apache Kafka. Whether you’re running Kafka clusters with ZooKeeper, planning a smooth migration to KRaft, or starting your Kafka journey from the ground up, our certified Kafka experts are here to guide you every step of the way.
Our Kafka Consulting Services Include:
Kafka Cluster Setup & Performance Tuning
ZooKeeper to KRaft Migration
Real-Time Data Pipeline Development
24/7 Kafka Monitoring, Scaling & Support
Contact us to discuss your project requirements with our experts.
Conclusion
ZooKeeper was an essential part of Kafka’s early journey, ensuring reliable coordination, managing metadata, and enabling Kafka’s distributed nature. But as demands grew and architecture evolved, the need for a more scalable, integrated solution became clear. That solution is KRaft. Whether you’re maintaining an existing cluster or planning a new one, understanding this architectural shift is vital to making informed decisions.
Atul Khanduri, a seasoned Associate Technical Head at Ksolves India Ltd., has 12+ years of expertise in Big Data, Data Engineering, and DevOps. Skilled in Java, Python, Kubernetes, and cloud platforms (AWS, Azure, GCP), he specializes in scalable data solutions and enterprise architectures.
Not yet. ZooKeeper support still exists in current versions, but the Kafka community has officially moved toward KRaft as the long-term architecture. ZooKeeper will be fully deprecated in future Kafka releases.
Is it mandatory to migrate to KRaft?
Yes, eventually. Since new Kafka improvements are built around KRaft, staying on ZooKeeper will limit future upgrades and features.
Can I migrate an existing ZooKeeper-based Kafka cluster to KRaft?
Yes, but migration requires a planned downtime and the use of Kafka’s metadata migration tool. Rolling migrations are not currently supported.
Does KRaft improve Kafka performance?
Yes. KRaft offers faster leader elections, reduced metadata latency, quicker failovers, and better scalability because metadata is handled internally using Raft consensus.
What happens to ZooKeeper after migrating to KRaft?
Once the Kafka cluster runs successfully in KRaft mode, ZooKeeper is no longer required and can be safely decommissioned.
Do I still need quorum nodes with KRaft?
Yes. Instead of ZooKeeper nodes, you now maintain a KRaft controller quorum (usually 3 controllers) responsible for metadata replication and consensus.
Is KRaft production-ready for real workloads?
Yes. KRaft is considered production-ready starting from Kafka 3.5+, with significant stability improvements in later versions.
Will my applications change after migrating to KRaft?
No. Producer and consumer APIs remain unchanged. Migration impacts cluster configuration and operations, not application code.
Is KRaft more reliable than ZooKeeper?
Yes. KRaft offers deterministic Raft-based consensus, avoids ZooKeeper session issues, and keeps all metadata inside Kafka for better consistency and reliability.
Can I run a mixed cluster with some brokers on ZooKeeper and some on KRaft?
No, Kafka does not support running a hybrid or mixed-mode cluster. A Kafka cluster operates entirely in one mode only:
• ZooKeeper mode → all brokers depend on ZooKeeper
• KRaft mode → all brokers depend on the internal Raft-based controller quorum
Fill out the form below to gain instant access to our exclusive webinar. Learn from industry experts, discover the latest trends, and gain actionable insights—all at your convenience.
AUTHOR
Apache Kafka
Atul Khanduri, a seasoned Associate Technical Head at Ksolves India Ltd., has 12+ years of expertise in Big Data, Data Engineering, and DevOps. Skilled in Java, Python, Kubernetes, and cloud platforms (AWS, Azure, GCP), he specializes in scalable data solutions and enterprise architectures.
Share with