Project Name

How Ksolves Built a 1.5PB Data Lakehouse for a Multi-Operator Telecom CDR Provider

How Ksolves Built a 1.5PB Data Lakehouse for a Multi-Operator Telecom CDR Provider
Industry
Telecommunication
Technology
Apache NiFi,Minion, Apache Kafka, Apache Spark, Apache Hudi,Trino, Apache Superset, Keycloak, Apache Airflow

Loading

How Ksolves Built a 1.5PB Data Lakehouse for a Multi-Operator Telecom CDR Provider
Client Overview

Every day, telecom operators generate massive volumes of Call Detail Records (CDRs), which are logs of every call made, received, or missed on their network. For a business that manages CDR data on behalf of multiple telecom operators, handling that data reliably, securely, and at scale is not optional. It is the entire product.

 

One leading telecommunications data services provider was doing exactly that. They were collecting over 5TB of CDR files every day from mobile sites spread across multiple locations. Four or more Mobile Network Operators (MNOs) were all sharing the same platform, and each one required its data to be kept completely separate from the others. Any mix-up would be a serious regulatory breach.

 

Their existing setup could not keep up. They needed a new data platform built from scratch, one that could collect data securely from every site, bring it all together in one place, and make it available for analysis without ever letting one operator see another’s data.

 

That is what Ksolves, an AI-first company, built for them.

Key Challenges

The client came to Ksolves with four problems that needed to be solved before anything else could work:

  • No Secure Way to Collect CDR Files at Each Site: Each mobile site needed a safe and auditable way for operators to upload their CDR files. Simple shared folders were not good enough. Every upload needed to be controlled, logged, and isolated so that nothing inside the environment was ever exposed.
  • Keeping All Sites in Sync: The client had four or more mobile sites spread across different locations. If the data processing rules at one site were slightly different from another, it could cause silent errors in the data. Those errors would only show up later, after reports had already been generated and billing cycles had already run.
  • Handling 5TB of Data Every Day Without Gaps: The platform needed to collect and process over 5TB of CDR data every single day without missing anything. A gap in the data means incomplete billing records, incorrect reports, and broken SLA commitments to the operators.
  • Keeping Each Operator's Data Completely Separate: All four or more operators were sharing the same platform. But each one's data had to be kept completely invisible to the others. This was not just a preference. It was a legal and regulatory requirement.
Our Solutions

Ksolves, working with an AI-first delivery approach, used AI-assisted planning tools to design the architecture, validate security configurations, and accelerate development before any code was written. The platform was built in two layers: one at the edge (the mobile sites) and one at the center (the analytics hub).

  • Secure File Collection at Every Mobile Site: At each mobile site, a tool called SFTPGo was set up to act as a secure gateway. Operators can upload their CDR files through it, but they cannot see or access anything else in the system. Every upload is logged. Once the files arrive, Apache NiFi checks them for errors, cleans them up, and prepares them for sending to the central hub. NiFi Registry makes sure all sites follow the same rules. If a rule changes, it is updated at every site at the same time, with the ability to roll back instantly if something goes wrong.
  • Sending Data to the Central Hub: Once the data is ready at the edge, it is sent securely to a central Apache Kafka system. Kafka acts as a buffer between the mobile sites and the main processing engine, so activity at one end never slows down the other.
  • Storing 1.5PB of Data Safely: All the CDR data is stored in a MinIO storage system that can hold up to 1.5PB. MinIO uses a technology called erasure coding, which means the data is protected even if some of the hardware fails. The system is set up to keep 12 months of CDR history available at all times.
  • Processing Billions of Records: Apache Spark processes the incoming data from Kafka and organizes it into structured tables using Apache Hudi. This makes the data reliable, queryable, and ready for analysis. The system is built to stay running even if parts of it fail, thanks to YARN High Availability.
  • Keeping Every Operator's Data Separate: A security system called Keycloak manages who can log in and what they can see. Each operator gets their own isolated workspace. Apache Superset enforces role-based and row-level security controls on all dashboards, ensuring no operator can ever see another's records.
  • Making the Data Available for Analysis: Trino lets analysts run fast federated SQL queries directly on the stored data without moving it anywhere. Apache Superset uses those queries to power separate, governed dashboards for each operator, all updated with fresh data.

Technology Stack

Component Details
Edge Ingestion SFTPGo, Apache NiFi, NiFi Minion, NiFi Registry
Message Bus Apache Kafka
Processing Engine Apache Spark on YARN HA
Table Format Apache Hudi
Storage MinIO Enterprise, 1.5PB capacity, erasure coding
Query Engine Trino
Analytics Apache Superset, row-level security per MNO
Identity and Security Keycloak (OIDC SSO), Nginx, HAProxy
Orchestration Apache Airflow
AI Tooling AI-assisted architecture design, security validation, flow development
Impact

The platform has been running in production since launch and has delivered the following results:

  • 99.99% Data Durability: Every CDR record stored in the platform is protected by MinIO's erasure coding. No single hardware failure can cause data loss.
  • 5TB of Data Processed Every Day Without Gaps: The platform handles the full daily volume of CDR data across all mobile sites, with no missing records and no processing failures.
  • Zero Downtime Since Launch: The platform has not had a single unplanned outage since it went live. YARN HA and HAProxy keep everything running even when individual components fail.
  • 4+ Operators Running on One Platform, Fully Isolated: Each operator has its own completely separate data environment on the shared platform. No operator can see another's data, meeting all regulatory and contractual requirements.
  • 12 Months of CDR History Available for Querying: Analysts can query up to 12 months of historical CDR data in seconds using Trino, without any data movement or pipeline delays.
  • Faster Delivery with AI: Ksolves AI-first delivery approach compressed the architecture design and development phases significantly, getting the platform into production ahead of schedule.


Client Testimonial

“We needed a partner who understood the complexity of managing data for multiple operators on a single platform. Ksolves built exactly what we needed. Every operator’s data stays completely isolated, the platform has not gone down once since launch, and our analysts can query 12 months of history in seconds.”
— Head of Data Architecture, Telecommunications Data Services Provider

Conclusion

Ksolves, with its AI-first delivery approach, built a data platform that brought everything together. CDR data from every mobile site now flows into one secure, central system. Every operator’s data stays separate. Reports are fast, accurate, and always available.

 

The platform currently handles over 5TB of CDR data every day across a 1.5PB storage foundation, with 12 months of history available for analysis and zero downtime since launch. As the client onboards new operators and data volumes grow, the platform scales to match without any rebuilding.

 

For telecom providers managing CDR data across multiple operators and locations, explore Ksolves Big Data and Lakehouse Services and find out how a purpose-built data platform can work for your operations.

Managing CDR Data Across Multiple Operators? Let’s Build the Right Platform.