Project Name
How Ksolves Built a Unified MDM Platform for a UAE Automotive Group & Connected Fragmented Customer Data Across 8 Brands and 3 ERP Systems
![]()
The client is a UAE-based premium automotive group with one of the most diverse brand portfolios in the Gulf. Their portfolio spans exclusive dealerships for BMW, Rolls-Royce, Mini, BMW Motorrad, Geely, INEOS, and Riddara; vehicle rental and leasing under Budget Rent-A-Car and Prime Limousines; and a multi-brand after-sales division through Pitstop Autocare.
Across these eight business lines, the group manages everything from new vehicle purchases and service bookings to short-term rentals and long-term fleet leasing. Their data lived in three separate ERPs: Keyloop Autoline Drive, ERPNext (Polaris), and SpeedAuto. These systems did not talk to each other, which meant customer records were duplicated, fragmented, and impossible to unify for loyalty, CRM, or analytics.
The group’s Chief Digital Officer engaged Ksolves to design and build an enterprise MDM and CDW platform from scratch, one that could consolidate customer and vehicle data from all three source systems into a single, governed golden record layer.
Ksolves’ Big Data consulting team used AI-assisted data profiling and automated de-duplication modelling at the design stage to reduce configuration effort and accelerate delivery timelines by roughly half compared to a conventional build approach.
-
Fragmented Customer Identity Across Three Source Systems
The same customer appearing in Keyloop (dealership), ERPNext (finance and ERP), and SpeedAuto (rental) existed as three separate, unlinked records with no shared identifier. There was no way to build a unified Customer 360 view. This caused duplicate communications, missed cross-sell opportunities, and loyalty programme gaps across every brand. -
Missing Vehicle Lifecycle Golden Record
Vehicle records, including VIN, ownership history, service events, rental periods, and modifications, were spread across source systems with no connection between them. No master vehicle record existed to track a car from new sale through servicing, leasing, and resale across the full group lifecycle. -
Batch-Only Data Flows with No Real-Time Updates
Keyloop's primary integration method was batch CSV exports. No event-driven architecture was in place, which meant data freshness across business lines was poor. Cross-system analytics depended on stale overnight feeds. Real-time responses to customer activity were not possible. -
No De-duplication or Golden Record Logic
Duplicate customer records were a confirmed known issue across source systems. Without deterministic and fuzzy matching, including Arabic and English name variants and Emirates ID as deduplication keys, the group had no way to detect or resolve duplicate identities at the point of ingestion. -
No Centralized Analytics or CDW Layer
Business intelligence was fragmented per system. Leadership had no cross-functional view of customer lifetime value, vehicle performance, or revenue across dealership, after-sales, and rental lines. Group-level reporting depended entirely on manual extraction and Excel consolidation. -
UAE PDPL and GDPR Compliance Gaps
Customer PII, including Emirates ID, passport numbers, and mobile data, flowed across systems with no governed compliance framework. There was no consent tracking, no encryption policy, no audit logging, and no configured data retention aligned to UAE PDPL or GDPR requirements. -
No Governed Data Stewardship Process
With no stewardship portal, data quality ownership was undefined. Borderline merge decisions, incorrect records, and cleansing rule management had no structured workflow. Data quality depended entirely on informal manual processes across business units.
-
Multi-Source Ingestion Engine
Ksolves designed native connectors for Keyloop Autoline Drive (polling and delta CSV), ERPNext/Polaris (REST API and bulk), and SpeedAuto (API and file-based), along with a generic REST API gateway for third-party sources and a scheduled CSV batch upload interface. All flows maintained full provenance metadata, idempotent processing, and quarantine queuing for failed records. Apache NiFi and AWS API Gateway handled end-to-end ingestion orchestration. -
Intelligent De-duplication and Golden Record Engine
The de-duplication engine combined deterministic exact-match on Emirates ID, passport, email, phone, and CRM ID with probabilistic fuzzy matching using Jaro-Winkler, Levenshtein, and Soundex algorithms, covering both Arabic and English name variants. Configurable confidence thresholds auto-merged high-confidence duplicates and routed borderline cases to a governed Stewardship Queue. AI-assisted matching model calibration was used during the build to tune thresholds accurately across all three source system formats without manual trial and error. The result: greater than 95% de-duplication accuracy across 1 million records in 30 minutes or less. -
Centralized Data Repository and CDW on ClickHouse
ClickHouse was selected as the primary columnar analytical database for both the MDM golden record layer and the decoupled CDW. The Bronze-Silver-Gold Medallion Architecture stored raw, cleansed, and curated data in clearly separated layers. The platform delivered sub-500ms Customer 360 and Vehicle 360 API responses, full SCD Type 2 and Type 4 support, and native ODBC/JDBC connectivity for Power BI, Tableau, and Qlik Sense, all within the AWS Middle East (UAE) region for PDPL data residency compliance. -
Real-Time Event-Driven Architecture
All record lifecycle events, creation, update, merge, and deletion, were published to Apache Kafka via AWS MSK as structured, schema-governed Avro payloads. Entities covered: Customer, Vehicle, Transaction, and Events/Leads. End-to-end event latency was maintained below 5 seconds, enabling real-time downstream consumption by CDP, marketing automation, and contact centre systems. -
Data Governance Portal and Stewardship Queue
A web-based governance portal, built on the group's preferred Next.js and React stack, gave data stewards real-time stewardship queue management, side-by-side source record comparison with confidence scoring, no-code matching rule configuration, live data quality KPI dashboards, and RBAC-governed access controls aligned to Azure AD SSO. -
Security and Compliance Framework
The platform enforced AES-256 encryption at rest, TLS 1.3 in transit, Azure AD-based RBAC and ABAC across all access points, immutable audit logs retained for 7 years, and configurable consent and retention policies per entity. The full platform was deployed within AWS me-central-1 (UAE) to satisfy UAE PDPL data residency requirements and support GDPR consent management. Vulnerability management was configured to a zero-tolerance threshold of CVSS below 7.0.
Technology Stack
| Category | Technology | Role |
|---|---|---|
| Integration | Apache NiFi + AWS API Gateway | Ingestion orchestration across Keyloop, ERPNext, and SpeedAuto |
| Processing | Apache Spark + MLlib | Distributed cleansing, standardization, and de-duplication |
| Database | ClickHouse | Columnar MDM and CDW engine, sub-500ms query responses |
| Platform | AWS MSK (Apache Kafka) | Real-time event backbone, under 5-second end-to-end latency |
| Security | Azure AD + AWS WAF + GuardDuty | RBAC/ABAC, AES-256, TLS 1.3, UAE PDPL and GDPR compliance |
| Frontend | React / Next.js | Data Governance Portal for stewardship and data quality management |
-
Unified Customer Identity Across All 8 Brands
Before, a customer appearing across BMW dealership, Budget Rent-A-Car, and Pitstop Autocare existed as three separate, unlinked identities. After, a single authoritative golden customer record consolidates identities from all three source systems using Emirates ID, fuzzy Arabic and English name matching, and configurable survivorship rules. The unified Customer 360 view is available across all group brands at sub-500ms API response times. -
De-duplication at Scale: 1 Million Records in 30 Minutes or Less
Before, duplicate records were a known data quality issue with no automated resolution process. After, the Apache Spark and MLlib de-duplication engine processes 1 million records in 30 minutes or less at greater than 95% accuracy. Borderline records are automatically routed to the governed Stewardship Queue for human review. Ad-hoc manual resolution has been replaced by a fully governed automated pipeline. -
Real-Time Event Integration with Under 5-Second Latency
Before, all cross-system data flows were batch-only. Keyloop was limited to CSV exports, creating multi-hour data freshness gaps and preventing real-time reactions to customer activity. After, every customer and vehicle record lifecycle event is published to AWS MSK (Apache Kafka) within 5 seconds of creation or update, enabling real-time CDP triggers, marketing automation, and contact centre screen-pops without polling the central repository. -
99.99% Platform Availability with Disaster Recovery
Before, no unified data platform existed and no enterprise-level HA or DR strategy governed the central data layer. After, the multi-AZ AWS deployment with active-passive disaster recovery delivers 99.99% availability, with RPO of 1 hour or less and RTO of 4 hours or less, making the platform dependable for group-wide operational use. -
UAE PDPL and GDPR Compliance Fully Established
Before, customer PII including Emirates ID and passport data flowed across systems without a governed compliance framework, audit trail, or configured data retention policy. After, the platform enforces AES-256 encryption at rest, TLS 1.3 in transit, immutable audit logs retained for 7 years, and configurable consent and retention policies per entity, all within AWS me-central-1 (UAE) for data residency compliance.
“We had the same customer living as three separate records across Keyloop, ERPNext, and SpeedAuto with no way to connect them. Ksolves built a de-duplication and golden record platform that handles Arabic and English name variants, Emirates ID matching, and cross-brand identity resolution; and delivered it on schedule. Our CDO now has a Customer 360 view that simply didn’t exist before.”
– Group Head of Technology, UAE Premium Automotive Group
Before this engagement, a UAE premium automotive group operating 8 brands across Keyloop, ERPNext, and SpeedAuto had no unified customer identity, no master vehicle record, no real-time data flows, and no governed compliance framework. Cross-brand analytics, personalised CX, and loyalty programmes were not structurally possible.
After, Ksolves designed and built a purpose-built MDM and CDW platform delivering a single golden record for every customer and vehicle across all brands, powered by Apache Spark de-duplication, ClickHouse columnar analytics, and Apache Kafka real-time event streaming, deployed fully within AWS UAE for PDPL compliance. The platform meets all 20 functional and non-functional requirements defined in the engagement scope.
The configuration-first, microservices-native architecture lets the group onboard new brands or source systems in under 1 hour with zero downtime, keeping the data backbone ready for expansion into mobility services, loyalty programmes, and marketing automation.
Talk to Ksolves About Designing a Master Data Management Platform That Consolidates Your Data; Built Precisely to Your Architecture.