Case Study: Master Data Governance

The Problem

Three systems, none of them in agreement.

A 160+-year-old chemical company owned by a major diversified holding company had accumulated three separate enterprise systems over years of growth and acquisition. Each system managed records across a different part of the business: banks, vendors, and clients.

The systems ran simultaneously, but they did not stay synchronized. Every transaction risked creating a duplicate. Worse, duplication was not only a cross-system problem. It happened within individual systems too. The result was fragmented records, inconsistent identifiers, and no reliable master list of anything.

That inconsistency carried real cost. Records lost to duplication or double-processed through synchronization gaps create liability: payments sent to the wrong entity, commitments tracked against the wrong account, decisions made on data nobody could fully trust.

The business needed:

All historical duplication cleaned up and reconciled
Consistent, deduplicated records established across banks, vendors, and clients
A durable process to prevent duplication from recurring
All of this without replacing or consolidating any of the three existing systems

The Approach

Two distinct problems, two different tools.

Cleaning up the historical data and keeping future data clean are fundamentally different challenges. We treated them separately.

For the historical backlog, the data was too inconsistent for rules alone. Company names were spelled differently across systems. Fields did not map cleanly. Records had been entered by different people over decades with no shared standards. We used AI-based matching to identify likely duplicates and relationships across that noise, surfacing candidates for reconciliation that deterministic matching would have missed.

Once the legacy data was clean, we switched approaches entirely. The forward-looking process was built to be fully deterministic: explicit rules, no model inference, no ongoing costs. The client could understand it, maintain it, and own it without us.

The integration itself was bespoke: written specifically for the three systems as they existed, preserving their structure and operations while adding a reconciliation layer that kept records synchronized going forward.

Why It Worked

The right tool for each phase, not one tool for everything.

A common mistake in data cleanup projects is reaching for AI everywhere, including in places where deterministic logic is cheaper, faster, and more auditable. We used AI where the data was genuinely ambiguous and human judgment would have been needed otherwise. Once that ambiguity was resolved, we encoded the results as rules and handed control back to the client.

This meant the forward process cost nothing to run, required no model retraining, and was fully transparent to the internal team. There were no black boxes in day-to-day operations.

The constraint of keeping all three systems in place was also treated as a design requirement, not a limitation. The integration layer was built around those systems as they were, which meant no disruptive migration, no re-training staff on new tools, and a much shorter path to a working solution.

Results

Clean records, sustainable process, clean handoff.

Hundreds of thousands of records deduplicated and reconciled across all three systems
Consistent master records established for banks, vendors, and clients
All three existing systems preserved, with no platform replacement or migration required
Forward process fully deterministic: no ongoing AI inference costs
Internal team able to maintain and extend the process independently after handoff

Consistent Records Across Three Legacy Enterprise Systems

Three systems, none of them in agreement.

Two distinct problems, two different tools.

The right tool for each phase, not one tool for everything.

Clean records, sustainable process, clean handoff.

Tech Stack

Key Outcomes

Data problems don't fix themselves.