← Back to Work

Case Study

Consistent Records Across Three Legacy Enterprise Systems

Sector Specialty Chemicals
Client Type 160+-year-old subsidiary
Use Case Master Data Governance
Engagement AI + Deterministic Integration
100,000s
Company records deduplicated and reconciled across three systems
Zero
Ongoing AI costs once the project was done — the forward process runs on rules, not inference
3 systems
Kept intact bespoke integration across all three, no platform replacement

Three systems, none of them in agreement.

A 160+-year-old chemical company owned by a major diversified holding company had accumulated three separate enterprise systems over years of growth and acquisition. Each system managed records across a different part of the business: banks, vendors, and clients.

The systems ran simultaneously, but they did not stay synchronized. Every transaction risked creating a duplicate. Worse, duplication was not only a cross-system problem. It happened within individual systems too. The result was fragmented records, inconsistent identifiers, and no reliable master list of anything.

That inconsistency carried real cost. Records lost to duplication or double-processed through synchronization gaps create liability: payments sent to the wrong entity, commitments tracked against the wrong account, decisions made on data nobody could fully trust.

The business needed:

  • All historical duplication cleaned up and reconciled
  • Consistent, deduplicated records established across banks, vendors, and clients
  • A durable process to prevent duplication from recurring
  • All of this without replacing or consolidating any of the three existing systems

Two distinct problems, two different tools.

Cleaning up the historical data and keeping future data clean are fundamentally different challenges. We treated them separately.

For the historical backlog, the data was too inconsistent for rules alone. Company names were spelled differently across systems. Fields did not map cleanly. Records had been entered by different people over decades with no shared standards. We used AI-based matching to identify likely duplicates and relationships across that noise, surfacing candidates for reconciliation that deterministic matching would have missed.

Once the legacy data was clean, we switched approaches entirely. The forward-looking process was built to be fully deterministic: explicit rules, no model inference, no ongoing costs. The client could understand it, maintain it, and own it without us.

The integration itself was bespoke: written specifically for the three systems as they existed, preserving their structure and operations while adding a reconciliation layer that kept records synchronized going forward.

The right tool for each phase, not one tool for everything.

A common mistake in data cleanup projects is reaching for AI everywhere, including in places where deterministic logic is cheaper, faster, and more auditable. We used AI where the data was genuinely ambiguous and human judgment would have been needed otherwise. Once that ambiguity was resolved, we encoded the results as rules and handed control back to the client.

This meant the forward process cost nothing to run, required no model retraining, and was fully transparent to the internal team. There were no black boxes in day-to-day operations.

The constraint of keeping all three systems in place was also treated as a design requirement, not a limitation. The integration layer was built around those systems as they were, which meant no disruptive migration, no re-training staff on new tools, and a much shorter path to a working solution.

Clean records, sustainable process, clean handoff.

  • Hundreds of thousands of records deduplicated and reconciled across all three systems
  • Consistent master records established for banks, vendors, and clients
  • All three existing systems preserved, with no platform replacement or migration required
  • Forward process fully deterministic: no ongoing AI inference costs
  • Internal team able to maintain and extend the process independently after handoff

Tech Stack

Python
LLM-based entity matching
Deterministic deduplication rules
Cross-system record reconciliation
Bespoke integration layer

Key Outcomes

100,000s of records cleaned across 3 systems
Consistent master records for banks, vendors, clients
No system replacement required
Forward process is fully deterministic and maintainable
Zero ongoing AI inference costs after handoff
Start a Project →

Work With Us

Data problems don't fix themselves.

We build integrations and data systems designed to work within your existing infrastructure, then hand them off clean.

Start a Conversation