Case Study
The Problem
A 160+-year-old chemical company owned by a major diversified holding company had accumulated three separate enterprise systems over years of growth and acquisition. Each system managed records across a different part of the business: banks, vendors, and clients.
The systems ran simultaneously, but they did not stay synchronized. Every transaction risked creating a duplicate. Worse, duplication was not only a cross-system problem. It happened within individual systems too. The result was fragmented records, inconsistent identifiers, and no reliable master list of anything.
That inconsistency carried real cost. Records lost to duplication or double-processed through synchronization gaps create liability: payments sent to the wrong entity, commitments tracked against the wrong account, decisions made on data nobody could fully trust.
The business needed:
The Approach
Cleaning up the historical data and keeping future data clean are fundamentally different challenges. We treated them separately.
For the historical backlog, the data was too inconsistent for rules alone. Company names were spelled differently across systems. Fields did not map cleanly. Records had been entered by different people over decades with no shared standards. We used AI-based matching to identify likely duplicates and relationships across that noise, surfacing candidates for reconciliation that deterministic matching would have missed.
Once the legacy data was clean, we switched approaches entirely. The forward-looking process was built to be fully deterministic: explicit rules, no model inference, no ongoing costs. The client could understand it, maintain it, and own it without us.
The integration itself was bespoke: written specifically for the three systems as they existed, preserving their structure and operations while adding a reconciliation layer that kept records synchronized going forward.
Why It Worked
A common mistake in data cleanup projects is reaching for AI everywhere, including in places where deterministic logic is cheaper, faster, and more auditable. We used AI where the data was genuinely ambiguous and human judgment would have been needed otherwise. Once that ambiguity was resolved, we encoded the results as rules and handed control back to the client.
This meant the forward process cost nothing to run, required no model retraining, and was fully transparent to the internal team. There were no black boxes in day-to-day operations.
The constraint of keeping all three systems in place was also treated as a design requirement, not a limitation. The integration layer was built around those systems as they were, which meant no disruptive migration, no re-training staff on new tools, and a much shorter path to a working solution.
Results
Work With Us
We build integrations and data systems designed to work within your existing infrastructure, then hand them off clean.
Start a Conversation