
Here is a conversation I have had at least thirty times in the past two years: A CTO shows me their shiny new AI initiative, and within five minutes we hit the same wall. "We would love to use this for predictive maintenance, but our production data is locked in a system from 2008." Or: "Our AVEVA PI historian has decades of sensor data, but our data science team cannot access it."
The Hidden Cost of Legacy Systems Legacy systems cost enterprises an average of nearly $40,000 annually in maintenance alone, and they consume up to 80% of IT budgets. Source: Industry Research
Yet here is the uncomfortable truth most vendors will not tell you: ripping and replacing that 15-year-old system is almost never the right answer.
During my years working across manufacturing, mining, and utilities at organisations like BHP, Rio Tinto, and Schneider Electric, I spent considerable time connecting legacy systems to modern platforms. What I learned is that the integration problem is not a technology problem. It is a pattern recognition problem. Once you understand the five core integration patterns, you can connect almost anything to anything.
This guide is for IT managers and CTOs who are tired of hearing "you need to modernise first" and want practical, battle-tested approaches to legacy system integration that actually ship.
Before we dive into the technical patterns, let me explain why I am so adamant about integration over replacement.
The Westpac banking group is currently working to retire 120 legacy systems. That is not a typo. This is a multi-year, capital-intensive effort that ties up enormous resources. For mid-market Australian businesses with IT teams of five to twenty people, that approach simply does not scale.
Here is what works instead:
The key insight: your legacy system is not the problem. The lack of a communication layer is the problem. Add that layer, and suddenly your 20-year-old ERP becomes a data source for your AI models.
Based on industry experience and documented case studies, five core integration patterns consistently work. In order of preference (cleanest to messiest), they are:
Let me walk you through each one with real implementation details.
If your legacy system exposes any kind of interface, even SOAP, even COM objects, even a command-line interface, you can wrap it with a modern REST or GraphQL API.
How It Works:
You build a thin service layer that translates modern API calls into whatever native protocol your legacy system speaks. This wrapper handles authentication, error handling, and data transformation.
# Example: Wrapping a legacy SOAP service
from flask import Flask, jsonify, request
import zeep
app = Flask(__name__)
legacy_client = zeep.Client('http://legacy-system:8080/service?wsdl')
@app.route('/api/v1/orders/<order_id>', methods=['GET'])
def get_order(order_id):
# Translate REST call to SOAP
legacy_response = legacy_client.service.GetOrder(OrderID=order_id)
# Transform to modern JSON format
return jsonify({
'order_id': legacy_response.OrderID,
'customer': legacy_response.CustomerName,
'status': legacy_response.Status,
'created_at': str(legacy_response.CreateDate)
})
AVEVA PI System Example:
AVEVA (formerly OSIsoft) provides excellent developer technologies for this exact purpose. Their PI Web API offers RESTful access to time-series data, and the PI SQL Framework lets you query historian data with standard SQL.
Consider a manufacturing plant with 15 years of sensor data locked in PI. By building an API wrapper that exposes temperature, pressure, and flow data as REST endpoints, a data science team can go from "we cannot access the data" to "we have a working predictive maintenance model" in six weeks. This is a common pattern I saw repeatedly during my time at Schneider Electric.
When to Use:
Cost Estimate: $15,000 to $50,000 AUD for a well-architected wrapper service, depending on complexity.
Many legacy systems, even those without APIs, sit on top of relational databases. If you can connect to that database, you have a direct line to the data.
How It Works:
Use ODBC or JDBC drivers to establish read access to the legacy database. Build views or stored procedures that expose the specific data sets you need, then connect your AI pipeline directly.
-- Create a view that exposes legacy inventory data
CREATE VIEW ai_inventory_feed AS
SELECT
p.product_id,
p.product_name,
i.quantity_on_hand,
i.reorder_point,
i.last_updated
FROM legacy_products p
JOIN legacy_inventory i ON p.product_id = i.product_id
WHERE i.last_updated > DATEADD(day, -1, GETDATE());
Critical Security Considerations:
This approach requires robust security controls. In my experience, you need:
Despite being over 30 years old, ODBC remains essential in enterprise environments. It provides standardised database connectivity across diverse systems, enabling applications to communicate with various databases without specialised code.
When to Use:
Cost Estimate: $5,000 to $20,000 AUD for secure database integration setup, plus ongoing monitoring.
When you need to orchestrate data flow between multiple legacy systems, or when legacy systems need to communicate asynchronously, middleware is your friend.
How It Works:
Deploy a message broker (RabbitMQ, Apache Kafka, or AWS SQS) as an intermediary. Legacy systems publish events to the broker, and modern systems subscribe to those events.
// Modern AI service subscribing to legacy events
const amqp = require('amqplib');
async function connectToLegacyEvents() {
const connection = await amqp.connect('amqp://broker');
const channel = await connection.createChannel();
// Subscribe to inventory changes from legacy ERP
await channel.assertQueue('legacy.inventory.changes');
channel.consume('legacy.inventory.changes', (msg) => {
const inventoryUpdate = JSON.parse(msg.content.toString());
// Feed to AI demand forecasting model
feedToAIModel(inventoryUpdate);
channel.ack(msg);
});
}
Middleware Options That Work Well:
According to integration experts, middleware excels in ecosystems where multiple legacy systems require seamless integration, when transforming data between formats is a regular necessity, or when operating on specifically tailored legacy systems that need to retain unique protocols.
When to Use:
Cost Estimate: $30,000 to $100,000 AUD for enterprise middleware implementation, less for open-source self-managed.
This is more common than you might think. Many legacy systems, especially older manufacturing and logistics software, export data as flat files. Embrace it.
How It Works:
The legacy system runs scheduled exports (nightly, hourly, or triggered) to CSV, XML, or JSON files. An integration service monitors a file location (local directory, FTP, or SFTP server), picks up new files, transforms the data, and routes it to your AI systems.
# File-based integration for legacy inventory system
import os
import pandas as pd
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
class LegacyFileHandler(FileSystemEventHandler):
def on_created(self, event):
if event.src_path.endswith('.csv'):
# Parse legacy CSV format
df = pd.read_csv(event.src_path, encoding='latin-1')
# Transform to modern schema
transformed = transform_legacy_format(df)
# Push to AI pipeline
push_to_ai_pipeline(transformed)
# Archive processed file
archive_file(event.src_path)
# Monitor the legacy export directory
observer = Observer()
observer.schedule(LegacyFileHandler(), '/mnt/legacy-exports/')
observer.start()
Real-World Pattern:
Consider a construction firm whose job costing system predates APIs entirely. But it can export daily CSV reports. A Python service that watches the SFTP server, parses the exports, and feeds the data into a cash flow forecasting model can be built in about two weeks. Total cost: typically under $10,000 AUD.
When to Use:
Cost Estimate: $5,000 to $25,000 AUD depending on transformation complexity.
I want to be direct here: screen scraping should be your last resort. It is brittle, maintenance-intensive, and breaks whenever the underlying application changes. But sometimes it is the only option.
How It Works:
Robotic Process Automation (RPA) tools interact with the legacy application's user interface, reading data from screens and entering commands as a human would.
When Screen Scraping Makes Sense:
Danske Bank deployed 250 UiPath bots to handle compliance, loans, and KYC activities on systems built with COBOL. The result: 60,000 annual employee hours saved, and data accuracy improved to 90%.
But Danske Bank is a large enterprise with dedicated RPA teams. For mid-market Australian businesses, I only recommend screen scraping when:
Technical Reality:
Any change in the underlying technology stack, from OS to web browser to the actual application, will break the integration. Budget for ongoing maintenance.
Cost Estimate: $20,000 to $80,000 AUD initial implementation, plus $10,000 to $30,000 AUD annual maintenance.
Here is the decision framework I use with clients:
| Question | If Yes | If No |
|---|---|---|
| Does the system have any API (REST, SOAP, GraphQL)? | Build an API wrapper | Continue |
| Can you access the underlying database? | Use database direct connection | Continue |
| Does the system support file exports? | Use file-based integration | Continue |
| Is there a message queue or event system? | Use middleware | Continue |
| Is the UI stable and predictable? | Consider screen scraping | Accept manual process |
The Hybrid Approach:
In practice, most enterprise integrations use multiple patterns. A typical manufacturing integration might use:
Here is the roadmap that consistently works for legacy integration projects:
For Australian CTOs, legacy integration comes with additional considerations:
Privacy Act 1988 Compliance: When integrating legacy systems containing personal information, ensure your integration layer maintains the same (or better) security controls as the source system. Data in transit must be encrypted.
Cloud Residency: If your AI platform runs in the cloud, verify that data remains in Australian regions. AWS Sydney (ap-southeast-2) and Azure Australia East both support major AI services.
Industry-Specific Requirements:
Based on current Australian market rates, here is what you should budget for legacy integration:
| Integration Approach | Initial Cost (AUD) | Annual Maintenance (AUD) |
|---|---|---|
| API Wrapper | $15,000 - $50,000 | $5,000 - $15,000 |
| Database Connection | $5,000 - $20,000 | $2,000 - $8,000 |
| File-Based Integration | $5,000 - $25,000 | $3,000 - $10,000 |
| Middleware Platform | $30,000 - $100,000 | $15,000 - $40,000 |
| Screen Scraping (RPA) | $20,000 - $80,000 | $10,000 - $30,000 |
ROI Calculation: Integration vs Replacement
Full System Replacement
- Cost: $150,000 - $500,000+
- Timeline: 12-24 months
- Risk: High (disruption, data migration)
Integration Approach
- Cost: $15,000 - $100,000
- Timeline: 2-4 months
- Risk: Low (legacy system continues working)
Savings: $100,000 - $400,000+ Time to value: 6-12 months faster
Compare this to full system replacement, which typically ranges from $150,000 to $500,000+ for mid-market systems. Integration usually delivers ROI within 6 to 12 months.
Here are the mistakes that repeatedly cause integration projects to fail:
1. Starting with the Hardest System Pick your quickest win first. Build organisational confidence before tackling the 30-year-old mainframe.
2. Underestimating Data Quality Legacy systems often have data quality issues you do not know about until you start extracting. Budget time for data cleansing.
3. Ignoring Security That legacy database has been protected by obscurity for years. Once you expose it via API, it needs real security controls.
4. No Monitoring Legacy systems fail silently. Implement health checks and alerting from day one.
5. Going Alone Integration requires knowledge of both the legacy system and modern architectures. If your team lacks legacy expertise, bring in specialists for the initial implementation.
When legacy integration is done right, your AI initiatives unlock value that was previously impossible:
| Before Integration | After Integration | Business Impact |
|---|---|---|
| Sensor data locked in 20-year-old PLCs | Predictive maintenance models active | 30-50% reduction in unplanned downtime |
| ERP history inaccessible | Demand forecasting trained on decade of data | 20-30% better inventory planning |
| Manual data extraction from legacy formats | Automated document processing | 80% time savings |
| Systems never talked to each other | Real-time unified dashboards | Single source of truth |
Market Opportunity The Australian system integration market is projected to reach USD 18.07 billion by 2030, growing at 14.18% annually. This growth is driven by organisations finding ways to extract value from existing investments rather than constantly replacing them. Source: Mordor Intelligence
If you are sitting on legacy systems that are blocking your AI ambitions, here is what I recommend:
The dirty secret of digital transformation is that most of it is plumbing. Getting data from point A to point B reliably, securely, and efficiently. That is not glamorous work, but it is the work that makes AI initiatives actually succeed.
Ready to unlock your legacy data? Book a Technical Consultation to discuss your integration challenges and explore the best approach for your systems.
Related Reading:
Sources: Research synthesized from MuleSoft Connectivity Benchmark Report, AVEVA PI System Documentation, iTnews Australia, Prismatic API Integration Guide, Moesif API Integration Guide, ServiceNow Legacy Systems Cost Analysis, and Mordor Intelligence Australia System Integration Market Report.