Back to Blog
    Technical Strategy

    Legacy System Integration: How to Connect Old Software to Modern AI (Without Ripping and Replacing)

    Jan 3, 2026By Solve8 Team12 min read

    Legacy System Integration Modern Ai

    The $40,000 Problem Nobody Wants to Talk About

    Here is a conversation I have had at least thirty times in the past two years: A CTO shows me their shiny new AI initiative, and within five minutes we hit the same wall. "We would love to use this for predictive maintenance, but our production data is locked in a system from 2008." Or: "Our AVEVA PI historian has decades of sensor data, but our data science team cannot access it."

    The Hidden Cost of Legacy Systems Legacy systems cost enterprises an average of nearly $40,000 annually in maintenance alone, and they consume up to 80% of IT budgets. Source: Industry Research

    Yet here is the uncomfortable truth most vendors will not tell you: ripping and replacing that 15-year-old system is almost never the right answer.

    During my years working across manufacturing, mining, and utilities at organisations like BHP, Rio Tinto, and Schneider Electric, I spent considerable time connecting legacy systems to modern platforms. What I learned is that the integration problem is not a technology problem. It is a pattern recognition problem. Once you understand the five core integration patterns, you can connect almost anything to anything.

    This guide is for IT managers and CTOs who are tired of hearing "you need to modernise first" and want practical, battle-tested approaches to legacy system integration that actually ship.


    Why Rip-and-Replace Almost Never Works

    Before we dive into the technical patterns, let me explain why I am so adamant about integration over replacement.

    The Westpac banking group is currently working to retire 120 legacy systems. That is not a typo. This is a multi-year, capital-intensive effort that ties up enormous resources. For mid-market Australian businesses with IT teams of five to twenty people, that approach simply does not scale.

    Here is what works instead:

    • Bendigo and Adelaide Bank used AI-powered modernisation to cut developer migration time by 90%, reducing testing from 80 hours to 5 minutes
    • Northern Star Resources (Australian mining company) used AVEVA PI's AF SDK to automate data collection from multiple legacy systems, eliminating manual Excel transfers entirely
    • Logistics firms that integrate their legacy freight systems rather than replacing them commonly see 30-40% processing time reductions

    The key insight: your legacy system is not the problem. The lack of a communication layer is the problem. Add that layer, and suddenly your 20-year-old ERP becomes a data source for your AI models.


    The Five Integration Patterns That Actually Work

    Based on industry experience and documented case studies, five core integration patterns consistently work. In order of preference (cleanest to messiest), they are:

    The Five Legacy Integration Patterns

    Choose based on your legacy system capabilities (in order of preference)
    Has any interface (SOAP, COM, CLI)
    → API Wrappers - $15-50K - Real-time, modern, scalable
    Has accessible database
    → Database Direct - $5-20K - Direct access, batch OK
    Needs complex orchestration
    → Middleware - $30-100K - Multiple system coordination
    Only supports exports
    → File-Based - $5-25K - Batch processing only (Last resort: Screen Scraping $20-80K)

    Let me walk you through each one with real implementation details.

    Pattern 1: API Wrappers - The Gold Standard

    If your legacy system exposes any kind of interface, even SOAP, even COM objects, even a command-line interface, you can wrap it with a modern REST or GraphQL API.

    How It Works:

    You build a thin service layer that translates modern API calls into whatever native protocol your legacy system speaks. This wrapper handles authentication, error handling, and data transformation.

    # Example: Wrapping a legacy SOAP service
    from flask import Flask, jsonify, request
    import zeep
    
    app = Flask(__name__)
    legacy_client = zeep.Client('http://legacy-system:8080/service?wsdl')
    
    @app.route('/api/v1/orders/<order_id>', methods=['GET'])
    def get_order(order_id):
        # Translate REST call to SOAP
        legacy_response = legacy_client.service.GetOrder(OrderID=order_id)
    
        # Transform to modern JSON format
        return jsonify({
            'order_id': legacy_response.OrderID,
            'customer': legacy_response.CustomerName,
            'status': legacy_response.Status,
            'created_at': str(legacy_response.CreateDate)
        })
    

    AVEVA PI System Example:

    AVEVA (formerly OSIsoft) provides excellent developer technologies for this exact purpose. Their PI Web API offers RESTful access to time-series data, and the PI SQL Framework lets you query historian data with standard SQL.

    Consider a manufacturing plant with 15 years of sensor data locked in PI. By building an API wrapper that exposes temperature, pressure, and flow data as REST endpoints, a data science team can go from "we cannot access the data" to "we have a working predictive maintenance model" in six weeks. This is a common pattern I saw repeatedly during my time at Schneider Electric.

    When to Use:

    • Legacy system has any programmable interface
    • You need real-time or near-real-time data access
    • Multiple modern systems need to consume the data

    Cost Estimate: $15,000 to $50,000 AUD for a well-architected wrapper service, depending on complexity.


    Pattern 2: Database Direct Connections

    Many legacy systems, even those without APIs, sit on top of relational databases. If you can connect to that database, you have a direct line to the data.

    How It Works:

    Use ODBC or JDBC drivers to establish read access to the legacy database. Build views or stored procedures that expose the specific data sets you need, then connect your AI pipeline directly.

    -- Create a view that exposes legacy inventory data
    CREATE VIEW ai_inventory_feed AS
    SELECT
        p.product_id,
        p.product_name,
        i.quantity_on_hand,
        i.reorder_point,
        i.last_updated
    FROM legacy_products p
    JOIN legacy_inventory i ON p.product_id = i.product_id
    WHERE i.last_updated > DATEADD(day, -1, GETDATE());
    

    Critical Security Considerations:

    This approach requires robust security controls. In my experience, you need:

    • Read-only database user with access only to specific tables or views
    • IP-based access control lists restricting which servers can connect
    • Connection encryption (TLS) for all database traffic
    • Audit logging on all queries

    Despite being over 30 years old, ODBC remains essential in enterprise environments. It provides standardised database connectivity across diverse systems, enabling applications to communicate with various databases without specialised code.

    When to Use:

    • Legacy system uses a relational database you can access
    • You need batch or near-real-time data synchronisation
    • The legacy vendor will not (or cannot) provide API access

    Cost Estimate: $5,000 to $20,000 AUD for secure database integration setup, plus ongoing monitoring.


    Pattern 3: Middleware and Message Queues

    When you need to orchestrate data flow between multiple legacy systems, or when legacy systems need to communicate asynchronously, middleware is your friend.

    How It Works:

    Deploy a message broker (RabbitMQ, Apache Kafka, or AWS SQS) as an intermediary. Legacy systems publish events to the broker, and modern systems subscribe to those events.

    // Modern AI service subscribing to legacy events
    const amqp = require('amqplib');
    
    async function connectToLegacyEvents() {
        const connection = await amqp.connect('amqp://broker');
        const channel = await connection.createChannel();
    
        // Subscribe to inventory changes from legacy ERP
        await channel.assertQueue('legacy.inventory.changes');
    
        channel.consume('legacy.inventory.changes', (msg) => {
            const inventoryUpdate = JSON.parse(msg.content.toString());
    
            // Feed to AI demand forecasting model
            feedToAIModel(inventoryUpdate);
    
            channel.ack(msg);
        });
    }
    

    Middleware Options That Work Well:

    • MuleSoft: Enterprise-grade, excellent for complex transformations. Expensive but powerful.
    • Apache Camel: Open-source, Java-based. Great for teams with Java expertise.
    • n8n or Workato: Low-code options for simpler integrations.
    • Custom Node.js/Python services: When you need full control.

    According to integration experts, middleware excels in ecosystems where multiple legacy systems require seamless integration, when transforming data between formats is a regular necessity, or when operating on specifically tailored legacy systems that need to retain unique protocols.

    When to Use:

    • Multiple legacy systems need to share data
    • You need asynchronous, event-driven integration
    • Data transformations are complex

    Cost Estimate: $30,000 to $100,000 AUD for enterprise middleware implementation, less for open-source self-managed.


    Pattern 4: File-Based Integration

    This is more common than you might think. Many legacy systems, especially older manufacturing and logistics software, export data as flat files. Embrace it.

    How It Works:

    The legacy system runs scheduled exports (nightly, hourly, or triggered) to CSV, XML, or JSON files. An integration service monitors a file location (local directory, FTP, or SFTP server), picks up new files, transforms the data, and routes it to your AI systems.

    # File-based integration for legacy inventory system
    import os
    import pandas as pd
    from watchdog.observers import Observer
    from watchdog.events import FileSystemEventHandler
    
    class LegacyFileHandler(FileSystemEventHandler):
        def on_created(self, event):
            if event.src_path.endswith('.csv'):
                # Parse legacy CSV format
                df = pd.read_csv(event.src_path, encoding='latin-1')
    
                # Transform to modern schema
                transformed = transform_legacy_format(df)
    
                # Push to AI pipeline
                push_to_ai_pipeline(transformed)
    
                # Archive processed file
                archive_file(event.src_path)
    
    # Monitor the legacy export directory
    observer = Observer()
    observer.schedule(LegacyFileHandler(), '/mnt/legacy-exports/')
    observer.start()
    

    Real-World Pattern:

    Consider a construction firm whose job costing system predates APIs entirely. But it can export daily CSV reports. A Python service that watches the SFTP server, parses the exports, and feeds the data into a cash flow forecasting model can be built in about two weeks. Total cost: typically under $10,000 AUD.

    When to Use:

    • Legacy system has no API but can export files
    • Batch processing (not real-time) is acceptable
    • Partners or vendors send data via file drops

    Cost Estimate: $5,000 to $25,000 AUD depending on transformation complexity.


    Pattern 5: Screen Scraping - The Last Resort

    I want to be direct here: screen scraping should be your last resort. It is brittle, maintenance-intensive, and breaks whenever the underlying application changes. But sometimes it is the only option.

    How It Works:

    Robotic Process Automation (RPA) tools interact with the legacy application's user interface, reading data from screens and entering commands as a human would.

    When Screen Scraping Makes Sense:

    Danske Bank deployed 250 UiPath bots to handle compliance, loans, and KYC activities on systems built with COBOL. The result: 60,000 annual employee hours saved, and data accuracy improved to 90%.

    But Danske Bank is a large enterprise with dedicated RPA teams. For mid-market Australian businesses, I only recommend screen scraping when:

    • No other integration method is possible
    • The ROI clearly justifies the ongoing maintenance burden
    • You have budget for RPA platform licensing (UiPath, Automation Anywhere)
    • The underlying application is stable and rarely updated

    Technical Reality:

    Any change in the underlying technology stack, from OS to web browser to the actual application, will break the integration. Budget for ongoing maintenance.

    Cost Estimate: $20,000 to $80,000 AUD initial implementation, plus $10,000 to $30,000 AUD annual maintenance.


    Choosing the Right Pattern: A Decision Framework

    Here is the decision framework I use with clients:

    Legacy Integration Decision Framework

    What capabilities does your legacy system have?
    Has any API (REST, SOAP, GraphQL, COM, CLI)
    → Build API Wrapper ($15K-50K) - Best option
    Has accessible database
    → Database Direct ($5K-20K) - Good option
    Can export files (CSV, XML, JSON)
    → File-Based Integration ($5K-25K)
    Has message queue/event system
    → Use Middleware ($30K-100K)
    UI only, stable interface
    → Screen Scraping ($20K-80K) - Last resort
    QuestionIf YesIf No
    Does the system have any API (REST, SOAP, GraphQL)?Build an API wrapperContinue
    Can you access the underlying database?Use database direct connectionContinue
    Does the system support file exports?Use file-based integrationContinue
    Is there a message queue or event system?Use middlewareContinue
    Is the UI stable and predictable?Consider screen scrapingAccept manual process

    The Hybrid Approach:

    In practice, most enterprise integrations use multiple patterns. A typical manufacturing integration might use:

    • Database connection for historical production data (batch, overnight)
    • API wrapper around AVEVA PI for real-time sensor data
    • File-based integration for vendor price lists updated weekly
    • Middleware to orchestrate all data flows into the AI platform

    Implementation Roadmap: Start Small, Scale Fast

    Here is the roadmap that consistently works for legacy integration projects:

    Legacy Integration Roadmap

    1
    Week 1-2
    Discovery
    Audit data sources, document capabilities, identify quick wins, map to AI use cases
    2
    Week 3-6
    Proof of Concept
    One high-value, low-risk system. Implement simplest pattern. Validate data quality.
    3
    Week 7-18
    Production
    Harden security, implement monitoring, error handling, document runbooks
    4
    Ongoing
    Scale
    Apply patterns to more systems, build reusable components, consider middleware for complex orchestration

    Phase 1: Discovery (2 weeks)

    • Audit all data sources and existing integration points
    • Document legacy system capabilities (APIs, databases, export functions)
    • Identify quick wins (systems where integration is straightforward)
    • Map data flows to AI use cases

    Phase 2: Proof of Concept (4 weeks)

    • Select one high-value, low-risk integration
    • Implement using the simplest viable pattern
    • Validate data quality and latency
    • Measure baseline for ROI calculation

    Phase 3: Production Implementation (6-12 weeks)

    • Harden security and access controls
    • Implement monitoring and alerting
    • Build error handling and retry logic
    • Document runbooks for operations team

    Phase 4: Scale (Ongoing)

    • Apply proven patterns to additional systems
    • Build reusable integration components
    • Consider middleware for complex orchestration

    The Australian Context: Data Sovereignty and Compliance

    For Australian CTOs, legacy integration comes with additional considerations:

    Privacy Act 1988 Compliance: When integrating legacy systems containing personal information, ensure your integration layer maintains the same (or better) security controls as the source system. Data in transit must be encrypted.

    Cloud Residency: If your AI platform runs in the cloud, verify that data remains in Australian regions. AWS Sydney (ap-southeast-2) and Azure Australia East both support major AI services.

    Industry-Specific Requirements:

    • Healthcare: RACGP and ADHA standards for patient data
    • Financial Services: APRA CPS 234 for information security
    • Government: ISM compliance for classified workloads

    Cost Reality Check

    Based on current Australian market rates, here is what you should budget for legacy integration:

    Integration ApproachInitial Cost (AUD)Annual Maintenance (AUD)
    API Wrapper$15,000 - $50,000$5,000 - $15,000
    Database Connection$5,000 - $20,000$2,000 - $8,000
    File-Based Integration$5,000 - $25,000$3,000 - $10,000
    Middleware Platform$30,000 - $100,000$15,000 - $40,000
    Screen Scraping (RPA)$20,000 - $80,000$10,000 - $30,000

    ROI Calculation: Integration vs Replacement

    Full System Replacement

    • Cost: $150,000 - $500,000+
    • Timeline: 12-24 months
    • Risk: High (disruption, data migration)

    Integration Approach

    • Cost: $15,000 - $100,000
    • Timeline: 2-4 months
    • Risk: Low (legacy system continues working)

    Savings: $100,000 - $400,000+ Time to value: 6-12 months faster

    Compare this to full system replacement, which typically ranges from $150,000 to $500,000+ for mid-market systems. Integration usually delivers ROI within 6 to 12 months.


    Common Mistakes to Avoid

    Here are the mistakes that repeatedly cause integration projects to fail:

    1. Starting with the Hardest System Pick your quickest win first. Build organisational confidence before tackling the 30-year-old mainframe.

    2. Underestimating Data Quality Legacy systems often have data quality issues you do not know about until you start extracting. Budget time for data cleansing.

    3. Ignoring Security That legacy database has been protected by obscurity for years. Once you expose it via API, it needs real security controls.

    4. No Monitoring Legacy systems fail silently. Implement health checks and alerting from day one.

    5. Going Alone Integration requires knowledge of both the legacy system and modern architectures. If your team lacks legacy expertise, bring in specialists for the initial implementation.


    What Success Looks Like

    When legacy integration is done right, your AI initiatives unlock value that was previously impossible:

    Before IntegrationAfter IntegrationBusiness Impact
    Sensor data locked in 20-year-old PLCsPredictive maintenance models active30-50% reduction in unplanned downtime
    ERP history inaccessibleDemand forecasting trained on decade of data20-30% better inventory planning
    Manual data extraction from legacy formatsAutomated document processing80% time savings
    Systems never talked to each otherReal-time unified dashboardsSingle source of truth

    Market Opportunity The Australian system integration market is projected to reach USD 18.07 billion by 2030, growing at 14.18% annually. This growth is driven by organisations finding ways to extract value from existing investments rather than constantly replacing them. Source: Mordor Intelligence


    Next Steps

    If you are sitting on legacy systems that are blocking your AI ambitions, here is what I recommend:

    1. Audit your current integration points - What is already connected, and how?
    2. Map your data flows - Where does the data your AI needs actually live?
    3. Identify one quick win - Which integration would deliver the most value with the least complexity?
    4. Start small - Prove the pattern works before scaling.

    The dirty secret of digital transformation is that most of it is plumbing. Getting data from point A to point B reliably, securely, and efficiently. That is not glamorous work, but it is the work that makes AI initiatives actually succeed.

    Ready to unlock your legacy data? Book a Technical Consultation to discuss your integration challenges and explore the best approach for your systems.


    Related Reading:


    Sources: Research synthesized from MuleSoft Connectivity Benchmark Report, AVEVA PI System Documentation, iTnews Australia, Prismatic API Integration Guide, Moesif API Integration Guide, ServiceNow Legacy Systems Cost Analysis, and Mordor Intelligence Australia System Integration Market Report.