AI-Powered ETL in Amazon Redshift

28 April 2026 | Jack Bailey

When the Warehouse Starts Doing the Work

In our previous piece, we explored how ETL (Extract, Transform, and Load) is evolving into adaptive, intelligent systems. In Redshift environments, we are now seeing what that shift looks like in practice.

For most of its life, Amazon Redshift has been treated as the final step in the data pipeline. Data is prepared elsewhere, transformed in dedicated processing layers, and then loaded into the warehouse for analysis.

That assumption is starting to break down.

At Ardent, we work with organisations operating large-scale Redshift environments where the boundaries between ingestion, transformation, and analytics are no longer as clearly defined as they once were.

We see platforms ingesting large volumes of data from multiple sources, often combining datasets such as TV audience data, mobile location data, and digital interaction data, each arriving in different formats and at different speeds.

These pipelines are designed to clean, join, and enrich that data so it can be used consistently across analytics and reporting, with outputs in some cases refreshed as often as an hourly basis to support downstream decision-making.

As data volumes grow and workloads diversify, the warehouse is no longer just storing data. It is increasingly responsible for shaping, optimising, and serving it in real time.

This is not simply a tooling change. It reflects a deeper architectural shift in how data platforms are designed and operated.

The Hidden Cost of Separation

Traditional ETL has been built on separation. Data is extracted, transformed, and loaded across distinct layers, often owned by different teams and optimised for different objectives.

At smaller scales, this structure provides clarity and control. At enterprise level, it introduces friction.

Each transition between systems creates overhead:

Latency between stages
Duplication of transformation logic
Additional operational dependencies
Misalignment between how data is processed and how it is consumed

This is most clearly visible in environments where Redshift is tightly coupled with multiple upstream processing layers. The same transformations are often implemented in more than one place, for example aggregating or reshaping data before it is loaded and then repeating similar logic inside the warehouse to support different reporting needs.

Over time, this duplication makes the architecture harder to work with. Even small changes require updates across multiple layers, and performance optimisation becomes fragmented rather than centrally managed.

Redshift Is No Longer Just the Destination

Amazon Redshift was originally positioned as the endpoint of the data journey; however, today that role is evolving.

Capabilities such as automated workload management, materialised views, and in-database processing are changing Redshift from passive storage to active processing.

More importantly, recent optimisation features mean Redshift can automatically adjust how queries are executed and how resources are allocated, based on real-time workload conditions.

We are seeing organisations shift transformation logic into the warehouse not just to simplify pipelines, but to take advantage of this adaptive behaviour.

The move from ETL toward ELT is part of this transition, but it does not fully capture what is happening.

The more significant change is that the warehouse is starting to participate in optimisation decisions that were previously external.

When Optimisation Stops Being Manual

In traditional environments, performance is maintained through ongoing intervention. Engineers monitor pipelines, identify bottlenecks, refactor transformations, and tune infrastructure to keep systems running efficiently.

In Redshift-based architectures, much of this activity is shifting into the system itself.

Redshift can:

Learn workload patterns over time and adjust execution strategies
Dynamically allocate compute based on query complexity and concurrency

Prioritise workloads based on demand, balancing ingestion, transformation, and analytics

Optimise both performance and cost through adaptive resource management

In practice, this changes how engineering effort is applied.

We repeatedly witness teams spending less time on reactive optimisation and more time on structuring data models and workloads so the system can operate effectively. The focus often moves from fine-tuning individual queries to shaping how the system behaves under stress.

ETL Becomes a Property of the System

One of the most significant changes in Redshift environments is how ETL is conceptualised. Rather than existing as a sequence of pipelines, it begins to emerge from the interaction of workloads within the warehouse itself.

Ingestion processes, transformation queries, BI dashboards, and analytical workloads all draw on the same pool of resources, with Redshift’s workload management determining how those demands are balanced in real time.

As a result, transformation is no longer confined to a predefined stage. It happens continuously, shaped by how data is accessed and used.

In practical terms, this changes where and how transformation logic is applied. Instead of preparing data upstream before loading it into Redshift, raw data can be ingested directly and shaped within the warehouse using scheduled queries and materialised views. As query patterns evolve, those transformations can be adjusted in place, without rebuilding upstream pipelines or reprocessing entire datasets.

In our work with clients, we see this most clearly in environments where duplicated transformations, previously spread across multiple processing layers, are consolidated into the warehouse. As this happens, duplication is reduced and the distance between raw data and usable insight decreases.

The result is not just simplification, but a change in how the system behaves. ETL becomes less about orchestrating pipelines and more about coordinating how the system operates as a whole.

Bringing Transformation Closer to Use

As transformation moves into the warehouse, it also moves closer to the point of consumption.

In traditional architectures, data is heavily processed before it is made available for analysis. By the time it reaches the warehouse, much of the logic is fixed.

In Redshift-centric models, transformation becomes more iterative.

Raw data can be ingested rapidly, with structures applied dynamically using features such as materialised views and incremental refresh. This allows data to be shaped in response to how it is queried, rather than predefined assumptions.

This is particularly visible in environments where reporting requirements evolve quickly, with teams adjusting transformation logic within the warehouse, rather than repeatedly reworking upstream pipelines.

This creates a tighter feedback loop between data availability and data usage, improving both responsiveness and relevance.

The Trade-Off: Visibility in Adaptive Systems

As Redshift takes on more responsibility for optimisation and transformation, system behaviour becomes less static:

Queries may be executed differently depending on workload conditions
Resources may be reallocated dynamically
Execution paths can change without direct intervention

While this improves performance and efficiency, it introduces a new challenge: understanding system behaviour at a detailed level.

This becomes most apparent when investigating performance or reconciling outputs across workloads, where system-driven optimisation makes execution decisions less immediately visible. This shifts the focus once again toward observability and control.

Organisations need:

Clear visibility into workload behaviour
Monitoring aligned with dynamic execution patterns
Governance frameworks that operate effectively in adaptive environments

Without this, the system’s ability to optimise can outpace the organisation’s ability to interpret it.

Real-World Shift: Adaptive Architectures in Practice

Organisations are increasingly moving away from tightly coupled pipeline architectures toward more flexible, workload-aware designs, a shift we are seeing across large-scale Redshift deployments.

In some cases, this includes adopting ‘multi-cluster’ or ‘hub-and-spoke’ models to isolate different types of workloads. In others, it involves consolidating transformation logic into the warehouse to reduce duplication and improve consistency.

What drives these changes is not just performance, but contention.

As ingestion, transformation, and analytics workloads scale simultaneously, static pipeline structures struggle to coexist efficiently. Adaptive resource management becomes essential to maintain stability.

The system must continuously respond to demand, rather than rely on predefined execution patterns.

Where to Start

For organisations already using Redshift, this shift does not require a complete redesign.

At Ardent, we typically begin by reassessing how responsibilities are distributed across the platform:

Which transformations genuinely need to happen upstream?
Where is data being moved unnecessarily?
Which workloads would benefit from dynamic optimisation?

From there, incremental changes can introduce significant improvements:

Moving selected transformations into the warehouse
Leveraging automated workload management
Reducing duplication across processing layers
Simplifying orchestration where possible

The objective is not to eliminate ETL, but to reduce the rigidity that surrounds it.

What This Means Now

ETL is not disappearing, but its role is changing.

As platforms like Amazon Redshift become more adaptive, the boundary between storage, processing, and optimisation continues to blur.

The organisations that benefit most will not be those that build the most complex pipelines but those that simplify their architectures, reduce unnecessary separation between layers, and allow their systems to take on more of the operational burden.

If you would like to discuss this further or how Ardent could be of assistance to your business, please get in touch: https://www.ardentisys.com/contact-us/

Ardent Insights

Which Platforms Are Ahead in AI-Ready Data Pipelines?

At Ardent, we have spent years helping organisations design, modernise and operate the data foundations behind critical reporting, analytics and decision-making. That experience gives us a clear view of what now separates AI-ready businesses from those still struggling to get value from their data. It is not the amount of data they hold, or even [...]

Making Your Existing Data Pipelines AI-Ready

From Stable Infrastructure to Adaptive Intelligence Most organisations do not need more data. They need their existing data to work better. At Ardent, we spend a significant amount of time inside large-scale client data platforms that are already mature, operational, and delivering value. These are not greenfield environments. They are complex ecosystems built over years, [...]

Why Data Pipelines Must Become Intelligent Systems

Most data platforms do not fail because they lack capability. They fail because they cannot adapt to the speed at which their environment changes. At the centre of this challenge is ETL. ETL (Extract, Transform, Load) is the process by which raw data is ingested, structured, and made usable for analytics, reporting, and operational decision-making. [...]

More insights

US

280 Madison Avenue,

9th Floor, Room 912,
New York,

NY, 10016

+1-646-475-2228

India

114 Udyog Bhavan,

Sonawala Road,

Goregaon East,

Mumbai, India, 400 063

+91 (0) 22 268 547 15