All Articles
/
Best Practices

What is reverse ETL?

To operationalize your data, you need reverse ETL.

Published on Jan 17, 2024 by AirOps Team

The modern data stack has come a long way in recent years.

Cloud data warehouses with affordable storage and compute. Data ingestion tools that streamline the way businesses collect, process, and prepare data. Powerful transformation and modeling applications. Not to mention a dizzying number of data analytics and business intelligence (BI) tools to help teams visualize their data.

There’s one more step to go, though: Making that data actionable.

Historically, this has been a tricky problem – getting neatly transformed and modeled data out of the data warehouse and into operational tools and documents like Salesforce, Hubspot, Zendesk, Google Sheets, Notion, Coda, Monday, and Asana isn’t a simple process.

… it’s getting easier, though, thanks to something called reverse ETL

Reverse ETL makes it possible to move beyond dashboards and visualizations. It helps companies get data into the hands of business end-users, inside of their favorite operational tools (aka data operationalization). It also solves the data adoption problem, a key issue that an overwhelming majority of organizations (and the data teams who support them) struggle to solve.

You’ll often hear it referred to as the “final mile” in data analytics. After reading you’ll understand why that’s the case.

Here’s what we’re going to cover:

  • What is reverse ETL?
  • Why would a company use reverse ETL to move data back out of its warehouse?
  • What’s the difference between ETL/ELT and reverse ETL?
  • How does reverse ETL fit into the modern data stack?
  • What are the biggest benefits of reverse ETL?
  • What problems can reverse ETL solve?
  • Reverse ETL use cases for sales, marketing, customer success, and product teams
  • Does your organization need reverse ETL?
  • Should you build or buy a reverse ETL solution?
  • How to choose a reverse ETL tool

If you want to get right to the tactical side of reverse ETL, jump to the What problems can reverse ETL solve? section. We’ll explain how it helps solve some of the biggest problems that plague data teams, individual functions, and the business as a whole.

If you’re curious about the details behind reverse ETL: what it is, why companies use it, how it works, and why it’s the perfect addition to any modern data stack, keep reading.

What is reverse ETL?

Reverse ETL refers to the process of replicating data from a storage system (like a cloud data warehouse) to various third-party tools and applications that you use for everyday operations, like Salesforce, Zendesk, Hubspot, and countless other SaaS applications. 

Here’s a step-by-step breakdown of the process:

  1. Data is ingested into the data warehouse and transformed to enrich it and add business logic
  2. Data is translated to fit with the target destination’s API requirements
  3. Data is loaded into the target application

For our visual learners out there, this simple diagram shows how data flows through the reverse ETL process:

Source

Why would a company use reverse ETL to move data back out of its warehouse?

If you’re wondering, “Why in the world would I want to move data back out of the warehouse when we just went through the hassle of getting it in there?” you wouldn’t be alone. 

It’s a valid question, because you probably poured a lot of resources into deploying your data storage system: You collected data from all of your sources. Your data team cleaned, prepared, and organized it into nice, usable data models, then created aesthetically pleasing dashboards and visualizations. 

There’s a problem with those beautiful dashboards and visualizations, though: most people won’t use them. If a salesperson spends most of their time in Salesforce, that’s where they want to see their data. Many people do a lot of their work inside of Excel and Google Sheets, and good luck getting them to change their workflows. Let’s face it, Excel is probably the most successful piece of software ever. People aren’t going to swap it for a sleek visualization in a BI tool (no matter how nice it looks).

This is the data adoption problem in action, and it’s a big issue for most startups and early-stage companies that have invested in data analytics and BI

Reverse ETL addresses this problem because it brings data to the tools people are already using. They don’t have to make fundamental changes to their behavior to use data. It makes working with (and extracting value from) data so much easier.

If you want to develop a data-driven culture within your organization, reverse ETL is a great way to get there.

Reverse ETL also makes sense from an efficiency perspective: Your data warehouse (or other central storage system) contains all of the data that’s been modeled based on business requirements. It also houses the core metrics definitions that business users rely on to drive daily operations, like annual recurring revenue (ARR), customer lifetime value (CLV), and customer acquisition costs (CAC). 

This is a perfect environment for a solution like reverse ETL because it allows you to get your transformed, modeled data back into core operating tools and documents. As a result, you can move beyond analysis and into operationalization 🎉.

What’s the difference between ETL/ELT and reverse ETL?

ETL/ELT and reverse ETL have different functions, but if you want a full understanding of how reverse ETL works, it helps to learn the differences between two common data ingestion methods with similar names: ETL and ELT.

ETL and ELT both extract raw data from various sources, like SaaS applications and production databases, transform the data into clean and structured data, and then load the data into a storage system, like a data warehouse.

ETL was the precursor to ELT and was first introduced in the 1970s. ELT entered the scene in the 2000s. It arose from a need for a faster, more scalable data ingestion method as the volume of data that the average business generated increased.

The main difference between the two is the order in which data is transformed. With ETL, data is transformed and then loaded into the data warehouse. With ELT, data is loaded into the warehouse and then also transformed inside of it.

Reverse ETL is the inverse of the ETL process → it moves data out of the warehouse and back into SaaS tools and other operational systems. 

You can learn more about ETL and ELT in our detailed guide: ETL vs. ELT: Understanding the Differences & Benefits. Or, check out this simplified chart that explains the main differences between ETL, ELT, and reverse ETL:

ETL
Extract → Transform → Load
ELT
Extract → Load → Transform
Reverse ETL
Data Source SaaS applications, business tools, production databases, etc. SaaS applications, business tools, production databases, etc. Data warehouse
Data Destination Data warehouse or data lake, a transactional / third-party app databases Data warehouse or data lake Transactional / third-party app databases
Transformation process Data is transformed to meet the storage system’s requirements before being loaded into the data storage system Transformation occurs after loading data into the destination storage system Data is formatted for API/frontend app requirements and transformation happens before storing results in the destination system
Purpose Used to power dashboards and reports Used to power dashboards and reports Used to power operations and workflows

ETL, ELT, and reverse ETL do have something in common, though: they can all be part of an organization’s modern data stack.

How does reverse ETL fit into the modern data stack?

Reverse ETL is part of the most recent iteration of the modern data stack. It’s not something you’ll see in every startup or early-stage company’s stack, but it’s becoming more and more common. 

Before reverse ETL, a company’s data was more or less siloed inside of a data storage system once it was loaded and transformed. If someone wanted to use that data to generate reports, make decisions, and drive action, they generally needed engineering help to get data out of the warehouse and into dashboards and BI tools.

Dashboards and BI tools certainly have their place. Being able to access data inside of them doesn’t mean data is actionable or operationalized, though. 

Here’s a quick overview of what the modern data stack looks like with reverse ETL included:

  1. Data sources, including SaaS apps, event data from your website or app, databases, etc.
  2. Data storage system, like a cloud data warehouse
  3. Data ingestion and integration through ETL or ELT
  4. Data modeling
  5. Data analytics, including process, tooling, and visualization
  6. Reverse ETL to make your data actionable

A diagram of the modern data stack with reverse ETL included.
Source

If you aren’t familiar with the different layers of a modern data stack, get the details here:
How to Build a Modern Data Stack.

In the humble opinion of Team AirOps, reverse ETL is a tool that will power the future of work because it allows everyone (even non-technical folks) to operationalize data and self-serve their data needs. 

When people can work with data inside of the tools and documents they already use every day, it frees up data and engineering team resources to focus on solving the company’s most difficult technical problems.

Discover AirOps AI Apps

Powerful AI apps you can customize & run at scale: write SQL, run NLP, draft personalized content, and more.

What are the biggest benefits of reverse ETL?

With reverse ETL, operations teams can access and use data inside of the tools they already use to do their jobs, like Salesforce, Hubspot, Zendesk, Monday, and others. Reverse ETL even makes it possible to access data inside core operating documents, like Google Sheets and Notion. 

This enables something we’ve mentioned a few times now: data operationalization.

Data operationalization frees people to do more of their best work, more efficiently and effectively. When everyone in an organization can seamlessly access, understand, and build with the data they need in the core operating tools they already know and love, that’s data operationalization in action. 

Reverse ETL also enables business end-users to self-serve their own data needs, which increases efficiency and removes friction to using data in day-to-day workflows.

What problems can reverse ETL solve?

By now you’re probably onto the fact that reverse ETL is awesome. But, what kind of problems does this new technology actually solve? As far as we’re concerned, the biggest value-add of reverse ETL is data adoption.

The vast majority of data science projects fail. According to one estimate, only 20% of analytic insights will deliver business outcomes through 2022. Ouch.

Why do so many data projects fail, though? The general consensus on the reason is a matter of adoption. Data and engineering teams spend a ton of time and resources getting data into beautiful dashboards, visualizations, and reports… and then no one uses them.

Reverse ETL addresses the adoption problem because it doesn’t require people to change their behavior to use data. 

When you can push data into the tools and systems that people are already using, the data adoption problem is suddenly much easier to solve. Here are three examples of the problems that can be solved with the help of reverse ETL solutions:

1. You don’t have accurate performance metrics to support scaling/fundraising.

If a company wants to scale or fundraise, they need to be able to (accurately) report performance metrics

In an ideal world, accurate reporting wouldn’t require playing a high-stakes game of “How the heck did we get this number last time?” Not only is this inefficient and frustrating, but it’s also an approach that’s prone to errors and inconsistencies. 

Luckily, reverse ETL can get you one step closer to an ideal world, one that doesn’t involve downloading and comparing various CSV files to figure out how a metric was calculated. 

2. The data team is swamped.

In many startups, early-stage companies, and other organizations, demand for analytics dramatically exceeds supply. There’s simply no way the data team will be able to handle every request.

Since reverse ETL allows people to self-serve their own day-to-day data needs, it gives data and engineering teams more time to work on complicated projects that require advanced technical skills and knowledge.

3. The product team is working on new features, but they need a better understanding of user behavior to develop features people actually want. 

Much like marketing metrics alone can’t paint a complete picture of your users, neither can product metrics on their own. With reverse ETL tools, the product team can analyze product data, sales data, support data, and any other relevant data that’s stored within the warehouse. 

This allows for a more comprehensive, 360-degree view of users that can be used to guide the development of new product features.

Reverse ETL use cases for sales, marketing, customer success, and product teams

There are tons of potential use cases for reverse ETL – here are some examples for sales, marketing, customer success, and product functions:

Sales

  • Use combined datasets to create lead scoring models that will help sales focus on the right conversations with the right customers. 
  • Pull data from Salesforce into Google Sheets for real-time sales forecasting.
  • Combine customer behavioral data from various data sources to help sales reps forge more meaningful connections with leads and prospects.

Marketing

  • Combine customer data from sales, product, and customer support to develop personalized marketing campaigns.
  • View customer purchase frequency from systems like Stripe inside of marketing automation tools like Hubspot.
  • Use demographic data to segment leads and deliver more effective email drip campaigns. 

Customer Support

  • Sync data from internal support channels with Zendesk.
  • Automatically push customer support data into Salesforce to supercharge the sales process and improve intra-team collaboration.
  • Sync data between Zendesk, Salesforce, and Hubspot so that CS (and other teams) can see accurate health score metrics for all customers they work with.

Product

  • Integrate product usage data with CRM data to proactively identify customers who are at risk of churning.
  • Identify highly engaged users, then share that information with sales so they can follow up with upsells or special offers.
  • Assess product usage data from Pendo alongside customer support ticket data from Zendesk. This helps the product team proactively identify problems that might cause a customer to contact support. 

Does your organization need reverse ETL?

We’ve waxed poetic about reverse ETL for a while now, so the answer to this question might surprise you.

The answer to the question of  “who needs reverse ETL?” is… it depends.

While reverse ETL is a powerful solution that enables data operationalization and increases the likelihood of widespread organizational data adoption, that doesn’t mean it’s a necessary part of every company’s data stack.

For example, there are plenty of enterprise businesses and large companies that want a mature data stack (as opposed to a modern data stack). There are plenty of reasons for this: Many large, established companies operate their data analytics and BI programs in an on-prem or hybrid cloud environment that uses local servers for data storage and processing. Large enterprises also tend to have a lot of technical data resources and a carefully governed API strategy, which effectively does the same thing as reverse ETL – moving data from a SaaS app into the data warehouse requires the same lift as moving data from the warehouse into a SaaS app. Further, enterprise-level businesses often use mature data stacks because they’re (typically) slower to change how they do things… especially if what they’re doing is working for the company.

On the flip side, there are also plenty of smaller businesses that won’t get much value from reverse ETL. Dashboards and reports may be more than enough for their needs, even if they already have a modern data stack. 

Alternatives to reverse ETL

If you need to sync data between a data warehouse and SaaS tools, but don’t necessarily need a full-fledged reverse ETL pipeline, there are two additional solutions to consider.

Integration Platforms (iPaaS)

iPaaS tools like Zapier, Workato, and IFTTT allow users to send data between platforms, all thanks to the power of no-code. Each tool performs certain actions based on a predefined trigger that you set, like sending a welcome email in Hubspot when a new lead is created in Salesforce. 

These tools can be super helpful but use them wisely. They’re meant to solve simple workflow challenges via point-to-point integrations between tools. They aren’t designed to be scalable and pipelines can quickly get out of control as you add new custom workflows, triggers, and if/then clauses. 

Additionally, you can’t use iPaaS tools to build custom data models that accurately sync data to downstream apps (because those apps require that your data be in a specific format or shape).

Customer Data Platforms (CDPs)

Platforms like Segment and mParticle are common players in the customer data space. They provide a single, custom-built platform where organizations can store all of their customer data. Data can also be easily accessed and activated in the various tools that different functions use.

The primary benefit of CDPs is that they have event collection capabilities for behavioral data, identity resolution, and audience management. This can be incredibly useful for some companies, but keep in mind that the underlying data model is rigid and can’t handle complex use cases.

CDP and iPaaS solutions are certainly enough for some teams. They’re no-code, easy to use, and don’t require any technical engineering resources. But, relying on them too much as the organization grows can cause problems, especially if you want to improve the entire organization’s access to data (not just a single department or function).

A modern data stack, with a single data warehouse and a reverse ETL solution to operationalize data, is a better long-term solution for startups, early-stage companies, and other growth-minded organizations that want easy access to and scalability of their data.

Should you build or buy a reverse ETL solution?

If you’ve decided that reverse ETL makes sense for your organization, you have two options: build or buy.

When you build a reverse ETL solution from scratch, you need to create every individual data connection between the data source and the data warehouse. This can take weeks or months depending on the resources at your disposal. Don’t forget about maintenance, either, because that will require additional resources. Every time an app or tool updates its API, maintenance will be required. 

Alternatively, you can buy a reverse ETL tool. These SaaS solutions come equipped with pre-built data connectors, though the exact number and types of connectors will vary from one company to another. 

Since very few organizations have spare data resources lying around, the majority of companies will opt to purchase a reverse ETL tool.

How to choose a reverse ETL tool

Over the past several years, more and more reverse ETL tools have entered the market. These tools are fully managed and easily scalable. User-friendliness varies between providers – some are designed to be used by data teams, and others are geared towards non-technical business end-users. 

With that in mind, here’s a broad overview of the most important features and functionalities you should look for in a reverse ETL tool:

  • Number of connectors: Reverse ETL tools aren’t useful if they don’t connect to the applications your teams rely on to do their work. 
  • A growing catalog of connectors: Ideally, any reverse ETL tool you consider will also be adding new connectors to its library with some degree of regularity. 
  • Ease of use: This will mean different things to different users, so keep that in mind. Some reverse ETL tools require SQL knowledge to operate effectively. Others have a visual interface powered by low-code or no-code tools that require zero SQL skills.
  • Reliable syncing: APIs have limits to how much data can be synced, so your reverse ETL tool should be reliable and efficient (i.e., only sync data that’s been recently changed or updated.)
  • Customized alerts and observability: Organizations should have complete control over their alerts to ensure they receive mission-critical information when needed. 
  • Compliance and security: Every industry runs up against different compliance and security requirements. Make sure any reverse ETL tool that you consider meets the same standards that your organization is held to. 
  • Customer support: There’s always a chance that something will go awry, so double-check that any provider you’re considering has robust, dependable, and easily accessible support options. 
  • Pricing structure: Pricing for reverse ETL tools is based on things like the number of connectors/integrations and data consumption. Before you purchase a tool, confirm that the pricing structure will work for the way your organization plans to use it.

Reverse ETL is your ticket to actionable, operationalized, and widely adopted data

If you’re ready to get data out of your warehouse and into the hands of business end-users, reverse ETL deserves a place in your organization’s data stack. 

When teams have access to data in the tools they already use every single day, they’re far more likely to use it to make logical, evidence-based decisions that will move the business forward. Take that, data silos. 

At AirOps, we firmly believe that reverse ETL represents the future of work. We’re on a mission to empower more end-users to operationalize data… and we hope that you’ll join us! Click here to sign up for email updates so that you can be the first to know about new product features and other AirOps-related news.