Using AI to generate YAML entries for dbt models

How to use artificial intelligence to automatically generate YAML entries for your dbt models.

Published on Jan 17, 2024 by AirOps Team

After an analytics engineer creates a new dbt model, they frequently skip an important step in their workflows: Generating YAML entries that include the model name, columns, and descriptions. 

It’s easy to see why so many people neglect this step – generating YAML entries for dbt models can be frustrating and time-consuming. When you’re trying to learn the dbt framework, this seemingly small task can be extra onerous. 

The result? Poorly documented work and a variety of downstream data problems. 

Luckily, there’s an easy way to save tons of time by using AI Data Sidekick to automatically generate a YAML entry for a dbt SQL query.

Automatically write SQL in dbt Cloud using AI

Data Sidekick combines the power of AI with context from your data warehouse to automate common data-related tasks. 

With the dbt Config data app, you can input a SQL query and instantly: 

  • Generate a dbt YAML file entry for the model that the SQL query creates
  • Automatically define the model’s name, columns, and definitions
  • Add tests on fields to double check that everything is working correctly

Work with data 10x faster.

Magically draft, correct, and explain SQL. Instantly write Python scripts. Free for individuals and small teams.

See how Data Sidekick performs in the wild

Curious about how Sidekick performs in the wild? See what Kyle Dempsey, Head of CX and Solutions Architecture at AirOps, has to say about using it to generate YAML entries for his dbt data models.

Be honest – before Sidekick, how often did you skip writing YAML file entries?
Honestly, more often than I should probably admit. The process of writing a .yml file entry is slow and monotonous, so I skipped it a lot.

This isn’t something I’m exactly proud of – if the dbt Config isn’t generated correctly, errors in your downtown data pipeline are almost guaranteed. Plus, incorrectly defined or undefined columns make it hard to standardize, communicate, and reuse data definitions.
How has Sidekick changed your data wor
My documentation game has improved tremendously. Having Sidekick means that I have literally no excuse – skipping this step would make zero sense because fully documenting all of my models is so quick and easy.

Work with data 10x faster.

Magically draft, correct, and explain SQL. Instantly write Python scripts. Free for individuals and small teams.