Integration Jobs Metadata

What information is captured and when?

Context

For some integration jobs, it is possible to configure Orchestra to collect additional metadata from a different source. This allows Orchestra to enrich your job with additional information. A good example is when running a dbt Cloud job that uses Snowflake as the adapter. Here you can configure Orchestra to enrich your dbt Cloud Task Runs and Operations with their Snowflake cost. This allows you to track the cost of each of dbt Task Runs over time.

Available Configurations

  • dbt Cloud + Snowflake. You can configure Orchestra to collect the cost of each query in your dbt Cloud job from Snowflake. This cost is described as follows:

    • The sum of the Snowflake Cloud services credits used and an estimate of the credits used based on the duration of the query and the warehouse size. NOTE: This will not include time for which a warehouse with generous auto-suspend limits is still active that exceeds the query duration.

  • Coalesce + Snowflake. Similar to the dbt Cloud + Snowflake configuration you can configure Coalesce jobs that use Snowflake to collect the Snowflake query cost.

  • Airflow trigger. When Orchestra is triggered via an Airflow DAG, you can configure Orchestra to collect information about the parent DAG.

  • ECS + dbt Core. When running dbt Core jobs as part of an ECS task you can configure Orchestra to collect the artifact output files from the dbt Core run and parse them into dbt operations in the Orchestra UI. More information about this configuration can be found here.

For Snowflake, to ensure metadata is fetched as expected, ensure the role used in Orchestra has the MONITOR and USAGE permissions on each warehouse in which queries are being executed in:

GRANT MONITOR, USAGE ON WAREHOUSE "<WAREHOUSE>" TO ROLE <ROLE>;

Last updated