Run job

Description

This job triggers a connector using this endpoint.

Use Cases

We recommend creating a Run job Task for every "job" within dbt Cloud.

This way, you can use Orchestra to trigger your data transformation on a cron or event based schedule. This has a number of advantages vs. using dbt's in-built scheduler:

  • You can co-ordinate tasks outside of dbt, like ingestion tools (for operations that move data to your data warehouse) or reverse ELT tools (for moving data outside of the warehouse)

  • You can use Orchestra to trigger jobs across dbt Cloud Accounts and Projects. This means you can enable a data mesh architecture with a high degree of flexibility around Orchestrating different jobs

  • When dbt Cloud jobs run, Data Warehouse cost is incurred. Running these operations on a schedule you set explicitly ensures these costs do not go out of hand. This easily happens when Orchestrating multiple jobs across different accounts or projects, because dbt Cloud triggers cross-project jobs automatically where that setting is enabled

  • Orchestrating across dbt Projects is only available via an enterprise dbt Cloud plan

  • We aggregate metadata from the dbt Task in the same place as the metadata from other operations in your Pipeline

Setup guide

Fetch the dbt Job ID

  1. Head to the dbt cloud, and select "Deploy" -> "Jobs"

  2. You'll be taken to an overview of your jobs where the URL is: https://cloud.getdbt.com/deploy/{account_id}/projects/{project_id}/jobs

  3. Select the job you want, and parse the job_id from the url, which is of the form: https://cloud.getdbt.com/deploy/{account_id}/projects/{project_id}/jobs/{job_id}

Parameters

These parameters are required to run the Job run Task

NameData typeRestrictionsExample

Job ID

Number

N.A.

12345

Error handling

API Requests

If we receive the following error codes from dbt Cloud, we'll raise an error and the task will move to a failed state.

Code

Description

Handling

401

Unauthorised

We will raise an error and parse the raw error message from the dbt response as the Orchestra message

404

Not Found

We will raise an error with the HTTP Reason as the Orchestra message

Other error code

We will raise an error with the HTTP Reason as the Orchestra message

Failed models or tests

The job response from dbt Cloud looks like this:

{
    "status": {
        "code": 200,
        "is_success": true,
        "user_message": "Success!",
        "developer_message": ""
    },
    "data": {
        "id": 1234567,
        "trigger_id": 123456,
        "account_id": 12345,
        "environment_id": 12345,
        "project_id": 123456,
        "job_definition_id": 124254,
        "status": 10,
        "dbt_version": "1.0",
        "git_branch": "main",
        "git_sha": "4e041f69021cbddjhfjkdha4f78d50bb961054dd73",
        "status_message": null,
        "owner_thread_id": null,
        "executed_by_thread_id": "scheduler-run-0-7dfjdjnf-tp2sr",
        "deferring_run_id": null,
        "artifacts_saved": true,
        "artifact_s3_path": "env_name/runs/123456/artifacts/target",
        "has_docs_generated": false,
        "has_sources_generated": false,
        "notifications_sent": true,
        "blocked_by": [],
        "scribe_enabled": true,
        "created_at": "2023-07-14 11:49:37.958419+00:00",
        "updated_at": "2023-07-14 11:50:15.464103+00:00",
        "dequeued_at": "2023-07-14 11:49:38.034838+00:00",
        "started_at": "2023-07-14 11:49:42.666082+00:00",
        "finished_at": "2023-07-14 11:50:15.324682+00:00",
        "last_checked_at": "2023-07-14 11:50:15.373394+00:00",
        "last_heartbeat_at": "2023-07-14 11:50:12.653844+00:00",
        "should_start_at": null,
        "trigger": null,
        "job": null,
        "environment": null,
        "run_steps": [],
        "status_humanized": "Success",
        "in_progress": false,
        "is_complete": true,
        "is_success": true,
        "is_error": false,
        "is_cancelled": false,
        "duration": "00:00:37",
        "queued_duration": "00:00:04",
        "run_duration": "00:00:32",
        "duration_humanized": "37 seconds",
        "queued_duration_humanized": "4 seconds",
        "run_duration_humanized": "32 seconds",
        "created_at_humanized": "3 months, 3 weeks ago",
        "finished_at_humanized": "3 months, 3 weeks ago",
        "retrying_run_id": null,
        "can_retry": false,
        "retry_not_supported_reason": null,
        "job_id": 123456,
        "is_running": null,
        "href": "https://cloud.getdbt.com/deploy/account_id/projects/project_id/runs/run_id/"
    }
}

There is therefore, limited information provided in the Orchestra message field, however detailed run information is available in the Lineage tab.

Retrying from the point of failure

dbt Cloud supports retrying a node from the point of failure.

In this case, Orchestra will make a request to this endpoint.

This can be done in the Orchestra UI by navigating to "View All Runs" here:

Select a run by clicking on it:

You will be able to retry from the point of failure by selecting Re-run pipeline and selecting "Re-run from Failed":

Last updated