How to Call the dbt Cloud API From an ETL Pipeline and Wait for Job Completion

Table of Contents

To call the dbt Cloud API from an ETL pipeline and wait for job completion, you trigger the job by sending a POST request to dbt Cloud's run endpoint, capture the run ID from the response, then poll the run status endpoint on a loop until the status is complete or error, and use the result to determine whether the next pipeline step should proceed.

This guide is for data engineers who use dbt Cloud for transformations and need to trigger dbt jobs from within an ETL orchestration layer, then gate downstream steps on the result. After reading, you will be able to build a workflow that triggers a dbt Cloud job, waits for it to finish, and routes downstream steps based on whether the job succeeded or failed.

The core pattern is three steps: POST to the trigger endpoint, poll the status endpoint until the job exits a running state, then branch on the final status. A dbt run that completes successfully means downstream ETL steps can proceed against fresh model output. A run that errors or is cancelled means the downstream steps must be held until the problem is resolved.

The Problem

dbt Cloud jobs are long-running. A full model refresh can take 5 to 60 minutes depending on model complexity, warehouse size, and data volume. You cannot fire the trigger API call and immediately proceed as if the job is finished.

Teams that do not handle this correctly fall into two patterns. The first adds an arbitrary sleep delay, choosing a number of minutes that feels safe, then moves on. This either wastes time when the job finishes early, or breaks when the job runs longer than expected. The second pattern skips waiting entirely and runs downstream steps on stale, pre-transformation data, producing reports that reflect the state of the warehouse before the dbt models ran.

What is actually needed is a polling loop: a mechanism that checks run status on a regular interval and only proceeds when dbt Cloud returns a finished state, whether that state is success, error, or cancelled. To learn how Integrate.io can help to automate your ETL pipelines and dbt transformations, reach out to our team to discuss your use case with our Sales engineer.

What You'll Need

A dbt Cloud account with at least one configured job and a known job ID (visible in dbt Cloud under Deploy > Jobs)
A dbt Cloud API token created under Account Settings > API Tokens; a service account token with the "Developer" role is sufficient
Your dbt Cloud account ID visible in the URL when logged in: cloud.getdbt.com/accounts/YOUR_ACCOUNT_ID/
An ETL orchestration tool that supports REST API calls and polling within a workflow (Integrate.io covers both natively)
The downstream ETL steps that should run only after the dbt job succeeds, already defined in your pipeline

How to Call the dbt Cloud API From an ETL Pipeline and Wait for Job Completion: Step-by-Step

Step 1: Locate Your dbt Cloud Account ID, Job ID, and Generate an API Token

Before you can trigger a dbt Cloud job from an external system, you need three values: your account ID, the job ID for the specific job you want to trigger, and an API token that authorizes the request.

What to do:

Log into dbt Cloud; your account ID appears in the URL immediately after /accounts/; note this number
Navigate to Deploy > Jobs, select the job you want to trigger, and look at the URL; the job ID appears after /jobs/
Go to Account Settings > API Tokens and create a new service account token; assign it at least the "Developer" role
Copy the token value immediately; dbt Cloud shows it only once and you cannot retrieve it later
Store the token in your ETL tool's secrets manager as an encrypted secret, not in plain text in the workflow config

Output of this step: Three values ready for use: account ID, job ID, and an API token stored as an encrypted secret.

Step 2: Trigger the dbt Cloud Job via a POST Request

With the account ID, job ID, and token in hand, you can fire the trigger request. This POST call starts the dbt Cloud job and returns a run ID that you will use in every subsequent status check.

What to do:

Construct the POST request URL using this format: https://cloud.getdbt.com/api/v2/accounts/{account_id}/jobs/{job_id}/run/
Set the Authorization header to: Token YOUR_API_TOKEN
Set the Content-Type header to: application/json
Send a request body containing at minimum: {"cause": "Triggered by ETL pipeline"}
Optionally include steps_override in the body if you want to run only specific dbt models on this trigger rather than the full job definition
Parse the response body and extract the run ID from the data.id field; store it as a pipeline variable

Output of this step: A triggered dbt Cloud job and a run ID stored as a pipeline variable, ready to use in the polling step.

Where Integrate.io helps: Integrate.io's REST API task component in the workflow canvas lets you configure the POST URL, headers, and body directly through the UI. The response fields, including data.id, are automatically available as named variables that downstream workflow steps can reference without any parsing code.

Step 3: Poll the Run Status Endpoint Until the Job Completes

dbt Cloud does not push a completion event to your pipeline. You need to ask for the run status on a regular interval until the job exits the running state. This is the polling loop that replaces both the arbitrary sleep and the fire-and-forget approach.

What to do:

Construct the GET request URL: https://cloud.getdbt.com/api/v2/accounts/{account_id}/runs/{run_id}/
Use the same Authorization header as the trigger step
Read the status_humanized field from the response; possible values include "Running", "Queued", "Success", "Error", and "Cancelled"
Set a polling interval of 30 to 60 seconds; most dbt jobs do not change state faster than this, and polling more frequently adds unnecessary API load
Continue polling until status_humanized is not "Running" and not "Queued"
Record the final status value as a pipeline variable for the branching step

Output of this step: A confirmed final run status ("Success", "Error", or "Cancelled") from the dbt Cloud run status endpoint.

Where Integrate.io helps: Integrate.io's workflow supports a polling task type that sends a GET request on a configured interval, checks a response field against a target value, and repeats until the condition is met or a timeout is reached. This eliminates the need for a custom Python script to manage the polling loop; the interval, timeout, and completion condition are set through the workflow configuration.

Step 4: Branch the Workflow Based on the dbt Job Result

Once the polling loop exits with a final status, the workflow needs to decide what happens next. Running downstream steps regardless of the dbt job outcome is the most common error in this pattern.

What to do:

Read the final status variable captured in Step 3
If the status is "Success", route the workflow to the downstream ETL steps that depend on the dbt models being current
If the status is "Error" or "Cancelled", route the workflow to a failure branch that sends an alert containing the run ID and the status value, then halts execution
Do not proceed to downstream steps on a "Cancelled" status; a cancelled run means the dbt models are not refreshed, and downstream consumers will read stale data
Include the run ID in every failure alert so the dbt Cloud run can be looked up directly

Output of this step: A workflow that routes to the correct downstream path based on dbt job outcome, with failure alerts containing the run ID.

Where Integrate.io helps: Integrate.io's workflow branching component evaluates the status variable from the polling step and routes execution to the success path or the failure path with no custom logic required. Alert tasks in the failure branch can include the run ID and status as dynamic fields.

Step 5: Pass the dbt Run Timestamp to Downstream Steps

The dbt Cloud run response includes a finished_at timestamp. Capturing this value and passing it downstream gives subsequent ETL steps a reliable reference point for incremental load filtering and audit logging.

What to do:

During the polling step, when the job reaches a final state, extract the finished_at field from the run status response
Store this timestamp as a named pipeline variable
Pass it to downstream ETL steps as a reference value; for example, a Snowflake load step that runs after the dbt job can use this timestamp to confirm it is reading post-transformation data
Log the run ID and the finished_at timestamp in your pipeline's job history so you can audit which dbt run produced the data your downstream steps consumed
If your downstream steps use incremental filters based on a watermark, the finished_at timestamp is the appropriate watermark to use for this cycle

Output of this step: A pipeline variable containing the dbt run completion timestamp, available to all downstream workflow steps for incremental filtering and audit records.

Step 6: Handle Edge Cases: Timeouts and Cancelled Runs

dbt runs occasionally queue for extended periods before starting. Without a polling timeout, a stalled run in "Queued" or "Running" state will hold your workflow open indefinitely.

What to do:

Set a maximum polling duration in your orchestration tool; two hours is a reasonable ceiling for most dbt jobs, but adjust based on your longest expected run time
If the polling loop reaches the maximum duration without seeing a terminal status, treat the run as failed and fire the failure branch with a "polling timeout" message and the run ID
Handle "Cancelled" status the same as "Error" at every point in the workflow; a cancellation means the models are not fresh
If a run was cancelled by a user action in dbt Cloud, surface this to the team as a pipeline failure so the situation is investigated before downstream steps consume stale data
Document the timeout value in your team's runbook so engineers know where to adjust it if dbt job runtimes grow

Output of this step: A workflow with a polling timeout configured, "Cancelled" handled as a failure condition, and the run ID included in all failure notifications.

Step 7: Schedule the Full Workflow and Remove Any Separate dbt Schedules

Once the ETL orchestration tool owns the dbt job trigger, the native dbt Cloud job schedule for that job should be disabled. Running both creates duplicate job executions and potential conflicts.

What to do:

Set the schedule for the full ETL-plus-dbt workflow in your orchestration tool at the frequency your downstream reporting requires
Log into dbt Cloud, navigate to Deploy > Jobs, select the job, and disable the job's native schedule
Verify the schedule is disabled by checking the job's Next Run field in dbt Cloud; it should show no upcoming run
Add a note in the job description in dbt Cloud indicating it is triggered by your ETL orchestration tool, so other team members do not re-enable the schedule
Document in your team's runbook that this job is owned by the ETL workflow and must not have its dbt Cloud schedule re-enabled

Output of this step: One unified schedule in the ETL orchestration tool, with the dbt Cloud job's native schedule disabled and the change documented.

Common Mistakes to Avoid

Using a fixed sleep delay instead of polling: A 10-minute sleep works today when the dbt job takes 8 minutes; it breaks next month when the model grows and the job takes 15 minutes. Always poll for the actual completion status rather than estimating a safe duration.
Not capturing the run ID from the trigger response: Without the run ID, you cannot poll the correct run's status, especially when the same job has multiple runs in the queue. Extract and store data.id immediately after the trigger POST, before any other step runs.
Proceeding downstream on a "Cancelled" status: A cancelled run means the dbt models did not finish refreshing. Treating "Cancelled" as a non-failure and running downstream steps means those steps consume pre-transformation data. Route "Cancelled" to the same failure branch as "Error".
Running both an ETL workflow schedule and a dbt Cloud job schedule for the same job: This runs the job twice per cycle, doubles compute cost, and can cause conflicts when the two runs overlap. Disable the dbt Cloud schedule as soon as the ETL workflow takes over scheduling authority.
Not setting a polling timeout: If dbt Cloud experiences an issue and a run stays in "Running" state past any reasonable duration, an untimed polling loop will hold your workflow open until it is manually killed. Set a maximum polling duration and fire a timeout alert if it is exceeded.
Hardcoding the account ID and job ID in the workflow definition: These values change when jobs are cloned, migrated, or when the team moves to a different dbt Cloud account. Store them as pipeline variables or secrets so they can be updated without editing the workflow definition directly.

Conclusion

Calling the dbt Cloud API from an ETL pipeline and waiting for job completion requires four things done in sequence: triggering the run via POST, capturing the run ID, polling the status endpoint until the job exits the running state, and branching on the final status before any downstream step executes.

The full pattern covered here starts with locating the account ID, job ID, and API token; continues through the trigger POST and polling loop; passes the finished_at timestamp downstream for incremental filtering; and ends with consolidating the schedule in the orchestration tool so dbt Cloud's native scheduler is no longer involved.

Integrate.io handles the dbt Cloud trigger, the polling loop, and the success/failure branching through its REST API task and workflow branching components, removing the need for a custom Python script to manage the poll interval, timeout, and state evaluation. Once this pattern is in place for one dbt job, the same workflow structure extends to any additional dbt jobs the team adds, with each job getting its own trigger-poll-branch sequence in the orchestration graph.

ETL

How to Call the dbt Cloud API From an ETL Pipeline and Wait for Job Completion

The Problem

What You'll Need

How to Call the dbt Cloud API From an ETL Pipeline and Wait for Job Completion: Step-by-Step

Step 1: Locate Your dbt Cloud Account ID, Job ID, and Generate an API Token

Step 2: Trigger the dbt Cloud Job via a POST Request

Step 3: Poll the Run Status Endpoint Until the Job Completes

Step 4: Branch the Workflow Based on the dbt Job Result

Step 5: Pass the dbt Run Timestamp to Downstream Steps

Step 6: Handle Edge Cases: Timeouts and Cancelled Runs

Step 7: Schedule the Full Workflow and Remove Any Separate dbt Schedules

Common Mistakes to Avoid

Conclusion

How to Orchestrate Multi-Step ETL Workflows with Conditional Failure Branching: Step-by-Step

MCP vs REST APIs for Data Integration: When to Use Each

How to Connect Your Data Warehouse to AI Agents With MCP

How to Call the dbt Cloud API From an ETL Pipeline and Wait for Job Completion

The Problem

What You'll Need

How to Call the dbt Cloud API From an ETL Pipeline and Wait for Job Completion: Step-by-Step

Step 1: Locate Your dbt Cloud Account ID, Job ID, and Generate an API Token

Step 2: Trigger the dbt Cloud Job via a POST Request

Step 3: Poll the Run Status Endpoint Until the Job Completes

Step 4: Branch the Workflow Based on the dbt Job Result

Step 5: Pass the dbt Run Timestamp to Downstream Steps

Step 6: Handle Edge Cases: Timeouts and Cancelled Runs

Step 7: Schedule the Full Workflow and Remove Any Separate dbt Schedules

Common Mistakes to Avoid

Conclusion

Related Readings

How to Orchestrate Multi-Step ETL Workflows with Conditional Failure Branching: Step-by-Step

MCP vs REST APIs for Data Integration: When to Use Each

How to Connect Your Data Warehouse to AI Agents With MCP

Subscribe To The Stack Newsletter

Subscribe To
The Stack Newsletter