ECS Run task
Details how to use Orchestra to run a task in ECS
Description
This job triggers a ECS standalone task.
Use Cases
We recommend creating a ECS run task for each standalone task you wish to run in ECS
This way, you can use Orchestra to trigger running your script on a cron or event based schedule. This has a number of advantages:
You can co-ordinate tasks outside of AWS
A common use case for this is to trigger a crawl job that runs in ECS and outputs data to S3. Once this job has completed Orchestra can trigger an ingestion job to move the data to Snowflake and then you can use a dbt job to transform that data correctly for you.
Another use case is to run dbt core jobs using ECS containers. With Orchestra you can run your ingestion task and then trigger your dbt core job in ECS.
You can use Orchestra to trigger jobs across AWS Accounts / Environments
When AWS ECS jobs run, cost is incurred. Running these operations on a schedule you set explicitly ensures these costs do not go out of hand
We aggregate metadata from the AWS ECS Task in the same place as the metadata from other operations in your Pipeline
Using Orchestra to trigger ECS
To run an ECS task using Orchestra you will need:
An ECS cluster
A task definition using Fargate as the launch type and
awsvpc
as the network modeA VPC with at least one subnet and a security group that allows traffic to and from the ECS containers
Parameters and setup
These parameters are required to run the Run Workflow Task
Name | Data type | Restrictions | Example |
---|---|---|---|
Cluster | String | N.A. | dummy-cluster |
Task defintion | String | N.A. | task_def:revision |
Subnet IDs | String | Comma separated string | subnet-xxxxxxxx,subnet-yyyyyyyy,subnet-zzzzzzzz |
Security group IDs | String | Comma separated string | sg-xxxxxxxx,sg-yyyyyyyy,sg-zzzzzzzz |
Limitations
Orchestra only supports running ECS containers on Fargate. Please contact us if you need to run using EC2 or external machines.
Orchestra only supports running ECS containers using the
awsvpc
network mode. Please contact us if you need to use other network modes.
dbt Core
A common use case is to use ECS standalone tasks to run dbt Core projects. There are some additional configuration requirements required to setup this correctly within Orchestra:
Additional permissions are required for the user provided to Orchestra to execute the AWS commands. Details can be found here.
Upload artifacts to S3 ready for Orchestra to download and build operations from. Details can be found here.
In Orchestra, for the ECS task, selecting
Collect additional metadata
and configuring the following parameters (unless specified, they are required):S3 bucket name: the name of the S3 bucket where the artifacts will be stored
S3 key prefix: the key prefix for the above S3 bucket where Orchestra should download the artifacts from
(optional) Warehouse identifier: for Orchestra to generate more accurate metadata once the task is complete, you can include the warehouse identifier that dbt runs on. For example, your Snowflake Account ID
Error Handling
Orchestra uses the output from ECS DescribeTasks API to determine the status of your task. When a task enters the 'deprovisioning' state Orchestra parses all the containers in the task and checks their exit codes. If any container has a non-zero exit code it will fail the task and alert you. If all containers have a zero exit code the task moves to succeeded and the next step in your pipeline is triggered.
Last updated