🔗Connections

How does Orchestra connect to Integrations?

Every integration requires a Connection. A connection is a configuration for connecting to the other parts of your stack. It's a separate object, because in many instances it's reusable - for example, supposing you want to configure Tasks for a specific Integration, it would be arduous to manually specify things like Account IDs, Client IDs, Bearer Tokens, and so on.

Examples - when should I create a connection?

When you want to re-use a credential to an integration, you should create a connection. If you only have a single environment ("Production"), then one connection per integration would suffice. If you leverage dev, staging or UAT, and production environments - you would need three.

Examples

  • Fivetran: users tend to only use Fivetran for Production data. Users can duplicate all their Fivetran syncs for Staging and Production, but this would double cost so is not advised. For Fivetran, the token scopes for all connections, so Orchestra users typically only need to create one FIvetran connection

  • Snowflake: Snowflake is a database with a full role-based access control system ("RBAC system"). Every user in Snowflake has their own API credential; which means depending on which resources a user has access to, multiple connection may be required for a Snowflake integration

  • dbt Cloud: typically, a user in dbt Cloud has access to all projects in a given dbt cloud account. This means that only one connection in Orchestra is necessary. When dbt Cloud implement an RBAC system, multiple connections may be required. For the ease of logical separation (for example, co-ordinating different projects across the same account), it may be more convenient to use multiple dbt Cloud connections in Orchestra.

Use cases

Environment management

Using Snowflake as an example, you may have multiple environments in Snowflake for separating your staging area and a production area. These may have different Connection Parameters, which you can see below.

In this case, it's wise to use a production-dedicated account to manipulate data in production, which ensures this environment is "locked down" and users cannot programatically (or via the UI) update data in Snowflake. You might not want to do this for staging. This necessitates having different connections.

dbt Cloud - dbt Mesh

In dbt Cloud, you might have different Cloud accounts or Projects for different teams. This is particularly helpful if you have multiple teams with their own repositories, but that feed into each other. For example, an Analytics Engineering team might take care of key Fact and Dimension tables, whereas an analyst team has their own repository for SQL that takes those Fact and Dimension tables as sources.

Triggering and monitoring jobs across these different environments can be tricky, because you might need a different Token in dbt Cloud to access different projects. Using Orchestra, you can quickly spin up different connection objects to easily trigger and monitor dbt Cloud Tasks across these environments.

These can be used in the Task Builder to chain different operations together

We'll be adding RBAC for connections soon, which give data product owners even more fine-grained control over who can use what connections, and for what purpose.

Last updated