AWS Glue

AWS Glue is a serverless data integration service

Type: Cloud Provider / Infrastructure

Website: https://aws.amazon.com/glue/

General docs: https://aws.amazon.com/glue/resources/

Python SDK Docs: https://github.com/awslabs/aws-glue-libs

API Docs: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api.html

Authentication

To connect EC2 to Orchestra, you will need to follow the same process you normally do when dealing with service principals in AWS:

  • An AWS IAM user with the required permissions for the job you wish to run

  • An access key for the IAM user

Prerequisites

To get started, you will need:

  • An AWS IAM user with the required permissions for the job you wish to run

  • An access key for the IAM user

Instructions

  1. Navigate to the IAM console in the AWS account you wish to run the job in.

  2. Attach the required permissions. See below for the required permissions for each job.

  3. Generate a CLI access key for the User. More information is available in the AWS docs here. Once the access key is generated you can download a CSV containing your Access Key ID and Secret Access key.

Required Permissions

AWS Run Glue job. The following actions are required. You can specify the resource if desired.

123456789101112131415{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "glue:StartJobRun",
        "glue:GetJobRun",
        "glue:BatchStopJobRun"
      ],
      "Resource": "*"
    }
  ]
}

Last updated