Ingest and Catalog Data from Teradata Vantage to Amazon S3 with AWS Glue Scripts
Overview
This quickstart details the process of ingesting and cataloging data from Teradata Vantage to Amazon S3 with AWS Glue.
팁
For ingesting data to Amazon S3 when cataloging is not a requirement consider Teradata Write NOS capabilities.
Prerequisites
- Access to an Amazon AWS account
- Access to a Teradata Vantage instance
노트
If you need a test instance of Vantage, you can provision one for free at https://clearscape.teradata.com
- A database client to send queries for loading the test data
Loading of test data
- In your favorite database client run the following queries
Amazon AWS setup
In this section, we will cover in detail each of the steps below:
- Create an Amazon S3 bucket to ingest data
- Create an AWS Glue Catalog Database for storing metadata
- Store Teradata Vantage credentials in AWS Secrets Manager
- Create an AWS Glue Service Role to assign to ETL jobs
- Create a connection to a Teradata Vantage Instance in AWS Glue
- Create an AWS Glue Job
- Draft a script for automated ingestion and cataloging of Teradata Vantage data into Amazon S3
Create an Amazon S3 Bucket to Ingest Data
- In Amazon S3, select
Create bucket
. - Assign a name to your bucket and take note of it.
- Leave all settings at their default values.
- Click on
Create bucket
.
Create an AWS Glue Catalog Database for Storing Metadata
- In AWS Glue, select Data catalog, Databases.
- Click on
Add database
. - Define a database name and click on
Create database
.
Store Teradata Vantage credentials in AWS Secrets Manager
- In AWS Secrets Manager, select
Create new secret
. - The secret should be an
Other type of secret
with the following keys and values according to your Teradata Vantage Instance:- USER
- PASSWORD
팁
In the case of ClearScape Analytics Experience, the user is always "demo_user," and the password is the one you defined when creating your ClearScape Analytics Experience environment.
- Assign a name to the secret.
- The rest of the steps can be left with the default values.
- Create the secret.
Create an AWS Glue Service Role to Assign to ETL Jobs
The role you create should have access to the typical permissions of a Glue Service Role, but also access to read the secret and S3 bucket you've created.
- In AWS, go to the IAM service.
- Under Access Management, select
Roles
. - In roles, click on
Create role
. - In select trusted entity, select
AWS service
and pickGlue
from the dropdown. - In add permissions:
- Search for
AWSGlueServiceRole
. - Click the related checkbox.
- Search for
SecretsManagerReadWrite
. - Click the related checkbox.
- Search for
- In Name, review, and create:
- Define a name for your role.
- Click on
Create role
. - Return to Access Management, Roles, and search for the role you've just created.
- Select your role.
- Click on
Add permissions
, thenCreate inline policy
. - Click on
JSON
. - In the Policy editor, paste the JSON object below, substituting the name of the bucket you've created.
- Click
Next
. - Assign a name to your policy.
- Click on
Create policy
.