Aller au contenu principal

Integrate Teradata Jupyter extensions with Google Vertex AI

Remarque

This how-to shows you how to add Teradata Extensions to a Jupyter Notebooks environment. A hosted version of Jupyter Notebooks integrated with Teradata Extensions and analytics tools is available for functional testing for free at https://clearscape.teradata.com.

Overview

Teradata Jupyter extensions provide Teradata SQL kernel and several UI extensions to allow users to easily access and navigate Teradata database from Jupyter envioronment. Google Vertex AI is Google Cloud's new unified ML platform. Vertex AI Workbench provides a Jupyter-base development environment for the entire data science workflow. This article describes how to integate our Jupyter extensions with Vertex AI Workbench so that Vertex AI users can take advantage of our Teradata extensions in their ML pipeline.

Vertex AI workbench supports two types of notebooks: managed notebooks and user-managed notebooks. Here we will focus on user-managed notebooks. We will show two ways to integrate our Jupyter extensions with user-managed notebooks: use startup script to install our kernel and extensions or use custom container.

Prerequisites

  • Access to a Teradata Vantage instance
    Remarque

    If you need a test instance of Vantage, you can provision one for free at https://clearscape.teradata.com

  • Google Cloud account with Vertex AI enabled
  • Google cloud storage to store startup scripts and Teradata Jupyter extension package

Integration

There are two ways to run Teradata Jupyter Extensions in Vertex AI:

  1. Use startup script
  2. Use custom container

These two integration methods are described below.

Use startup script

When we create a new notebook instance, we can specify a startup script. This script runs only once after the instance is created. Here are the steps:

  1. Download Teradata Jupyter extensions package

Go to Vantage Modules for Jupyter page to download the Teradata Jupyter extensions package bundle Linux version.

  1. Upload the package to a Google Cloud storage bucket

  2. Write a startup script and upload it to cloud storage bucket

Below is a sample script. It fetches Teradata Jupyter extension package from cloud storage bucket and installs Teradata SQL kernel and extensions.

  1. Create a new notebook and add the startup script from cloud storage bucket create a new notebook with startup script

  2. It may take a few minutes for the notebook creation process to complete. When it is done, click on Open notebook. Open notebook

Use custom container

Another option is to provide a custom container when creating a notebook.

  1. Download Teradata Jupyter extensions package

Go to Vantage Modules for Jupyter page to download the Teradata Jupyter extensions package bundle Linux version.

  1. Copy this package to your work directory and unzip it

  2. Build custom Docker image

The custom container must expose a service on port 8080. It is recommended to create a container derived from a Google Deep Learning Containers image, because those images are already configured to be compatible with user-managed notebooks.

Below is a sample Dockerfile you can use to build a Docker image with Teradata SQL kernel and extensions installed:

  1. In your work directory (where you unzipped Teradata Jupyter extensions package), run docker build to build the image:
  1. Push the docker image to Google container registry or artifact registry

Please refer to the following documentations to push docker image to registry:

  1. Create a new notebook

In Environment section, set custom container field to the location of your newly created custom container: Open notebook

Further reading

Remarque

If you have any questions or need further assistance, please visit our community forum where you can get support and interact with other community members.

Également intéressant