Aller au contenu principal

Run Teradata Jupyter Notebook Demos for VantageCloud Lake in Google Cloud Vertex AI

Overview

This quickstart explains how to run Teradata Jupyter Notebook Demos for VantageCloud Lake on Vertex AI, the AI/ML platform for Google Cloud.

Prerequisites

Vertex AI Google Cloud environment setup

When you create a new notebook instance, you can specify a startup script. This script, which runs only once after instance creation, will install the Teradata Jupyter extension package and clone a GitHub repository into the new user-managed notebooks instance.

  • Download Teradata Jupyter extensions package

  • Create Google Cloud Storage Bucket

    • Create a bucket with a name relevant to the project (e.g., teradata_jupyter).
    • Ensure that the bucket name is globally unique. For instance, if the name teradata_jupyter has already been used, it will not be available for subsequent users. New bucket
  • Upload the unizzped Jupyter extension package to your Google Cloud Storage bucket as a file.

  • Write the following startup script and save it as startup.sh to your local machine.

Below is an example script that retrieves the Teradata Jupyter extension package from Google Cloud Storage bucket and installs Teradata SQL kernel, extensions and clones the lake-demos repository.

info

Remember to replace teradata_jupyter in the gsutil cp command.

  • Upload this script to your Google Cloud storage bucket as a file files uploaded to bucket

Initiating a user managed notebook instance

  • Access Vertex AI Workbench

    • Return to Vertex AI Workbench in Google Cloud console.
    • Create a new User-Managed Notebook via Advanced Options or directly at https://notebook.new/.
  • Under Details, name your notebook, select your region and select continue. notebook env details

  • Under Environment select Browse to select your startup.sh script from your Google Cloud Bucket. select startup script

  • Select Create to initiate the notebook. It may take a few minutes for the notebook creation process to complete. When it is done, click on OPEN JUPYTERLAB. active notebook

info

You will have to whitelist this IP in your VantageCloud Lake environment to allow the connection. This solution is appropriate in a trial environment. For production environments, a configuration of VPCs, Subnets, and Security Groups might need to be configured and whitelisted.

  • On JupyterLab open a notebook with a Python kernel and run the following command for finding your notebook instance IP address. python3 kernel

VantageCloud Lake Configuration

  • In the VantageCloud Lake environment, under settings, add the IP of your notebook instance Initiate JupyterLab

Edit vars.json

Navigate into the lake-demos directory in your notebook. notebook launcher

Right click on vars.json to open the file with editor. vars.json

Edit the vars.json file file to include the required credentials to run the demos

VariableValue
"host"Public IP value from your VantageCloud Lake environment
"UES_URI"Open Analytics from your VantageCloud Lake environment
"dbc"The master password of your VantageCloud Lake environment.

To retrieve a Public IP address and Open Analytics Endpoint follow these instructions.

info

Change passwords in the vars.json file.You'll see that in the sample vars.json, the passwords of all users are defaulted to "password", this is just for matters of the sample file, you should change all of these password fields to strong passwords, secure them as necessary and follow other password management best practices

Run demos

Execute all the cells in 0_Demo_Environment_Setup.ipynb to setup your environment. Followed by 1_Demo_Setup_Base_Data.ipynb to load the base data required for demo.

To learn more about the demo notebooks, go to Teradata Lake demos page on GitHub.

Summary

In this quickstart guide, we configured Google Cloud Vertex AI Workbench Notebooks to run Teradata Jupyter Notebook Demos for VantageCloud Lake.

Également intéressant