Skip to main content

teradatamlspk - Teradata Python package for running Spark workloads on Vantage

Details


Overview

teradatamlspk is a Python package, built as an extension of teradataml, Teradata Python package. Syntax and user accessibility of teradatamlspk APIs are kept similar to PySpark APIs, allowing, the existing PySpark workloads, that run on Spark engine, can be easily run on Teradata Vantage with minimal changes to migrate PySpark workloads to Vantage.

teradatamlspk offers another function pyspark2teradataml that enables conversion of a PySpark script to a teradatamlspk Python script. It also generates the HTML report for the conversion, that is useful for the user to understand the changes done and also carry out any manual changes in the generated script, so that the script can be run on Vantage.

Dependent Python Packages: 

  • teradataml 20.0.0.0 or Later
  • PrettyTable

 

Not Applicable
OS version
20.00.00.00
Release version

Technical Details

  • Version
  • Released
  • TTU
  • OS
  • Teradata

teradatamlspk - Teradata Python package for running Spark workloads on Vantage