Skip to content

Files

Latest commit

d6fcc87 · Mar 29, 2024

History

History

Tips

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Dec 22, 2023
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024
Mar 29, 2024

tracker

GitHub logo
View on
GitHub





/Tips/readme.md

A collection of tips for scaling jobs, generalizing jobs for flexibility, and developing ML training jobs that are portable. Think of this as DevOps for ML training jobs. The tips will show how to do multiple tasks in parallel within your code, pass parameters to jobs from the command line and input files, package training code, build custom containers with training code, and deploy training code on Vertex AI Training to take advantage of scalable managed infrastructure at the job level.

Using This Repository

  • Each notebook that has a parameter defined as BUCKET = PROJECT_ID can be customized:
    • change this to BUCKET = PROJECT_ID + 'suffix' if you already have a GCS bucket with the same name as the project.

IDEs

For more details around setting up working environment in a variety of IDE's beyond just JupyterLap please checkout the IDE folder.

Tools

Notes

  • aiplatform Python Client
    • All about the Vertex AI Python Client: versions (aiplatform_v1 and aiplatform_v1beta) and layers (aiplatform and aiplatform.gapic). Includes the deeper details and examples of using each.

Python: Notebooks on Skills For ML Training Jobs and Tasks

  • Python Multiprocessing
    • tips for executing multiple tasks at the same time
  • Python Job Parameters
    • tips for passing values to programs from the command line (argparse, docopt, click) or with files (JSON, YAML, pickle)
  • Python Client for GCS
    • tips for interacting with GCS storage from Python, Vertex AI
  • Python Packages
    • prepare ML training code with a file (modules), folders, packages, distributions (source distribution and built distribution) and storing in custom repositories with Artifact Registry
  • Python Custom Containers
    • tips for building derivative containers with Cloud Build and Artifact Registry
  • Python Training
    • move training code out of a notebook and into Vertex AI Training Custom Jobs
    • This demonstrates many workflows for directly using the code formats created in Python Packages and for the custom container workflows created in Python Custom Containers

BigQuery: Notebooks on BigQuery Topics

Additional Tips

Notebooks on Skills For BigQuery

  • New series will go here (see todo)

ToDo:

  • split this folder with subfolders
  • Python, BigQuery, KFP, ...
    • KFP Layers: components, tasks, artifacts, pipelines, IO
    • BQ Layers: project, dataset, table, rows, columns, cells + access, operations, ...
  • [IP] BigQuery Tips:
    • BigQuery - Python Clients
    • BigQuery - R
    • BigQuery - Data Types
    • BigQuery - Tables
    • BigQuery - UDF
    • BigQuery - Remote Functions
  • Add Git workflow tip - how to clone with PAT
  • [DEV] add KFP tip, include component authoring
  • [IP] secret manager tip
  • IDE's: colab, VSCode local, VSCode Remote, Jupyter, ....