trainML Documentation
  • Docs
  • Tutorials
  • Blog
  • Login/Signup

›All Blog Posts

All Blog Posts

  • CloudBender
  • NVIDIA NGC Catalog Integration
  • Collaborative Projects for Resource Sharing
  • Customer Provided Job Environments
  • REST Endpoints for Inference
  • Automatic Dependency Installation
  • Consolidated Account Billing
  • Load Model Code Directly From Your Laptop
  • Start Training Models With One Line
  • RTX 3090 (BFGPU) Instances Now Available
  • Build Full Machine Learning Pipelines with trainML Inference Jobs
  • Store Training Results Directly on the trainML Platform
  • Dataset Viewing
  • Stay Modern with Python 3.8 Job Environments
  • Downloadable Log Extracts for Jobs and Datasets
  • Automate Training with the trainML Python SDK
  • trainML Jobs on Google Cloud Platform Instances
  • Spawn Training Jobs Directly From Notebooks
  • Easy Notebook Forking For Rapid Experimentation
  • Making Datasets More Flexible and Expanding Environment Options
  • Kaggle Datasets and API Integration
  • Centralized, Real-Time Training Job Worker Monitoring
  • Free to Use Public Datasets
  • Major UI Overhaul and Direct Notebook Access
  • Load Data Once, Reuse Infinitely
  • Serverless Deep Learning On Private Git Repositories
  • Google Cloud Storage Integration Released
  • Skip the Cloud Data Transfers with Local Storage
  • Web (HTTP/FTP) Data Downloads Plus Auto-Extraction of Archives

Automatic Dependency Installation

July 22, 2021

trainML

trainML

trainML jobs now accept lists of packages that will be installed using apt, pip, or conda as part of the job creation process and will automatically install dependencies found in the requirements.txt file in the root of the model code working directory.

How It Works

You can now specify a list of apt, pip, and/or conda packages as part of the job creation process. After the job's model code is loaded, trainML will automatically install the packages from the apt list, the conda list, the requirements.txt file, and finally the pip list. Specifying package dependencies works for all job types and is particularly recommended for non-notebook jobs.

Using the Web Platform

Navigate to the dashboard of whichever job type you wish to create and click the Create button. After filling out the other fields necessary to start the job, expand the Environment section of the form. There are now 3 new text areas under the Package Dependencies header for the pip, apt, and conda dependency lists, respectively. Each package should be added on its own line. If you need to pin a specific version of the dependency, use the same syntax you would when running the package manager's install command, e.g. package==version for pip, package=version for apt, and "package=version" for conda. Click Next and review the list of packages to install and create the job.

No additional configuration is necessary to install the pip requirements.txt file. Simply insure the file is well formatted and in the root directory of the model you specify in the Model section of the job form.

Using the SDK

To specify package dependencies using the SDK, add a packages dictionary to the environment dictionary of the job create command.

You must specify the type dictionary item in the environment dictionary if you are modifying the default values.

The packages dictionary only allows the three keys (apt,pip, and conda) to be specified, but none of the keys are required. The package dependencies should be specified as lists of strings. When pinning conda packages, be sure to include the quotes explicitly. An example specification is the following:

job = await trainml.jobs.create(
    name="Test Notebook Job",
    ...
    environment=dict(
        type="DEEPLEARNING_PY38",
        packages=dict(
            apt=["llvm-9-dev", "libpng-dev"],
            pip=["fbprophet", "catboost", "nemo_toolkit[all]==1.0.0b2"],
            conda=["libuv", "libpng", "\"pyarrow=3.0.0\""],
        ),
    ),
)

Using the CLI

To specify package dependencies using the CLI, use the command line option for the package manager you need (apt-packages, pip-packages, or conda-packages). Packages should be specified as single CSV argument. An example is the following:

trainml job create notebook \
 --apt-packages 'llvm-9-dev,libpng-dev' \
 --pip-packages 'fbprophet,catboost,nemo_toolkit[all]==1.0.0b2' \
 --conda-packages 'libuv,libpng,"pyarrow=3.0.0"' \
 "Test Notebook Job"
Tweet
Recent Posts
  • How It Works
    • Using the Web Platform
    • Using the SDK
    • Using the CLI
trainML Documentation
Docs
Getting StartedTutorials
Legal
Privacy PolicyTerms of Use
Copyright © 2022 trainML, LLC, All rights reserved