Skip to main content

Load Model Code Directly From Your Laptop

· 3 min read

You can now start any job type from model code stored on your local computer without committing the code to a git repository. In combination with the trainML CLI, starting a notebook from your local computer is as simple as:

trainml job create notebook --model-dir ~/model-code --data-dir ~/data "My Notebook"

How It Works

When creating a new job, in the Model section of the job form you can now select the Local from the Model Type dropdown. If this option is selected, specify the storage path as the location on your local computer you want the data to be copied from or to. The path must be specified as an absolute path (starting with /, like /home/username/data), a home directory relative path (starting with ~/, like ~/data), or an environment variable based path (starting with $, like $HOME/data, where the HOME environment variable on your local computer is set to /home/username). Relative paths ./ are not supported.

caution

All Local options require the local connection capability. Please ensure your system meets the prerequisites and all required components are installed prior to using this option.

Once the job starts, it will wait for you to connect to download the model code. Click the Connect button and follow the instructions. Once the model code has been transferred, the job will automatically start.

Using the CLI

As shown above, you can utilize this option automatically through the trainML CLI by specifying the --model-dir argument. When specified, the command will automatically create the job with the local model type configured, activate the local connection, transfer the model data, and open the notebook once the job starts (or watch the logs if it is a training/inference job).

If you also want to automatically upload a dataset during job creation, you can specify the --data-dir option. However, if you plan to reuse the same dataset on multiple jobs, you only need to do this once. For future jobs, you should utilize the --dataset option and specify the dataset name or ID instead. This will save time having to transfer the data multiple times and avoid charges to store the data multiple times.

Using the SDK

You can also start a job using the local model option with the Python SDK by specifying a source_type of local and a source_uri of the local directory path in the model dictionary in the create function.

from trainml import TrainML
import asyncio

trainml = TrainML()
job = await trainml.jobs.create(
name="Training Job with Local Output",
....
model=dict(source_type="local", source_uri="~/tensorflow-model"),
)

# Jobs using Local Model will wait for you to connect in the "waiting for data/model download" state

await job.wait_for("waiting for data/model download")
attach_task = asyncio.create_task(job.attach())
connect_task = asyncio.create_task(job.connect())
await asyncio.gather(attach_task, connect_task)

Since the job will wait for you to connect to transfer the model data, you must first wait for the job to be ready to transfer (the waiting for data/model download state), then connect. The attach function in the example above is optional and provides the log output from the data transfers and job.