Local Connection Capability
Some advanced platform functionality is enabled through the use of the trainML CLI and SDK's connection capability. This capability is required to perform the following functions:
- Create a dataset using the
Local
storage type - Populate a job's model code using the
Local
model type - Receive training or inference job model outputs using the
Local
storage type - Allow workers to access services (databases, other applications) running on your local computer at runtime.
Prerequisites
In order to use the connection capability, you must meet the following prerequisites.
- You must have Docker installed
- If using windows, you must install Docker with the WSL 2 backend
- The local user must be configured to manage docker (without sudo)
- Install the trainML CLI and configure its API keys
How to Use
The trainML CLI is the only way to connect to jobs and other resources. Resources that support the connection capability (jobs, datasets, checkpoints, and models) have connect
and disconnect
subcommands. To connect to a job run:
trainml job connect <job ID or name>
By default, connecting will automatically attach the terminal to the log output of the resource. To connect without attaching, run:
trainml job connect --no-attach <job ID or name>
If you connect without attaching, you need to disconnect after the job is complete to avoid potential conflicts with future jobs:
trainml job disconnect <job ID or name>
You can view the status of your current connections by running:
trainml connection list
You can also obtain the connection command by clicking the Connect
button on a resource in the web interface.
Troubleshooting
Issues Connecting
Run:
trainml connection remove-all --all-projects
Run docker ps
and ensure there are no running containers using a trainML image.