Skip to main content

Spawn Training Jobs Directly From Notebooks

· 2 min read

You can now convert notebooks directly into training jobs to easily run independent training experiments while working on your projects. In contrast to copying the notebook into another notebook job, training jobs will run autonomously, send their output to the location you specify, and automatically terminate when finished.

How It Works

From the Notebook Dashboard, select the notebook job you wish to convert and click the Copy button from the dashboard menu. The Copy button is only enabled when a single job is selected and that job is either running or stopped. To create a training job from the notebook, select Convert to Training Job as the Copy Type. When this copy type is selected, you have the ability to modify the resources for the job, as well as the datasets to use as the input data. In addition, you are able to configure the location to send the training job's output data as well as the workers for the training job.

Click the Copy button on the form to create the new training job. The training job will automatically begin running once the copy is complete. You can view the execution logs to monitor the progress of the job as it runs. Once training is complete, the job will automatically upload its results and stop.

Using this method to create training jobs is a great way to get started with trainML training jobs. The most challenging part of creating a training job is typically crafting the correct worker command for your code given the location of the input data, model, and desired outputs. By starting with a notebook job, you can interactively test your code, ensure all the required libraries are installed, and get familiar with the directory structure of the job environment. Once you have a working job command, you can then spin off training jobs as needed using the job convert capability.