New job environments based on Python 3.10 are now available for all frameworks.
Checkpoint Export
trainML checkpoints can now be exported to common data locations for use in other environments.
Interval Checkpointing for Training Jobs
You can now configure automatic checkpointing on a scheduld interval during training for trainML Training Jobs.
Job Init Script Support
trainML jobs can now run shell scripts prior to workload execution and data attachment to further configure the workload environment.
Instant Job Rerun
trainML Training and Inference jobs can now be reran on updated data with only two clicks.
PyTorch 2.0
With the recent major release of PyTorch 2.0, new trainML job environments are available.
Use Large Language Model Checkpoints for Free
Start building your own ChatGPT-like applications with popular open source Large Language Models like GPT-J, GPT-NeoX, and BloomZ.
Attach Checkpoints to Edge Inference Devices
CloudBender™ Device Configurations have been expanded to allow attaching public and private Checkpoints to edge inference devices.
Create Checkpoints and Datasets from Job Outputs
Checkpoints and Datasets are now supported output destinations for trainML Training and Inference jobs.
Automatically Integrate Huggingface Models
Checkpoints can now be created directly from public or private Hugging Face models.