How It Works
Creating a Dataset
Any datasets created through a training job automatically become persistent after they are loaded. In the data section of the training job form, select
Create New Dataset from the
Dataset Type field. You can then choose the source of the data from the
Input Data Type field and specify the URI of the data in the
Input Data Storage Path field. When the job starts, the new dataset will be created with the default name of
Job - <Job Name>.
Using a Persistent Dataset
Datasets can be used by selecting
My Dataset from the
Dataset Type field in the
Data section of the job form. Select the desired dataset from the list and create the job. Once the job is running you can access the dataset in the
/opt/trainml/input directory, or using the TRAINML_DATA_PATH environment variable.
Persistent datasets size is included in the monthly storage charge. The first 50 GB of storage is free each month, the storage charge is 0.20 credits per GB per Month for any storage in excess of 50 GB. Private datasets are charged based on the actual size of the dataset once it has been created. A dataset's size counts towards your monthly storage charge as long as it exists, whether or not it is currently being used by a job or by how many. For example, if you create a dataset that is 50 GB on the 15th of the month, that dataset will count for 25 GB/month in the computation of your monthly storage charge for that month. This number is the same if you never use the dataset on a job the entire month, or use it on 100 separate jobs concurrently for the rest of the month.