Scratch

Working with Intermediate Data in Code Ocean

The scratch folder is a dedicated folder mounted to the capsule that ensures large intermediate data can be easily used in Code Ocean. It functions differently during Cloud Workstation sessions versus Reproducible Runs. In both cases, the scratch folder is mounted EFS storage that is practically unlimited in size.

Cloud Workstation Scratch Folder

For Cloud Workstation (CW) sessions the scratch folder is a mounted drive whose contents will persist throughout the lifetime of the capsule. Files written to scratch during a CW session will be visible in the capsule IDE after the session is shutdown and will be available in all subsequent sessions unless deleted by the user. These files will not be available during a Reproducible Run.

In a CW session, scratch is on the path /root/capsule/scratch

The scratch folder is a convenient location to store large data before creating a data asset. Contents of the scratch folder can be made into a data asset from the capsule IDE or during a CW session.

It is best practice to delete files from scratch that are no longer needed to avoid taking up unnecessary storage. Should the capsule be deleted, the scratch folder would be deleted as well.

Reproducible Run Scratch Folder

For Reproducible Runs the scratch folder functions as a temporary folder that is empty at the start of the run and will be emptied at the end of the run. The capsule workspace (i.e. the core files excluding data assets) is limited to 5GB and therefore a Reproducible Run will fail if this limit is exceeded by creating new files during the run. The scratch folder can be used during a run to safely create files or work with intermediate data of any size. Since the folder is emptied before the end of each run any results must be moved to the results folder.

From within the code folder, scratch can be accessed using the relative path ../scratch (best practice) or the absolute path /scratch

Last updated