Adding a New Dataset

You can add a new dataset from the DATA ASSETS page or from a capsule:

  1. Go to the DATA ASSETS page.

  2. Click + Add dataset to add a new dataset.

After clicking on the + Add dataset an interactive form will appear.

By default, a new data asset is private (i.e. only the owner can see it). To learn more about sharing a data asset with others, go to Sharing Data Assets.

Upload From Your Local Machine

  1. Click + Add dataset.

  2. Click Next (local files is the default option).

  3. Drag & drop the file or folders you want to upload from your local drive or click Choose Files to browse.

  4. Complete the fields:

    • Dataset Name (required)—Use a meaningful name so that others can easily find the dataset.

    • Default Folder (required)—The folder name inside a capsule. Use a name that’s similar to the dataset name. Spaces and some special characters are not allowed here.

    • Description (optional)—Add some text to make the dataset easy to find and understand.

    • Tags (required)—Tags are another way to help people find your dataset.

  5. Click Add Dataset to finish.

You can upload a folder of files. The size of a single file to be uploaded is limited to 5GB while there is no limit for the size of the folder. However, the upload timeout is 24 hours.

There is no "resume on failure" which means that if the upload is interrupted (due to a timeout or other issues), you will have to start all over again.

Import From a Cloud Provider

  1. Click + Add dataset.

  2. Choose AWS S3 Bucket or Google Cloud and then click Next.

  3. Provide information about the bucket you want to use.

    • You can upload the entire bucket or a specific folder. If the bucket is protected, click Advanced Settings to enter your credentials and then click Next.

    • For AWS users, leave Keep files on external storage unchecked. To add a remote dataset, check out Establish an External Link to an AWS S3 Bucket).

  4. Complete the fields:

    • Dataset Name (required)— Use a meaningful name so that others can find the dataset easily.

    • Default Folder (required)—The folder name inside a capsule. Use a name that’s similar to the dataset name. Spaces and some special characters are not allowed here.

    • Description (optional)—Add some text to make the dataset easy to find and understand.

    • Tags (required)—Tags are another way to help people find your dataset.

  5. Click Add Dataset to finish.

  1. Click + Add dataset.

  2. Click AWS S3 Bucket and then click Next.

  3. Specify the Bucket Name and the Folder Name.

  4. Check Keep files on external storage.

  5. Select AWS credentials (go to Secret Management Guide if you need help on setting a secret).

  6. Complete the fields:

    • Dataset Name (required)—Use a meaningful name so that others can find the dataset easily.

    • Default Folder (required)—The folder name inside a capsule. Use a name that’s similar to the dataset name. Spaces and some special characters are not allowed here.

    • Description (optional)—Add some text to make the dataset easy to find and understand.

    • Tags (required)—Tags are another way to help people find your dataset.

  7. Click Add Dataset to finish.

Create a New Dataset from the Scratch Folder

To create a dataset from the scratch folder, that has been created in the Cloud Workstation:

  1. From the dropdown list click Create dataset.

2. Provide the Title, Description and Tags.

3. The file is available as a dataset and can be viewed and used in capsules.

Last updated