Data Asset

Prerequisites

  • Generated Access Token with Datasets scope

  • The Data Asset's ID

You can find Data Asset's ID below the title.

  • created float64

    Data Asset creation time

  • description string Data Asset description

  • field string

    Field of research

  • id integer

    Metadata id

  • last_transferred integer

    Time Data Asset's files were last transferred to a different S3 storage location.

  • last_used integer

    Time Data Asset was last used in seconds from unix epoch.

  • mount string

    The default mount folder of the Data Asset.

  • name string

    Name of the Data Asset

  • size string Size in bytes of the Data Asset.

  • state enum

    Data Asset creation state

    • draft

      Data Asset is still being created.

    • ready

      Data Asset is ready to use.

    • failed

      Data Asset creation failed.

  • source_bucket

    Information on bucket from which Data Asset was created.

    • bucket string

      The original buckets name

    • origin enum

      • aws, local, gcp

    • prefix string

      The folder in the s3 bucket from which the Data Asset would be created.

  • tags list<string>

    Keywords for searching the Data Asset by.

  • transfer_error

    The error that occurred during the last transfer attempt if it failed.

  • type Enum

    Type of the Data Asset.

    • dataset

    • result

    • combined

    • model

  • app_parameters list The name and value of app panel parameters used to generate the result Data Asset.

  • contained_data_assets list

    List of structs containing information about the contained Data Assets, if the Data Asset is of type combined.

  • custom_metadata dictionary

    According to custom metadata fields defined by deployment admin and values that were set by the user.

  • nextflow_profile string Pipeline Nextflow profile used to generate the result Data Asset

  • provenance dictionary

    Shows the Data Asset provenance if type is result.

    • commit

      Commit the Data Asset was created from with.

    • run_script

      Script the Data Asset was created by.

    • DataAssets

      Data Assets that was used to create the Data Asset.

    • docker_image

      Docker image used to create the Data Asset.

    • Capsule

      Capsule used to create the Data Asset.

Create Data Asset

POST https://{codeocean-domain}/api/v1/data_assets

This API allows for the creation of Data Assets from either an S3 bucket or the results of a computation. Note: Admins and Capsule/Pipeline owners can create a Data Asset from another user's run.

Prerequisite

Before using this API call, you may require AWS Cloud Credentials configured as Secrets or an Assumable Role.

Request Body

Name
Type/Values
Description

name*

string

Data Asset name.

description

string

Data Asset description.

mount*

string

Data Asset default mount folder.

tags*

list/string

Keywords applied to the Data Asset to aid in searching.

custom_metadata

string custom field - string number custom field - number date custom field - number (unix epoch format timestamp in seconds)

Map of key value pairs, according to custom metadata fields defined by the admin.

source*

struct - aws or computation

aws*

bucket*

string

The S3 bucket from which the Data Asset would be created.

prefix*

path

The folder in the S3 bucket from which the Data Asset would be crated.

keep_on_external_storage*

boolean

When set to true, the Data Asset files will not be copied to Code Ocean.

public*

boolean

When set to true, Code Ocean will try to access the source bucket without credentials.

use_input_bucket

boolean

When set to true, Code Ocean will try to create the Data Asset from an internal input bucket. Only allowed to Admin users. All awsproperties ignored except prefix.

computation

id

string

Computation ID.

path

path

Results path. Leave empty to capture all result files.

target

struct

Optional for designating an S3 storage location outside of Code Ocean. Only applicable for source type of computation.

aws

bucket

string

The S3 bucket in which the Data Asset would be created.

prefix

path

The folder in the S3 bucket in which the Data Asset would be created

results_info

struct

When the source of the data is a result in an S3 bucket originating from an exported Capsule/Pipeline, additional information can be provided to populate lineage & provenance.

capsule/pipeline_id

string

The ID of the Capsule or Pipeline that was executed. Must be provided when using results_info.

version

string

Capsule or Pipeline Release version.

commit

string

Commit hash of Capsule/Pipeline code at time of execution.

run_script

string

Path to the script that was executed relative to the /Capsulefolder in cases when it was not the default code/run.

data_assets

array<string>

IDs of Data Assets used during the run.

parameters

array<dictionary>

Run Parameters.

name

string

Parameter label.

param_name

string

Parameter name.

value

string

Parameter value.

nextflow_profile

string

Pipeline Nextflow Profile

processes

array

Pipeline processes' information.

name

string

Pipeline process name as it appears in the main.nfscript.

capsule_id

string

The ID of the Capsule executed in the process.

version

integer

Release Capsule version.

public

boolean

When set to true, indicates the Capsule is a public Code Ocean App.

parameters

array<string>

Run parameters.

name

string

Parameter title.

param_name

string

Parameter name.

value

string

Parameter value.

Scope

Type
Permission

Data Asset

Read & Write

Create a Data Asset from a Public S3 Bucket

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag

Create a Data Asset from a Private S3 Bucket

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag

Create an External Result Data Asset

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag

Create a Result Data Asset

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag

Create a Result Data Asset from External Result

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDK - Capsulehashtag
chevron-rightRequest Example Python SDK - Pipelinehashtag

Create a Combined Data Asset

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag

Create a Data Asset with Custom Metadata Tags

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag
chevron-rightResponsehashtag
circle-info

API only returns a confirmation of the validity of the creation request, not the success of the creation, since the creation takes time. Poll on the dataset details and monitor its state until it’s ready.

Get Data Asset

GET https://{codeocean-domain}/api/v1/data_assets/{data_asset_id}

This API retrieves metadata for your Data Asset.

Path Parameters

Name
Type

data_asset_id *

string

Scope

Type
Permission

Data Asset

Read & Write

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag

Update Metadata

PUT https://{codeocean-domain}/api/v1/data_assets/{data_asset_id}

This API allows for the updating of the metadata for your data asset.

Path Parameters

Name
Type

data_asset_id*

string

Request Body

Name
Type

name*

string

description*

string

tags*

string

mount*

string

custom_metadata

dictionary

Scope

Type
Permission

Data Asset

Write

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag
chevron-rightResponsehashtag

Archiving/Unarchiving a Dataset

PATCH https://{codeocean-domain}/api/v1/data_assets/{data_asset_id}/archive?archive={true|false}

This API allows for the archiving and retrieval of your Data Asset.

Path Parameters

Name
Type

data_asset_id*

string

Query Parameters

Name
Type

archive*

boolean

Scope

Type
Permission

Data Asset

Read & Write

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag

Search Data Assets

POST https://{codeocean-domain}/api/v1/data_assets/search

This API allows for the searching of Data Assets in your deployment.

Request Body (all fields optional)

Name
Type or Value
Description

offset

int

Specifies the starting index for the search.

limit

int

Specifies how many items to return (up to 1000, defaults to 100).

next_token

string

Represents the token for the next page of results as provided in the previous response. If both from and next_token are set, the from parameter is ignored.

sort_order

asc, desc

Determines the result sort order. Must be provided with sort_field, otherwise ignored.

sort_field

created, type, name, size

Determines the field to sort by (default created).

query

string

Determines the search query. Can be a free text or in the form of “name:... tag:... run_script:... commit_id:...”.

type

dataset, result, combined, or model

Specifies the type of Data Asset to include in the response. If omitted results may include all types.

ownership

created, shared

Search Data Asset by ownership. created - Only Data Assets created by the user.

shared - Data Assets shared with the user.

** Defaults to all accessible, and admins will have access to all Data Assets in the system.

origin

internal, external

Designates whether to return only internal/external Data Assets.

favorite

boolean

Search only favorite Data Assets.

archived

boolean

Search only archived Data Assets.

filters

list

key

string

Field key can be each of name, description, tags, any custom field key defined by the admin.

value

Field value to be included/excluded (optional)

values

Field values in case of multiple values (optional).

range

Field range to be included/excluded (only one of min/max must be set).

min

number

max

number

exclude

boolean

Whether to include/exclude the field value.

Response

Name
Type
Description

has_more

boolean

Indicates if there are more results.

next_token

number

Specifies the next page token for the next request.

results

array

Array of Data Assets found.

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag
chevron-rightRequest Example Python SDK - Search with Custom Metadata Filterhashtag
chevron-rightRequest Example Python SDK - Search for >1000 Data Assets hashtag
circle-info

The search_data_assets_iterator function iteratively calls the search_data_assets function to return a list of all the Data Assets that match the data_asset_params query. The limit parameter is the page size, i.e. the number of Data Assets to return on each iteration, and as such setting it to 1000 provides the best performance.

chevron-rightResponsehashtag

{

'has_more': True/False, (indicates whether there are more results)

'next_token': 'sometoken',

'results': [Data Asset Object]

}

Update Permissions of a Data Asset

POST https://{codeocean-domain}/api/v1/data_assets/{data_asset_id}/permissions

This API allows for the updating of permissions associated with a Data Assets in your deployment.

Path Parameters

Name
Type

data_asset_id*

string

Request Body

Name
Type

users*

array<dict>

groups*

array<dict>

everyone*

string

share_assets

bool

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag

Delete Data Asset

DELETE https://{codeocean-domain}/api/v1/data_assets/{data_asset_id}

This API deletes your Data Asset.

Path Parameters

Name
Type

data_asset_id *

string

Scope

Type
Permission

Data Asset

Read & Write

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag

Transfer Data Asset

POST https://{codeocean-domain}/api/v1/data_assets/{data_asset_id}/transfer

This is an Admin only API that allows transferring a Data Asset's files to a different S3 storage location. This can be used to convert an Internal Data Asset to External, or to change the storage location of an External Data Asset. When applied to Result Data Assets, provenance is maintained.

circle-info

When transferring files from an External Data Asset linked to a public S3 bucket, the S3 bucket should have ACLs disabled as described herearrow-up-right.

Path Parameters

Name
Type
Description

target*

struct

struct defining target storage location

aws*

bucket*

string

the S3 bucket the Data Asset's files will be transferred into

prefix*

string

the folder in the S3 bucket in which the Data Asset files will be placed

force

boolean

perform the transfer even if there are Release Pipelines using it

Request Body

Name
Type

data_asset_id *

string

circle-info

Changing the storage location of a Data Asset will break a Pipeline it's used in due to how the data asset is referenced in the main.nf file. As such, the force field must be used when transferring a Data Asset used by a Release Pipeline and the Release Pipeline must then be updated.

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag

List Data Asset Files

POST https://{codeocean-domain}/api/v1/data_assets/{data_asset_id}/files

This API allows for listing of an Internal Data Asset's files.

circle-info

To list files in an External Data Asset, interact with the S3 bucket directly using e.g. the AWS CLI. S3 information can be obtained from the source_bucket field of the get data asset API.

Path Parameters

Name
Type

data_asset_id*

string

Request Body

Name
Type
Description

path

string

The path of a folder within the Data Asset. Empty path will retrieve a list of all files or folders at the root level.

Scope

Type
Permission

Data Asset

Read

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag
chevron-rightResponsehashtag

Get Data Asset File URLs

GET https://{codeocean-domain}/api/v1/data_assets/{data_asset_id}/files/urls?path={path_to_file}

This API allows for the generation of two URLs to a file in an Internal Data Asset:

  • download_url - signed URL for downloading the file

  • view_url - signed URL for viewing the file in the browser

circle-info

This API was introduced in Code Ocean version 4.0. The previous Get Data Asset File Download URL API is deprecated but will be supported until August 2026.

Path Parameters

Name
Type

data_asset_id*

string

path_to_file*

string

Scope

Type
Permission

Data Asset

Read

chevron-rightRequest Example Bashhashtag
chevron-rightRequest Example Python SDKhashtag
chevron-rightResponsehashtag

  • download_url string Download file URL

  • view_url string View file URL

Last updated

Was this helpful?