Data Asset API

Prerequisites

  • Token with datasets scope

  • The data asset's ID to pass to the API call

You can find the data asset's ID below the title

Creating a Dataset from Computation

POST https://{domain}codeocean.com/api/v1/data_assets

This API allows for the creation of datasets using specified parameters.

Path Parameters

NameTypeDescription

POST*

/datasets

Your VPC domain

Headers

NameTypeDescription

-u*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

Set this to: Content - Type: application/json

--data-raw

JSON data with new or updated permissions

Request Body

NameTypeDescription

name*

string

The name of the data asset

description*

string/optional

A description for the data asset

mount*

string

Data asset default mount folder

source*

Computation ID

ID from previous capsule run

tags

string/list

Keywords for searching the data asset by

{
    "created":"created ID",
    "description":"the description provided",
    "files":"the number of the files in the datset",
    "id":"dataset's ID",
    "lastUsed":0,
    "name":"name of the dataset",
    "sizeInBytes":"size of the dataset",
    "state":"DATA_ASSET_STATE_DRAFT",
    "tags":"the tags provided",
    "type":"DATA_ASSET_TYPE_DATASET"
}
Request Response Description
  • created - string

    • the data asset creation time in seconds from unix epoch.

  • description - string

    • a description of the data asset

  • id - string

    • the computation internal id

  • last_used - array

    • the time this data asset was last used in seconds from unix epoch

  • name - integer

    • name of the data asset

  • provenance - array

    • the time this data asset was last used in seconds from unix epoch

  • state - DRAFT, READY, FAILED

    • the data asset creation state. Can be one of the following:

      • DRAFT - the data asset is still being created

      • READY - the data asset is ready for use

      • FAILED - the data asset creation failed

  • tags - string

    • keywords for searching the data asset by

  • type - DATASET, RESULT

    • the type of the data asset. Can be one of the following

      • DATASET

      • RESULT

Request Format
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://{domain}/api/v1/data_assets --data-raw '{
"name": "Data asset From API",
"description": "An example for creating data asset from CO API",
"mount": "ProteinFile",
"tags": [ "proteomics", "bioinformatics" ],
"source": {
"computation": {
"id": "c229ed13-ec06-43d0-abd9-4d481af3f5e3"
}
}
}'
Request Example Bash
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://apps.codeocean.com/api/v1/data_assets --data-raw '{
"name": "Data asset From API",
"description": "An example for creating data asset from CO API",
"mount": "ProteinFile",
"tags": [ "proteomics", "bioinformatics" ],
"source": {
"computation": {
"id": "c229ed13-ec06-43d0-abd9-4d481af3f5e3"
}
}
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 


json_data = {
  "name": "Data asset From API",
  "description": "An example for creating data asset from CO API",
  "mount": "ProteinFile",
  "tags": [
    "proteomics",
    "bioinformatics"
  ],
  "source": {
    "computation": {
      "id": "c229ed13-ec06-43d0-abd9-4d481af3f5e3"
    }
  }
}
 


response = requests.post( 'https://apps.codeocean.com/api/v1/data_assets', headers=headers, 
json=json_data, 
auth=(os.getenv('API_SECRET', ''), ''), 
)
Request Example Response
{
  "app_parameters": [
    {
      "name": "Assembly ID",
      "value": "1"
    },
    {
      "name": "Assembly ID",
      "value": "1"
    }
  ],
  "created": 1689606430,
  "description": "An example for creating data asset from CO API",
  "id": "3ae5a2bc-b217-4d28-bfe8-d7d3f9692016",
  "last_used": 0,
  "name": "",
  "provenance": {
    "capsule": "7a3357c8-bb41-4934-b229-19dcf594a32d",
    "commit": "d3cdc2c45ca4a9b331dcfac114fd90684b354231",
    "docker_image": "a440b7ff-cbe0-4561-bd60-fc322d17a45a",
    "run_script": "code/run"
  },
  "state": "draft",
  "tags": [
    "proteomics",
    "bioinformatics"
  ],
  "type": "result"
}

Create a Dataset from a Public Bucket

POST https://{domain}.codeocean.com/api/v1/data_assets/{data assets ID}'

This API allows for the creation of datasets from a public bucket using specified parameters.

Path Parameters

NameTypeDescription

POST*

/datasets

Your VPC domain

Headers

NameTypeDescription

-u*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

--data-raw*

JSON data with new or updated permissions

-H

Set this to: Content - Type: application/json

Request Body

NameTypeDescription

name*

string

Data asset name

description*

string/optional

Data asset description

mount*

string

Data asset default mount folder

tags *

list/string

Keywords for searching the data asset by

source*

bucket*

string

Name of your Public Data Bucket

prefix*

string

Path to your Directory

keep_on_external_storage*

boolean

keep_on_external_storage

index data*

boolean

{
   "created": 1633277005,
   "description": "a descriptive description",
   "files": 0,
   "id": "fea84ebf-b58b-4ad2-994d-7169dc3880fb",
   "lastUsed": 0,
   "name": "my dataset",
   "sizeInBytes": 0,
   "state": "DATA_ASSET_STATE_DRAFT",
   "tags": [ "t1", "t2" ],
   "type": "DATA_ASSET_TYPE_DATASET"
}
Request Format
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://{domain}/api/v1/data_assets --data-raw '{
"name": "myDatasetFromPublic",
"description": "a descriptive description",
"mount": "Mymount",
"tags": ["t1", "t2"],
"source": {
"aws": {
"bucket": "Public Bucket",
"prefix": "PREFIX",
"keep_on_external_storage":false,
"index_data":false
}
}
}'
Request Example Bash
curl --location --request POST 'https://apps.codeocean.com/api/v1/data_assets' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\' \
--data-raw '{
"name":"import public AWS bucket with dataset api",
"description":"meaningful-c",
"mount":"citations",
"tags":["Genomics"],
"source":{
"aws":{
"bucket":"codeocean-public-data",
"prefix":"example_datasets/ATAC/hg38_2bit/",
"keep_on_external_storage":false,
"index_data":false
}
}
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 


json_data = {
  "name": "import public AWS bucket with dataset api",
  "description": "meaningful-c",
  "mount": "citations",
  "tags": [
    "Genomics"
  ],
  "source": {
    "aws": {
      "bucket": "codeocean-public-data",
      "prefix": "example_datasets/ATAC/hg38_2bit/",
      "keep_on_external_storage": "False",
      "index_data": "False"
    }
  }
}
 


response = requests.post( 'https://apps.codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Request Example Response
{
  "created": 1689618780,
  "description": "meaningful-c",
  "id": "c39f20e6-9ded-4460-8292-76fc42fd1c00",
  "last_used": 0,
  "name": "import public AWS bucket with dataset api",
  "state": "draft",
  "tags": [
    "Genomics"
  ],
  "type": "dataset"
}
curl --location --request GET 'https://acmecorp.codeocean.com/api/v1/data_assets/37a93748-ce90-4980-913b-2de0908d5212' \
-u \'${CUSTOM_KEY}:\'

Create a Data Asset from Private Bucket

POST https://{domain}.codeocean.com/api/v1/data_assets/{data assets ID}'

This API allows for the creation of datasets from an S3 or GCP bucket using specified parameters.

Path Parameters

NameTypeDescription

POST*

/datasets

Headers

NameTypeDescription

-u

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

--data-raw

JSON data with new or updated permissions

-H

Set this to: Content - Type: application/json

Request Body

NameTypeDescription

name*

string

Data asset name

description*

string/optional

Data asset description

mount*

string

Data asset default mount folder

tags*

list/string

Keywords for searching the data asset by

source*

index data*

boolean

keep_on_external_storage*

boolean

keep_on_external_storage

prefix*

string

Path to your Directory

bucket*

string

Name of your Public Data Bucket

{
    "has_more" - boolean: indicates whether there ar more results
    "results" - array: array of dataset found
}
Request Format
curl --location --request POST 'https://{domain}/api/v1/data_assets' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\' \
--data-raw '{
"name": "My External Data Asset",
"description": "External Indexed Dataset From API",
"mount": "external-indexed",
"tags": [ "t1","t2"],
"source": {
"aws": {
"bucket": "codeocean-datasetapi-test-cs",
"keep_on_external_storage": true,
"index_data": true,
"access_key_id": "'"$AWS_ACCESS_KEY_ID"'",
"secret_access_key": "'"$AWS_SECRET_ACCESS_KEY"'"
}
}
}'
Request Example Bash
curl --location --request POST 'https://apps.codeocean.com/api/v1/data_assets' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\' \
--data-raw '{
"name":"import public AWS bucket with dataset api",
"description":"meaningful-c",
"mount":"citations",
"tags":["Genomics"],
"source":{
"aws":{
"bucket":"codeocean-public-data",
"prefix":"example_datasets/ATAC/hg38_2bit/",
"keep_on_external_storage":false,
"index_data":false
}
}
}'
Request Example Python
import os, requests 


headers = {
  "name": "import public AWS bucket with dataset api",
  "description": "meaningful-c",
  "mount": "citations",
  "tags": [
    "Genomics"
  ],
  "source": {
    "aws": {
      "bucket": "codeocean-public-data",
      "prefix": "example_datasets/ATAC/hg38_2bit/",
      "keep_on_external_storage": "False",
      "index_data": "False"
    }
  }
}
 
response = requests.post( 'https://apps.codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Request Example Response
{
  "created": 1689618780,
  "description": "meaningful-c",
  "id": "c39f20e6-9ded-4460-8292-76fc42fd1c00",
  "last_used": 0,
  "name": "import public AWS bucket with dataset api",
  "state": "draft",
  "tags": [
    "Genomics"
  ],
  "type": "dataset"
}

Get Metadata from Dataset

GET https://{domain}.codeocean.com/api/v1/data_assets/{data_set_id}

This API retrieves metadata for your data asset.

Path Parameters

NameTypeDescription

GET *

/data_assets/:data_asset_id

Headers

NameTypeDescription

-u*

String

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H

String

Set this to: Content - Type: application/json

{
  "created": float64 - data asset creation time,
  "description": string - data asset descriptionw description",
  "files": int64 - total number of files in the data asset if available,
  "id": string - the data asset internal id,
  "lastUsed": float64 - the last time the data asset was used in seconds since epoch,
  "name": string - data asset name,
  "size": int64 - the total size in bytes of the data asset if available,
  "state": string - data asset state - draft / ready / failed,
  "tags": array of string tags,
  "type": string - dataset / result 
}
Request Response Description
  • created - float64

    • data asset creation time

  • description - string

    • data asset description

  • field - string

    • field of research

  • id - integer

    • metadata id

  • Keywords - string/list

    • associated keywords

  • Name - string

    • name of the dataset

  • Owner - string

    • identification ID of the owner

  • Published Capsule - boolean

    • value indicates whether this capsule is published

  • Slug - integer

    • unkown

  • Status - boolean

    • indicates the status of this capsule

Request Format
curl --location --request GET 'https://{domain}/api/v1/capsules/{dataset_id}' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\'
Request Example Bash
curl --location --request GET 'https://apps.codeocean.com/api/v1/capsules/4bc97533-6eb4-48ac-966f-648548a756d2' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
}
 
response = requests.get( 'https://apps.codeocean.com/api/v1/capsules/4bc97533-6eb4-48ac-966f-648548a756d2', headers=headers, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Request Example Response
{
  "cloned_from_url": "",
  "created": 1673385764,
  "description": "This tool takes an alignment of reads or fragments as input (BAM file) and generates a coverage track (bigWig or bedGraph) as output. The coverage is calculated as the number of reads per bin, where bins are short consecutive counting windows of a defined size. It is possible to extended the length of the reads to better reflect the actual fragment length. bamCoverage offers normalization by scaling factor, Reads Per Kilobase per Million mapped reads (RPKM), counts per million (CPM), bins per million mapped reads (BPM) and 1x depth (reads per genome coverage, RPGC).\n\nSource : https://deeptools.readthedocs.io/en/develop/content/tools/bamCoverage.html",
  "field": "Bioinformatics",
  "id": "4bc97533-6eb4-48ac-966f-648548a756d2",
  "keywords": [
    "ChIP",
    "Normalization"
  ],
  "name": "deepTools-bamCoverage",
  "owner": "467ef120-2c93-42eb-8865-5866004243bf",
  "published_capsule": "",
  "slug": "7607289",
  "status": "non-published"
}

Update Metadata

PUT https://{domain}.codeocean.com/api/v1/data_assets/{data_set_id}

This API allows for the updating of the metadata for your data asset.

Path Parameters

NameTypeDescription

PUT*

/data_assets/:data_asset_id

Your VPC domain

Headers

NameTypeDescription

-u*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

Set this to: Content - Type: application/json

--data-raw*

JSON data with new or updated permissions

Request Body

NameTypeDescription

name*

string

The name of the data asset

description*

string

A description for the data asset

tags*

string

Keywords to search the data asset by

mount*

string

Data asset default mount folder

custom_metadata*

Map of key value pairs, should match custom metadata fields defined by the admin's possible values:

string custom field - string

number custom field - number

date custom field - number

- unix (epoch) format timestamp in secs

Request Response Description
  • created - float64

    • data asset creation time

  • description - string

    • data asset description

  • files - int64

    • total number of files in the data asset if available

  • id - string

    • the data asset internal id

  • last_used - float64

    • the last time the data asset was used in seconds since epoch

  • name - string

    • name of the dataset

  • size - int64

    • the total size in bytes of the data asset if available

  • state - string

    • data asset state - draft / ready / failed

  • tags - integer

    • array of string tags

  • type - string

    • dataset / result

Request Format
curl -X PUT 'https://{domain}/api/v1/data_assets/{dataset_id}' \
-u \'${API_SECRET}:\' \
-H 'Content-Type: application/json' \
--data-raw '{
"name": "Modified The Name",
"description": "a new description from the API!",
"tags": ["I","Am","New"],
"mount": "NewMount"
}'
Request Example Bash
curl -X PUT 'https://apps.codeocean.com/api/v1/data_assets/d36665a7-ef59-4b8e-a799-bee7f83ee317' \
-u \'${API_SECRET}:\' \
-H 'Content-Type: application/json' \
--data-raw '{
"name": "Modified The Name",
"description": "a new description from the API!",
"tags": ["I","Am","New"],
"mount": "NewMount"
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
}


json_data = {
  "name": "Modified The Name",
  "description": "a new description from the API!",
  "tags": [
    "I",
    "Am",
    "New"
  ],
  "mount": "NewMount"
}
 
response = requests.put( 'https://apps.codeocean.com/api/v1/data_assets/d36665a7-ef59-4b8e-a799-bee7f83ee317', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Request Example Response
{
  "created": 1682956260,
  "description": "a new description from the API!",
  "files": 1,
  "id": "2fc14a4d-5746-4f66-8da8-079ed3441286",
  "last_used": 1682956368,
  "name": "Modified The Name",
  "size": 4594062,
  "state": "ready",
  "tags": [
    "I",
    "Am",
    "New"
  ],
  "type": "dataset"
}

Archiving/Unarchiving a Dataset

PATCH https://{domain}.codeocean.com/api/v1/data_assets/{data_set_id}/archive?archive=true

This API allows for the archiving and retrieval of your data asset.

Path Parameters

NameTypeDescription

PATCH*

/data_assets/:data_asset_id

Your VPC domain

Headers

NameTypeDescription

-u*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H

Set this to: Content - Type: application/json

Request Format
Archiving a Dataset

curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH "https://{domain}/api/v1/data_assets/{dataset_id}/archive?archive=true"

Unarchiving a Dataset

curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH "https://{domain}/api/v1/data_assets/{data-asset_id}/archive?archive=false"
Request Example Bash
Archiving a Dataset

curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH
"https://apps.codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive?archive=true"


Unarchiving a Dataset


curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH
"https://apps.codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive?archive=false"
Request Example Python
Archiving a Dataset


import os, requests 


headers = {
  "Content-Type": "application/json"
} 


params = {
  "archive": "true"
}
 
response = requests.patch( 'https://apps.codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive', params=params, headers=headers, auth=(os.getenv('API_SECRET', ''), ''), 
)


Unarchiving a Dataset


headers = {
  "Content-Type": "application/json"
}
 
params = {
  "archive": "false"
}
 
response = requests.patch( 'https://apps.codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive', params=params, headers=headers, auth=(os.getenv('API_SECRET', ''), ''), 
)
Request Example Response

There is no response.