Data Asset API

Prerequisites

  • Token with datasets scope

  • The data asset's ID to pass to the API call

You can find the data asset's ID below the title

Create Data Asset

POST https://{domain}/api/v1/data_assets

This API allows for the creation of data assets from either an S3 bucket or the results of a computation.

Prerequisite

Before using this API call, you must have AWS Cloud Credentials configured as Secrets or an Assumable Role.

Path Parameters

NameTypeDescription

POST*

/datasets

Headers

NameTypeDescription

-u:*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

Set this to: Content - Type: application/json

--data-raw *

JSON data with new or updated permission

Request Body

NameTypeDescription

name*

string

data asset name

description*

string

data asset description

mount*

string

data asset default mount folder

tags*

list/string

keywords for searching the data asset by

Custom Metadata

map of key value pairs, according to custom metadata fields defined by the admin, possible values:

- custom field: string

- custom field: number

- custom field: date

source*

aws*

Bucket*

the S3 bucket from which the data asset would be created

- bucket name

Prefix*

the folder in the S3 bucket from which the data asset would be created

- directory path

Keep_on_external_storage*

boolean

when this property is set to true an External Data Asset will be created. When this property is set to false or excluded, the data asset files are copied into Code Ocean and an Internal Data Asset is created.

Public*

boolean

when this property is set to true, CO will try to access the source bucket without credentials

computation

id

string

computation ID

path

string

path to a folder in Results, leave empty to capture all files

Target

aws

Bucket

The S3 bucket in which the result files will be stored. Specifying this parameter will create an external result data asset

Prefix

The folder in the S3 bucket in which the data asset will be created

Create a Data Asset from a Public S3 Bucket

Request Example Bash
curl --location --request POST 'https://codeocean.com/api/v1/data_assets' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\' \
--data-raw '{
"name":"import public AWS bucket with dataset api",
"description":"meaningful-c",
"mount":"citations",
"tags":["Genomics"],
"source":
        {
        "aws":
                {
                        "public":true,
                        "bucket":"codeocean-public-data",
                        "prefix":"example_datasets/ATAC/hg38_2bit/"
                }
        }
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 

json_data = {
  "name": "import public AWS bucket with dataset api",
  "description": "meaningful-c",
  "mount": "citations",
  "tags": [
    "Genomics"
  ],
  "source": {
    "aws": {
  "public":true,
      "bucket": "codeocean-public-data",
      "prefix": "example_datasets/ATAC/hg38_2bit/"
    }
  }
}
 
response = requests.post( 'https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Response
{
  "created": 1689618780,
  "description": "meaningful-c",
  "id": "c39f20e6-9ded-4460-8292-76fc42fd1c00",
  "last_used": 0,
  "name": "import public AWS bucket with dataset api",
  "state": "draft",
  "tags": [
    "Genomics"
],
  "type": "dataset"
}

Create a Data Asset from a Private S3 Bucket

Request Example Bash
curl --location --request POST 'https://codeocean.com/api/v1/data_assets' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\' \
--data-raw '{
"name":"import private AWS bucket with Data Asset API",
"description":"meaningful-c",
"mount":"citations",
"tags":["Genomics"],
"source":
        {
        "aws":
                {
                "bucket":"codeocean-private-data",
                "prefix":"example_datasets/ATAC/hg38_2bit/"
                }
        }
}'
Request Example Python
import os, requests 


headers = {
  "name": "import public AWS bucket with dataset api",
  "description": "meaningful-c",
  "mount": "citations",
  "tags": [
    "Genomics"
  ],
  "source": {
    "aws": {
      "bucket": "codeocean-private-data",
      "prefix": "example_datasets/ATAC/hg38_2bit/"
    }
  }
}
 
response = requests.post( 'https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Response
{
  "created": 1689618780,
  "description": "meaningful-c",
  "id": "c39f20e6-9ded-4460-8292-76fc42fd1c00",
  "last_used": 0,
  "name": "import private AWS bucket with dataset api",
  "state": "draft",
  "tags": [
    "Genomics"
  ],
  "type": "dataset"
}

Create an External Result Data Asset

Request Example Bash
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://codeocean.com/api/v1/data_assets --data-raw '{
"name": "RNA-Sequencing",
"description": "these are reads from an experiment",
"mount": "Reads",
"tags": ["Genomics", "RNA"],
"source": 
        {
        "computation": 
                {
                        "id":”8f174aed-64ce-43eb-9c16-64d25da84bda”,
                        “path”:”Alignment/” (Alignment is a folder in Results)
                }
},
“target”:
        {
        “aws”:
                {
                        “bucket”:”my-bucket”,
                        “prefix”:”deposit/my/results/”
                }
        }
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 


json_data = {
  "name": "RNA-Sequencing",
  "description": "these are reads from an experiment",
  "mount": "Reads",
  "tags": [ "Genomics", "RNA" ],
  "source": {
      "computation": {
         "id": "8f174aed-64ce-43eb-9c16-64d25da84bda"
      }
  },
  "target": {
   "aws": {
     "bucket": "my-bucket",
     "prefix": "deposit/my/results/"
   }
 }
}
 

response = requests.post('https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Response
{
    "Created":1701897946,
    "description":"these are reads from an experiment",
    "Id":"bc774b3d-ec4f-4687-a8b2-74014bf02e1a",
    "Last_used":0,
    "Name":"",
    "Provenance":{
        "capsule":"cd4ef788-b404-4406-8bda-76b5e41a7b8d",
        "Commit":"a7a8aac7d19866c7fcc5877ac4c6e4d55811fab8",
    "Data_assets":["eeefcc52-b445-4e3c-80c5-0e65526cd712"],
    "docker_image":"","run_script":"code/run"
            },
    "State":"draft",
    "tags":["Genomics","RNA"],
    "type":"result"
}

Create a Result Data Asset

Request Example Bash
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://codeocean.com/api/v1/data_assets --data-raw '{
  "name": "Data asset From API",
  "description": "An example for creating data asset from CO API",
  "mount": "some-folder",
  "tags": [ "keyword1", "keyword2" ],
  "source": {
      "computation": {
         "id": "8f174aed-64ce-43eb-9c16-64d25da84bda"
      }
  }
}'
Request Example Python
import os, requests 

headers = {
  "Content-Type": "application/json"
} 

json_data = {
  "name": "Data asset From API",
  "description": "An example for creating data asset from CO API",
  "mount": "some-folder",
  "tags": [ "keyword1", "keyword2" ],
  "source": {
      "computation": {
         "id": "8f174aed-64ce-43eb-9c16-64d25da84bda"
      }
  }
}
 
response = requests.post('https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Response
{
    "Created":1701959542,
    "description":"An example for creating data asset from CO API",
    "Id":"edbce303-4097-418a-9175-8397ee0d3833",
    "Last_used":0,
    "Name":"",
    "Provenance":{
        "capsule":"cd4ef788-b404-4406-8bda-76b5e41a7b8d",
        "Commit":"a7a8aac7d19866c7fcc5877ac4c6e4d55811fab8",
        "Data_assets":["eeefcc52-b445-4e3c-80c5-0e65526cd712"],
        "Docker_image":"",
        "run_script":"code/run"
        },
    "State":"draft",
    "Tags":["keyword1","keyword2"],
    "type":"result"
}

Create a Data Asset with Custom Metadata Tags

Request Example Bash
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://codeocean.com/api/v1/data_assets --data-raw '{
"name": "myDatasetFromPublic",
"description": "a descriptive description",
"mount": "Mymount",
"tags": ["t1", "t2"],
"custom_metadata":
        {
                "some_field": "one", 
                "another_field": 1, 
                "dateField": 1676246400 
        },
"source": 
        {
        "computation": 
                {
                "id":”computation_ID”,
                "path": "/path/to/folder/" (remove if want all Results)
                }
        }
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 


json_data = {
  "name": "myDatasetFromPublic",
"description": "a descriptive description",
"mount": "Mymount",
"tags": ["t1", "t2"],
"custom_metadata":
    {
        "some_field": "one", 
        "another_field": 1, 
        "dateField": 1676246400 
},
"source": 
        {
        "computation": 
                    {
                        "id":”computation_ID”,
                        "path": "/path/to/folder/" (remove if want all Results)
                    }
        }
}
 
response = requests.post('https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Response
{ 
    "created": 1633277005, 
    "description": "a descriptive description",
     "files": 0, 
    "id": "fea84ebf-b58b-4ad2-994d-7169dc3880fb", 
    "last_used": 0, 
    "name": "my dataset", 
    "size": 0, 
    "state": "DATA_ASSET_STATE_DRAFT", 
    "tags": [ "t1", "t2" ], 
    "custom_metadata":{ 	 
            "some Field": "one", 
            "another_field": 1, 
            "dateField": 1676246400 
            },
    "type": "DATA_ASSET_TYPE_DATASET" 
}

API only returns a confirmation of the validity of the creation request, not the success of the creation, since the creation takes time. Poll on the dataset details and monitor its state until it’s ready.

Attach/Detach Data Asset Capsules

POST https://{domain}/api/v1/capsules/{capsule_id}/data_assets

This API attaches one or many Data Assets to a Capsule/Pipeline.

Prerequisite

Before using this API call, you must have AWS Cloud Credentials configured as Secrets or an Assumable Role.

Path Parameters

NameTypeDescription

POST*

/capsules/:capsule_id/data_assets

Headers

NameTypeDescription

-u*

Authorize with Code Ocean API Secret: -u $API_SECRET :

-H*

Set this to: Content - Type: application/json

--data-raw*

JSON data with new or updated permissions

Request Body

NameTypeDescription

id*

string

data assets id

mount*

string

Folder to mount data

Response Description
  • external - boolean

    • indicates whether the data asset is external

  • id - string

    • data asset ID

  • job_id - string

    • for internal use

  • mount - string

    • name of folder to mount the data asset

  • mount_state - string

    • for internal use

  • ready - boolean

    • data asset is attached and ready for use in capsule

Attach

Request Example Bash
curl -i --location --request POST 'https://codeocean.com/api/v1/capsules/4367940-e863-4819-afbd-1b6b9f9b1256/data_assets' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\' \
--data-raw '[
       {"id": "052b6c02-2b81-4eca-b064-5e886c806ebe"},
        {"id": "9378e12a-f349-4f07-8f4b-9e64b8b8b514"}
]'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
}

json_data = [
  {"id": "7fbd0ba0-9603-4e5a-9810-579207d4c1d3"}
]

response = requests.post('https://codeocean.com/api/v1/capsules/eb082456-d031-4a42-80b0-f209b8728927/data_assets',
    headers=headers,
    json=json_data,
    auth=(os.getenv('API_SECRET', ''), ''),)
Response
{
    HTTP/1.1 200 OK
    Date: Mon, 27 Nov 2023 14:12:35 GMT
    Content-Type: application/json
    Content-Length: 251
    Connection: keep-alive
    Api-Version: 2021-09-09
    Cache-Control: no-cache, no-store
    Request-Id: e510e058-90c7-4241-adcb-7c055ffa0a00
    X-Content-Type-Options: nosniff

    [
        {
            "External":false,"id":"052b6c02-2b81-4eca-b064-5e886c806ebe",
            "Job_id":"",
            "mount":"Sequences",
            "Mount_state":"","ready":true
        },
        {
            "External":false,
            "Id":"9378e12a-f349-4f07-8f4b-9e64b8b8b514",
            "Job_id":"",
            "mount":"Reference",
            "Mount_state":"",
            "ready":true
        }
    ]
}

Detach

Request Example Bash
curl -X DELETE \
'https://codeocean.com/api/v1/capsules/a4367940-e863-4819-afbd-1b6b9f9b1256/data_assets' \
-u \'${API_SECRET}:\' -H 'Content-Type: application/json' \
--data-raw '[
    "4ec22934-d75f-4385-9c52-c8e593a4234c",     
    "052b6c02-2b81-4eca-b064-5e886c806ebe",
    "9378e12a-f349-4f07-8f4b-9e64b8b8b514"
]'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
}

json_data = [
  "7fbd0ba0-9603-4e5a-9810-579207d4c1d3"
]

response = requests.post('https://codeocean.com/api/v1/capsules/eb082456-d031-4a42-80b0-f209b8728927/data_assets',
    headers=headers,
    json=json_data,
    auth=(os.getenv('API_SECRET', ''), ''),)

Get Metadata from Data Asset

GET https://{domain}/api/v1/data_assets/{data_set_id}

This API retrieves metadata for your data asset.

Path Parameters

NameTypeDescription

GET *

/data_assets/:data_asset_id

Headers

NameTypeDescription

-u*

String

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

String

Set this to: Content - Type: application/json

{
  "created": float64 - data asset creation time,
  "description": string - data asset descriptionw description",
  "files": int64 - total number of files in the data asset if available,
  "id": string - the data asset internal id,
  "lastUsed": float64 - the last time the data asset was used in seconds since epoch,
  "name": string - data asset name,
  "size": int64 - the total size in bytes of the data asset if available,
  "state": string - data asset state - draft / ready / failed,
  "tags": array of string tags,
  "type": string - dataset / result 
}
Response Description
  • created - float64

    • data asset creation time

  • description - string

    • data asset description

  • field - string

    • field of research

  • id - integer

    • metadata id

  • Keywords - string/list

    • associated keywords

  • Name - string

    • name of the dataset

  • Owner - string

    • identification ID of the owner

  • Published Capsule - boolean

    • value indicates whether this capsule is published

  • Slug - integer

    • unkown

  • Status - boolean

    • indicates the status of this capsule

Request Example Bash
curl --location --request GET 'https://codeocean.com/api/v1/data_assets/4bc97533-6eb4-48ac-966f-648548a756d2' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
}
 
response = requests.get('https://codeocean.com/api/v1/data_assets/4bc97533-6eb4-48ac-966f-648548a756d2', headers=headers, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Response
{
  "cloned_from_url": "",
  "created": 1673385764,
  "description": "This tool takes an alignment of reads or fragments as input (BAM file) and generates a coverage track (bigWig or bedGraph) as output. The coverage is calculated as the number of reads per bin, where bins are short consecutive counting windows of a defined size. It is possible to extended the length of the reads to better reflect the actual fragment length. bamCoverage offers normalization by scaling factor, Reads Per Kilobase per Million mapped reads (RPKM), counts per million (CPM), bins per million mapped reads (BPM) and 1x depth (reads per genome coverage, RPGC).\n\nSource : https://deeptools.readthedocs.io/en/develop/content/tools/bamCoverage.html",
  "field": "Bioinformatics",
  "id": "4bc97533-6eb4-48ac-966f-648548a756d2",
  "keywords": [
    "ChIP",
    "Normalization"
  ],
  "name": "deepTools-bamCoverage",
  "owner": "467ef120-2c93-42eb-8865-5866004243bf",
  "published_capsule": "",
  "slug": "7607289",
  "status": "non-published"
}

Update Metadata for a Data Asset

PUT https://{domain}/api/v1/data_assets/{data_set_id}

This API allows for the updating of the metadata for your data asset.

Path Parameters

NameTypeDescription

PUT*

/data_assets/:data_asset_id

Your VPC domain

Headers

NameTypeDescription

-u*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

Set this to: Content - Type: application/json

--data-raw*

JSON data with new or updated permissions

Request Body

NameTypeDescription

name*

string

The name of the data asset

description*

string

A description for the data asset

tags*

string

Keywords to search the data asset by

mount*

string

Data asset default mount folder

custom_metadata

Map of key value pairs, should match custom metadata fields defined by the admin's possible values:

string custom field - string

number custom field - number

date custom field - number

- unix (epoch) format timestamp in secs

Response Description
  • created - float64

    • data asset creation time

  • description - string

    • data asset description

  • files - int64

    • total number of files in the data asset if available

  • id - string

    • the data asset internal id

  • last_used - float64

    • the last time the data asset was used in seconds since epoch

  • name - string

    • name of the dataset

  • size - int64

    • the total size in bytes of the data asset if available

  • state - string

    • data asset state - draft / ready / failed

  • tags - integer

    • array of string tags

  • type - string

    • dataset / result

Request Example Bash
curl -X PUT 'https://codeocean.com/api/v1/data_assets/d36665a7-ef59-4b8e-a799-bee7f83ee317' \
-u \'${API_SECRET}:\' \
-H 'Content-Type: application/json' \
--data-raw '{
        "name": "Modified The Name",
        "description": "a new description from the API!",
        "tags": ["I","Am","New"],
        "mount": "NewMount"
}'
Request Example Python
import os, requests 

headers = {
  "Content-Type": "application/json"
}

json_data = {
  "name": "Modified The Name",
  "description": "a new description from the API!",
  "tags": [
    "I",
    "Am",
    "New"
  ],
  "mount": "NewMount"
}
 
response = requests.put('https://codeocean.com/api/v1/data_assets/d36665a7-ef59-4b8e-a799-bee7f83ee317', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Response
{
  "created": 1682956260,
  "description": "a new description from the API!",
  "files": 1,
  "id": "2fc14a4d-5746-4f66-8da8-079ed3441286",
  "last_used": 1682956368,
  "name": "Modified The Name",
  "size": 4594062,
  "state": "ready",
  "tags": [
    "I",
    "Am",
    "New"
  ],
  "type": "dataset"
}

Archiving/Unarchiving a Data Asset

PATCH https://{domain}/api/v1/data_assets/{data_set_id}/archive?archive=true

This API allows for the archiving and retrieval of your data asset.

Path Parameters

NameTypeDescription

PATCH*

/data_assets/:data_asset_id

Your VPC domain

Headers

NameTypeDescription

-u*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

Set this to: Content - Type: application/json

Request Example Bash
Archiving a Dataset

curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH
"https://codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive?archive=true"


Unarchiving a Dataset


curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH
"https://codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive?archive=false"
Request Example Python
Archiving a Dataset


import os, requests 


headers = {
  "Content-Type": "application/json"
} 


params = {
  "archive": "true"
}
 
response = requests.patch('https://codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive', params=params, headers=headers, auth=(os.getenv('API_SECRET', ''), ''), 
)


Unarchiving a Dataset


headers = {
  "Content-Type": "application/json"
}
 
params = {
  "archive": "false"
}
 
response = requests.patch('https://codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive', params=params, headers=headers, auth=(os.getenv('API_SECRET', ''), ''), 
)
Response

There is no response.

Last updated