This API allows for the creation of data assets from either an S3 bucket or the results of a computation.
Prerequisite
Before using this API call, you must have AWS Cloud Credentials configured as Secrets or an Assumable Role.
Path Parameters
Name
Type
Description
POST*
/datasets
Headers
Name
Type
Description
-u:*
Authorize with Code Ocean API Secret:
-u $API_SECRET :
This is setting the "Authorization Basic" base64string header
-H*
Set this to:
Content - Type:
application/json
--data-raw *
JSON data with new or updated permission
Request Body
Name
Type
Description
name*
string
data asset name
description*
string
data asset description
mount*
string
data asset default mount folder
tags*
list/string
keywords for searching the data asset by
Custom Metadata
map of key value pairs, according to custom metadata fields defined by the admin, possible values:
- custom field: string
- custom field: number
- custom field: date
source*
aws*
Bucket*
the S3 bucket from which the data asset would be created
- bucket name
Prefix*
the folder in the S3 bucket from which the data asset would be created
- directory path
Keep_on_external_storage*
boolean
when this property is set to true an External Data Asset will be created. When this property is set to false or excluded, the data asset files are copied into Code Ocean and an Internal Data Asset is created.
Public*
boolean
when this property is set to true, CO will try to access the source bucket without credentials
computation
id
string
computation ID
path
string
path to a folder in Results, leave empty to capture all files
Target
aws
Bucket
The S3 bucket in which the result files will be stored. Specifying this parameter will create an external result data asset
Prefix
The folder in the S3 bucket in which the data asset will be created
{"created":1689618780,"description":"meaningful-c","id":"c39f20e6-9ded-4460-8292-76fc42fd1c00","last_used":0,"name":"import public AWS bucket with dataset api","state":"draft","tags": ["Genomics"],"type":"dataset"}
{"created":1689618780,"description":"meaningful-c","id":"c39f20e6-9ded-4460-8292-76fc42fd1c00","last_used":0,"name":"import private AWS bucket with dataset api","state":"draft","tags": ["Genomics" ],"type":"dataset"}
Create an External Result Data Asset
Request Example Bash
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://codeocean.com/api/v1/data_assets --data-raw '{
"name":"RNA-Sequencing","description":"these are reads from an experiment","mount":"Reads","tags": ["Genomics", "RNA"],"source": {"computation": {"id":”8f174aed-64ce-43eb-9c16-64d25da84bda”,“path”:”Alignment/” (Alignment isafolderinResults) }},“target”: {“aws”: {“bucket”:”my-bucket”,“prefix”:”deposit/my/results/” } }}'
Request Example Python
import os, requests headers ={"Content-Type":"application/json"}json_data ={"name":"RNA-Sequencing","description":"these are reads from an experiment","mount":"Reads","tags": [ "Genomics","RNA" ],"source":{"computation":{"id":"8f174aed-64ce-43eb-9c16-64d25da84bda"}},"target":{"aws":{"bucket":"my-bucket","prefix":"deposit/my/results/"}}}response = requests.post('https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"),
)
Response
{"Created":1701897946,"description":"these are reads from an experiment","Id":"bc774b3d-ec4f-4687-a8b2-74014bf02e1a","Last_used":0,"Name":"","Provenance":{"capsule":"cd4ef788-b404-4406-8bda-76b5e41a7b8d","Commit":"a7a8aac7d19866c7fcc5877ac4c6e4d55811fab8","Data_assets":["eeefcc52-b445-4e3c-80c5-0e65526cd712"],"docker_image":"","run_script":"code/run" },"State":"draft","tags":["Genomics","RNA"],"type":"result"}
Create a Result Data Asset
Request Example Bash
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://codeocean.com/api/v1/data_assets --data-raw '{
"name":"Data asset From API","description":"An example for creating data asset from CO API","mount":"some-folder","tags": [ "keyword1","keyword2"],"source":{"computation":{"id":"8f174aed-64ce-43eb-9c16-64d25da84bda" } }}'
Request Example Python
import os, requests headers ={"Content-Type":"application/json"}json_data ={"name":"Data asset From API","description":"An example for creating data asset from CO API","mount":"some-folder","tags": [ "keyword1","keyword2" ],"source":{"computation":{"id":"8f174aed-64ce-43eb-9c16-64d25da84bda"}}}response = requests.post('https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"),
)
Response
{ "Created":1701959542, "description":"An example for creating data asset from CO API", "Id":"edbce303-4097-418a-9175-8397ee0d3833", "Last_used":0, "Name":"", "Provenance":{ "capsule":"cd4ef788-b404-4406-8bda-76b5e41a7b8d", "Commit":"a7a8aac7d19866c7fcc5877ac4c6e4d55811fab8", "Data_assets":["eeefcc52-b445-4e3c-80c5-0e65526cd712"], "Docker_image":"", "run_script":"code/run" }, "State":"draft", "Tags":["keyword1","keyword2"], "type":"result"}
API only returns a confirmation of the validity of the creation request, not the success of the creation, since the creation takes time. Poll on the dataset details and monitor its state until it’s ready.
Authorize with Code Ocean API Secret:
-u $API_SECRET :
This is setting the "Authorization Basic" base64string header
-H*
String
Set this to:
Content - Type:
application/json
{"created": float64 - data asset creation time,"description": string - data asset descriptionw description","files": int64 - total number of files in the data asset if available,"id": string - the data asset internal id,"lastUsed": float64 - the last time the data asset was used in seconds since epoch,"name": string - data asset name,"size": int64 - the total size in bytes of the data asset if available,"state": string - data asset state - draft / ready / failed,"tags": array of string tags,"type": string - dataset / result }
{"cloned_from_url":"","created":1673385764, "description": "This tool takes an alignment of reads or fragments as input (BAM file) and generates a coverage track (bigWig or bedGraph) as output. The coverage is calculated as the number of reads per bin, where bins are short consecutive counting windows of a defined size. It is possible to extended the length of the reads to better reflect the actual fragment length. bamCoverage offers normalization by scaling factor, Reads Per Kilobase per Million mapped reads (RPKM), counts per million (CPM), bins per million mapped reads (BPM) and 1x depth (reads per genome coverage, RPGC).\n\nSource : https://deeptools.readthedocs.io/en/develop/content/tools/bamCoverage.html",
"field":"Bioinformatics","id":"4bc97533-6eb4-48ac-966f-648548a756d2","keywords": ["ChIP","Normalization" ],"name":"deepTools-bamCoverage","owner":"467ef120-2c93-42eb-8865-5866004243bf","published_capsule":"","slug":"7607289","status":"non-published"}
This API allows for the updating of the metadata for your data asset.
Path Parameters
Name
Type
Description
PUT*
/data_assets/:data_asset_id
Your VPC domain
Headers
Name
Type
Description
-u*
Authorize with Code Ocean API Secret:
-u $API_SECRET :
This is setting the "Authorization Basic" base64string header
-H*
Set this to:
Content - Type:
application/json
--data-raw*
JSON data with new or updated permissions
Request Body
Name
Type
Description
name*
string
The name of the data asset
description*
string
A description for the data asset
tags*
string
Keywords to search the data asset by
mount*
string
Data asset default mount folder
custom_metadata
Map of key value pairs, should match custom metadata fields defined by the admin's possible values:
string custom field - string
number custom field - number
date custom field - number
- unix (epoch) format timestamp in secs
Response Description
created - float64
data asset creation time
description - string
data asset description
files - int64
total number of files in the data asset if available
id - string
the data asset internal id
last_used - float64
the last time the data asset was used in seconds since epoch
name - string
name of the dataset
size - int64
the total size in bytes of the data asset if available
state - string
data asset state - draft / ready / failed
tags - integer
array of string tags
type - string
dataset / result
Request Example Bash
curl-XPUT'https://codeocean.com/api/v1/data_assets/d36665a7-ef59-4b8e-a799-bee7f83ee317' \-u \'${API_SECRET}:\' \-H 'Content-Type: application/json' \--data-raw '{ "name": "Modified The Name", "description": "a new description from the API!", "tags": ["I","Am","New"], "mount": "NewMount"}'
Request Example Python
import os, requests headers ={"Content-Type":"application/json"}json_data ={"name":"Modified The Name","description":"a new description from the API!","tags": ["I","Am","New" ],"mount":"NewMount"}response = requests.put('https://codeocean.com/api/v1/data_assets/d36665a7-ef59-4b8e-a799-bee7f83ee317', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"),
)
Response
{"created":1682956260,"description":"a new description from the API!","files":1,"id":"2fc14a4d-5746-4f66-8da8-079ed3441286","last_used":1682956368,"name":"Modified The Name","size":4594062,"state":"ready","tags": ["I","Am","New" ],"type":"dataset"}