Data Asset API

Prerequisites

  • Token with datasets scope

  • The data asset's ID to pass to the API call

You can find the data asset's ID below the title

Creating a Dataset from Computation

POST https://{domain}codeocean.com/api/v1/data_assets

This API allows for the creation of datasets using specified parameters.

Path Parameters

Headers

Request Body

{
    "created":"created ID",
    "description":"the description provided",
    "files":"the number of the files in the datset",
    "id":"dataset's ID",
    "lastUsed":0,
    "name":"name of the dataset",
    "sizeInBytes":"size of the dataset",
    "state":"DATA_ASSET_STATE_DRAFT",
    "tags":"the tags provided",
    "type":"DATA_ASSET_TYPE_DATASET"
}
Request Response Description
  • created - string

    • the data asset creation time in seconds from unix epoch.

  • description - string

    • a description of the data asset

  • id - string

    • the computation internal id

  • last_used - array

    • the time this data asset was last used in seconds from unix epoch

  • name - integer

    • name of the data asset

  • provenance - array

    • the time this data asset was last used in seconds from unix epoch

  • state - DRAFT, READY, FAILED

    • the data asset creation state. Can be one of the following:

      • DRAFT - the data asset is still being created

      • READY - the data asset is ready for use

      • FAILED - the data asset creation failed

  • tags - string

    • keywords for searching the data asset by

  • type - DATASET, RESULT

    • the type of the data asset. Can be one of the following

      • DATASET

      • RESULT

Request Format
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://{domain}/api/v1/data_assets --data-raw '{
"name": "Data asset From API",
"description": "An example for creating data asset from CO API",
"mount": "ProteinFile",
"tags": [ "proteomics", "bioinformatics" ],
"source": {
"computation": {
"id": "c229ed13-ec06-43d0-abd9-4d481af3f5e3"
}
}
}'
Request Example Bash
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://apps.codeocean.com/api/v1/data_assets --data-raw '{
"name": "Data asset From API",
"description": "An example for creating data asset from CO API",
"mount": "ProteinFile",
"tags": [ "proteomics", "bioinformatics" ],
"source": {
"computation": {
"id": "c229ed13-ec06-43d0-abd9-4d481af3f5e3"
}
}
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 


json_data = {
  "name": "Data asset From API",
  "description": "An example for creating data asset from CO API",
  "mount": "ProteinFile",
  "tags": [
    "proteomics",
    "bioinformatics"
  ],
  "source": {
    "computation": {
      "id": "c229ed13-ec06-43d0-abd9-4d481af3f5e3"
    }
  }
}
 


response = requests.post( 'https://apps.codeocean.com/api/v1/data_assets', headers=headers, 
json=json_data, 
auth=(os.getenv('API_SECRET', ''), ''), 
)
Request Example Response
{
  "app_parameters": [
    {
      "name": "Assembly ID",
      "value": "1"
    },
    {
      "name": "Assembly ID",
      "value": "1"
    }
  ],
  "created": 1689606430,
  "description": "An example for creating data asset from CO API",
  "id": "3ae5a2bc-b217-4d28-bfe8-d7d3f9692016",
  "last_used": 0,
  "name": "",
  "provenance": {
    "capsule": "7a3357c8-bb41-4934-b229-19dcf594a32d",
    "commit": "d3cdc2c45ca4a9b331dcfac114fd90684b354231",
    "docker_image": "a440b7ff-cbe0-4561-bd60-fc322d17a45a",
    "run_script": "code/run"
  },
  "state": "draft",
  "tags": [
    "proteomics",
    "bioinformatics"
  ],
  "type": "result"
}

Create a Dataset from a Public Bucket

POST https://{domain}.codeocean.com/api/v1/data_assets/{data assets ID}'

This API allows for the creation of datasets from a public bucket using specified parameters.

Path Parameters

Headers

Request Body

{
   "created": 1633277005,
   "description": "a descriptive description",
   "files": 0,
   "id": "fea84ebf-b58b-4ad2-994d-7169dc3880fb",
   "lastUsed": 0,
   "name": "my dataset",
   "sizeInBytes": 0,
   "state": "DATA_ASSET_STATE_DRAFT",
   "tags": [ "t1", "t2" ],
   "type": "DATA_ASSET_TYPE_DATASET"
}
Request Format
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://{domain}/api/v1/data_assets --data-raw '{
"name": "myDatasetFromPublic",
"description": "a descriptive description",
"mount": "Mymount",
"tags": ["t1", "t2"],
"source": {
"aws": {
"bucket": "Public Bucket",
"prefix": "PREFIX",
"keep_on_external_storage":false,
"index_data":false
}
}
}'
Request Example Bash
curl --location --request POST 'https://apps.codeocean.com/api/v1/data_assets' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\' \
--data-raw '{
"name":"import public AWS bucket with dataset api",
"description":"meaningful-c",
"mount":"citations",
"tags":["Genomics"],
"source":{
"aws":{
"bucket":"codeocean-public-data",
"prefix":"example_datasets/ATAC/hg38_2bit/",
"keep_on_external_storage":false,
"index_data":false
}
}
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 


json_data = {
  "name": "import public AWS bucket with dataset api",
  "description": "meaningful-c",
  "mount": "citations",
  "tags": [
    "Genomics"
  ],
  "source": {
    "aws": {
      "bucket": "codeocean-public-data",
      "prefix": "example_datasets/ATAC/hg38_2bit/",
      "keep_on_external_storage": "False",
      "index_data": "False"
    }
  }
}
 


response = requests.post( 'https://apps.codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Request Example Response
{
  "created": 1689618780,
  "description": "meaningful-c",
  "id": "c39f20e6-9ded-4460-8292-76fc42fd1c00",
  "last_used": 0,
  "name": "import public AWS bucket with dataset api",
  "state": "draft",
  "tags": [
    "Genomics"
  ],
  "type": "dataset"
}
curl --location --request GET 'https://acmecorp.codeocean.com/api/v1/data_assets/37a93748-ce90-4980-913b-2de0908d5212' \
-u \'${CUSTOM_KEY}:\'

Create a Data Asset from Private Bucket

POST https://{domain}.codeocean.com/api/v1/data_assets/{data assets ID}'

This API allows for the creation of datasets from an S3 or GCP bucket using specified parameters.

Path Parameters

Headers

Request Body

{
    "has_more" - boolean: indicates whether there ar more results
    "results" - array: array of dataset found
}
Request Format
curl --location --request POST 'https://{domain}/api/v1/data_assets' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\' \
--data-raw '{
"name": "My External Data Asset",
"description": "External Indexed Dataset From API",
"mount": "external-indexed",
"tags": [ "t1","t2"],
"source": {
"aws": {
"bucket": "codeocean-datasetapi-test-cs",
"keep_on_external_storage": true,
"index_data": true,
"access_key_id": "'"$AWS_ACCESS_KEY_ID"'",
"secret_access_key": "'"$AWS_SECRET_ACCESS_KEY"'"
}
}
}'
Request Example Bash
curl --location --request POST 'https://apps.codeocean.com/api/v1/data_assets' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\' \
--data-raw '{
"name":"import public AWS bucket with dataset api",
"description":"meaningful-c",
"mount":"citations",
"tags":["Genomics"],
"source":{
"aws":{
"bucket":"codeocean-public-data",
"prefix":"example_datasets/ATAC/hg38_2bit/",
"keep_on_external_storage":false,
"index_data":false
}
}
}'
Request Example Python
import os, requests 


headers = {
  "name": "import public AWS bucket with dataset api",
  "description": "meaningful-c",
  "mount": "citations",
  "tags": [
    "Genomics"
  ],
  "source": {
    "aws": {
      "bucket": "codeocean-public-data",
      "prefix": "example_datasets/ATAC/hg38_2bit/",
      "keep_on_external_storage": "False",
      "index_data": "False"
    }
  }
}
 
response = requests.post( 'https://apps.codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Request Example Response
{
  "created": 1689618780,
  "description": "meaningful-c",
  "id": "c39f20e6-9ded-4460-8292-76fc42fd1c00",
  "last_used": 0,
  "name": "import public AWS bucket with dataset api",
  "state": "draft",
  "tags": [
    "Genomics"
  ],
  "type": "dataset"
}

Get Metadata from Dataset

GET https://{domain}.codeocean.com/api/v1/data_assets/{data_set_id}

This API retrieves metadata for your data asset.

Path Parameters

Headers

{
  "created": float64 - data asset creation time,
  "description": string - data asset descriptionw description",
  "files": int64 - total number of files in the data asset if available,
  "id": string - the data asset internal id,
  "lastUsed": float64 - the last time the data asset was used in seconds since epoch,
  "name": string - data asset name,
  "size": int64 - the total size in bytes of the data asset if available,
  "state": string - data asset state - draft / ready / failed,
  "tags": array of string tags,
  "type": string - dataset / result 
}
Request Response Description
  • created - float64

    • data asset creation time

  • description - string

    • data asset description

  • field - string

    • field of research

  • id - integer

    • metadata id

  • Keywords - string/list

    • associated keywords

  • Name - string

    • name of the dataset

  • Owner - string

    • identification ID of the owner

  • Published Capsule - boolean

    • value indicates whether this capsule is published

  • Slug - integer

    • unkown

  • Status - boolean

    • indicates the status of this capsule

Request Format
curl --location --request GET 'https://{domain}/api/v1/data_assets/{dataset_id}' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\'
Request Example Bash
curl --location --request GET 'https://apps.codeocean.com/api/v1/data_assets/4bc97533-6eb4-48ac-966f-648548a756d2' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
}
 
response = requests.get( 'https://apps.codeocean.com/api/v1/data_assets/4bc97533-6eb4-48ac-966f-648548a756d2', headers=headers, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Request Example Response
{
  "cloned_from_url": "",
  "created": 1673385764,
  "description": "This tool takes an alignment of reads or fragments as input (BAM file) and generates a coverage track (bigWig or bedGraph) as output. The coverage is calculated as the number of reads per bin, where bins are short consecutive counting windows of a defined size. It is possible to extended the length of the reads to better reflect the actual fragment length. bamCoverage offers normalization by scaling factor, Reads Per Kilobase per Million mapped reads (RPKM), counts per million (CPM), bins per million mapped reads (BPM) and 1x depth (reads per genome coverage, RPGC).\n\nSource : https://deeptools.readthedocs.io/en/develop/content/tools/bamCoverage.html",
  "field": "Bioinformatics",
  "id": "4bc97533-6eb4-48ac-966f-648548a756d2",
  "keywords": [
    "ChIP",
    "Normalization"
  ],
  "name": "deepTools-bamCoverage",
  "owner": "467ef120-2c93-42eb-8865-5866004243bf",
  "published_capsule": "",
  "slug": "7607289",
  "status": "non-published"
}

Update Metadata

PUT https://{domain}.codeocean.com/api/v1/data_assets/{data_set_id}

This API allows for the updating of the metadata for your data asset.

Path Parameters

Headers

Request Body

Request Response Description
  • created - float64

    • data asset creation time

  • description - string

    • data asset description

  • files - int64

    • total number of files in the data asset if available

  • id - string

    • the data asset internal id

  • last_used - float64

    • the last time the data asset was used in seconds since epoch

  • name - string

    • name of the dataset

  • size - int64

    • the total size in bytes of the data asset if available

  • state - string

    • data asset state - draft / ready / failed

  • tags - integer

    • array of string tags

  • type - string

    • dataset / result

Request Format
curl -X PUT 'https://{domain}/api/v1/data_assets/{dataset_id}' \
-u \'${API_SECRET}:\' \
-H 'Content-Type: application/json' \
--data-raw '{
"name": "Modified The Name",
"description": "a new description from the API!",
"tags": ["I","Am","New"],
"mount": "NewMount"
}'
Request Example Bash
curl -X PUT 'https://apps.codeocean.com/api/v1/data_assets/d36665a7-ef59-4b8e-a799-bee7f83ee317' \
-u \'${API_SECRET}:\' \
-H 'Content-Type: application/json' \
--data-raw '{
"name": "Modified The Name",
"description": "a new description from the API!",
"tags": ["I","Am","New"],
"mount": "NewMount"
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
}


json_data = {
  "name": "Modified The Name",
  "description": "a new description from the API!",
  "tags": [
    "I",
    "Am",
    "New"
  ],
  "mount": "NewMount"
}
 
response = requests.put( 'https://apps.codeocean.com/api/v1/data_assets/d36665a7-ef59-4b8e-a799-bee7f83ee317', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Request Example Response
{
  "created": 1682956260,
  "description": "a new description from the API!",
  "files": 1,
  "id": "2fc14a4d-5746-4f66-8da8-079ed3441286",
  "last_used": 1682956368,
  "name": "Modified The Name",
  "size": 4594062,
  "state": "ready",
  "tags": [
    "I",
    "Am",
    "New"
  ],
  "type": "dataset"
}

Archiving/Unarchiving a Dataset

PATCH https://{domain}.codeocean.com/api/v1/data_assets/{data_set_id}/archive?archive=true

This API allows for the archiving and retrieval of your data asset.

Path Parameters

Headers

Request Format
Archiving a Dataset

curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH "https://{domain}/api/v1/data_assets/{dataset_id}/archive?archive=true"

Unarchiving a Dataset

curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH "https://{domain}/api/v1/data_assets/{data-asset_id}/archive?archive=false"
Request Example Bash
Archiving a Dataset

curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH
"https://apps.codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive?archive=true"


Unarchiving a Dataset


curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH
"https://apps.codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive?archive=false"
Request Example Python
Archiving a Dataset


import os, requests 


headers = {
  "Content-Type": "application/json"
} 


params = {
  "archive": "true"
}
 
response = requests.patch( 'https://apps.codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive', params=params, headers=headers, auth=(os.getenv('API_SECRET', ''), ''), 
)


Unarchiving a Dataset


headers = {
  "Content-Type": "application/json"
}
 
params = {
  "archive": "false"
}
 
response = requests.patch( 'https://apps.codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive', params=params, headers=headers, auth=(os.getenv('API_SECRET', ''), ''), 
)
Request Example Response

There is no response.

Last updated