Data Asset API

Prerequisites

  • Generated Token with Datasets Scope

  • The Data Asset's ID

You can find Data Asset's ID below the title.

  • created

    • float64

      • Data Asset creation time

  • description

    • string

      • Data Asset description

  • field

    • string

      • field of research

  • id

    • integer

      • metadata id

  • lastUsed

    • integer

      • the time this Data Asset was last used in seconds from unix epoch.

  • name -

    • string

      • name of the Data Asset

  • size

    • string

      • the size in bytes of the Data Asset.

  • state

    • boolean

      • the Data Asset creation state

        • DATA_ASSET_STATE_DRAFT

          • the Data Asset is still being created.

          • DATA_ASSET_STATE_READY

            • the Data Asset is ready to use.

          • DATA_ASSET_STATE_FAILED

            • the Data Asset creation failed

  • source_bucket

    • information on bucket from which Data Asset was created.

      • bucket

        • the original buckets name

      • origin

        • aws/local/gcp

      • prefix

        • the folder in the s3 bucket from which the Data Asset would be created.

  • tags

    • keywords for searching the Data Asset by.

  • type

    • the type of the Data Asset.

      • DATA_ASSET_TYPE_DATASET

      • DATA_ASSET_TYPE_RESULT

  • custom_metadata

    • map of key value pairs, according to custom metadata fields defined by the admin and values that were set by the user.

    • provenance

      • shows the Data Asset provenance, only relevant for kind = result.

        • commit

          • the commit the Data Asset was created from with.

        • runScript

          • the script the Data Asset was created by.

        • DataAssets

          • datasets that was used to create the Data Asset.

        • DockerImage

          • the docker image used to create the Data Asset.

        • Capsule

          • the Capsule used to create the Data Asset.

    • app_parameters

      • the list of command-line arguments and their names provided to the Capsule run script in the related computation.

      • items

        • name

          • parameter name

        • value

          • parameter value

Create Data Asset

POST https://{domain}/api/v1/data_assets

This API allows for the creation of data assets from either an S3 bucket or the results of a computation.

Prerequisite

Before using this API call, you may require AWS Cloud Credentials configured as Secrets or an Assumable Role.

Path Parameters

NameTypeDescription

POST*

/datasets

Headers

NameTypeDescription

-u:*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

Set this to: Content - Type: application/json

--data-raw *

JSON data with new or updated permission

Request Body

NameTypeDescription

name*

string

data asset name

description*

string

data asset description

mount*

string

data asset default mount folder

tags*

list/string

keywords for searching the data asset by

Custom Metadata

map of key value pairs, according to custom metadata fields defined by the admin, possible values:

- custom field: string

- custom field: number

- custom field: date

source*

aws*

Bucket*

the S3 bucket from which the data asset would be created

- bucket name

Prefix*

the folder in the S3 bucket from which the data asset would be created

- directory path

Keep_on_external_storage*

boolean

when this property is set to true an External Data Asset will be created. When this property is set to false or excluded, the data asset files are copied into Code Ocean and an Internal Data Asset is created.

Public*

boolean

when this property is set to true, CO will try to access the source bucket without credentials

computation

id

string

computation ID

path

string

path to a folder in Results, leave empty to capture all files

Target

aws

Bucket

The S3 bucket in which the result files will be stored. Specifying this parameter will create an external result data asset

Prefix

The folder in the S3 bucket in which the data asset will be created

Create a Data Asset from a Public S3 Bucket

Request Example Bash
curl --location --request POST 'https://codeocean.com/api/v1/data_assets' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\' \
--data-raw '{
"name":"import public AWS bucket with dataset api",
"description":"meaningful-c",
"mount":"citations",
"tags":["Genomics"],
"source":
        {
        "aws":
                {
                        "public":true,
                        "bucket":"codeocean-public-data",
                        "prefix":"example_datasets/ATAC/hg38_2bit/"
                }
        }
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 

json_data = {
  "name": "import public AWS bucket with dataset api",
  "description": "meaningful-c",
  "mount": "citations",
  "tags": [
    "Genomics"
  ],
  "source": {
    "aws": {
  "public":true,
      "bucket": "codeocean-public-data",
      "prefix": "example_datasets/ATAC/hg38_2bit/"
    }
  }
}
 
response = requests.post( 'https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)

Create a Data Asset from a Private S3 Bucket

Request Example Bash
curl --location --request POST 'https://codeocean.com/api/v1/data_assets' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\' \
--data-raw '{
"name":"import private AWS bucket with Data Asset API",
"description":"meaningful-c",
"mount":"citations",
"tags":["Genomics"],
"source":
        {
        "aws":
                {
                "bucket":"codeocean-private-data",
                "prefix":"example_datasets/ATAC/hg38_2bit/"
                }
        }
}'
Request Example Python
import os, requests 


headers = {
  "name": "import public AWS bucket with dataset api",
  "description": "meaningful-c",
  "mount": "citations",
  "tags": [
    "Genomics"
  ],
  "source": {
    "aws": {
      "bucket": "codeocean-private-data",
      "prefix": "example_datasets/ATAC/hg38_2bit/"
    }
  }
}
 
response = requests.post( 'https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)

Create an External Result Data Asset

Request Example Bash
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://codeocean.com/api/v1/data_assets --data-raw '{
"name": "RNA-Sequencing",
"description": "these are reads from an experiment",
"mount": "Reads",
"tags": ["Genomics", "RNA"],
"source": 
        {
        "computation": 
                {
                        "id":”8f174aed-64ce-43eb-9c16-64d25da84bda”,
                        “path”:”Alignment/” (Alignment is a folder in Results)
                }
},
“target”:
        {
        “aws”:
                {
                        “bucket”:”my-bucket”,
                        “prefix”:”deposit/my/results/”
                }
        }
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 


json_data = {
  "name": "RNA-Sequencing",
  "description": "these are reads from an experiment",
  "mount": "Reads",
  "tags": [ "Genomics", "RNA" ],
  "source": {
      "computation": {
         "id": "8f174aed-64ce-43eb-9c16-64d25da84bda"
      }
  },
  "target": {
   "aws": {
     "bucket": "my-bucket",
     "prefix": "deposit/my/results/"
   }
 }
}
 

response = requests.post('https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)

Create a Result Data Asset

Request Example Bash
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://codeocean.com/api/v1/data_assets --data-raw '{
  "name": "Data asset From API",
  "description": "An example for creating data asset from CO API",
  "mount": "some-folder",
  "tags": [ "keyword1", "keyword2" ],
  "source": {
      "computation": {
         "id": "8f174aed-64ce-43eb-9c16-64d25da84bda"
      }
  }
}'
Request Example Python
import os, requests 

headers = {
  "Content-Type": "application/json"
} 

json_data = {
  "name": "Data asset From API",
  "description": "An example for creating data asset from CO API",
  "mount": "some-folder",
  "tags": [ "keyword1", "keyword2" ],
  "source": {
      "computation": {
         "id": "8f174aed-64ce-43eb-9c16-64d25da84bda"
      }
  }
}
 
response = requests.post('https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)

Create a Data Asset with Custom Metadata Tags

Request Example Bash
curl -H "Content-Type: application/json" -u ${API_SECRET}: -X POST https://codeocean.com/api/v1/data_assets --data-raw '{
"name": "myDatasetFromPublic",
"description": "a descriptive description",
"mount": "Mymount",
"tags": ["t1", "t2"],
"custom_metadata":
        {
                "some_field": "one", 
                "another_field": 1, 
                "dateField": 1676246400 
        },
"source": 
        {
        "computation": 
                {
                "id":”computation_ID”,
                "path": "/path/to/folder/" (remove if want all Results)
                }
        }
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 


json_data = {
  "name": "myDatasetFromPublic",
"description": "a descriptive description",
"mount": "Mymount",
"tags": ["t1", "t2"],
"custom_metadata":
    {
        "some_field": "one", 
        "another_field": 1, 
        "dateField": 1676246400 
},
"source": 
        {
        "computation": 
                    {
                        "id":”computation_ID”,
                        "path": "/path/to/folder/" (remove if want all Results)
                    }
        }
}
 
response = requests.post('https://codeocean.com/api/v1/data_assets', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Response

API only returns a confirmation of the validity of the creation request, not the success of the creation, since the creation takes time. Poll on the dataset details and monitor its state until it’s ready.

Get Dataset

GET https://{domain}/api/v1/data_assets/{data_set_id}

This API retrieves metadata for your data asset.

Path Parameters

NameTypeDescription

GET *

/data_assets/:data_asset_id

Headers

NameTypeDescription

-u*

String

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

String

Set this to: Content - Type: application/json

{
  "created": float64 - data asset creation time,
  "description": string - data asset descriptionw description",
  "files": int64 - total number of files in the data asset if available,
  "id": string - the data asset internal id,
  "lastUsed": float64 - the last time the data asset was used in seconds since epoch,
  "name": string - data asset name,
  "size": int64 - the total size in bytes of the data asset if available,
  "state": string - data asset state - draft / ready / failed,
  "tags": array of string tags,
  "type": string - dataset / result 
}
Request Example Bash
curl --location --request GET 'https://codeocean.com/api/v1/data_assets/4bc97533-6eb4-48ac-966f-648548a756d2' \
--header 'Content-Type: application/json' \
-u \'${API_SECRET}:\'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
}
 
response = requests.get('https://codeocean.com/api/v1/data_assets/4bc97533-6eb4-48ac-966f-648548a756d2', headers=headers, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Response

Update Metadata

PUT https://{domain}/api/v1/data_assets/{data_set_id}

This API allows for the updating of the metadata for your data asset.

Path Parameters

NameTypeDescription

PUT*

/data_assets/:data_asset_id

Your VPC domain

Headers

NameTypeDescription

-u*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

Set this to: Content - Type: application/json

--data-raw*

JSON with parameters

Request Body

NameTypeDescription

name*

string

The name of the data asset

description*

string

A description for the data asset

tags*

string

Keywords to search the data asset by

mount*

string

Data asset default mount folder

custom_metadata

Map of key value pairs, should match custom metadata fields defined by the admin's possible values:

string custom field - string

number custom field - number

date custom field - number

- unix (epoch) format timestamp in secs

Request Example Bash
curl -X PUT 'https://codeocean.com/api/v1/data_assets/d36665a7-ef59-4b8e-a799-bee7f83ee317' \
-u \'${API_SECRET}:\' \
-H 'Content-Type: application/json' \
--data-raw '{
        "name": "Modified The Name",
        "description": "a new description from the API!",
        "tags": ["I","Am","New"],
        "mount": "NewMount"
}'
Request Example Python
import os, requests 

headers = {
  "Content-Type": "application/json"
}

json_data = {
  "name": "Modified The Name",
  "description": "a new description from the API!",
  "tags": [
    "I",
    "Am",
    "New"
  ],
  "mount": "NewMount"
}
 
response = requests.put('https://codeocean.com/api/v1/data_assets/d36665a7-ef59-4b8e-a799-bee7f83ee317', headers=headers, json=json_data, auth=("'" + os.getenv('API_SECRET', ''), "'"), 
)
Response

Archiving/Unarchiving a Dataset

PATCH https://{domain}/api/v1/data_assets/{data_set_id}/archive?archive=true

This API allows for the archiving and retrieval of your data asset.

Path Parameters

NameTypeDescription

PATCH*

/data_assets/:data_asset_id

Your VPC domain

Headers

NameTypeDescription

-u*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

Set this to: Content - Type: application/json

Request Example Bash
Archiving a Dataset

curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH
"https://codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive?archive=true"


Unarchiving a Dataset


curl -H "Content-Type: application/json" -u ${API_SECRET}: -X PATCH
"https://codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive?archive=false"
Request Example Python
Archiving a Dataset


import os, requests 


headers = {
  "Content-Type": "application/json"
} 


params = {
  "archive": "true"
}
 
response = requests.patch('https://codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive', params=params, headers=headers, auth=(os.getenv('API_SECRET', ''), ''), 
)


Unarchiving a Dataset


headers = {
  "Content-Type": "application/json"
}
 
params = {
  "archive": "false"
}
 
response = requests.patch('https://codeocean.com/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive', params=params, headers=headers, auth=(os.getenv('API_SECRET', ''), ''), 
)

Search Data Assets

POST https://{domain}/api/v1/data_assets/search

This API allows for the searching of Data Assets in your deployment.

Path Parameters

NameTypeDescription

POST*

/datasets/search

Headers

NameTypeDescription

-u:*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

Set this to: Content - Type: application/json

--data-raw *

JSON with parameters

Request Body

NameTypeDescription

offset*

int

describes the search from index.

limit*

int

specifies how many items to return.

sort_order

string

asc,desc

determines the result search order. must be provided with sort_field otherwise ignored.

sort_field

string

created,type,name,size

determines the field to sort by.

type

string

dataset, result

if omitted results may include both Data Assets and Results.

ownership*

string

created, shared

search Data Asset by ownership.

favorite*

boolean

search only favorite Data Assets.

archived*

boolean

search only archived Data Assets.

query*

string

determines the search query.

format

name: .. tag: ... run_script: ... commit_id: ...

Request Example Bash
curl -X POST 'https://codeocean.com/api/v1/data_assets/search'
-u \'${API_SECRET}:\' \
-H 'Content-Type: application/json' \
--data-raw '{
        "offset": "0",
        "limit": "10",
        "sort_order": "desc",
        "sort_field": "name",
        "type": "dataset",
        "ownership": "created",
        "favorite": "False",
        "archived": "False",
        "query": "tag:bioinforma name:Saccro"
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 


params = {
        "offset": "0",
        "limit": "10",
        "sort_order": "desc",
        "sort_field": "name",
        "type": "dataset",
        "ownership": "created",
        "favorite": "False",
        "archived": "False",
        "query": "tag:bioinforma name:Saccro"
}
 
response = requests.post('https://codeocean.com/api/v1/data_assets/search',
params=params, headers=headers, 
auth=(os.getenv('API_SECRET', ''), ''), 
)
Response

{

'has_more': True/False, (indicates whether there are more results) 'results': [Data Asset Object]

}

Update Permissions of a Data Asset

POST https://{domain}/api/v1/data_assets/{data_set_id}/permissions

This API allows for the updating of permissions associated with a Data Assets in your deployment.

Path Parameters

NameTypeDescription

POST*

/data_assets/{data_set_id}/permissions

Headers

NameTypeDescription

-u:*

Authorize with Code Ocean API Secret: -u $API_SECRET :

This is setting the "Authorization Basic" base64string header

-H*

Set this to: Content - Type: application/json

--data-raw *

JSON with parameters

Request Body

NameTypeDescription

users*

array<dict>

list of dictionaries with username and role.

role : owner, viewer, editor.

groups*

array<dict>

list of dictionaries with groups and roles

everyone*

string

viewer or none

Request Example Bash
curl -X POST 'https://codeocean.com/api/v1/data_assets/{data_asset_id}/permissions'
-u \'${API_SECRET}:\' \
-H 'Content-Type: application/json' \
--data-raw '{
  "users": [{"email": "john@codeocean.com", "role":"owner"}]
  "groups": [{"group":"ad-group","role":"viewer"}],
  "everyone": "viewer"
}'
Request Example Python
import os, requests 


headers = {
  "Content-Type": "application/json"
} 


params = {
  "users": [{"email": "john@codeocean.com", "role":"owner"}]
  "groups": [{"group":"ad-group","role":"viewer"}],
  "everyone": "viewer"
}
 
response = requests.post('https://codeocean.com/api/v1/data_assets/{data_asset_id}/permissions',
params=params, headers=headers, 
auth=(os.getenv('API_SECRET', ''), ''), 
)