Data Asset

Prerequisites

  • Generated Access Token with Datasets scope

  • The Data Asset's ID

You can find Data Asset's ID below the title.

  • created float64

    Data Asset creation time

  • description string Data Asset description

  • field string

    Field of research

  • id integer

    Metadata id

  • last_used integer

    Time Data Asset was last used in seconds from unix epoch.

  • name string

    Name of the Data Asset

  • size string Size in bytes of the Data Asset.

  • state enum

    Data Asset creation state

    • draft

      Data Asset is still being created.

    • ready

      Data Asset is ready to use.

    • failed

      Data Asset creation failed.

  • source_bucket

    Information on bucket from which Data Asset was created.

    • bucket string

      The original buckets name

    • origin enum

      • aws, local, gcp

    • prefix string

      The folder in the s3 bucket from which the Data Asset would be created.

  • tags list<string>

    Keywords for searching the Data Asset by.

  • type Enum

    Type of the Data Asset.

    • DATA_ASSET_TYPE_DATASET

    • DATA_ASSET_TYPE_RESULT

  • custom_metadata dictionary

    According to custom metadata fields defined by the admin and values that were set by the user.

  • provenance dictionary

    Shows the Data Asset provenance, only relevant for kind = result.

    • commit

      Commit the Data Asset was created from with.

    • run_script

      Script the Data Asset was created by.

    • DataAssets

      Data Assets that was used to create the Data Asset.

    • docker_Image

      Docker image used to create the Data Asset.

    • Capsule

      Capsule used to create the Data Asset.

Create Data Asset

POST https://{domain}/api/v1/data_assets

This API allows for the creation of data assets from either an S3 bucket or the results of a computation.

Prerequisite

Before using this API call, you may require AWS Cloud Credentials configured as Secrets or an Assumable Role.

Request Body

NameType

name*

string

description*

string

mount*

string

tags*

list/string

custom_metadata

source*

aws*

bucket*

bucket name

prefix*

path

keep_on_external_storage*

boolean

public*

boolean

computation

id

string

path

path

target

aws

bucket

bucket name

prefix

path

Scope

TypePermission

Data Asset

Read & Write

Create a Data Asset from a Public S3 Bucket

Request Example Bash
curl -X POST https://{domain}/api/v1/data_assets \
   -u "cop_d23dasd312" \
   -H "Content-Type: application/json" \ 
   --data-raw "{
      "name": "import public AWS bucket with dataset api",
      "description": "meaningful-c",
      "mount": "citations",
      "tags": ["Genomics"],
      "source": {
         "aws": {
             "public": true,
             "bucket": "codeocean-public-data",
             "prefix": "example_datasets/ATAC/hg38_2bit/"
         }
      }
   }"
Request Example Python SDK
from codeocean.data_asset import DataAssetParams, Source, AWSS3Source

data_asset_params = DataAssetParams(
    name="Dataset From Bucket",
    description="S3 bucket import",
    mount="my-data",
    tags=["my", "data"],
    source=Source(
        aws=AWSS3Source(
            bucket="codeocean-public-data",
            prefix="example_datasets/ATAC/hg38_2bit/",
            public="true",
        ),
    ),
)

data_asset = client.data_assets.create_data_asset(data_asset_params)

Create a Data Asset from a Private S3 Bucket

Request Example Bash
curl -X POST https://{domain}/api/v1/data_assets \
   -u "cop_d23dasd312" \
   -H "Content-Type: application/json" \
   --data-raw "{
      "name":"import private AWS bucket with Data Asset API",
      "description":"meaningful-c",
      "mount":"citations",
      "tags":["Genomics"],
      "source":
         {
         "aws":
                 {
                  "bucket":"codeocean-private-data",
                  "prefix":"example_datasets/ATAC/hg38_2bit/"
                 }
         }
}"
Request Example Python SDK
from codeocean.data_asset import DataAssetParams, Source, AWSS3Source

data_asset_params = DataAssetParams(
    name="Dataset From Bucket",
    description="S3 bucket import",
    mount="my-data",
    tags=["my", "data"],
    source=Source(
        aws=AWSS3Source(
            bucket="codeocean-private-data",
            prefix="example_datasets/ATAC/hg38_2bit/",
        ),
    ),
)

data_asset = client.data_assets.create_data_asset(data_asset_params)

Create an External Result Data Asset

Request Example Bash
curl -X POST "https://codeocean.[my-domain].com/api/v1/data_assets"
   -H "Content-Type: application/json" \
   -u "cop_d23dasd312" \ 
   --data-raw "{
       "name": "RNA-Sequencing",
       "description": "these are reads from an experiment", 
       "mount": "Reads",
       "tags": ["Genomics", "RNA"],
       "source": 
         {
         "computation": 
                 {
                         "id":”8f174aed-64ce-43eb-9c16-64d25da84bda”,
                         “path”:”Alignment/” (Alignment is a folder in Results)
                 }
},
      “target”:
         {
         “aws”:
                 {
                         “bucket”:”my-bucket”,
                         “prefix”:”deposit/my/results/”
                 }
         }
}"
Request Example Python SDK
from codeocean.data_asset import (
        DataAssetParams, 
        Source,
        ComputationSource, 
        Target,
        AWSS3Target
        )
        
data_asset_params = DataAssetParams(
    name="RNA-Sequencing",
    description="these are reads from an experiment",
    mount="Reads",
    tags=["Genomics", "RNA"],
    source=Source(
        computation=ComputationSource(
            id="8f174aed-64ce-43eb-9c16-64d25da84bda",
            path="Alignment/"
        ),
    ),
    target=Target(
        aws=AWSS3Target(
            bucket="my-bucket",
            prefix="deposit/my/results/"
        )
    )
    
data_asset = client.data_assets.create_data_asset(data_asset_params)

Create a Result Data Asset

Request Example Bash
curl -X POST "https://codeocean.[my-domain].com/api/v1/data_assets" \
-H "Content-Type: application/json" \ 
-u "cop_d23dasd312" \  
   --data-raw "{
       "name": "Data asset From API",
       "description": "An example for creating data asset from CO API",
       "mount": "some-folder",
       "tags": [ "keyword1", "keyword2" ],
       "source": {
         "computation": {
          "id": "8f174aed-64ce-43eb-9c16-64d25da84bda"
       }
    }
}"
Request Example Python SDK
from codeocean.data_asset import (
        DataAssetParams, 
        Source,
        ComputationSource
        )
        
data_asset_params = DataAssetParams(
    name="RNA-Sequencing",
    description="these are reads from an experiment",
    mount="Reads",
    tags=["Genomics", "RNA"],
    source=Source(
        computation=ComputationSource(
            id="8f174aed-64ce-43eb-9c16-64d25da84bda",
            path="Alignment/"
        ),
    )
)
    
data_asset = client.data_assets.create_data_asset(data_asset_params)

Create a Data Asset with Custom Metadata Tags

Request Example Bash
curl -X POST "https://codeocean.[my-domain].com/api/v1/data_assets"
   -H "Content-Type: application/json" \
   -u "cop_d23dasd312" \  
   --data-raw "{
       "name": "myDatasetFromPublic",
       "description": "a descriptive description",
       "mount": "Mymount",
       "tags": ["t1", "t2"],
       "custom_metadata":
         {
                 "some_field": "one", 
                 "another_field": 1, 
                 "dateField": 1676246400 
        }, 
       "source": 
         {
         "computation": 
                 {
                   "id":”computation_ID”,
                   "path": "/path/to/folder/" (remove if want all Results)
                 }
         }
}"
Request Example Python SDK
from codeocean.data_asset import (
        DataAssetParams, 
        Source,
        ComputationSource
        )
        
data_asset_params = DataAssetParams(
    name="RNA-Sequencing",
    description="these are reads from an experiment",
    mount="Reads",
    tags=["Genomics", "RNA"],
    "custom_metadata":
            {
        "some_field": "one", 
        "another_field": 1, 
        "dateField": 1676246400 
        },
    source=Source(
        computation=ComputationSource(
            id="8f174aed-64ce-43eb-9c16-64d25da84bda",
            path="Alignment/"
        ),
    )
)
    
data_asset = client.data_assets.create_data_asset(data_asset_params)
Response

API only returns a confirmation of the validity of the creation request, not the success of the creation, since the creation takes time. Poll on the dataset details and monitor its state until it’s ready.

Get Data Asset

GET https://{domain}/api/v1/data_assets/{data_asset_id}

This API retrieves metadata for your Data Asset.

Path Parameters

NameType

data_asset_id *

string

Scope

TypePermission

Data Asset

Read & Write

Request Example Bash
curl https://{domain}/api/v1/data_assets/4bc97533-6eb4-48ac-966f-648548a756d2 \
   -u "cop_d23dasd312"
Request Example Python SDK
data_asset = client.data_assets.get_data_asset(dataset_id="4bc97533-6eb4-48ac-966f-648548a756d2")

Update Metadata

PUT https://{domain}/api/v1/data_assets/{data_asset_id}

This API allows for the updating of the metadata for your data asset.

Path Parameters

NameType

data_asset_id*

string

Request Body

NameType

name*

string

description*

string

tags*

string

mount*

string

custom_metadata

dictionary

Scope

TypePermission

Data Asset

Write

Request Example Bash
curl -X PUT "https://codeocean.[my-domain].com/api/v1/data_assets/d36665a7-ef59-4b8e-a799-bee7f83ee317" \
   -H "Content-Type: application/json" \
   -u "cop_d23dasd312" \
   --data-raw "{
        "name": "Modified The Name",
        "description": "a new description from the API!",
        "tags": ["I","Am","New"],
        "mount": "NewMount"
}"
Request Example Python SDK
from codeocean.data_asset import DataAssetUpdateParams

data_asset_params = DataAssetUpdateParams(
    name="Modified The Name",
    description="a new description from the SDK!",
    tags=["I","Am","New"],
    mount="NewMount",
)

data_asset = client.data_assets.update_metadata(
    dataset_id="4bc97533-6eb4-48ac-966f-648548a756d2",
    update_params=data_asset_params,
)
Response

Archiving/Unarchiving a Dataset

PATCH https://{domain}/api/v1/data_assets/{data_asset_id}/archive?archive={true|false}

This API allows for the archiving and retrieval of your data asset.

Path Parameters

NameType

data_asset_id*

string

Query Parameters

NameType

archive*

boolean

Scope

TypePermission

Data Asset

Read & Write

Request Example Bash
Archiving a Dataset

curl -X PATCH https://{domain}/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive?archive=true
   -u "cop_d23dasd312" 

Unarchiving a Dataset


curl -X PATCH https://{domain}/api/v1/data_assets/e25ec103-a712-4882-a9fa-3cd5a80438a8/archive?archive=false
   -u "cop_d23dasd312"
Request Example Python SDK
client.data_assets.archive_data_asset(
    data_asset_id="edf1a1df-4e97-4888-9e2a-92bf70e341e8",
    archive=True,
)

Search Data Assets

POST https://{domain}/api/v1/data_assets/search

This API allows for the searching of Data Assets in your deployment.

Request Body

NameType

offset*

int

limit*

int

sort_order

string

sort_field

string

type

string

ownership*

string

favorite*

boolean

archived*

boolean

query*

string

Request Example Bash
curl -X POST https://{domain}/api/v1/data_assets/search
   -u "cop_d23dasd312" \ 
   -H "Content-Type: application/json" \
   --data-raw "{
        "offset": "0",
        "limit": "10",
        "sort_order": "desc",
        "sort_field": "name",
        "type": "dataset",
        "ownership": "created",
        "favorite": "False",
        "archived": "False",
        "query": "tag:bioinforma name:Saccro"
}"
Request Example Python SDK
from codeocean.data_asset import DataAssetSearchParams

data_asset_params = DataAssetSearchParams(
    limit="10",
    offset="2",
    archived="false",
    favorite="false",
    query="tag:Bioinforma name:Sequencing"
)

data_assets = client.data_assets.search_data_assets(data_asset_params)
Response

{

'has_more': True/False, (indicates whether there are more results) 'results': [Data Asset Object]

}

Update Permissions of a Data Asset

POST https://{domain}/api/v1/data_assets/{data_asset_id}/permissions

This API allows for the updating of permissions associated with a Data Assets in your deployment.

Path Parameters

NameType

data_asset_id*

string

Request Body

NameType

users*

array<dict>

groups*

array<dict>

everyone*

string

share_assets

bool

Request Example Bash
curl -X POST https://{domain}/api/v1/data_assets/{data_asset_id}/permissions \
  -u "cop_d23dasd312" \
  -H "Content-Type: application/json" \
  --data-raw "{
       "users": [{"email": "john@codeocean.com", "role":"owner"}]
       "groups": [{"group":"ad-group","role":"viewer"}],
       "everyone": "viewer",
       "share_assets": "true"
  }"
Request Example Python SDK
from codeocean.data_asset import (
    Permissions
)
                
from codeocean.components import (
    UserPermissions,
    EveryoneRole,
    UserRole,
)
                
update_permissions = Permissions(
    users=[
        UserPermissions(
            email="jake@codeocean.com",
            role=UserRole(value="owner"),
        ),
    ],
    everyone=EveryoneRole(value="viewer"),
    share_assets=True,
)    

client.data_assets.update_permissions(
    data_asset_id="e25ec103-a712-4882-a9fa-3cd5a80438a8",
    permissions=update_permissions,
)