Data Connectors

Each data connector consists of two capsules, one which performs a query and outputs a data file (in parquet, text or other tabular format) and one which will create a Data Asset automatically from the results.

LogoTitleDescriptionInput Data

Databricks - Data Connector

Query a Databricks metastore to pull data using SQL.

  • None

Databricks - Data Asset Generation

Create a Data Asset from a SQL query selecting from a Databricks metastore.

  • None

BigQuery - Data

This Capsule is intended to make a connection to Google Cloud's BigQuery, run a query and then download the results as a file..

  • None

BigQuery - Data Asset Generator

A Data Asset generator for Google BigQuery.

  • None

MySQL - Data Connector

This Capsule will query a MySQL database and output a file containing the requested data.

  • None

MySQL - Data Asset Generation

This Capsule will run the MySQL - Data Connector capsule to query a MySQL database and automatically create a Data Asset from the result.

  • None

Snowflake - Data Connector

This Capsule will query a Snowflake database and output a file containing the requested data.

  • None

Snowflake - Data Asset Generation

This Capsule will run the Snowflake - Data Connector capsule to query a Snowflake database and automatically create a Data Asset from the result.

  • None

AWS Athena (Glue) Data Connector

This Capsule will submit a SQL query using Athena and will output a result file for use within Code Ocean.

  • None

AWS Athena (Glue) Data Asset Generation

This is a Capsule which will perform an AWS Athena query by calling an external Capsule (AWS Glue Athena connector) and generating an output Data Asset.

  • None

Redshift Data Connector

This Capsule will query a Redshift database and output a file containing the requested data.

  • None

Redshift Data Connector - Data Asset Generation

This Capsule will run the Redshift Data Connector capsule to query a Redshift database and automatically create a Data Asset from the result.

  • None

Fetch data with ffq

Pulls data from GEO, SRA, EMBL-EBI, DDBJ or Biosample by accession number.

  • None

Download data from BaseSpace

Download demultiplexed (fastq.gz) or raw (bcl) Illumina sequencing data through the Illumina BaseSpace CLI. This Capsule requires a BaseSpace account and NGS data owned or shared with the user.

  • None

Publishing data to Tableau

The application is a Streamlit-based interface that simplifies interactions with Tableau. It assists in publishing new data sources to Tableau Cloud from local spreadsheet files.

  • .csv or .tsv files to upload to Tableau.