FAQ—General

How do I revert a commit?

The Compute capsule Workbench editor currently does not offer Git features beyond committing and diffing. Advanced Git operations like revert, log, remote, etc. can be done in a CLI via a Cloud Workstation deployment tool.

What happens if I close my browser tab during a Reproducible Run?

Reproducible Runs are designed to continue even after the browser tab is closed. To view which capsules you have been running, go to the capsule dashboard and check for the capsules with running status. You can click back into a running capsule and see the process.

What happens if I close my browser tab during a Cloud Workstation development session?

Cloud workstation sessions will end after 2 hours of idleness (<2% of CPU usage). If your script is running, you will be able to log back into the session from the dashboard the same way as mentioned above.

How does the EC2 machine turn on and shut down?

It depends on the machine type that is chosen during set up (go to Compute Resources for more detail).

For standard machine type, the system will turn on a new EC2 machine when the compute slot is not enough for the current computation and will shut down the EC2 machine when it's idle for an hour (or for a time specified in the deployment plan).

For the dedicated machine type, the system will turn on a new EC2 machine which dedicates to that computation and will shut down the machine right after the computation is completed.

How does resource allocation work on the worker machines?

It depends on the machine type that is chosen during set up (go to Compute Resources for more details).

For standard machine type, Code Ocean reserves 1GB of memory for the underline processes/container (i.e., the worker proxy, worker runner, ec2 agent, etc.). Considering this reservation, we calculate the amount of memory each slot gets, so for 64GB worker machines, each slot will get a bit less than 8GB ((64-1)/8). We place memory limits on each computation container according to the number of slots it occupies. In this way, the computation cannot use memory reserved for other computations or underline processes. This is how the usage displayed in the Cloud Workstation is calculated. When it hits 100%, you will get an "Out Of Memory" error. Even there are other free memories on the worker machine.

For dedicated machines, Code Ocean only reserves 200MB for underline processes, and the rest is made available for the computation.

What is the limit of Git file and repository size?

Each Code Ocean capsule contains a Git repository, which is subject to maximum file and repository size limits. The total size of the repository must be less than 2 GB and no individual file can be larger than 100 MB.

Your larger files, which tend to be the data, can be placed under the data folder, which is automatically added to .gitignore. This has the effect of hiding these files from Git, thereby bypassing any size limits. (Alternatively, you can also manually create a .gitignore file, and specify particular folders that should be hidden from Git.)

These limits will also apply when you import an existing repository into Code Ocean. If you run into a size limit error when trying to import a repository, please talk to your system administrator or reach out to the Code Ocean support team, we will be happy to help.

Why are there these limits?

As Linus Torvald puts it, "Git fundamentally never really looks at less than the whole repo. Even if you limit things a bit (ie check out just a portion, or have the history go back just a bit), git ends up still always caring about the whole thing, and carrying the knowledge around.

So git scales really badly if you force it to look at everything as one huge repository."

How to export Capsules and reproduce results on my local machine?

Code Ocean allows authors and readers to download an entire compute capsule. From the capsule's menu, click the Capsule tab and select Export.

This will prompt a download screen where you can download the environment template, metadata, code, and, optionally, the data. When you unzip this, you will see something like the following (this screenshot comes from MacOS).

  • REPRODUCING.md contains specific instructions for how to reproduce the capsule's results locally, with notes on the necessary prerequisites and commands. If you have downloaded a published capsule, this document will point to the preserved Docker image in our registry.

  • /code has your capsule's code, and /data has your capsule's data.

  • /metadata has a file called metadata.yml that will look something like this (for an unpublished capsule):

metadata_version: 1
name: Cape Feare
authors:
- name: Sideshow Bob
  affiliations:
  - name: The Krusty the Clown Show
corresponding_contributor:
  name: Sideshow Bob

Published capsules will have all of the information in the corresponding capsule's metadata.

  • The /environment folder contains, at a minimum, a file called Dockerfile . If you've employed a postInstall script, you will see a postInstall file as well. Other files may appear if, for instance, you use an additional PPA.

    • Dockerfile is the recipe for rebuilding your capsule's computational environment locally. Each will begin with a line like:

FROM registry.org_name.codeocean.com/codeocean/r-studio:1.2.5019-r4.0.3-ubuntu18.04

This tells the Dockerfile where to pull the Docker image from. If the environment has been customized further, there will be more commands such as:

ARG DEBIAN_FRONTEND=noninteractive

RUN apt-get update \
    && apt-get install -y --no-install-recommends \
      "curl=7.47.0-1ubuntu2.2" \
      "gcc=4:5.3.1-1ubuntu1" \
      "libnlopt-dev=2.4.2+dfsg-2" \
      "pandoc=1.16.0.2~dfsg-1" \
    && rm -rf /var/lib/apt/lists/*

and so on.

Reproducing your results locally is likely to be less user-friendly than reproducing results on Code Ocean. Docker requires some familiarity with the command line.

Here are some helpful resources related to Docker:

Why do I need to install everything when I add a new package?

Code Ocean uses Docker as its underlining technique to establish the computing environment for the capsule. The system will detect the changes to the environment (ie. Changes in the Dockerfile) to determine whether to trigger a rebuild or not. If the environment is not modified, it will use the cache information from the previous build to skip the build phase if the environment is not modified.

As been said, any edit made in the environment folder will trigger a change in the Dockerfile, which in turn could result in a complete rebuild of the Docker image during a subsequent computation session. Since adding a new package will change the Dockerfile, the system will trigger a rebuild signal and re-install everything.

How can I debug Shiny App from the cloud workstation in Code Ocean?

Shiny is eventually part of RStudio, according to the documentation from R, you can run your Shiny App from the RStudio cloud workstation. Then debugging by checking the logs in the RStudio built-in terminal.

How to display images in a markdown file on Code Ocean?

It has more to do with how to display images in a markdown file in general. Simply append ?raw=true to the image url will make the trick. For example: ![](./image.png?raw=true) This will display image.png in the markdown file. (image.png and the markdown are in the same directory)

In Code Ocean, you can use those tricks to display images as well. See the discussion on stuckoverflow.

What do I do if a file is too large to be committed to the data directory?

You can put them in .gitignore as committing large data files in Git is not a good practice in general. See .gitignore session for more information.

Can you leave a comment on a commit after it has already been committed?

Not from the UI. However, you can ammand the comment in the terminal from a cloud workstation.

Can I run Workflows language in Code Ocean (e.g., Nextflow)?

Yes. The capsule will run on a Linux system, you can install any workflow packages you require. You do need to know how to configure/link the path for the master script for the workflows. We encourage you to look for example capsules on the explore page of our public platform.

For Nextflow in particular, there are the links to the example capsules:

How do I Collect Logs of Failed Runs?

You can download computation run logs to make sharing error messages with the Code Ocean support team easier to troubleshoot. To do this:

  1. From the Timeline, find the relevant failed run and select Download Error Log from the dropdown list.

2. Click Save.

3. Email the Error Log to your admin or to the Code Ocean support team (support@codeocean.com)

Plotting issue with R 4.1.0

Sometimes RStudio crashed when saving the plot. It's a known issue for R and might be a R/Rstudio version incompatibility problem, you can solve it by adding the following to the postInstall script

wget https://download2.rstudio.org/server/bionic/amd64/rstudio-server-2021.09.0-351-amd64.deb 
dpkg -i rstudio-server-2021.09.0-351-amd64.deb

Where should I save large files and/or intermediate files?

For the large files, your best bet is to make it into a dataset and attach it to the capsule. This will work on the capsule or cloud workstation. Please check the Data Assets Guide for more information.

For the intermediate files, we recommend using the scratch folder while you are saving anything in the cloud workstation so that it will not count toward the capsule's size and help with the performance.

Last updated