Reproducible Runs

In order to perform a Reproducible Run from the Capsule IDE, the environment must be built and the run file must specify which script the Capsule should run.

Run File

The run file is a bash script that is often referred to as the driver script. The driver script generates all desired results in an automated way without any human intervention. This way, the results are less prone to variations from human input. When executing a Reproducible Run, the run file will be executed end to end.

Setting up Run File

  1. Hover over the first script to run in the Capsule

  2. Click the dropdown menu that appears on it

  3. Select Set as File to Run

You'll then get a 'run' script that executes your main script. This file is a shell script and can be modified as you see fit to run your analyses.

Properties of a good Run File

First, your run script should run headlessly, meaning that it does not require user input, nor expect a pop-up display, during runtime.

Second, the run script should perform a very small number of actions, such as point to another script in the Capsule.

Executing a Reproducible Run

The Reproducible Run button is located at the top of the Reproducibility panel, above the Timeline.

Once you click on it, the system will prepare the machine to run your computation. The file will be temporarily locked and show the computation progress information on the Reproducibility panel.

You can stop the run by clicking on the Stop Run button. For more information, see Stop Computation after Run Step. When the run is complete you are returned to the Timeline where the results of the run are displayed, whether it was successful or not.

To view the run details from the Reproducibility pane, click Run Details from the drop down menu.

Closing the browser tab during a run

Reproducible Runs are designed to persist even after the browser tab is closed. To review the Capsules you have been running, navigate to the My Capsule dashboard and look for those with a running status. You can click back into a running Capsule to observe the ongoing process.

Availability of Files during Reproducible Run

FolderAvailable in Reproducible RunPath in ComputationAlternative Path in Computation

Metadata

/root/capsule/metadata

-

Environment

/root/capsule/environment

-

Code

/root/capsule/code

/code

Data

/root/capsule/data

/data

Results

/root/capsule/results

/results

Scratch (CW)

/root/capsule/scratch

/scratch

Scratch (RR)

/root/capsule/scratch

/scratch

CW Root FS

/

-

Referring to Files During a Run

It is best to use relative paths where possible, for example, ./data and ./results. This increases compatibility if the Capsule is exported to run locally.

The initial working directory will be /code when your code is run. You will hardly need to refer to it if you use relative paths.

If you have code in a subfolder, add .. where necessary, for example

load('../../data/my_data.csv'). This will go up two parent directories and look for ./data/my_data.csv.

Concurrent Runs

On top of the computation progress information, you can click on Back to Timeline, and execute another run concurrently.

In the screenshot below, there are two computations running at the same time. You can click on View Run Details to go back to the computation progress information for tracking the progress.

Concurrent runs can be on either Flex or Dedicated machines.

The output displayed after a run will be for the second of the two concurrent runs. The run output can be switched between concurrent runs in the console by clicking View Output.

Stop Computation after Run Step

A computation can be stopped during a post-run step, for example, results collection during a Reproducible Run or syncing back to the capsule after a Cloud Workstation session. The stop computation functionality can reduce the time spent waiting for potentially unnecessary steps to complete (e.g. collecting results that aren’t needed).

A warning is issued before performing the action. For Reproducible Run, the warning is that the results will be lost. For Cloud Workstation, any changes made to the Capsule and environment will be lost.