Backup and restore

Learn about Code Ocean VPC backup and restore capabilities.

Overview

Code Ocean VPC stores persistent data that needs to be backed up in the following storage components:

  • EBS data volume

  • S3 buckets

  • RDS analytics DB

The system comes with a default backup plan that includes: EBS & RDS daily snapshots using AWS Backup, S3 bucket versioning, and a 14 day retention period. The retention period is reflected in the number of days EBS & RDS snapshots are kept in the AWS Backup vault and in the number of days non-current (including deleted) S3 object versions are kept in S3 buckets that store persistent data.

You can configure the backup frequency (cron expression) to meet RPO (restore point objective) requirements, and the retention period (days) through the CloudFormation template parameters.

Disaster recovery

As part of a disaster recovery plan, you can protect production data on your Code Ocean VPC further by configuring automated backups to a different AWS account and/or region.

We provide a separate Code Ocean VPC CloudFormation backup stack to receive EBS & RDS snapshots and S3 replication from a production Code Ocean VPC deployment. The backup stack supports backup of a single Code Ocean VPC from VPC version 2.19.

The latest CloudFormation backup stack template version is available here: https://codeocean-vpc.s3.amazonaws.com/templates/codeocean-backup/v1.2.0/codeocean-backup.template.yaml

Note that newer versions of Code Ocean VPC can require upgrades to their backup stack to support new backup features. The Code Ocean VPC release notes will include the minimal required version of their backup stack.

Backup configuration

Prerequisites

To use cross-account backup, you must first enable the Cross-account backup feature in the master/management account in your AWS organization.

  1. Log in using your AWS Organizations management account credentials. Cross-account backup can only be enabled or disabled using these credentials.

  2. Open the AWS Backup console at https://console.aws.amazon.com/backup.

  3. In My account, choose Settings.

  4. For Cross-account backup, click Turn-on.

More information can be found in the AWS documentation.

Configuration steps

  1. Deploy the Code Ocean VPC CloudFormation backup stack in your backup destination AWS account/region.

    • Specify the Code Ocean VPC source AWS account ID.

    • You can configure the Backup Retention Period parameter as needed. This will reflect in the number of days non-current (including deleted) S3 object versions are kept in S3 backup buckets.

  2. Update the source Code Ocean VPC CloudFormation stack with the following Backup Configuration parameters:

    • Destination Backup Vault ARN (e.g. arn:aws:backup:us-east-1:0123456789:backup-vault:destination-stack-v1).

    • Destination Backup S3 KMS Key. This would be the AWS managed S3 KMS key alias ARN in the destination AWS account, eg arn:aws:kms:us-east-1:0123456789:alias/aws/s3.

    • Destination Backup S3 Storage Class: GLACIER_IR. S3 Glacier Instant Retrieval delivers the fastest access to cheaper archive storage, which will allow you to restore a Code Ocean VPC without waiting to restore all S3 objects in the S3 backup buckets.

    • Destination Backup S3 Bucket ARNs (Capsules, Datasets, etc.)

  3. Wait for the stack update to complete.

  4. Replicate existing S3 objects from source to destination:

    1. Start an SSM session on the source deployment services instance

    2. Generate and copy the S3 batch replication commands:

      sudo su -
      co-admin generate-backup-s3-replication-commands
    3. Start an AWS Console Cloud Shell, paste the commands into the shell and execute.

    4. You can track the progress of the S3 batch replication jobs in AWS S3 console under Batch Operations. Note that S3 reports status Failed for batch jobs that try to replicate empty S3 buckets with no objects in them. These errors can be safely ignored.

Automatic backup

After completing the above backup setup, Code Ocean VPC will automatically copy EBS & RDS snapshots, as they are created, to the destination backup stack, and continuously replicate S3 data into the destination S3 backup buckets in the backup stack.

Manual backup

The following steps describe the process to generate on-demand snapshots of the system’s EBS data volume and the RDS analytics DB. This will be required during a system migration process to take a final snapshot of a current system. It is also recommended to generate a backup before upgrading the system to allow roll back to a previous state.

EBS data volume

  1. Using the AWS console, navigate to AWS Backup > My account > Protected resources and click on Create on-demand backup.

  2. Fill in the following fields:

    1. Resource type: EBS

    2. Volume ID: [Code Ocean deployment EBS volume ID]

    3. Total retention period: 1 day (or more as required)

    4. Backup vault: [Code Ocean deployment backup vault]

    5. IAM role: [Code Ocean deployment BackupRole]

  3. Click on Create on-demand backup.

RDS analytics DB

  1. Using the AWS console, navigate toAWS Backup > My account > Protected resources and click on Create on-demand backup.

  2. Fill in the following fields:

    1. Resource type: RDS

    2. Database name: [Code Ocean deployment RDS analytics DB name]

    3. Total retention period: 1 day (or more as required)

    4. Backup vault: [Code Ocean deployment backup vault]

    5. IAM role: [Code Ocean deployment BackupRole]

  3. Click on Create on-demand backup.

If the Code Ocean VPC is configured with backup to a destination backup vault, the above generated snapshots will be automatically copied into the destination backup vault.

Restore from backup

During disaster recovery, you can restore your Code Ocean VPC production deployment by provisioning another instance of Code Ocean VPC and performing a restore from the backup stack.

  1. Deploy a new instance of Code Ocean VPC into the new account/region. Make sure to use the same version as the source Code Ocean VPC deployment you are restoring from.

    • Specify the backup AWS account ID under the Restore Configuration > Restore source account ID CloudFormation stack parameter.

    • Copy any other stack configuration parameter from the source deployment as needed.

  2. Wait for the stack creation to complete.

  3. SSM into the Code Ocean VPC services instance in the new deployment and stop the codeocean systemd service:

    sudo su -
    systemctl stop codeocean

    This will prevent the system from performing any cleanup maintenance jobs while S3 data is being replicated into it, which will result in deleting replicated S3 data.

  4. Update the Code Ocean VPC CloudFormation backup stack with the following Restore Configuration parameters:

    1. Destination Restore S3 KMS Key. This would be the AWS managed S3 KMS key alias ARN in the new deployment AWS account (e.g. arn:aws:kms:us-east-1:0123456789:alias/aws/s3).

    2. Destination Restore S3 Bucket ARNs (Capsules, Datasets, etc.). These would be the S3 bucket ARNs in the new deployment.

  5. Wait for the stack update to complete.

  6. Replicate S3 data from backup to the new deployment:

    1. Start an SSM session on the new deployment's services instance

    2. Generate and copy the S3 replication commands:

      sudo su -
      co-admin generate-restore-s3-replication-commands

      Fill in all required command line arguments, including the Backup Account ID, the IAM S3 Restore Role ARN, and the various bucket ARNs, all of which you can find the backup CloudFormation stack Resources list. The generated AWS CLI commands you copied will now be executed in the backup account.

    3. Start an AWS Console Cloud Shell in the backup account, paste the commands into the shell and execute

    4. You can track the progress of the S3 batch replication jobs in the backup account AWS S3 console under Batch Operations. Note that S3 reports status Failed for batch jobs that try to replicate empty S3 buckets with no objects in them. These errors can be safely ignored.

  7. Select the EBS & RDS snapshots to restore:

    1. In the backup account, open the AWS Backup console > Backup vaults > click the backup stack vault name.

    2. Under Recovery points click on the matching EBS & RDS snapshots you'd like to restore and copy their ARNs.

  8. Copy EBS & RDS snapshots from backup to the new deployment:

    1. Start an SSM session on the new deployment's services instance.

    2. Generate and copy the snapshot copy commands:

      sudo su -
      co-admin generate-restore-snapshot-copy-commands

      Fill in all required command line arguments, including the snapshot ARNs, and the backup vault name. For the IAM role ARN. you can use the AWS Backup default service role, eg arn:aws:iam::0123456789:role/service-role/AWSBackupDefaultServiceRole. The generated AWS CLI commands you copied will now be executed in the backup account.

    3. Start an AWS Console Cloud Shell in the backup account, paste the commands into the shell and execute.

    4. You can track the progress of the snapshot copy jobs in the backup account AWS Backup console under My account > Jobs.

  9. Wait for the snapshots copy to complete. The snapshots should appear in the new deployment backup vault under Recovery points.

  10. Restore a temporary EBS volume from the EBS snapshot:

    1. In the AWS console of the new deployment navigate to AWS Backup > Backup vaults

    2. Click on the backup vault name

    3. Under the Recovery points section, click on the EBS snapshot to restore

    4. Copy aside the Creation time of the EBS snapshot

    5. Click on Restore

    6. Fill in the following fields:

      1. Volume type: gp3

      2. Size: [match the size of the existing data volume]

      3. Availability zone: [choose AZ]

      4. Throughput: [match the throughput of the existing data volume]

      5. Encryption: change KMS key to use the default aws/ebs key (copy the key ID from the AWS KMS console > AWS managed keys > aws/ebs)

    7. Click on Restore backup

    8. Wait until the restore completes

    9. Navigate to AWS Backup > My account > Jobs > Restore jobs

    10. Copy aside the EBS Volume ID from the EBS restore job

  11. Restore a temporary RDS DB instance from the RDS snapshot:

    1. Navigate back to the AWS backup vault Recovery points and click on the RDS snapshot to restore

    2. Click on Restore

    3. Fill in the following fields:

      1. instance type: db.t3.micro

      2. storage class: gp3

      3. Availability and durability: Do not create a standby instance

      4. DB Instance Identifier: tmp-codeocean-restore

      5. VPC: [choose the Code Ocean deployment VPC]

      6. DB parameter group: [choose the Code Ocean DB parameter group]

      7. IAM DB Authentication Enabled: Disable

    4. Click on Restore backup

    5. Wait until the restore completes

    6. Find and copy aside the current RDS analytics DB master password in AWS Console > Secrets Manager > Secrets > /[codeocean-stack-name]/analytics/master-password-0 > Retrieve secret value

    7. Navigate to the AWS RDS console > Databases and click on the tmp-codeocean-restore DB

    8. Click Modify

    9. Update the master password using the master password you copied above

    10. Update the security group to the current code ocean deployment analytics DB security group

    11. Click Continue

    12. Select Apply immediately

    13. Click Modify DB instance

    14. Wait until the update completes

    15. Copy aside the endpoints for both the current RDS analytics DB and the restored RDS analytics DB

  12. Start an SSM session on the new deployment's services instance and run the following commands:

    sudo su -
    screen # starts a new screen session that you can return to if your SSM session ends
    
    systemctl stop codeocean
  13. Copy in data from the restored EBS volume:

    1. Attach the restored volume to the services instance:

      REGION=[your region, eg us-east-1]
      VOLUME=[restored EBS volume ID, eg vol-0123456789]
      
      aws ec2 attach-volume \
          --region $REGION \
          --volume-id $VOLUME \
          --instance-id $(ec2-metadata -i | grep -o i-[0-9a-z]*) \
          --device /dev/sdg
      
      mkdir -p /mnt/snapshot
      mount -o nouuid /dev/sdg /mnt/snapshot
    2. Override the data volume content with the snapshot content:

      time rsync -ahHv --delete /mnt/snapshot/ /data/
  14. Copy in data from the restored RDS DB:

    export PGPASSWORD=[master password]
    export RDS_HOST=[endpoint of current RDS analytics DB]
    export RDS_SNAPSHOT_HOST=[endpoint of restored RDS analytics DB]
    
    cd /mnt/snapshot/
    
    pg_dump --host $RDS_SNAPSHOT_HOST --port 5432 --username root -d codeocean -Fc -b -v -f codeocean.sql
    
    pg_restore --host $RDS_HOST --port 5432 --username root -d codeocean -c -v codeocean.sql
    
    dropdb --host $RDS_HOST --port 5432 --username root superset
    dropuser --host $RDS_HOST --port 5432 --username root superset
  15. Restore the S3 buckets to the exact snapshot time:

    pip3 install git+https://github.com/sveniu/s3-pit-restore.git
    
    TIME=[snapshot timestamp using format "MM-dd-yyyy HH:mm:ss +TZ", eg "12-20-2023 00:00:00 +0"]
    
    restore () {
        NAME=$1
        BUCKET=$(cat /etc/codeocean/stack.json | jq -r ".Buckets.$NAME")
        s3-pit-restore -b "$BUCKET" -B "$BUCKET" -t "$TIME"
    }
    
    restore Capsules
    restore Datasets
    restore DockerRegistry
    restore Inputfiles
    restore Licenses
    restore Packages
    restore Public
    restore Results
  16. Restore the system secrets:

    co-init restore-secrets
  17. Start codeocean service:

    systemctl start codeocean
  18. Run migration to set state of all datasets to “not-cached”:

    curl -X POST "localhost:8300/migrate?migration_num=6"
  19. Delete the restored EBS volume:

    1. Unmount the volume

      umount /mnt/snapshot
    2. In the AWS console, navigate to EC2 > Elastic Block Store > Volumes > Filter by volume id and check the desirable volume

    3. Click on Actions > Detach volume

    4. Click on Delete

  20. Delete the restored RDS DB:

    1. Navigate to the AWS RDS console > Databases and click on the tmp-codeocean-restore DB

    2. Click on Actions > Delete

    3. Uncheck Create final snapshot and Retain automated backups, acknowledge and click Delete

Migration

Migration of a Code Ocean VPC deployment to a new AWS account or region happens in two phases:

  1. Setup, where you create a new Code Ocean VPC deployment in the destination AWS account or region, and copy in the bulk of data from the source deployment. This phase happens in the background while the source deployment is still up and running.

  2. Switch, where you stop work on source deployment, take last snapshots of EBS & RDS, copy in remaining data, and finally perform the DNS switch.

Phase 1 - Copy Bulk of Data

  1. Deploy a new instance of Code Ocean VPC into the destination account/region. Make sure to use the same version as the source Code Ocean VPC deployment.

    • Specify the source AWS account ID under the Restore Configuration > Restore source account ID CloudFormation stack parameter.

    • Copy any other stack configuration parameter from the source deployment as needed.

    • If you opt into ACM certificate validation via DNS, it's recommended to first set the new CloudFormation stack on a different domain than the source Code Ocean VPC deployment. Later, when switching DNS in phase 2 of the migration you can update the DNS configuration as needed.

  2. Wait for the stack creation to complete.

  3. SSM into the Code Ocean VPC services instance in the new deployment and stop the codeocean systemd service:

    sudo su -
    systemctl stop codeocean

    This will prevent the system from performing any cleanup maintenance jobs while S3 data is being replicated into it, which will result in deleting replicated S3 data.

  4. Update the source Code Ocean VPC CloudFormation stack with the following Backup Configuration parameters:

    1. Destination Backup Vault ARN (e.g. arn:aws:backup:us-east-1:0123456789:backup-vault:destination-stack-v1). This would be the backup vault ARN in the destination deployment.

    2. Destination Backup S3 KMS Key. This would be the AWS managed S3 KMS key alias ARN in the destination AWS account (e.g. arn:aws:kms:us-east-1:0123456789:alias/aws/s3).

    3. Destination Backup S3 Storage Class: STANDARD (not GLACIER_IR).

    4. Destination Backup S3 Bucket ARNs (Capsules, Datasets, etc.). These would be the S3 bucket ARNs in the destination deployment.

  5. Wait for the stack update to complete.

  6. Replicate S3 existing objects from source to destination:

    1. Start an SSM session on the source deployment's services instance.

    2. Generate and copy the S3 replication commands:

      sudo su -
      co-admin generate-backup-s3-replication-commands
    3. Start an AWS Console Cloud Shell in the source account, paste the commands into the shell and execute.

    4. You can track the progress of the S3 batch replication jobs in the source account AWS S3 console under Batch Operations. Note that S3 reports status Failed for batch jobs that try to replicate empty S3 buckets with no objects in them. These errors can be safely ignored.

  7. Create an on-demand backup of the source deployment EBS data volume as described here.

  8. Wait for the snapshot to be automatically copied into the destination deployment’s backup vault. The snapshot should appear in the destination deployment backup vault under Recovery points.

  9. Copy in data from the EBS data volume snapshot to the destination deployment EBS data volume by executing the following steps from the restore section: 10, 12, 13, 19.

Phase 2 - Remaining Data, Switch

  1. Stop work and bring down the source Code Ocean VPC deployment as described here.

  2. Update the source deployment CloudFormation stack to remove S3 replication rules:

    1. Remove the following Backup Configuration stack parameters, and update the stack

      1. Destination Backup S3 KMS Key

      2. Destination Backup S3 Bucket ARNs (Capsules, Datasets, etc.)

    2. Wait for the stack update to complete

  3. Stop the codeocean systemd service on the source deployment services instance:

    sudo su -
    systemctl stop codeocean
  4. Create on-demand backups of the source deployment EBS data volume and RDS DB as described here.

    The snapshots will be automatically copied into the destination deployment’s backup vault.

  5. Copy in data from the EBS data volume and RDS DB snapshots to the destination deployment by executing steps 10-20 from the restore section. Skip step 15 as there is no need to restore S3 buckets to previous state.

  6. Update the destination deployment CloudFormation stack to remove restore configuration:

    1. Remove the Restore Configuration → Restore source account ID CloudFormation stack parameter and update the stack

    2. Wait for the stack update to complete

  7. Perform any required DNS switch

Last updated