Code Ocean VPC Administration Guide
v2.13
v2.13
  • Code Ocean VPC Administration Guide
  • Overview
    • System Overview
    • System Capacity and Sizing
  • Installation Guide
    • Prerequisites
    • CloudFormation Deployment
    • Deployment Parameters
    • Deployment IAM role
    • Subdomain Delegation
    • Create an Admin Account
    • Upgrade Code Ocean
    • Remove Code Ocean
  • Management Guide
    • User Management
      • Admin Signup
      • Adding/Removing an Administrator
      • Inviting New Users
      • Generating a Reset Password Link
      • Extend User Ownership
      • Deactivate User
    • Set up a User Banner Message
    • Enable Git Integration
    • Starter Environments
      • Deploy Base Image
      • Image Actions
      • Deploying Private Docker Base Images
    • Authentication
    • SCIM Provisioning using Azure Active Directory
    • SCIM Provisioning using Okta
    • Configure Worker Parameters
    • ACM Certificate Renewal
    • Deleting Released Capsules
    • Assumable Roles
  • Troubleshooting Guide
    • Collecting Logs with the Support Bundle
    • Searching Logs in AWS CloudWatch
    • Alarms
Powered by GitBook
On this page
  • Predefined Alarms
  • Important Alarms
  • How to Address Alarms
  • Subscribe to Alarms

Was this helpful?

  1. Troubleshooting Guide

Alarms

Learn alarm features.

PreviousSearching Logs in AWS CloudWatch

Last updated 1 year ago

Was this helpful?

Predefined Alarms

Code Ocean natively reports metrics to CloudWatch. These include system metrics as disk space, CPU and memory usage as well as application metrics. As part of the deployment, alarms are also provisioned

Important Alarms

Alarm

Description

Threshold

services-unhealthy-host

The services machine is not responding to health checks

Whenever this happens

services-data-volume-usage-70/-90

The services machine disk space is getting low

Disk used is greater than 70%/90%

critical-error

Critical errors returned by CodeOcean services

Whenever this happens

services-cpu-usage-high

Services machine CPU usage is high

CPU Utilization is grater then 70% for 6 times during 30 minutes

How to Address Alarms

  • services-unhealthy-host - We need to check the status of the Code Ocean service EC2 instance:

    • Connect to the instance using Session Manager and run sudo systemctl restart codeocean

  • services-data-volume-usage - This alarm indicates low disk space on the data EBS volume that is attached to the Code Ocean services instance.

  • services-cpu-usage-high - Contact Code Ocean if the issue persists

Subscribe to Alarms

Email subscription to an SNS topic

  1. In the navigation pane, choose Topics. A list of topics should be visible. Find the pre-defined topic with the name alarms-<id> and copy its ARN value.

  2. In the navigation pane, switch to Subscriptions, and select Create subscription.

  3. In the Create subscription dialog box, for Topic ARN, paste the topic ARN that you copied in the previous step.

  4. For Protocol, choose Email.

  5. For Endpoint, enter an email address that you can use to receive the notification, and then choose Create subscription.

  6. From your email application, open the message from AWS Notifications and confirm your subscription.

    Your web browser displays a confirmation response from Amazon SNS.

First, check if the instance is running in the AWS EC2 console:

First, we need to increase the data EBS volume size: 1. Open the AWS console, go to the EC2 service, select Instances on the left and search for the codeocean-services instance. Here's a direct link: . 2. Select the instance and click the Storage tab 3. Click on the volume with Device Name /dev/sde sdf 4. Right click on the volume and click Modify volume 5. Enter the new Size for the volume (it's normally recommended to double the current size. For example, if it's currently 500 GiB, then enter 1024 to make it 1 TiB) 6. Click Modify

Next, we need to extend the file system for this volume on the Code Ocean services instance: 1. Go back to the codeocean-services instance in the AWS EC2 console 2. Right click the instance and click Connect 3. Select the Session Manager tab and click Connect 4. In the terminal enter the following commands: sudo su - ec2-user sudo xfs_growfs -d /data

critical-errors - In case of a critical error, you will need to and send it to the Code Ocean support team for analysis and help with troubleshooting.

You can get notified when alarms happen using . All of the Code Ocean alarms are reported to the alarms-<id> SNS topic, which you can subscribe to.

Open the Amazon SNS console at .

See more on how to connect AWS SNS with .

https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:v=3;instanceState=running;search=:codeocean-services
https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:v=3;instanceState=running;search=:codeocean-services
See AWS docs for reference
See AWS docs for reference
generate a support bundle
AWS SNS
https://console.aws.amazon.com/sns/v3/home
Slack or Microsoft Teams