Alarms
Learn alarm features.
Predefined Alarms
Code Ocean natively reports metrics to CloudWatch. These include system metrics as disk space, CPU and memory usage as well as application metrics. As part of the deployment, alarms are also provisioned
Important Alarms
Alarm | Description | Threshold |
---|---|---|
services-unhealthy-host | The services machine is not responding to health checks | Whenever this happens |
services-data-volume-usage-70/-90 | The services machine disk space is getting low | Disk used is greater than 70%/90% |
critical-error | Critical errors returned by CodeOcean services | Whenever this happens |
services-cpu-usage-high | Services machine CPU usage is high | CPU Utilization is grater then 70% for 6 times during 30 minutes |
How to Address Alarms
services-unhealthy-host - We need to check the status of the Code Ocean service EC2 instance:
First, check if the instance is running in the AWS EC2 console: https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:v=3;instanceState=running;search=:codeocean-services
Connect to the instance using Session Manager and run
sudo systemctl restart codeocean
services-data-volume-usage - This alarm indicates low disk space on the data EBS volume that is attached to the Code Ocean services instance.
First, we need to increase the data EBS volume size: 1. Open the AWS console, go to the EC2 service, select Instances on the left and search for the
codeocean-services
instance. Here's a direct link: https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#Instances:v=3;instanceState=running;search=:codeocean-services. 2. Select the instance and click the Storage tab 3. Click on the volume with Device Name/dev/sde
sdf 4. Right click on the volume and click Modify volume 5. Enter the new Size for the volume (it's normally recommended to double the current size. For example, if it's currently 500 GiB, then enter 1024 to make it 1 TiB) 6. Click Modify See AWS docs for referenceNext, we need to extend the file system for this volume on the Code Ocean services instance: 1. Go back to the
codeocean-services
instance in the AWS EC2 console 2. Right click the instance and click Connect 3. Select the Session Manager tab and click Connect 4. In the terminal enter the following commands:sudo su - ec2-user
sudo xfs_growfs -d /data
See AWS docs for reference
critical-errors - In case of a critical error, you will need to generate a support bundle and send it to the Code Ocean support team for analysis and help with troubleshooting.
services-cpu-usage-high - Contact Code Ocean if the issue persists
Subscribe to Alarms
You can get notified when alarms happen using AWS SNS.
All of the Code Ocean alarms are reported to the alarms-<id>
SNS topic, which you can subscribe to.
Email subscription to an SNS topic
Open the Amazon SNS console at https://console.aws.amazon.com/sns/v3/home.
In the navigation pane, choose Topics. A list of topics should be visible. Find the pre-defined topic with the name
alarms-<id>
and copy its ARN value**.**In the navigation pane, switch to Subscriptions, and select Create subscription.
In the Create subscription dialog box, for Topic ARN, paste the topic ARN that you copied in the previous step.
For Protocol, choose Email.
For Endpoint, enter an email address that you can use to receive the notification, and then choose Create subscription.
From your email application, open the message from AWS Notifications and confirm your subscription.
Your web browser displays a confirmation response from Amazon SNS.
See more on how to connect AWS SNS with Slack or Microsoft Teams.
Last updated