Where can I monitor API status in real-time?

API Status Check (apistatuscheck.com) provides real-time monitoring for 100+ APIs with uptime tracking and alerts. You can view dashboards, subscribe to feeds, and set up notifications in minutes.

Is AWS Down? Complete Status Check Guide + Quick Fixes

Q: Is AWS Down? Complete Status Check Guide + Quick Fixes?

This post explains Is AWS Down? Complete Status Check Guide + Quick Fixes with clear steps and practical examples. Use the guidance to apply the recommendations in your own API workflows.

EC2 instances not responding?
S3 buckets timing out?
Lambda functions failing?

Before panicking, verify if AWS is actually down—or if it's a configuration issue on your end. Here's your complete guide to checking AWS status and responding to outages.

Quick Check: Is AWS Actually Down?

Don't assume it's AWS. Many "AWS down" reports are actually configuration errors, quota limits, or region-specific issues that can be resolved quickly.

1. Check Official Sources

AWS Service Health Dashboard:
🔗 health.aws.amazon.com/health/status

What to look for:

✅ Green checkmarks = Service operational
⚠️ Yellow indicators = Service degradation
🔴 Red indicators = Service disruption
📋 Recent events = Click for details

Shows status for:

EC2 (Compute)
S3 (Storage)
Lambda (Serverless)
RDS (Databases)
CloudFront (CDN)
Route 53 (DNS)
All AWS regions

API Status Check:
🔗 apistatuscheck.com/api/aws

Why use it:

Real-time monitoring (checks every 5 minutes)
Historical uptime data
Instant alerts (Slack, Discord, email)
Tracks individual services separately
Multi-region monitoring

Twitter/X Search:
🔗 Search "AWS down" on Twitter

Why it works:

DevOps teams report outages instantly
AWS support responds here
See which regions affected
Identify specific services down

Pro tip: Search specific services: "EC2 down", "S3 down us-east-1", etc.

2. Check Region-Specific Status

AWS operates in multiple regions worldwide:

Region Code	Location	Common Name
us-east-1	N. Virginia	US East (most common)
us-east-2	Ohio	US East 2
us-west-1	N. California	US West
us-west-2	Oregon	US West 2
eu-west-1	Ireland	Europe
eu-central-1	Frankfurt	Europe Central
ap-southeast-1	Singapore	Asia Pacific
ap-northeast-1	Tokyo	Asia Pacific
sa-east-1	São Paulo	South America

Critical insight: AWS outages are almost always region-specific. us-east-1 can be down while us-west-2 is fine.

How to check your region:

AWS Console → Top-right dropdown shows current region
Check your resource configurations
Look at health.aws.amazon.com region-by-region

Best practice: Deploy to multiple regions for redundancy.

3. Check Service-Specific Status

AWS has 200+ services. Focus on the major ones:

Service	What It Does	Most Common Issues
EC2	Virtual servers	Instance launch failures, connectivity
S3	Object storage	High error rates, slow responses
Lambda	Serverless compute	Invocation failures, timeouts
RDS	Managed databases	Connection failures, slow queries
CloudFront	CDN	Cache misses, edge location issues
Route 53	DNS	Resolution failures (rare)

Your service might be down while AWS globally is up.

Common AWS Error Messages (And What They Mean)

EC2: "InsufficientInstanceCapacity"

What it means: AWS doesn't have enough physical capacity in that availability zone.

Causes:

High demand in specific AZ
Instance type shortage
Spot instance availability

Quick fixes:

Try different availability zone (us-east-1a → us-east-1b)
Try different instance type (m5.large → m5a.large)
Wait 30-60 minutes and retry
Use different region temporarily

Long-term fix: Use Auto Scaling with multiple AZs.

S3: "503 Service Unavailable" or "SlowDown"

What it means: S3 is throttling requests or overloaded.

Causes:

Too many requests to same prefix
S3 service degradation
Regional outage

Quick fixes:

Implement exponential backoff (retry with increasing delays)
Check S3 request rate limits
Distribute requests across key prefixes
Check AWS Status for S3 issues

Code example (exponential backoff):

import time
import boto3
from botocore.exceptions import ClientError

def s3_get_with_retry(bucket, key, max_retries=5):
    s3 = boto3.client('s3')
    for i in range(max_retries):
        try:
            return s3.get_object(Bucket=bucket, Key=key)
        except ClientError as e:
            if e.response['Error']['Code'] == '503':
                wait = 2 ** i  # Exponential backoff
                time.sleep(wait)
            else:
                raise
    raise Exception("Max retries exceeded")

Lambda: "Rate Exceeded" or "TooManyRequestsException"

What it means: Hit Lambda concurrency limits.

Causes:

Account-level concurrent execution limit (default: 1000)
Reserved concurrency limit
Burst limit exceeded

Quick fixes:

Check Lambda console → Throttles metric
Request concurrency limit increase (AWS Support)
Implement queue (SQS) to smooth traffic
Check if specific function has reserved concurrency set too low

Check current limits:

aws lambda get-account-settings --region us-east-1

RDS: "Cannot Connect to Database"

What it means: Can't reach RDS instance.

Causes:

Security group blocking access
RDS instance stopped/terminated
Network connectivity issue
Regional outage

Quick fixes:

Check RDS instance status (Console → RDS → Databases)
Verify security group allows your IP (port 3306 for MySQL, 5432 for PostgreSQL)
Check VPC routing/subnet configuration
Test from EC2 instance in same VPC
Check AWS Status for RDS issues

Test connection from EC2:

# MySQL
mysql -h your-rds-endpoint.rds.amazonaws.com -u admin -p

# PostgreSQL
psql -h your-rds-endpoint.rds.amazonaws.com -U admin -d mydb

CloudFront: "502 Bad Gateway" or "504 Gateway Timeout"

What it means: CloudFront can't reach your origin server.

Causes:

Origin server down (S3, EC2, ALB)
Origin timeout too short
SSL/TLS certificate issues
Origin security group blocking CloudFront IPs

Quick fixes:

Check origin server health
Verify origin domain/IP is correct (CloudFront console)
Check origin response time (should be < 30 sec)
Whitelist CloudFront IP ranges in security groups
Check SSL certificate validity

Get CloudFront IP ranges:

curl https://ip-ranges.amazonaws.com/ip-ranges.json | grep CLOUDFRONT

Route 53: DNS Resolution Failures

What it means: DNS queries not resolving (very rare).

Causes:

Hosted zone misconfigured
Record set errors
Health check failures causing failover
Actual Route 53 outage (extremely rare)

Quick fixes:

Test DNS resolution: dig yourdomain.com or nslookup yourdomain.com
Check Route 53 hosted zone records (Console → Route 53)
Verify nameservers match (domain registrar = Route 53 nameservers)
Check health check status
Check AWS Status for Route 53 issues

Test DNS from multiple locations:

# Using dig
dig @8.8.8.8 yourdomain.com

# Using nslookup
nslookup yourdomain.com 8.8.8.8

Quick Fixes: AWS Service Issues

Fix #1: Check AWS Personal Health Dashboard

First stop for AWS issues.

How to access:

AWS Console → Search "Health"
Or visit: console.aws.amazon.com/health

What you'll see:

Issues affecting YOUR resources
Scheduled maintenance events
Recent events history
Affected resources list

Action items:

Read event details
Check "Affected resources" tab
Follow AWS recommendations
Set up email/SNS notifications

Fix #2: Verify Region Selection

Wrong region = resources "disappear"

Check current region:

Top-right corner of AWS Console
Should match where you created resources

Common mistake:

Created EC2 in us-east-1
Console switched to us-west-2
"Where did my instances go?!"

Fix:

Switch to correct region in dropdown
Set up AWS CLI default region:

aws configure set region us-east-1

Fix #3: Check Service Quotas/Limits

AWS has limits on everything.

Common limits:

EC2 instances per region (default: 20 On-Demand instances)
S3 bucket names (globally unique)
Lambda concurrent executions (default: 1000)
EBS volumes per region (default: 5 TiB)

Check quotas:

AWS Console → Service Quotas
Search for service (e.g., "EC2")
See current limit vs. usage
Request increase if needed

Via CLI:

aws service-quotas list-service-quotas --service-code ec2

Pro tip: Request limit increases BEFORE you need them (can take 24-48 hours).

Fix #4: Implement Retry Logic with Exponential Backoff

AWS recommends exponential backoff for all API calls.

Why:

Handles temporary failures
Respects throttling
Improves reliability

Implementation (Python boto3):

from botocore.config import Config
import boto3

# Configure automatic retries
config = Config(
   retries = {
      'max_attempts': 10,
      'mode': 'adaptive'  # or 'standard'
   }
)

# Use with any AWS client
s3 = boto3.client('s3', config=config)
ec2 = boto3.client('ec2', config=config)

JavaScript (AWS SDK v3):

import { S3Client } from "@aws-sdk/client-s3";

const client = new S3Client({
  maxAttempts: 10,
  retryMode: "adaptive"
});

Fix #5: Check CloudWatch Metrics

CloudWatch shows what's actually happening.

Key metrics to check:

EC2:

CPUUtilization
StatusCheckFailed
NetworkIn/NetworkOut

S3:

4xxErrors, 5xxErrors
AllRequests
BytesDownloaded

Lambda:

Invocations
Errors
Throttles
Duration

RDS:

CPUUtilization
DatabaseConnections
ReadLatency, WriteLatency

How to access:

AWS Console → CloudWatch → Metrics
Select namespace (AWS/EC2, AWS/S3, etc.)
Graph metrics for last 1-24 hours
Look for spikes/drops

CLI example:

aws cloudwatch get-metric-statistics \
  --namespace AWS/EC2 \
  --metric-name CPUUtilization \
  --dimensions Name=InstanceId,Value=i-1234567890abcdef0 \
  --start-time 2026-02-07T00:00:00Z \
  --end-time 2026-02-07T23:59:59Z \
  --period 3600 \
  --statistics Average

Fix #6: Check Security Groups and NACLs

Most connectivity issues = security group misconfiguration.

Security Groups (instance-level firewall):

Check rules:

EC2 Console → Security Groups
Find relevant group
Check Inbound rules (incoming traffic)
Check Outbound rules (outgoing traffic)

Common issues:

SSH (port 22) not allowed from your IP
HTTP/HTTPS (80/443) not open to 0.0.0.0/0
RDS port not open to application security group
Forgot to allow outbound traffic (rare, but happens)

Quick fix for testing:

Temporarily allow all traffic: 0.0.0.0/0 on all ports
If it works, narrow down to specific ports/IPs
NEVER leave wide open in production

Network ACLs (subnet-level firewall):

Usually left at default (allow all)
Check if someone modified them
VPC → Network ACLs

Fix #7: Check IAM Permissions

"Access Denied" errors = IAM issue, not AWS down.

Troubleshoot:

1. Check who you are:

aws sts get-caller-identity

2. Test specific permission:

aws iam simulate-principal-policy \
  --policy-source-arn arn:aws:iam::123456789012:user/YourUser \
  --action-names s3:GetObject \
  --resource-arns arn:aws:s3:::your-bucket/*

3. Check CloudTrail for denied actions:

CloudTrail → Event history
Filter: "Error code = AccessDenied"
See exactly which permission is missing

Common fixes:

Attach policy with required permissions
Add resource to existing policy
Check if MFA required
Verify you're using correct AWS account

Fix #8: Use AWS Support (If You Have a Plan)

AWS Support tiers:

Plan	Response Time	Cost
Basic	No tech support	Free
Developer	12-24 hours	$29/month
Business	1 hour (critical)	$100+/month
Enterprise	15 minutes (critical)	$15,000+/month

When to contact support:

Service limits need increasing
Billing issues
Technical issues you can't resolve
Account or security issues

How to open case:

AWS Console → Support → Create case
Choose category (Service limit, technical, billing)
Describe issue with details
Attach CloudWatch graphs, error messages

Pro tip: Include AWS request IDs from error messages (speeds up troubleshooting).

EC2 Not Working?

Issue: Can't Connect to EC2 Instance

Troubleshoot:

1. Check instance state:

EC2 Console → Instances
Should be "running" (green)
"stopped" = start it
"terminated" = it's gone, launch new one

2. Check security group:

Select instance → Security tab
Click security group name
Inbound rules should include:
- SSH (port 22) from your IP for Linux
- RDP (port 3389) from your IP for Windows

3. Test network connectivity:

# Ping (if ICMP allowed)
ping ec2-xx-xx-xx-xx.compute.amazonaws.com

# Test SSH port
telnet ec2-xx-xx-xx-xx.compute.amazonaws.com 22
# Or
nc -zv ec2-xx-xx-xx-xx.compute.amazonaws.com 22

4. Check if you have correct key:

SSH requires .pem key file
Key must match what you selected at launch
Key permissions must be 400: chmod 400 your-key.pem

5. Check System Status Checks:

EC2 Console → Instance → Status checks tab
"2/2 checks passed" = healthy
Failed checks = hardware/network issue → Reboot or contact AWS

Issue: EC2 Instance Slow or Unresponsive

Causes:

CPU throttling (T instance credits exhausted)
Memory exhausted
Disk I/O bottleneck
Network saturation

Troubleshoot:

1. Check CloudWatch metrics:

CPU, Network, Disk I/O graphs
Look for maxed out metrics

2. For T instances (T2, T3, T4g), check CPU credits:

CloudWatch → Metrics → EC2 → Per-Instance Metrics
CPUCreditBalance
If near zero, you're being throttled

Solutions:

Switch to unlimited mode (costs more but no throttling)
Upgrade to M, C, or R instance type
Optimize application

3. Connect via EC2 Instance Connect or Session Manager:

Browser-based console access (no SSH needed)
EC2 Console → Instance → Connect button

S3 Not Working?

Issue: S3 Bucket Access Denied

Causes:

Bucket policy blocking access
IAM permissions missing
Bucket in different region
Bucket doesn't exist

Troubleshoot:

1. Check bucket exists:

aws s3 ls s3://your-bucket-name

2. Check bucket region:

aws s3api get-bucket-location --bucket your-bucket-name

3. Check bucket policy:

S3 Console → Bucket → Permissions → Bucket policy
Look for "Deny" statements

4. Check IAM permissions:

Need s3:GetObject, s3:PutObject, s3:ListBucket, etc.

5. Check Block Public Access settings:

S3 Console → Bucket → Permissions → Block public access
May need to disable for public buckets

Issue: S3 High Error Rates

Check Service Health Dashboard:

health.aws.amazon.com → S3
Look for your region

Implement retry logic:

See Fix #4 above

Optimize request patterns:

Distribute across key prefixes (avoid sequential keys)
Use CloudFront for frequently accessed objects
Enable S3 Transfer Acceleration for uploads

Lambda Not Working?

Issue: Lambda Timeouts

Causes:

Function timeout too short (default: 3 sec, max: 15 min)
Slow dependencies (database, API calls)
Cold starts
VPC networking delays

Quick fixes:

1. Increase timeout:

Lambda Console → Function → Configuration → General
Set timeout higher (but find root cause)

2. Check CloudWatch Logs:

Lambda Console → Function → Monitor → View logs in CloudWatch
See exactly where function is slow

3. Optimize function:

Reduce package size
Increase memory (also increases CPU)
Remove VPC if not needed (VPC adds latency)
Use provisioned concurrency for critical functions

Issue: Lambda "Function Not Found"

Causes:

Function in wrong region
Function deleted
Wrong function name

Quick fixes:

Check region (top-right dropdown)
List functions: aws lambda list-functions
Verify function ARN

When AWS Actually Goes Down

What Happens

Major AWS outages (recent):

December 2021: us-east-1 outage (7 hours) - networking issue
July 2022: us-east-1 power issue (2 hours)
June 2023: us-east-1 EC2 API issues (3 hours)

Typical causes:

Power issues at data centers
Networking failures
Software deployment bugs
Rare: DDoS attacks

Impact:

Regional (usually just one region)
Service-specific (EC2 down, but S3 works)
Cascading failures (one service depends on another)

How AWS Responds

Communication:

AWS Service Health Dashboard
@AWSSupport on Twitter
Personal Health Dashboard notifications
Post-incident reports (PIR) published later

Timeline:

0-15 min: Users report issues on Twitter
15-30 min: AWS acknowledges on dashboard
30-90 min: Regular updates
Resolution: Hours to days for major outages
Post-mortem: Detailed PIR published weeks later

What to Do During Outages

1. Activate failover (if configured):

Switch to different region
Use read replicas for databases
Activate standby resources

2. Monitor Personal Health Dashboard:

Shows YOUR affected resources
Provides specific guidance

3. Communicate with stakeholders:

Update status page
Notify customers
Set expectations

4. Document incident:

Screenshot error messages
Save CloudWatch graphs
Note timeline
Use for post-mortem

5. Consider SLA credits:

AWS SLA: 99.99% uptime for most services
If missed, request service credits
Submit within 30 days of incident

AWS Down Checklist

Follow these steps in order:

Step 1: Verify it's actually AWS

Check AWS Service Health Dashboard
Check AWS Personal Health Dashboard
Search Twitter: "AWS down [region]"
Check specific service status
Verify correct region selected

Step 2: Service-specific checks

EC2: Check instance status, security groups
S3: Test bucket access, check error rates
Lambda: Check CloudWatch logs, metrics
RDS: Test connection, check instance status
CloudFront: Check origin health
Route 53: Test DNS resolution

Step 3: Configuration troubleshooting

Check security groups/NACLs
Verify IAM permissions
Check service quotas/limits
Review CloudWatch metrics
Check CloudTrail for errors

Step 4: Implement workarounds

Add retry logic with exponential backoff
Failover to different region (if multi-region)
Use alternate service (e.g., S3 → CloudFront)
Scale resources if capacity issue

Step 5: Contact AWS (if needed)

Open AWS Support case
Include request IDs, error messages
Attach CloudWatch graphs
Escalate if critical

Prevent Future Issues

1. Design for Failure

AWS Best Practices:

Multi-AZ deployment:

Single AZ = single point of failure
Multi-AZ = survives data center failure

Multi-Region for critical workloads:

Active-active or active-passive
Route 53 health checks + failover
Cross-region replication (S3, RDS, DynamoDB)

Example architecture:

┌─────────────────┐         ┌─────────────────┐
│   us-east-1     │         │   us-west-2     │
│  (Primary)      │◄────────┤   (Backup)      │
│                 │         │                 │
│  EC2 Auto Scale │         │  EC2 Auto Scale │
│  RDS Multi-AZ   │         │  RDS Read Rep   │
│  S3 (CRR→)      │         │  S3 (←CRR)      │
└─────────────────┘         └─────────────────┘
         ▲                           ▲
         │                           │
    Route 53 (health check + failover)

2. Implement Monitoring and Alerts

CloudWatch Alarms:

Critical alarms to set up:

EC2 StatusCheckFailed
RDS DatabaseConnections > threshold
Lambda Errors > threshold
S3 4xxErrors or 5xxErrors spike
ALB TargetResponseTime > threshold

Example alarm (CLI):

aws cloudwatch put-metric-alarm \
  --alarm-name ec2-cpu-high \
  --alarm-description "Alert if CPU exceeds 80%" \
  --metric-name CPUUtilization \
  --namespace AWS/EC2 \
  --statistic Average \
  --period 300 \
  --threshold 80 \
  --comparison-operator GreaterThanThreshold \
  --dimensions Name=InstanceId,Value=i-1234567890abcdef0 \
  --evaluation-periods 2 \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:my-topic

Third-party monitoring:

API Status Check - External monitoring
Datadog, New Relic, Dynatrace - APM
PagerDuty - Incident management

3. Use AWS Health API

Automate health check monitoring:

import boto3

health = boto3.client('health', region_name='us-east-1')

# Get all open issues
events = health.describe_events(
    filter={
        'eventStatusCodes': ['open', 'upcoming']
    }
)

for event in events['events']:
    print(f"Service: {event['service']}")
    print(f"Region: {event.get('region', 'GLOBAL')}")
    print(f"Status: {event['eventStatusCode']}")
    print(f"Description: {event['eventTypeCode']}")

Set up SNS notifications:

Personal Health Dashboard → Preferences
Configure email/SMS for events

4. Regular DR Drills

Disaster Recovery testing:

Quarterly exercises:

Simulate region failure
Failover to backup region
Test recovery time
Document issues found
Update runbooks

GameDay exercises:

AWS hosts GameDay events
Simulate real outage scenarios
Practice incident response
Improve team coordination

5. Keep Service Quotas Ahead

Proactive limit increases:

Before Black Friday, product launches, etc.:

Review current usage
Project peak demand
Request quota increases 2-4 weeks early
Confirm increases before event

Auto-scaling quotas:

Make sure auto-scaling limits match instance quotas
Request limits 2x peak demand (headroom)

Key Takeaways

Before assuming AWS is down:

✅ Check AWS Service Health Dashboard
✅ Check Personal Health Dashboard
✅ Verify correct region selected
✅ Search Twitter for "AWS down [region]"
✅ Test specific service (EC2, S3, Lambda, etc.)

Common fixes:

Check security groups (most connectivity issues)
Verify IAM permissions (most access denied errors)
Check service quotas (hit limits)
Implement retry logic with exponential backoff
Review CloudWatch metrics and logs

Service-specific issues:

EC2: Security groups, status checks, instance capacity
S3: Bucket policies, retry logic, key distribution
Lambda: Timeouts, concurrency limits, CloudWatch logs
RDS: Security groups, connection limits, Multi-AZ
CloudFront: Origin health, SSL certificates
Route 53: DNS records, health checks (rarely down)

If AWS is actually down:

Monitor Health Dashboard for updates
Activate failover to different region (if configured)
Communicate with stakeholders
Document incident for post-mortem
Consider requesting SLA credits

Prevent future issues:

Design multi-AZ/multi-region architecture
Set up CloudWatch alarms
Use Personal Health Dashboard API
Practice DR drills quarterly
Request service quota increases proactively

Remember: Most AWS issues are configuration errors or hitting limits, not actual AWS outages. Check security groups, IAM permissions, and quotas first.

Need real-time AWS status monitoring? Track AWS uptime with API Status Check - Get instant alerts when AWS services go down.

Related Resources

Is AWS Down Right Now? — Live status check
AWS Outage History — Past incidents and timeline
AWS vs Azure Uptime — Which cloud is more reliable?
Multi-Region DR Strategy — Build resilient cloud architecture

Quick Check: Is AWS Actually Down?

1. Check Official Sources

2. Check Region-Specific Status

3. Check Service-Specific Status

Common AWS Error Messages (And What They Mean)

EC2: "InsufficientInstanceCapacity"

S3: "503 Service Unavailable" or "SlowDown"

Lambda: "Rate Exceeded" or "TooManyRequestsException"

RDS: "Cannot Connect to Database"

CloudFront: "502 Bad Gateway" or "504 Gateway Timeout"

Route 53: DNS Resolution Failures

Quick Fixes: AWS Service Issues

Fix #1: Check AWS Personal Health Dashboard

Fix #2: Verify Region Selection

Fix #3: Check Service Quotas/Limits

Fix #4: Implement Retry Logic with Exponential Backoff

Fix #5: Check CloudWatch Metrics

Fix #6: Check Security Groups and NACLs

Fix #7: Check IAM Permissions

Fix #8: Use AWS Support (If You Have a Plan)

EC2 Not Working?

Issue: Can't Connect to EC2 Instance

Issue: EC2 Instance Slow or Unresponsive

S3 Not Working?

Issue: S3 Bucket Access Denied

Issue: S3 High Error Rates

Lambda Not Working?

Issue: Lambda Timeouts

Issue: Lambda "Function Not Found"

When AWS Actually Goes Down

What Happens

How AWS Responds

What to Do During Outages

AWS Down Checklist

Prevent Future Issues

1. Design for Failure

2. Implement Monitoring and Alerts

3. Use AWS Health API

4. Regular DR Drills

5. Keep Service Quotas Ahead

Key Takeaways

Related Resources

Monitor Your APIs