aws-cloud-monitoring
$
npx mdskill add automateyournetwork/netclaw/aws-cloud-monitoringMonitors AWS CloudWatch metrics, alarms, logs, and network performance
- Analyzes network latency, VPC flow logs, and CloudWatch alarms
- Uses AWS CloudWatch, CloudWatch Logs, and VPC flow log APIs
- Checks metrics for EC2, ELB, NAT Gateway, and Transit Gateway
- Delivers dashboards and alerts for network health and performance
SKILL.md
.github/skills/aws-cloud-monitoringView on GitHub ↗
--- name: aws-cloud-monitoring description: "AWS CloudWatch monitoring — metrics, alarms, log queries, VPC flow log analysis, network performance. Use when checking AWS alarms, analyzing VPC flow logs, investigating network latency, or monitoring VPN and NAT Gateway metrics." version: 1.0.0 license: Apache-2.0 tags: [aws, cloudwatch, monitoring, metrics, alarms, logs, flow-logs] --- # AWS Cloud Monitoring ## MCP Server - **Command**: `uvx awslabs.cloudwatch-mcp-server@latest` (stdio transport) - **Requires**: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION` (or `AWS_PROFILE`) ## Key Capabilities - **Metrics**: Query CloudWatch metrics for any AWS service (EC2, ELB, TGW, NAT GW, VPN) - **Alarms**: List and inspect CloudWatch alarms and their states - **Logs**: Run CloudWatch Logs Insights queries across any log group - **Flow Logs**: Analyze VPC and TGW flow logs for traffic patterns and dropped connections ## Workflow: Network Monitoring Dashboard When a user asks "how is our AWS network performing?": 1. **Check alarms**: List CloudWatch alarms in ALARM state 2. **VPN metrics**: Tunnel state, bytes in/out for site-to-site VPNs 3. **NAT Gateway metrics**: Active connections, packets dropped, bytes processed 4. **Transit Gateway metrics**: Bytes in/out, packets dropped per attachment 5. **ELB metrics**: Healthy/unhealthy targets, latency, 5xx errors 6. **Report**: Network health dashboard with any issues flagged ## Workflow: Flow Log Analysis When investigating traffic patterns or security events: 1. **Query VPC flow logs**: Filter by source IP, destination IP, port, action (ACCEPT/REJECT) 2. **Identify rejected traffic**: Find REJECT entries to see blocked connections 3. **Top talkers**: Aggregate by source/destination to find heaviest traffic flows 4. **Time correlation**: Narrow to specific time windows around incidents 5. **Report**: Traffic analysis with recommendations ## Common CloudWatch Network Metrics | Service | Metric | What It Tells You | |---------|--------|-------------------| | VPN | `TunnelState` | 0=down, 1=up for each tunnel | | VPN | `TunnelDataIn/Out` | Bytes through each VPN tunnel | | NAT GW | `ActiveConnectionCount` | Active NAT connections | | NAT GW | `PacketsDropCount` | Packets dropped (capacity issue) | | NAT GW | `BytesProcessed` | Traffic volume through NAT | | TGW | `BytesIn/BytesOut` | Traffic per TGW attachment | | TGW | `PacketDropCountBlackhole` | Blackhole route drops | | ELB | `HealthyHostCount` | Healthy targets behind ALB/NLB | | ELB | `TargetResponseTime` | Backend latency | | EC2 | `NetworkIn/NetworkOut` | Instance network throughput | | EC2 | `NetworkPacketsIn/Out` | Instance packet rate | ## Flow Log Query Examples ``` # Top rejected connections in last hour fields @timestamp, srcAddr, dstAddr, dstPort, action | filter action = "REJECT" | stats count() as rejections by srcAddr, dstAddr, dstPort | sort rejections desc | limit 20 # Traffic from specific source fields @timestamp, srcAddr, dstAddr, dstPort, bytes, action | filter srcAddr = "10.0.1.50" | sort @timestamp desc # Top talkers by bytes fields srcAddr, dstAddr, bytes | stats sum(bytes) as totalBytes by srcAddr, dstAddr | sort totalBytes desc | limit 10 ``` ## Important Rules - **CloudWatch Logs Insights queries have a cost** — be mindful of time range and data volume - **Region-specific** — metrics and logs are scoped to the configured region - **Record in GAIT** — log monitoring investigations for audit trail ## Environment Variables - `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION` (or `AWS_PROFILE`)
More from automateyournetwork/netclaw
- aap-automationRed Hat Ansible Automation Platform — inventory management, job template execution, project SCM sync, ad-hoc commands, host management, Galaxy content discovery. Use when automating infrastructure with Ansible, running playbooks, managing inventories, or searching for Ansible collections and roles.
- aap-edaEvent-Driven Ansible (EDA) — activation lifecycle, rulebook management, decision environments, event stream monitoring. Use when managing event-driven automation triggers, enabling/disabling activations, or reviewing EDA rulebooks.
- aap-lintansible-lint playbook and role validation — syntax checking, best practice enforcement, project-wide analysis, rule filtering. Use when validating Ansible playbooks, checking code quality, or enforcing automation best practices before deployment.
- aci-change-deploySafe ACI policy change deployment - ServiceNow CR lifecycle, pre/post-change fault baselines, APIC policy application, automatic rollback on fault delta, and GAIT audit trail. Use when deploying ACI policy changes, creating tenants or EPGs, pushing config to APIC, or running a change window with rollback protection.
- aci-fabric-auditComprehensive Cisco ACI fabric health audit - node status, tenant/VRF/BD/EPG policy review, contract analysis, fault triage, and endpoint learning verification. Use when auditing ACI fabric health, checking for faults, reviewing tenant policies, or running pre/post-change baselines on APIC.
- arista-cvpArista CloudVision Portal (CVP) automation via REST API — device inventory, events, connectivity monitoring, tag management (4 tools). Use when managing Arista devices, checking CloudVision events, monitoring network connectivity probes, or tagging devices in CVP.
- aruba-cx-configView and manage Aruba CX switch configurations, perform ISSU upgrades, and firmware operations
- aruba-cx-interfacesMonitor Aruba CX switch interface status, LLDP neighbors, and optical transceiver health
- aruba-cx-switchingView and manage Aruba CX switch VLANs and MAC address tables for Layer 2 operations
- aruba-cx-systemDiscover Aruba CX switch system information, firmware versions, and VSF topology