aws-solution-architect
$
npx mdskill add alirezarezvani/claude-skills/aws-solution-architectDesign scalable, cost-effective AWS architectures for startups with infrastructure-as-code templates.
SKILL.md
.github/skills/aws-solution-architectView on GitHub ↗
---
name: "aws-solution-architect"
description: Design AWS architectures for startups using serverless patterns and IaC templates. Use when asked to design serverless architecture, create CloudFormation templates, optimize AWS costs, set up CI/CD pipelines, or migrate to AWS. Covers Lambda, API Gateway, DynamoDB, ECS, Aurora, and cost optimization.
---
# AWS Solution Architect
Design scalable, cost-effective AWS architectures for startups with infrastructure-as-code templates.
---
## Workflow
### Step 1: Gather Requirements
Collect application specifications:
```
- Application type (web app, mobile backend, data pipeline, SaaS)
- Expected users and requests per second
- Budget constraints (monthly spend limit)
- Team size and AWS experience level
- Compliance requirements (GDPR, HIPAA, SOC 2)
- Availability requirements (SLA, RPO/RTO)
```
### Step 2: Design Architecture
Run the architecture designer to get pattern recommendations:
```bash
python scripts/architecture_designer.py --input requirements.json
```
**Example output:**
```json
{
"recommended_pattern": "serverless_web",
"service_stack": ["S3", "CloudFront", "API Gateway", "Lambda", "DynamoDB", "Cognito"],
"estimated_monthly_cost_usd": 35,
"pros": ["Low ops overhead", "Pay-per-use", "Auto-scaling"],
"cons": ["Cold starts", "15-min Lambda limit", "Eventual consistency"]
}
```
Select from recommended patterns:
- **Serverless Web**: S3 + CloudFront + API Gateway + Lambda + DynamoDB
- **Event-Driven Microservices**: EventBridge + Lambda + SQS + Step Functions
- **Three-Tier**: ALB + ECS Fargate + Aurora + ElastiCache
- **GraphQL Backend**: AppSync + Lambda + DynamoDB + Cognito
See `references/architecture_patterns.md` for detailed pattern specifications.
**Validation checkpoint:** Confirm the recommended pattern matches the team's operational maturity and compliance requirements before proceeding to Step 3.
### Step 3: Generate IaC Templates
Create infrastructure-as-code for the selected pattern:
```bash
# Serverless stack (CloudFormation)
python scripts/serverless_stack.py --app-name my-app --region us-east-1
```
**Example CloudFormation YAML output (core serverless resources):**
```yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Parameters:
AppName:
Type: String
Default: my-app
Resources:
ApiFunction:
Type: AWS::Serverless::Function
Properties:
Handler: index.handler
Runtime: nodejs20.x
MemorySize: 512
Timeout: 30
Environment:
Variables:
TABLE_NAME: !Ref DataTable
Policies:
- DynamoDBCrudPolicy:
TableName: !Ref DataTable
Events:
ApiEvent:
Type: Api
Properties:
Path: /{proxy+}
Method: ANY
DataTable:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: pk
AttributeType: S
- AttributeName: sk
AttributeType: S
KeySchema:
- AttributeName: pk
KeyType: HASH
- AttributeName: sk
KeyType: RANGE
```
> Full templates including API Gateway, Cognito, IAM roles, and CloudWatch logging are generated by `serverless_stack.py` and also available in `references/architecture_patterns.md`.
**Example CDK TypeScript snippet (three-tier pattern):**
```typescript
import * as ecs from 'aws-cdk-lib/aws-ecs';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as rds from 'aws-cdk-lib/aws-rds';
const vpc = new ec2.Vpc(this, 'AppVpc', { maxAzs: 2 });
const cluster = new ecs.Cluster(this, 'AppCluster', { vpc });
const db = new rds.ServerlessCluster(this, 'AppDb', {
engine: rds.DatabaseClusterEngine.auroraPostgres({
version: rds.AuroraPostgresEngineVersion.VER_15_2,
}),
vpc,
scaling: { minCapacity: 0.5, maxCapacity: 4 },
});
```
### Step 4: Review Costs
Analyze estimated costs and optimization opportunities:
```bash
python scripts/cost_optimizer.py --resources current_setup.json --monthly-spend 2000
```
**Example output:**
```json
{
"current_monthly_usd": 2000,
"recommendations": [
{ "action": "Right-size RDS db.r5.2xlarge → db.r5.large", "savings_usd": 420, "priority": "high" },
{ "action": "Purchase 1-yr Compute Savings Plan at 40% utilization", "savings_usd": 310, "priority": "high" },
{ "action": "Move S3 objects >90 days to Glacier Instant Retrieval", "savings_usd": 85, "priority": "medium" }
],
"total_potential_savings_usd": 815
}
```
Output includes:
- Monthly cost breakdown by service
- Right-sizing recommendations
- Savings Plans opportunities
- Potential monthly savings
### Step 5: Deploy
Deploy the generated infrastructure:
```bash
# CloudFormation
aws cloudformation create-stack \
--stack-name my-app-stack \
--template-body file://template.yaml \
--capabilities CAPABILITY_IAM
# CDK
cdk deploy
# Terraform
terraform init && terraform apply
```
### Step 6: Validate and Handle Failures
Verify deployment and set up monitoring:
```bash
# Check stack status
aws cloudformation describe-stacks --stack-name my-app-stack
# Set up CloudWatch alarms
aws cloudwatch put-metric-alarm --alarm-name high-errors ...
```
**If stack creation fails:**
1. Check the failure reason:
```bash
aws cloudformation describe-stack-events \
--stack-name my-app-stack \
--query 'StackEvents[?ResourceStatus==`CREATE_FAILED`]'
```
2. Review CloudWatch Logs for Lambda or ECS errors.
3. Fix the template or resource configuration.
4. Delete the failed stack before retrying:
```bash
aws cloudformation delete-stack --stack-name my-app-stack
# Wait for deletion
aws cloudformation wait stack-delete-complete --stack-name my-app-stack
# Redeploy
aws cloudformation create-stack ...
```
**Common failure causes:**
- IAM permission errors → verify `--capabilities CAPABILITY_IAM` and role trust policies
- Resource limit exceeded → request quota increase via Service Quotas console
- Invalid template syntax → run `aws cloudformation validate-template --template-body file://template.yaml` before deploying
---
## Tools
### architecture_designer.py
Generates architecture patterns based on requirements.
```bash
python scripts/architecture_designer.py --input requirements.json --output design.json
```
**Input:** JSON with app type, scale, budget, compliance needs
**Output:** Recommended pattern, service stack, cost estimate, pros/cons
### serverless_stack.py
Creates serverless CloudFormation templates.
```bash
python scripts/serverless_stack.py --app-name my-app --region us-east-1
```
**Output:** Production-ready CloudFormation YAML with:
- API Gateway + Lambda
- DynamoDB table
- Cognito user pool
- IAM roles with least privilege
- CloudWatch logging
### cost_optimizer.py
Analyzes costs and recommends optimizations.
```bash
python scripts/cost_optimizer.py --resources inventory.json --monthly-spend 5000
```
**Output:** Recommendations for:
- Idle resource removal
- Instance right-sizing
- Reserved capacity purchases
- Storage tier transitions
- NAT Gateway alternatives
---
## Quick Start
### MVP Architecture (< $100/month)
```
Ask: "Design a serverless MVP backend for a mobile app with 1000 users"
Result:
- Lambda + API Gateway for API
- DynamoDB pay-per-request for data
- Cognito for authentication
- S3 + CloudFront for static assets
- Estimated: $20-50/month
```
### Scaling Architecture ($500-2000/month)
```
Ask: "Design a scalable architecture for a SaaS platform with 50k users"
Result:
- ECS Fargate for containerized API
- Aurora Serverless for relational data
- ElastiCache for session caching
- CloudFront for CDN
- CodePipeline for CI/CD
- Multi-AZ deployment
```
### Cost Optimization
```
Ask: "Optimize my AWS setup to reduce costs by 30%. Current spend: $3000/month"
Provide: Current resource inventory (EC2, RDS, S3, etc.)
Result:
- Idle resource identification
- Right-sizing recommendations
- Savings Plans analysis
- Storage lifecycle policies
- Target savings: $900/month
```
### IaC Generation
```
Ask: "Generate CloudFormation for a three-tier web app with auto-scaling"
Result:
- VPC with public/private subnets
- ALB with HTTPS
- ECS Fargate with auto-scaling
- Aurora with read replicas
- Security groups and IAM roles
```
---
## Input Requirements
Provide these details for architecture design:
| Requirement | Description | Example |
|-------------|-------------|---------|
| Application type | What you're building | SaaS platform, mobile backend |
| Expected scale | Users, requests/sec | 10k users, 100 RPS |
| Budget | Monthly AWS limit | $500/month max |
| Team context | Size, AWS experience | 3 devs, intermediate |
| Compliance | Regulatory needs | HIPAA, GDPR, SOC 2 |
| Availability | Uptime requirements | 99.9% SLA, 1hr RPO |
**JSON Format:**
```json
{
"application_type": "saas_platform",
"expected_users": 10000,
"requests_per_second": 100,
"budget_monthly_usd": 500,
"team_size": 3,
"aws_experience": "intermediate",
"compliance": ["SOC2"],
"availability_sla": "99.9%"
}
```
---
## Output Formats
### Architecture Design
- Pattern recommendation with rationale
- Service stack diagram (ASCII)
- Monthly cost estimate and trade-offs
### IaC Templates
- **CloudFormation YAML**: Production-ready SAM/CFN templates
- **CDK TypeScript**: Type-safe infrastructure code
- **Terraform HCL**: Multi-cloud compatible configs
### Cost Analysis
- Current spend breakdown with optimization recommendations
- Priority action list (high/medium/low) and implementation checklist
---
## Reference Documentation
| Document | Contents |
|----------|----------|
| `references/architecture_patterns.md` | 6 patterns: serverless, microservices, three-tier, data processing, GraphQL, multi-region |
| `references/service_selection.md` | Decision matrices for compute, database, storage, messaging |
| `references/best_practices.md` | Serverless design, cost optimization, security hardening, scalability |
More from alirezarezvani/claude-skills
- a11y-auditAccessibility audit skill for scanning, fixing, and verifying WCAG 2.2 Level A and AA compliance across React, Next.js, Vue, Angular, Svelte, and plain HTML codebases. Use when auditing accessibility, fixing a11y violations, checking color contrast, generating compliance reports, or integrating accessibility checks into CI/CD pipelines.
- ab-test-setupWhen the user wants to plan, design, or implement an A/B test or experiment. Also use when the user mentions "A/B test," "split test," "experiment," "test this change," "variant copy," "multivariate test," "hypothesis," "conversion experiment," "statistical significance," or "test this." For tracking implementation, see analytics-tracking.
- ad-creativeWhen the user needs to generate, iterate, or scale ad creative for paid advertising. Use when they say 'write ad copy,' 'generate headlines,' 'create ad variations,' 'bulk creative,' 'iterate on ads,' 'ad copy validation,' 'RSA headlines,' 'Meta ad copy,' 'LinkedIn ad,' or 'creative testing.' This is pure creative production — distinct from paid-ads (campaign strategy). Use ad-creative when you need the copy, not the campaign plan.
- adversarial-reviewerAdversarial code review that breaks the self-review monoculture. Use when you want a genuinely critical review of recent changes, before merging a PR, or when you suspect Claude is being too agreeable about code quality. Forces perspective shifts through hostile reviewer personas that catch blind spots the author's mental model shares with the reviewer.
- aeoAnswer Engine Optimization (AEO) skill — optimize content to be cited by AI language models (ChatGPT, Perplexity, Claude, Gemini, Mistral) as authoritative sources. Distinct from SEO — AEO optimizes for citation in LLM-generated responses, not search rankings. Use when planning content for AI-first search audiences, auditing existing content for E-E-A-T signals, tracking which pages get cited by which LLMs, or building a citation-friendly content strategy. Triggers — 'AEO audit', 'optimize for ChatGPT', 'get cited by Perplexity', 'LLM citation strategy', 'answer engine optimization', 'content for AI search', 'E-E-A-T audit'. Output is a markdown audit report (default) or JSON for pipeline integration. Stdlib-only Python tools.
- agent-designerUse when the user asks to design a multi-agent system, pick an orchestration pattern (supervisor/swarm/pipeline), generate tool schemas for agents, or evaluate agent execution logs for cost, latency, and failure bottlenecks. Examples: 'design an agent architecture for research automation', 'generate Anthropic tool schemas from these tool descriptions', 'analyze these agent run logs for bottlenecks'. NOT for Claude Code workflow files (use workflow-builder) or single-agent prompt design (use agent-workflow-designer).
- agent-protocolInter-agent communication protocol for C-suite agent teams. Defines invocation syntax, loop prevention, isolation rules, and response formats. Use when C-suite agents need to query each other, coordinate cross-functional analysis, or run board meetings with multiple agent roles.
- agent-workflow-designerDesign production-grade multi-agent workflows with clear pattern choice (sequential, parallel, hierarchical), handoff contracts, failure handling, and cost/context controls. Use when architecting a multi-step agent pipeline, choosing between single-agent vs multi-agent approaches, or refactoring an LLM workflow that suffers from context bloat or unreliable handoffs.
- agenthubMulti-agent collaboration plugin that spawns N parallel subagents competing on the same task via git worktree isolation. Agents work independently, results are evaluated by metric or LLM judge, and the best branch is merged. Use when: user wants multiple approaches tried in parallel — code optimization, content variation, research exploration, or any task that benefits from parallel competition. Requires: a git repo.
- agile-product-ownerAgile product ownership for backlog management and sprint execution. Covers user story writing, acceptance criteria, sprint planning, and velocity tracking. Use when writing user stories, creating acceptance criteria, planning sprints, estimating story points, breaking down epics, or prioritizing the backlog.