pentagi

Name: pentagi
Author: TerminalSkills/skills
$npx mdskill add TerminalSkills/skills/pentagi
Execute autonomous penetration tests with embedded security tools.
Plans and runs multi-agent security assessments in Docker sandboxes.
Integrates nmap, Metasploit, SQLmap, and Neo4j knowledge graphs.
Uses LLM agents to coordinate research, development, and infrastructure.
Delivers comprehensive reports via Grafana and Langfuse dashboards.
SKILL.md

.github/skills/pentagiView on GitHub ↗
---
name: pentagi
description: >-
  Run AI-powered penetration testing with PentAGI. Use when a user asks to
  automate security testing, set up autonomous pentesting, deploy an AI-driven
  vulnerability scanner, build a self-hosted security testing platform, or
  conduct penetration tests with LLM-powered agents.
license: Apache-2.0
compatibility: 'Docker, any LLM provider (OpenAI, Anthropic, Ollama, Bedrock, Gemini, DeepSeek)'
metadata:
  author: terminal-skills
  version: 1.0.0
  category: devops
  tags:
    - pentagi
    - penetration-testing
    - security
    - ai-agents
    - autonomous
    - vulnerability
---

# PentAGI

## Overview

PentAGI is a fully autonomous AI agent system for penetration testing. It deploys a multi-agent architecture where specialized AI agents (research, development, infrastructure) collaborate to plan, execute, and report security assessments. All operations run in sandboxed Docker containers with 20+ professional security tools (nmap, metasploit, sqlmap, nikto, gobuster, etc.). Features a knowledge graph (Neo4j + Graphiti) for persistent learning across engagements, web intelligence via built-in browser, and comprehensive monitoring with Grafana/Langfuse. Self-hosted — your data stays on your infrastructure.

## Instructions

### Step 1: Quick Deployment

```bash
# Clone the repository
git clone https://github.com/vxcontrol/pentagi.git
cd pentagi

# Copy and configure environment
cp .env.example .env
```

```bash
# .env — Essential configuration
# LLM Provider (choose one)
OPENAI_API_KEY=sk-...                    # OpenAI
# ANTHROPIC_API_KEY=sk-ant-...           # Anthropic
# OLLAMA_SERVER_URL=http://host:11434    # Local Ollama

# Primary model for the main agent
LLM_MODEL=gpt-4o                         # or claude-3-5-sonnet, llama3.1
LLM_PROVIDER=openai                       # openai, anthropic, ollama, bedrock, gemini, deepseek

# Search provider for web intelligence
TAVILY_API_KEY=tvly-...                   # Tavily (recommended)
# GOOGLE_SEARCH_API_KEY=...              # or Google Custom Search
# SEARXNG_URL=http://localhost:8080      # or self-hosted SearXNG

# Security — change these in production
POSTGRES_PASSWORD=your-secure-password
SECRET_KEY=your-secret-key-min-32-chars
```

```bash
# Deploy the full stack
docker compose up -d

# Access the web UI
open http://localhost:3000
```

The stack deploys: React frontend, Go backend (GraphQL API), PostgreSQL with pgvector, Neo4j knowledge graph, security tools container, web scraper, and monitoring (Grafana + Langfuse).

### Step 2: Configure AI Agents

PentAGI uses a team of specialized agents that collaborate on the assessment.

```yaml
# Agent architecture (configured via UI or API)
#
# Primary Agent (Orchestrator)
#   ├── Researches target, plans attack phases
#   ├── Delegates to specialists:
#   │   ├── Research Agent — OSINT, web scraping, CVE lookup
#   │   ├── Development Agent — exploit modification, payload crafting
#   │   └── Infrastructure Agent — container management, tool setup
#   ├── Executes security tools in sandboxed containers
#   └── Generates vulnerability reports
#
# Each agent has access to:
#   - 20+ security tools (nmap, metasploit, sqlmap, nikto, etc.)
#   - Web browser for research
#   - Knowledge graph for persistent memory
#   - Previous engagement learnings
```

### Step 3: Start a Penetration Test via Web UI

```text
1. Open http://localhost:3000
2. Create a new engagement:
   - Target: IP address, domain, or CIDR range
   - Scope: Which services/ports to test
   - Rules of engagement: What's allowed (e.g., no DoS, no data exfiltration)
   - Objective: "Full security assessment" or specific focus
3. The AI agent:
   - Performs reconnaissance (nmap, whois, DNS enumeration)
   - Identifies services and versions
   - Searches for known vulnerabilities (CVE databases)
   - Attempts exploitation with appropriate tools
   - Documents findings with evidence
   - Generates a vulnerability report
```

### Step 4: GraphQL API Integration

```typescript
// Integrate PentAGI into your security pipeline via GraphQL
const PENTAGI_URL = 'http://localhost:3000/graphql'

// Create a new engagement
const createEngagement = await fetch(PENTAGI_URL, {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${API_TOKEN}`,
  },
  body: JSON.stringify({
    query: `
      mutation CreateTask($input: CreateTaskInput!) {
        createTask(input: $input) {
          id
          status
          createdAt
        }
      }
    `,
    variables: {
      input: {
        target: '192.168.1.0/24',
        objective: 'Perform a comprehensive security assessment of the internal network segment. Focus on identifying exposed services, default credentials, unpatched vulnerabilities, and potential lateral movement paths.',
        scope: ['port-scan', 'service-enum', 'vuln-scan', 'web-app-test'],
        constraints: ['no-dos', 'no-data-exfil', 'business-hours-only'],
      },
    },
  }),
})

// Monitor progress
const checkStatus = await fetch(PENTAGI_URL, {
  method: 'POST',
  headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${API_TOKEN}` },
  body: JSON.stringify({
    query: `
      query TaskStatus($id: ID!) {
        task(id: $id) {
          id
          status
          progress
          currentPhase
          findings {
            severity
            title
            description
            evidence
            remediation
          }
          logs {
            timestamp
            agent
            action
            output
          }
        }
      }
    `,
    variables: { id: engagement.id },
  }),
})
```

### Step 5: Knowledge Graph — Persistent Learning

```text
# PentAGI remembers findings across engagements via Neo4j + Graphiti
#
# After each engagement, the knowledge graph stores:
# - Vulnerability patterns found per technology stack
# - Successful exploitation techniques
# - Network topology relationships
# - Service fingerprints and their known weaknesses
#
# In future engagements, the agent queries this knowledge to:
# - Prioritize attack vectors that worked before on similar targets
# - Skip techniques known to fail on specific configurations
# - Correlate findings across multiple assessments
# - Identify systemic issues across the organization
```

### Step 6: Monitoring and Reporting

```bash
# Grafana dashboards — real-time monitoring
open http://localhost:3001
# Dashboards include:
# - Active agent operations and tool execution
# - Token usage and LLM cost tracking
# - Container resource utilization
# - Engagement timeline and progress

# Langfuse — LLM observability
open http://localhost:3002
# Track:
# - Agent reasoning chains
# - Prompt effectiveness
# - Token usage per engagement phase
# - Model performance comparison
```

```bash
# Export vulnerability report
curl -H "Authorization: Bearer $API_TOKEN" \
  "http://localhost:3000/api/v1/tasks/$TASK_ID/report" \
  -o vulnerability-report.pdf

# Report includes:
# - Executive summary
# - Detailed findings with CVSS scores
# - Evidence (screenshots, command output)
# - Remediation recommendations
# - Risk matrix
```

## Guidelines

- **Always get written authorization** before running PentAGI against any target. Unauthorized penetration testing is illegal.
- Deploy on an isolated network segment — PentAGI's sandboxed containers contain offensive tools.
- Use `constraints` to enforce rules of engagement — prevent DoS, data exfiltration, or out-of-scope testing.
- Start with `Ollama` for local/private assessments — no data leaves your infrastructure.
- The knowledge graph improves over time — run PentAGI consistently to build organizational security intelligence.
- Review agent actions in real-time via the web UI — autonomous doesn't mean unsupervised.
- PentAGI complements manual testing — use it for initial reconnaissance and known vulnerability scanning, then have humans investigate complex logic flaws.
- Resource requirements: 8GB+ RAM, 4+ CPU cores. GPU optional (only for local LLM via Ollama).
- Langfuse integration helps optimize LLM costs — track which models give best results per phase.
More from TerminalSkills/skills

Skill	Description
3dsmax-rendering	>-
3dsmax-scripting	>-
3proxy	>-
ably	>-
aceternity-ui	>-
act	>-
activepieces	>-
actix-web	\|
ad-campaign-optimization	>-
adonisjs	>-