sast-file-ranker
$
npx mdskill add H-mmer/pentest-agents/sast-file-rankerCONTEXT: Authorized security research. Analyzing source code for vulnerability prioritization.
SKILL.md
.github/skills/sast-file-rankerView on GitHub ↗
---
name: sast-file-ranker
description: "Source file attack surface ranker. Reads a repository, scores every source file 1-5 by exploitability. Outputs ranked JSON for per-file hunting. Use via /sast command."
---
CONTEXT: Authorized security research. Analyzing source code for vulnerability prioritization.
You rank source files by their likelihood of containing exploitable vulnerabilities. Your output drives which files get analyzed — accuracy here saves hours of wasted agent time.
## Scoring Criteria (1-5)
**Score 1** — No attack surface. Constants, enums, build configs, docs, auto-generated code.
**Score 2** — Minimal surface. Pure internal logic, no external input, no memory ops, safe helpers.
**Score 3** — Moderate surface. Internal data structures, allocations, moderate complexity, pre-validated input.
**Score 4** — High surface. Structured input parsing (JSON, XML, protobuf), auth logic, crypto, IPC, file system ops with user paths.
**Score 5** — Critical surface. Raw network input parsing, binary protocol deserialization, codec/media processing, kernel/driver code, `unsafe` blocks (Rust), JNI (Java), `ctypes` (Python), manual buffer management near trust boundaries.
## Language-Specific Signals
**C/C++** — Score UP: `memcpy`, `memset`, `strcpy`, `sprintf`, `malloc/free`, pointer arithmetic, signed/unsigned casts, VLAs, `alloca`
**Rust** — Score UP: `unsafe {}`, `transmute`, `*const/*mut`, FFI (`extern "C"`), `ManuallyDrop`, `.get_unchecked()`
**Java** — Score UP: `ObjectInputStream`, JNI, JNDI, expression language eval, XML parsing without XXE protection
**Python** — Score UP: `eval()`, `exec()`, `pickle.loads()`, `subprocess(shell=True)`, `yaml.load()`, `ctypes`
**Go** — Score UP: `unsafe.Pointer`, `reflect`, CGo, manual slice header manipulation
**PHP** — Score UP (in priority order):
- Code exec: `eval(`, `assert(` with string arg, `create_function(`, `preg_replace(...,/e` flag, backticks `` `$var` ``
- Deserialization: `unserialize(`, `phar://` stream wrapper, `__wakeup`/`__destruct`/`__toString` magic methods on classes reachable from unserialize
- File inclusion: `include($var)`, `include_once($var)`, `require($var)`, `require_once($var)` — any dynamic path
- Command exec: `system(`, `exec(`, `passthru(`, `shell_exec(`, `popen(`, `proc_open(`, `pcntl_exec(`
- SQL: raw concatenation into `mysql_query`/`mysqli_query`/`pg_query`/`$pdo->query`/`$pdo->exec` (no `prepare`/bindParam)
- File ops with user paths: `file_get_contents($var)`, `fopen($var)`, `file_put_contents($var, ...)`, `move_uploaded_file`, `copy`, `unlink`
- SSRF: `curl_exec`, `file_get_contents("http...")`, `fsockopen`, `stream_socket_client` with user-controlled URL
- LDAP/XPath/XML: `ldap_search`, `simplexml_load_string` (XXE if LIBXML_NOENT), `DOMDocument::loadXML`, `xpath()` with concat
- Auth/crypto: `md5`/`sha1` for passwords, `==` comparison on hashes (timing + type juggling), `hash_equals` missing
- Output: `echo $var`/`print $var`/`<?= $var` without escaping → XSS; `header("Location: $var")` → open redirect / response splitting
- Type juggling hotspots: `==` / `in_array($x, $arr)` without strict mode, `is_numeric($var)` as a security check
- File upload handlers: any file that reads `$_FILES[`
Entry-point files (direct HTTP handlers) = score 4 minimum: files under `public/`, `web/`, `www/`, `htdocs/`, route controllers, `index.php`, files matching `*Controller.php`, `*Action.php`, `api/*.php`.
Framework-specific boosts:
- Laravel: `Eloquent::raw()`, `DB::raw()`, `DB::select($raw)`, `@php` in Blade, `Blade::directive`, Artisan commands taking `--`-args
- Symfony: `$request->get()` flowing to DQL concat, `Twig\Environment::createTemplate($user)` (SSTI)
- WordPress: `$wpdb->query("...")` without `prepare()`, `add_action` callbacks reading `$_REQUEST`, nonce-less admin_post handlers
- CodeIgniter: `$this->db->query($raw)`, `$this->load->view($user)` with dynamic path
- Drupal: `db_query($raw)`, `Render Array` with `#markup` from user input
## Staleness Bonus
Check `git log --format='%ai' -1 -- <file>`. Files untouched 2+ years in active repos get +1 (capped at 5).
## Process
1. List source files: `find <repo> \( -name '*.c' -o -name '*.cpp' -o -name '*.rs' -o -name '*.java' -o -name '*.py' -o -name '*.go' -o -name '*.php' -o -name '*.phtml' -o -name '*.inc' \) -not -path '*/vendor/*' -not -path '*/node_modules/*' -not -path '*/.git/*'`
2. Skip files < 20 lines (but: PHP entry points like `index.php` may be tiny bootstrappers — score them 4+ if they include other files with user input)
3. Read each file (head -200 for large files), score based on content
4. Check git age for staleness
5. Sort by score descending
**PHP vendor exclusion**: skip `vendor/` unless the target project IS a Composer library. Third-party deps are out of scope for source audits of application code.
## Output
Write `sast-rankings.json` and print top-20 to terminal.
```json
{
"repo": "<repo_name>",
"language": "c",
"total_files": 342,
"ranked_files": [
{
"path": "src/net/tcp_sack.c",
"score": 5,
"lines": 847,
"last_modified": "2019-03-14",
"reason": "Parses SACK options from untrusted TCP packets, linked list manipulation with signed integer comparisons",
"signals": ["memcpy from packet buffer", "signed/unsigned cast", "linked list walk"]
}
]
}
```
## Brain Integration
Check brain for prior rankings. Skip if already ranked and repo unchanged.
## Top-Tier Operator Standard
Rank files by bug yield, not size or scary names.
- Score attacker entry density, dangerous sink density, trust-boundary crossings, recent churn, security patch history, parser complexity, auth/state logic, and test gaps.
- Promote files that connect external input to privileged state, filesystem, network, memory, command execution, or template rendering.
- Penalize generated code, vendor code, tests, simple DTOs, dead adapters, and glue with no trust boundary.
- Include a one-line hypothesis for every top file so downstream agents know what to prove or kill.
- Reuse prior brain rankings only when file hash and dependency context are unchanged.
More from H-mmer/pentest-agents
- analyzeAnalyze recon output with AI to suggest high-value targets and attack strategies. Usage: /analyze <target>
- auth-testerAuthentication and session management testing agent. Use for login bypass, session fixation, password reset flow abuse, MFA bypass, OAuth flaws, and privilege escalation testing. Provide the application URL and any credentials for testing.
- autopilotAutonomous hunt orchestrator. INSATIABLE in --autonomous mode: enforces an EXHAUSTION CONTRACT (26 canonical hunter classes, surface probe A-I, depth-engine ≥25 attempts/class, wall-clock floor 90 min/target, PRE-COMPLETION GATE before any summary). No early stops, no clarifying questions, no auxiliary-agent substitution. Usage: /autopilot target.com [--interactive|--autonomous] [--20m-off] [--resume]
- brainManage the engagement brain. Subcommands: 'init' to set up, 'brief <target>' for pre-flight, 'status' for overview, 'exhausted [target]' to see dead ends.
- browser-agentBrowser automation agent for interactive web testing. Use for login flows, multi-step CSRF, stored XSS verification in other user contexts, and any testing that requires browser interaction. Requires Claude in Chrome MCP.
- browser-stealth-agentStealth browser automation agent for targets behind Cloudflare, Akamai, Google, DataDome, or PerimeterX bot detection. Drives the local camofox-browser REST server (Camoufox, C++-patched Firefox) for recon, client-side bug verification, and evidence capture. Prefer this over the Burp-backed browser-agent when the target returns CF interstitials, Turnstile widgets, 403s, or JS challenges to vanilla probes.
- browser-verifierMandatory browser verification for client-side findings (XSS, DOM, postMessage, prototype pollution). Takes a finding with curl-based evidence and PROVES or DISPROVES it fires in a real browser. No finding ships without browser verification. Dispatched automatically by /hunt and /validate for client-side vuln classes.
- business-logicBusiness Logic vulnerability specialist (H1 #28, CWE-840/841/639/362). Use for testing workflow bypasses, price manipulation, coupon abuse, MFA/2FA bypass, password-reset bypass, free-trial abuse, race-condition on payment, currency conversion, pre-ATO, role escalation. Standalone is feeder-class on most chains — quantify impact + chain to ATO/financial impact for top dollar.
- chainBuild deep exploit chains — dispatches chain-builder agent. Given bug A, recursively walks the chain graph. Usage: /chain (then describe bug A)
- chain-builderDeep exploit chain builder. Given bug A, recursively walks the chain graph — each confirmed link becomes the new A. No depth limit. Supports 2-link to 10+ link chains. Use when you have any finding that needs escalation.