vision-analyze
$
npx mdskill add aAAaqwq/AGI-Super-Team/vision-analyzeAnalyze images for description, OCR, and visual insights.
- Extracts text and answers questions from JPG, PNG, GIF, and WebP files.
- Depends on multimodal vision models for image understanding.
- Executes analysis based on user prompts attached to image paths.
- Returns structured text responses describing content or extracted data.
SKILL.md
.github/skills/vision-analyzeView on GitHub ↗
---
name: vision-analyze
description: Image analysis using multimodal vision models. Use when user needs to: (1) Describe what's in an image, (2) Extract text from images (OCR), (3) Analyze visual content, (4) Compare images, (5) Answer questions about images. Supports JPG, PNG, GIF, WebP formats.
metadata:
{
"openclaw":
{
"emoji": "👁️",
"requires": {},
},
}
---
# Vision Analyze
Analyze images using the built-in vision capabilities of multimodal AI models.
## Quick Start
### Analyze an Image
Describe what's in an image:
```python
# The agent will automatically use vision when you provide an image path
image("/path/to/image.jpg", prompt="Describe what's in this image")
```
### Extract Text (OCR)
Extract text from images:
```python
image("/path/to/document.png", prompt="Extract all text from this image")
```
### Analyze Multiple Images
Compare or analyze multiple images:
```python
images(["/path/to/image1.jpg", "/path/to/image2.jpg"],
prompt="Compare these two images and describe the differences")
```
## Usage Patterns
### Visual Q&A
Ask specific questions about image content:
```python
image("menu.jpg", prompt="What are the prices of the main courses?")
image("chart.png", prompt="What trend does this graph show?")
image("screenshot.png", prompt="What error message is displayed?")
```
### Content Moderation
Check image content:
```python
image("upload.jpg", prompt="Is this image appropriate for a professional setting?")
```
### Data Extraction
Extract structured data from visual content:
```python
image("receipt.jpg", prompt="Extract the date, total amount, and items purchased")
image("business_card.png", prompt="Extract name, phone, email, and company")
image("form.jpg", prompt="Extract all filled fields as key-value pairs")
```
### Visual Comparison
Compare images:
```python
images(["before.jpg", "after.jpg"],
prompt="What changes were made between these two images?")
```
## Tips
- **Be specific**: The more specific your prompt, the better the results
- **Multiple images**: You can analyze up to 20 images at once
- **Supported formats**: JPG, PNG, GIF, WebP
- **Size limits**: Large images are automatically resized
## When to Use
- Reading text from screenshots, documents, or photos
- Describing visual content for accessibility
- Analyzing charts, graphs, or diagrams
- Comparing visual changes
- Extracting data from forms or receipts
- Understanding UI elements or error messages
More from aAAaqwq/AGI-Super-Team
- a-fund-monitor监控 A 股基金实时估值与盘后净值,自动判断交易日并生成提醒或分析。
- account-executive>
- add-leadAdd company/person/relationship to CRM
- adsComprehensive ad account analysis across all major platforms (Google, Meta
- ads-agentAI-агент для управления Facebook рекламой. Вызывай для анализа, оптимизации, создания кампаний и отчётов.
- afrexai-compliance-auditRun internal compliance audits against major governance and security
- afrexai-personal-financeComplete personal finance system — budgeting, debt payoff, investing, tax optimization, net worth tracking, and financial independence planning. Use when managing money, building wealth, paying off debt, planning retirement, or optimizing taxes. Zero dependencies.
- after-salesUse when managing post-purchase experience, building customer loyalty, or increasing repeat purchases
- agent-contactsAI agent contacts — add, list, remove MCP contacts. Use when someone gives an agent URL, or when you need to view/remove contacts.
- agent-model-switcher批量查看和切换子 agent 的模型配置,用于统一调整多 agent 的 provider/model 设置。