web-reader

$npx mdskill add HKUDS/Vibe-Trading/web-reader

Convert URLs into clean Markdown text for reading web pages, articles, and documents.

  • Helps extract readable content from web pages by removing ads and distractions.
  • Integrates with the read_url tool to process URLs directly without bash commands.
  • Uses Jina Reader for conversion but falls back to bash if blocked.
  • Delivers results as JSON with title, content in Markdown, and length.
SKILL.md
.github/skills/web-readerView on GitHub ↗
---
name: web-reader
description: Read web pages, articles, and document links by converting URLs into Markdown text. Use the `read_url` tool directly, without bash.
category: tool
---
# Web Reading

## Purpose

Converts any URL into clean Markdown text, removing ads, navigation, styling, and other distractions. Suitable for:
- Reading API documentation (`tushare`, `OKX`, `yfinance`, and similar)
- Reading technical articles and blogs
- Retrieving research reports and announcements
- Reading GitHub README / Wiki pages

## Usage

**Call the `read_url` tool directly (do not use bash + requests, call the tool directly):**

```
read_url(url="https://tushare.pro/document/2?doc_id=27")
```

Returns JSON:
```json
{
  "status": "ok",
  "title": "Page title",
  "url": "Original URL",
  "content": "Page content in Markdown format",
  "length": 12345
}
```

## Notes

- Content longer than 8000 characters will be truncated, with the total length noted at the end
- Some websites may block Jina Reader (returning HTTP 451). In that case, fall back to bash + requests
- Dynamically rendered SPA pages may return only skeleton HTML
- Chinese content is supported normally

## Common Usage

### Read API Documentation
```
read_url(url="https://tushare.pro/document/2?doc_id=27")
```

### Read Technical Articles
```
read_url(url="https://blog.example.com/quantitative-trading-guide")
```

### Retrieve GitHub Project Information
```
read_url(url="https://github.com/PaddlePaddle/PaddleOCR")
```
More from HKUDS/Vibe-Trading