categorizing-bsky-accounts
$
npx mdskill add oaustegard/claude-skills/categorizing-bsky-accountsExtract and categorize Bluesky accounts by topic using keywords.
- Analyzes bio and posts to identify relevant account themes.
- Depends on the extracting-keywords skill for YAKE and stopwords.
- Processes following lists, followers, or custom handle inputs.
- Returns topic classifications for each analyzed account.
SKILL.md
.github/skills/categorizing-bsky-accountsView on GitHub ↗
---
name: categorizing-bsky-accounts
description: Analyze and categorize Bluesky accounts by topic using keyword extraction. Use when users mention Bluesky account analysis, following/follower lists, topic discovery, account curation, or network analysis.
metadata:
version: 0.2.1
---
# Categorizing Bluesky Accounts
Fetch Bluesky account data and extract keywords for Claude to categorize by topic. The script compresses account context (bio + posts) into bio + keywords, then Claude performs intelligent categorization.
## Prerequisites
**Requires:** extracting-keywords skill (provides YAKE venv + domain stopwords)
The analyzer delegates keyword extraction to the extracting-keywords skill, which provides:
- Optimized YAKE installation with minimal dependencies
- Domain-specific stopwords: English (574), AI/ML (1357), Life Sciences (1293)
- Support for 34 languages
## Core Workflow
When users request Bluesky account analysis:
1. **Ensure keyword extraction is set up** - Invoke the extracting-keywords skill using the Skill tool to ensure YAKE venv exists (skip if already invoked in this session)
2. **Determine input mode** based on user's request:
- Following list → use `--following handle`
- Followers → use `--followers handle`
- List of handles → use `--handles "h1,h2,h3"`
- File provided → use `--file accounts.txt`
3. **Configure parameters:**
- `--accounts N` - Number to analyze (default: 100, max: 100)
- `--posts N` - Posts per account (default: 20, max: 100)
- `--stopwords [en|ai|ls]` - Choose domain-specific stopwords:
- `en`: English (general purpose)
- `ai`: AI/ML domain (recommended for tech accounts)
- `ls`: Life Sciences (for biomedical/research accounts)
- `--exclude "pattern1,pattern2"` - Skip spam/bot accounts
4. **Run script** - Outputs simple text format to stdout:
```
@handle1.bsky.social (Display Name)
Bio text here
Keywords: keyword1, keyword2, keyword3
@handle2.bsky.social (Another Name)
Bio text here
Keywords: keyword4, keyword5, keyword6
```
5. **Categorize accounts** - Claude analyzes bio + keywords to categorize by topic
## Quick Start
**Analyze following list with AI/ML stopwords:**
```bash
python scripts/bluesky_analyzer.py --following austegard.com --stopwords ai
```
**Analyze followers:**
```bash
python scripts/bluesky_analyzer.py --followers austegard.com
```
**Analyze specific handles:**
```bash
python scripts/bluesky_analyzer.py --handles "user1.bsky.social,user2.bsky.social,user3.bsky.social"
```
**From file:**
```bash
python scripts/bluesky_analyzer.py --file accounts.txt --stopwords ai
```
**Filter out bot accounts:**
```bash
python scripts/bluesky_analyzer.py --following handle --exclude "bot,spam,promo" --stopwords ai
```
## Parameters
### Input Modes (choose one)
**--handles "h1,h2,h3"**
Comma-separated list of Bluesky handles
**--following HANDLE**
Analyze accounts followed by HANDLE
**--followers HANDLE**
Analyze accounts following HANDLE
**--file PATH**
Read handles from file (one per line)
### Analysis Options
**--accounts N**
Number of accounts to analyze (1-100, default: 100)
**--posts N**
Posts to fetch per account (1-100, default: 20)
**--stopwords [en|ai|ls]**
Stopwords to use for keyword extraction (default: en)
- `en`: English stopwords (574 terms) - general purpose
- `ai`: AI/ML domain stopwords (1357 terms) - tech-focused accounts
- `ls`: Life Sciences stopwords (1293 terms) - biomedical/research accounts
**--exclude "word1,word2"**
Skip accounts with these keywords in bio/posts
## Output Format
The script outputs simple text format for Claude to process:
```
@alice.bsky.social (Alice Smith)
AI researcher working on LLM alignment and safety
Keywords: alignment, safety research, interpretability, llm evaluation
@bob.bsky.social (Bob Johnson)
Full-stack developer building web applications
Keywords: react, typescript, node.js, api design, postgresql
@carol.bsky.social (Carol Williams)
Biotech researcher studying CRISPR applications
Keywords: crispr, gene editing, therapeutics, clinical trials
```
Claude then categorizes accounts based on bio + keywords without hardcoded rules.
## Common Workflows
### Audit Your Following List
```bash
python scripts/bluesky_analyzer.py --following your-handle.bsky.social --stopwords ai
```
Claude will categorize accounts by topic and identify patterns in who you follow.
### Find Experts in a Topic
```bash
python scripts/bluesky_analyzer.py --following alice.bsky.social --stopwords ai
```
Ask Claude: "Which of these accounts are ML researchers?" or "Who focuses on climate tech?"
### Analyze a Curated List
```bash
cat > accounts.txt << 'EOF'
expert1.bsky.social
expert2.bsky.social
expert3.bsky.social
EOF
python scripts/bluesky_analyzer.py --file accounts.txt --stopwords ls
```
### Filter Out Bot Accounts
```bash
python scripts/bluesky_analyzer.py --following handle --exclude "bot,spam,promo,follow back" --stopwords ai
```
## Technical Details
### Keyword Extraction
Delegates to **extracting-keywords skill** using YAKE venv:
- **Stopwords options** (--stopwords):
- `en`: English (574 terms) - general purpose
- `ai`: AI/ML domain (1357 terms) - filters technical noise, ML boilerplate
- `ls`: Life Sciences (1293 terms) - filters research methodology, clinical terms
- N-grams: 1-3 words
- Deduplication: 0.9 threshold
- Top keywords: 10 per account
- Performance: ~5% overhead with domain stopwords vs English
### API Rate Limits
Bluesky API limits:
- 3000 requests per 5 minutes
- 5000 requests per hour
The analyzer respects these limits with built-in delays.
### Categorization Algorithm
**Script's role:**
1. Fetch account data (bio + posts)
2. Extract keywords to compress context
3. Output bio + keywords in simple format
**Claude's role:**
1. Read bio + keywords for each account
2. Intelligently categorize by topic (no hardcoded rules)
3. Group accounts, identify patterns, answer user questions
This agentic pattern is more flexible than hardcoded keyword matching.
## Troubleshooting
**"No accounts to analyze"**
- Verify handle format (include domain: handle.bsky.social)
- Check if account exists and has public following/followers
**"Insufficient content for keyword extraction"**
- Account has few posts (<5)
- Posts are very short
- Try increasing `--posts` parameter
**Rate limit errors**
- Reduce `--accounts` parameter
- Add delays between batches
- Check Bluesky API status
**Import errors**
- Verify extracting-keywords skill is available
- Check YAKE venv exists: `/home/claude/yake-venv/bin/python -c "import yake"`
- Verify Python 3.8+: `python3 --version`
## Integration with Other Skills
**Built-in integration:**
- **extracting-keywords**: Automatically delegates keyword extraction to this skill's optimized YAKE venv with domain-specific stopwords
## Example Sessions
**User:** "Can you analyze the accounts I follow on Bluesky and tell me what topics they focus on?"
**Claude:**
```bash
python scripts/bluesky_analyzer.py --following user-handle.bsky.social --stopwords ai
```
Based on the output, I can see you follow:
- **AI/ML researchers** (15 accounts): Focus on LLM safety, alignment, interpretability
- **Software engineers** (20 accounts): Web development, React, TypeScript, DevOps
- **Writers** (8 accounts): Tech journalism, newsletters, long-form content
- **Scientists** (7 accounts): Climate science, biotech, physics
**User:** "Find ML researchers in @alice's network"
**Claude:**
```bash
python scripts/bluesky_analyzer.py --following alice.bsky.social --stopwords ai
```
I found 23 ML researchers in Alice's network:
- 8 working on LLM alignment and safety
- 6 focused on model evaluation and benchmarks
- 5 in ML infrastructure and MLOps
- 4 in computer vision and multimodal models
**User:** "Here's a list of 30 accounts, categorize them"
**Claude:**
```bash
python scripts/bluesky_analyzer.py --file accounts.txt --stopwords ai
```
Categorized into:
- Climate Tech (8 accounts)
- Biotech (6 accounts)
- Fintech (5 accounts)
- AI/ML (7 accounts)
- Other (4 accounts)
More from oaustegard/claude-skills
- accessing-github-reposGitHub repository access in containerized environments using REST API and credential detection. Use when git clone fails, or when accessing private repos/writing files via API.
- api-credentialsSecurely manages API credentials for multiple providers (Anthropic Claude, Google Gemini, GitHub). Use when skills need to access stored API keys for external service invocations.
- asking-questionsGuidance for asking clarifying questions when user requests are ambiguous, have multiple valid approaches, or require critical decisions. Use when implementation choices exist that could significantly affect outcomes.
- browsing-blueskyBrowse Bluesky content via API and firehose - search posts, fetch user activity, sample trending topics, read feeds and lists, analyze and categorize accounts. Supports authenticated access for personalized feeds. Use for Bluesky research, user monitoring, trend analysis, feed reading, firehose sampling, account categorization.
- building-github-indexGenerate progressive disclosure indexes for GitHub repositories to use as Claude project knowledge. Use when setting up projects referencing external documentation, creating searchable indexes of technical blogs or knowledge bases, combining multiple repos into one index, or when user mentions "index", "github repo", "project knowledge", or "documentation reference".
- chartingSelect the right Python charting library (seaborn, matplotlib, graphviz) and produce publication-quality static visualizations. Use when creating charts, plots, graphs, diagrams, heatmaps, visualizations from data, or when choosing between matplotlib/seaborn/graphviz. Also triggers for network diagrams, flowcharts, dependency trees, state machines, and entity-relationship diagrams. For interactive browser-rendered charts or uploaded data exploration, defer to charting-vega-lite instead.
- charting-vega-liteCreate interactive data visualizations using Vega-Lite declarative JSON grammar. Supports 20+ chart types (bar, line, scatter, histogram, boxplot, grouped/stacked variations, etc.) via templates and programmatic builders. Use when users upload data for charting, request specific chart types, or mention visualizations. Produces portable JSON specs with inline data islands that work in Claude artifacts and can be adapted for production.
- check-toolsValidates development tool installations across Python, Node.js, Java, Go, Rust, C/C++, Git, and system utilities. Use when verifying environments or troubleshooting dependencies.
- cloning-projectExports project instructions and knowledge files from the current Claude project. Use when users want to clone, copy, backup, or export a project's configuration and files.
- coding-mojoDevelop and run Mojo code in Claude.ai containers. Handles installation, compilation, and execution. Use when writing Mojo code, benchmarking Mojo vs Python, or when user mentions Mojo, Modular, or MAX. Routes to Modular's official skills (mojo-syntax, mojo-python-interop, mojo-gpu-fundamentals) for language-specific correction layers.