signal-scanner
$
npx mdskill add gooseworks-ai/goose-skills/signal-scannerScan TAM companies for buying signals to trigger intent-driven outreach.
- Converts static lists into active leads by detecting headcount growth, tech stack shifts, and funding rounds.
- Integrates Supabase for storage, Apify for web scraping, and Anthropic for content analysis.
- Uses free data diffs and paid scraping to score and deduplicate potential buyers.
- Writes verified signals to the signals table for downstream activation workflows.
SKILL.md
.github/skills/signal-scannerView on GitHub ↗
---
name: signal-scanner
description: >
Detect buying signals across TAM companies and watchlist personas.
Three-phase architecture: (1) free diff-based signals from existing data
(headcount growth, tech stack changes, funding rounds), (2) Apify-powered
signals (job postings, LinkedIn content analysis, profile changes), and
(3) post-processing with dedup, scoring, and lead status updates.
Writes signals to Supabase signals table for downstream activation.
tags: [lead-generation]
---
# Signal Scanner
Scheduled scanner that detects buying signals on TAM companies and watchlist personas, writes them to the `signals` table, and sets up downstream activation.
## When to Use
- After TAM Builder has populated companies and personas
- As a recurring scan (daily/weekly) to detect timing-based outreach triggers
- When you need to move from static lists to intent-driven outreach
## Prerequisites
- `SUPABASE_URL` + `SUPABASE_SERVICE_ROLE_KEY` in `.env`
- `APIFY_TOKEN` in `.env` (for Phase 2 signals)
- `ANTHROPIC_API_KEY` in `.env` (optional, for LLM content analysis)
- TAM companies populated via `tam-builder`
- Watchlist personas created for Tier 1-2 companies
## Signal Types
| Priority | Signal | Level | Source | Cost |
|----------|--------|-------|--------|------|
| P0 | Headcount growth (>10% in 90d) | Company | Data diffs | Free |
| P0 | Tech stack changes | Company | Data diffs | Free |
| P0 | Funding round | Company | Data diffs | Free |
| P0 | Job posting for relevant roles | Company | Apify linkedin-job-search | ~$0.001/job |
| P1 | Leadership job change | Person | Apify linkedin-profile-scraper | ~$3/1k |
| P1 | LinkedIn content analysis | Person | Apify linkedin-profile-posts + LLM | ~$2/1k + LLM |
| P1 | LinkedIn profile updates | Person | Apify linkedin-profile-scraper | ~$3/1k |
| P2 | New C-suite hire | Company | Derived from person scans | Free |
## Config Format
See `configs/example.json` for full schema. Key sections:
- `client_name` — which client's TAM to scan
- `signals.*` — enable/disable each signal type with thresholds
- `scan_scope` — filter by tier, status, lead_status
## Database Write Policy
**CRITICAL: Never write signals or update lead statuses without explicit user approval.**
The signal scanner writes to multiple tables: `signals` (insert), `enrichment_log` (insert), `companies` (patch snapshots), and `people` (patch lead_status). These writes affect downstream outreach decisions — bad signals lead to bad outreach timing.
**Required flow:**
1. **Always run `--dry-run` first** to detect signals without writing to the database
2. Present the dry-run results to the user: signal count, types, top signals, affected companies/people
3. **Get explicit user approval** before running without `--dry-run`
4. Only then run the actual scan that writes to the database
**Why this matters:**
- Signals drive outreach timing — incorrect signals trigger premature outreach
- `lead_status` changes from `monitoring` to `signal_detected` are hard to undo across many records
- Snapshot updates affect future signal diffs — bad snapshots cascade into future scans
- Enrichment log entries track Apify credit spend
**The agent must NEVER pass `--yes` on a first run.** The `--yes` flag is only for pre-approved scheduled scans where the user has already validated the signal detection logic.
## Usage
```bash
# Dry run first (ALWAYS DO THIS) — detect signals without writing to DB
python skills/capabilities/signal-scanner/scripts/signal_scanner.py \
--config skills/capabilities/signal-scanner/configs/my-client.json --dry-run
# Full scan (only after user reviews dry-run results and approves)
python skills/capabilities/signal-scanner/scripts/signal_scanner.py \
--config skills/capabilities/signal-scanner/configs/my-client.json
# Test mode (5 companies max)
python skills/capabilities/signal-scanner/scripts/signal_scanner.py \
--config configs/example.json --test --dry-run
# Free signals only (skip Apify)
# Set all Apify signals to enabled: false in config
```
### Flags
| Flag | Effect |
|------|--------|
| `--config PATH` | Path to config JSON (required) |
| `--test` | Limit to 5 companies, 3 people |
| `--yes` | Auto-confirm Apify cost prompts. **Only use for pre-approved scheduled scans.** |
| `--dry-run` | Detect signals but don't write to DB. **Always run this first.** |
| `--max-runs N` | Override Apify run limit (default 50) |
## Output
### Signals table writes
Each signal includes: `client_name`, `company_id`, `person_id`, `signal_level` (company or person), `signal_type`, `signal_source`, `strength`, `signal_data` (JSON), `activation_score`, `detected_at`, `acted_on`, `run_id`.
### Other database writes
- Person `lead_status` updated to `signal_detected` when activation_score >= threshold
- Company `metadata._signal_snapshot` updated for next diff cycle
- Person `raw_data._signal_snapshot` updated for next diff cycle
- `enrichment_log` entries with `tool='apify'`, `action='search'` or `'enrich'`, plus `credits_used`
### Console output
- Summary stats printed to stdout
## Activation Score
```
activation_score = strength * recency_multiplier * account_fit
Recency: <24h = 1.5, 1-3d = 1.2, 3-7d = 1.0, 1-2w = 0.8, 2-4w = 0.5
Account: Tier 1 = 1.3, Tier 2 = 1.0, Tier 3 = 0.7
```
## Connects To
- **Upstream:** `tam-builder` (provides companies + people)
- **Downstream:** `cold-email-outreach` (acts on signals)
## File Structure
```
signal-scanner/
├── SKILL.md
├── configs/
│ └── example.json
└── scripts/
└── signal_scanner.py
```
More from gooseworks-ai/goose-skills