pikastream-video-meeting

$npx mdskill add elophanto/EloPhanto/pikastream-video-meeting

Join video meetings as an avatar agent via PikaStreaming.

  • Enables agents to enter Google Meet or Zoom calls automatically.
  • Requires a valid Pika Streaming API key and Python environment.
  • Generates or accepts user-provided avatars before meeting access.
  • Displays the configured avatar during the video conference session.

SKILL.md

.github/skills/pikastream-video-meetingView on GitHub ↗
---
name: pikastream-video-meeting
description: |
  Join a Google Meet or Zoom call as a video meeting agent via PikaStreaming.
  Trigger: user drops a Google Meet or Zoom link, or asks to join a meeting.
metadata:
  openclaw:
    requires:
      env: ["PIKA_DEV_KEY"]
      bins: ["python3"]
    primaryEnv: "PIKA_DEV_KEY"
---

# PikaStream Video Meeting

Script: `SKILL_DIR=skills/pikastream-video-meeting`

## First-Time Setup

Run once when the skill is first loaded:

```bash
pip install -r $SKILL_DIR/requirements.txt
```

### 1. Avatar

Check if `identity/videomeeting-avatar.png` exists and is larger than 1 KB. If it does NOT exist (or is too small):

1. Ask the user: `I need an avatar image for the video meeting (a headshot or portrait). Send me an image, or say "generate" and I'll create one for you.`
2. **Do not proceed until the user responds.** Do not auto-generate.
3. **User sends an image:** save it to `identity/videomeeting-avatar.png`.
4. **User says "generate":** run:
   ```bash
   python $SKILL_DIR/scripts/pikastreaming_videomeeting.py generate-avatar \
     --output identity/videomeeting-avatar.png
   ```
   If the user describes what they want (e.g. "a cartoon cat"), pass `--prompt "<description>, portrait headshot suitable for video calls"`.
   Show the generated image. Ask: `Want to keep this avatar or regenerate?` Wait for reply.
5. **Anything else:** repeat the question from step 1.

The bot must have an avatar before joining a meeting.

### 2. Voice

Check if `life/voice_id.txt` exists and is non-empty.

**If it exists:** read `life/voice_config.json`. If readable, check `cloned_at` — if 6+ days ago, warn the user:
`Your voice clone was created on {date} and may have expired (cloned voices are deleted after 7 days of non-use). Want to re-clone with a new recording, or try the existing one?`
- Re-clone: go to "If it does not exist" below.
- Keep it: use the existing voice ID.

If `life/voice_config.json` is missing or unreadable, use the voice ID from `life/voice_id.txt` as-is.

**If it does not exist** (or user chose to re-clone):

1. Ask the user: `I don't have a voice clone yet. You can: (a) send me a voice recording (10s-5min, clear speech) and I'll clone it, or (b) say "skip" to use a default voice.`
2. **Do not proceed until the user responds.**
3. **User says "skip":** use `English_radiant_girl`.
4. **User sends an audio file:** run:
   ```bash
   python $SKILL_DIR/scripts/pikastreaming_videomeeting.py clone-voice \
     --audio <file> --name <bot-name> --noise-reduction
   ```
   - Exit 0: read `life/voice_id.txt`. Tell user: `Voice cloned. Using {voice_id} for this meeting.`
   - Exit non-zero: tell user cloning failed (include stderr). Ask: `Try again with a different file, or skip and use the default voice?`
5. **Invalid file (not audio):** respond `That doesn't look like a supported audio file. Send an mp3, m4a, wav, ogg, flac, or aac file (10s-5min of clear speech).` Wait for retry or "skip".

---

## Join Flow

### Step 1 — Validate & gather context

**Avatar:** check `identity/videomeeting-avatar.png` exists and is > 1 KB. If not, run First-Time Setup above.

**Voice:** check `life/voice_id.txt` exists. If not, run the Voice section of First-Time Setup above.

**Context:** always gather fresh context — do not reuse a stale file from a previous session.

1. Read your workspace files (MEMORY.md, daily logs, identity files, etc.).
2. If no workspace data is available, ask the user: `What name should the bot use in this meeting?` Use their answer for the `Name:` field. Fill in any other sections you know from the conversation.
3. Synthesize a concise reference card to `/tmp/meeting_system_prompt.txt`. Use `{name}` as the bot's display name (also used as `--bot-name`). If data is thin (e.g. only a name), keep it short — don't pad with filler.

```
Synthesize the raw data below into a concise reference card for {name} to use during a voice/video call. Use third-person ("{name}") throughout. Prioritize CONCRETE DETAILS.

PRIORITY ORDER:
1. SPECIFIC FACTS: names, places, dates, numbers, events
2. RECENT ACTIVITY: what happened today/this week — actions, not vibes
3. RELATIONSHIPS: who matters, specific interactions
4. PERSONALITY: 1-2 sentences MAX

CURATION RULES:
- KEEP: anything with a proper noun, a number, a date, or a concrete action
- DROP: vague descriptions, routine status updates, empty entries
- MERGE: if multiple entries say similar things, pick the most vivid one

OUTPUT FORMAT:

**{name}**: [1 sentence — tone/vibe]

**Known facts** (concrete only, max 10):
- [specific fact with names/dates/numbers]

**Recent activity**:
- [built X, fixed Y, went to Z]

**Right now**: [1 line — current activity]

**People**: [name — 1 specific detail each]

RULES:
- Concrete > abstract
- Actions > descriptions
- Do not invent facts
- If data is thin, keep it short
```

### Step 2 — Join

```bash
python $SKILL_DIR/scripts/pikastreaming_videomeeting.py join \
  --meet-url <url> --bot-name <name> \
  --image identity/videomeeting-avatar.png \
  --system-prompt-file /tmp/meeting_system_prompt.txt \
  --voice-id <id> [--meeting-password <pw>]
```

Tell the user you're in. Say `leave` to leave. Don't mention session IDs.

**Exit codes:** 0 = joined. 6 = insufficient credits (stdout JSON contains a `checkout_url` — show it to the user).

## Leave

```bash
python $SKILL_DIR/scripts/pikastreaming_videomeeting.py leave \
  --session-id <id from join output>
```

## Verify

- A real session/room was created via the API and its ID/URL is captured, not mocked
- At least one participant join/leave event was observed in logs or webhook payloads
- Audio and video tracks were confirmed flowing (track stats, bytes-sent > 0), not assumed from connection state
- Recording / transcript / chat features used were verified to land in storage with a retrievable URL or ID
- Network failure was simulated or considered: reconnection / fallback behavior is documented

More from elophanto/EloPhanto

SkillDescription
12-principles-of-animationAudit animation code against Disney's 12 principles adapted for web. Use when reviewing motion, implementing animations, or checking animation quality. Outputs file:line findings.
accessibility-auditingAudit interfaces against WCAG 2.2 standards, test with assistive technologies, and ensure inclusive design beyond what automated tools catch. Adapted from msitarzewski/agency-agents.
agency-phase-0-discoveryIntelligence and discovery phase — validate opportunity before committing resources. Adapted from msitarzewski/agency-agents.
agency-phase-1-strategyStrategy and architecture phase — define what to build, how to structure it, and what success looks like. Adapted from msitarzewski/agency-agents.
agency-phase-2-foundationFoundation and scaffolding phase — build technical and operational foundation before feature development. Adapted from msitarzewski/agency-agents.
agency-phase-3-buildBuild and iterate phase — implement all features through continuous Dev-QA loops with orchestrated multi-agent sprints. Adapted from msitarzewski/agency-agents.
agency-phase-4-hardeningQuality and hardening phase — the final quality gauntlet proving production readiness with evidence. Adapted from msitarzewski/agency-agents.
agency-phase-5-launchLaunch and growth phase — coordinate go-to-market execution across all channels for maximum impact. Adapted from msitarzewski/agency-agents.
agency-phase-6-operateOperate and evolve phase — sustained operations with continuous improvement for live products. Adapted from msitarzewski/agency-agents.
agency-strategyNEXUS multi-agent orchestration strategy — the complete operational playbook for coordinating specialized AI agents across project phases. Adapted from msitarzewski/agency-agents.