hume

$npx mdskill add vm0-ai/vm0-skills/hume

Analyze emotion from text or video using Hume AI API.

  • Detects sentiment and facial expressions from media inputs.
  • Integrates with Hume AI API for batch inference jobs.
  • Executes analysis based on user prompts mentioning emotion.
  • Returns structured JSON results via API responses.

SKILL.md

.github/skills/humeView on GitHub ↗
---
name: hume
description: Hume AI API for emotion analysis. Use when user mentions "Hume", "emotion
  AI", "sentiment analysis", or voice emotion detection.
---

## Troubleshooting

If requests fail, run `zero doctor check-connector --env-name HUME_TOKEN` or `zero doctor check-connector --url https://api.hume.ai/v0/batch/jobs --method GET`

## Expression Measurement (Batch)

### Start Inference Job from URLs

Write to `/tmp/hume_request.json`:

```json
{
  "urls": ["https://example.com/media-file.mp4"],
  "models": {
    "face": {},
    "prosody": {},
    "language": {}
  },
  "notify": true
}
```

Then run:

```bash
curl -s -X POST "https://api.hume.ai/v0/batch/jobs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $HUME_TOKEN" -d @/tmp/hume_request.json | jq .
```

### Start Inference Job with Text

Write to `/tmp/hume_request.json`:

```json
{
  "text": ["I am so excited about this!", "This is really disappointing."],
  "models": {
    "language": {}
  }
}
```

Then run:

```bash
curl -s -X POST "https://api.hume.ai/v0/batch/jobs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $HUME_TOKEN" -d @/tmp/hume_request.json | jq .
```

### List Jobs

```bash
curl -s "https://api.hume.ai/v0/batch/jobs?limit=10" --header "X-Hume-Api-Key: $HUME_TOKEN" | jq .
```

### List Jobs by Status

```bash
curl -s "https://api.hume.ai/v0/batch/jobs?limit=10&status=COMPLETED" --header "X-Hume-Api-Key: $HUME_TOKEN" | jq .
```

### Get Job Details

```bash
curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}" --header "X-Hume-Api-Key: $HUME_TOKEN" | jq .
```

### Get Job Predictions

```bash
curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}/predictions" --header "X-Hume-Api-Key: $HUME_TOKEN" | jq .
```

### Download Job Artifacts

```bash
curl -s "https://api.hume.ai/v0/batch/jobs/{job-id}/artifacts" --header "X-Hume-Api-Key: $HUME_TOKEN" --output /tmp/hume_artifacts.zip
```

## Text-to-Speech

### Synthesize Speech (JSON Response)

Write to `/tmp/hume_request.json`:

```json
{
  "utterances": [
    {
      "text": "Hello, how are you today?",
      "description": "A warm and friendly greeting"
    }
  ],
  "format": {
    "type": "mp3"
  }
}
```

Then run:

```bash
curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $HUME_TOKEN" -d @/tmp/hume_request.json | jq .
```

The response contains base64-encoded audio in `.generations[].audio`.

### Synthesize Speech with Specific Voice

Write to `/tmp/hume_request.json`:

```json
{
  "utterances": [
    {
      "text": "Welcome to our platform!",
      "speed": 1.0
    }
  ],
  "voice": {
    "name": "{voice-name}"
  },
  "format": {
    "type": "mp3"
  }
}
```

Then run:

```bash
curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $HUME_TOKEN" -d @/tmp/hume_request.json | jq .
```

### Synthesize and Save Audio File

Write to `/tmp/hume_request.json`:

```json
{
  "utterances": [
    {
      "text": "This is a test of text-to-speech synthesis."
    }
  ],
  "format": {
    "type": "mp3"
  }
}
```

Then extract and decode the audio:

```bash
curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $HUME_TOKEN" -d @/tmp/hume_request.json | jq -r '.generations[0].audio' | base64 -d > /tmp/hume_output.mp3
```

### Synthesize Multiple Utterances

Write to `/tmp/hume_request.json`:

```json
{
  "utterances": [
    {
      "text": "First sentence.",
      "description": "calm and measured"
    },
    {
      "text": "Second sentence!",
      "description": "excited and energetic",
      "trailing_silence": 0.5
    }
  ],
  "format": {
    "type": "mp3"
  }
}
```

Then run:

```bash
curl -s -X POST "https://api.hume.ai/v0/tts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $HUME_TOKEN" -d @/tmp/hume_request.json | jq .
```

## EVI Configs

### List Configs

```bash
curl -s "https://api.hume.ai/v0/evi/configs?page_size=20" --header "X-Hume-Api-Key: $HUME_TOKEN" | jq .
```

### Create Config

Write to `/tmp/hume_request.json`:

```json
{
  "name": "My EVI Config"
}
```

Then run:

```bash
curl -s -X POST "https://api.hume.ai/v0/evi/configs" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $HUME_TOKEN" -d @/tmp/hume_request.json | jq .
```

## EVI Prompts

### List Prompts

```bash
curl -s "https://api.hume.ai/v0/evi/prompts?page_size=20" --header "X-Hume-Api-Key: $HUME_TOKEN" | jq .
```

### Create Prompt

Write to `/tmp/hume_request.json`:

```json
{
  "name": "Customer Support Agent",
  "text": "You are a friendly and empathetic customer support agent. Listen carefully and help resolve issues."
}
```

Then run:

```bash
curl -s -X POST "https://api.hume.ai/v0/evi/prompts" --header "Content-Type: application/json" --header "X-Hume-Api-Key: $HUME_TOKEN" -d @/tmp/hume_request.json | jq .
```

## EVI Chats

### List Chat Events

```bash
curl -s "https://api.hume.ai/v0/evi/chats/{chat-id}?page_size=50&ascending_order=true" --header "X-Hume-Api-Key: $HUME_TOKEN" | jq .
```

## Available Models for Expression Measurement

| Model | Description |
|-------|-------------|
| `face` | Facial expression analysis from images/video |
| `prosody` | Vocal expression analysis from audio |
| `language` | Emotion analysis from text |
| `ner` | Named entity recognition from text |
| `burst` | Non-speech vocal sounds (laughter, sighs) |
| `facemesh` | Detailed facial landmark detection |

## TTS Utterance Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `text` | string | Content to convert to speech (required) |
| `description` | string | Acting directions or voice style prompt |
| `speed` | number | Speech rate multiplier (default: 1.0) |
| `trailing_silence` | number | Silence after utterance in seconds (default: 0) |

## Response Codes

| Status | Description |
|--------|-------------|
| `200` | Success |
| `201` | Resource created |
| `400` | Invalid request parameters |
| `401` | Missing or invalid API key |
| `404` | Resource not found |
| `429` | Rate limit exceeded |
| `5xx` | Server error |

## Guidelines

1. **Authentication**: Use `X-Hume-Api-Key` header (not Bearer token) for all requests
2. **Batch Jobs**: Jobs are asynchronous; poll the job details endpoint until status is `COMPLETED`
3. **Models**: Specify only the models you need in batch requests to reduce processing time
4. **TTS Description**: Use the `description` field to control emotional tone and style of generated speech
5. **Pagination**: EVI endpoints use zero-based page numbering with configurable page size (1-100)
6. **API Key Types**: Organization API Key for TTS and EVI; Personal API Key for Expression Measurement

## API Reference

- Documentation: https://dev.hume.ai/reference
- Portal: https://app.hume.ai
- API Keys: https://app.hume.ai/keys

More from vm0-ai/vm0-skills

SkillDescription
account-reconciliationPerform account reconciliations comparing general ledger balances against subledgers, bank statements, or external records. Use for bank reconciliation, GL-to-subledger reconciliation, intercompany reconciliation, balance sheet reconciliation, reconciling item analysis, outstanding item aging, or clearing open items.
agentphoneBuild AI phone agents with AgentPhone API. Use when the user wants to make phone calls, send/receive SMS, manage phone numbers, create voice agents, set up webhooks, or check usage — anything related to telephony, phone numbers, or voice AI.
ahrefsAhrefs SEO API for backlink and keyword analysis. Use when user mentions
amplitudeAmplitude product analytics API. Use when user mentions "Amplitude",
analysis-qaQuality-check a data analysis before sharing — verify joins, aggregations, denominators, time ranges, and metric definitions. Detect pitfalls like survivorship bias, average-of-averages, join explosion, timezone mismatches, incomplete periods, and selection bias. Includes documentation templates for reproducible analyses.
anthropic-managed-agentsAnthropic Managed Agents API for programmatically creating, running, and streaming AI agents on Anthropic's cloud infrastructure. Use when the user mentions "Managed Agents", "Anthropic agent sessions", or needs to create/run/stream an Anthropic agent with tool use (bash, git, web), attach GitHub repositories, or inject secrets via Vault. Do NOT use for standard Claude Messages API — use the Claude API skill instead.
apifyApify web scraping platform. Use when user mentions "scrape website",
asanaAsana API for tasks and projects. Use when user mentions "Asana", "asana.com",
atlassianAtlassian API for Confluence and Jira. Use when user mentions "Confluence
attioAttio REST API for AI-native CRM operations — manage companies, people, deals, and custom objects, plus notes, tasks, lists, and comments. Use when the user mentions "Attio", "CRM record", "create company", "add person", "list entry", "CRM note", or "CRM task".