replicate
$
npx mdskill add vm0-ai/vm0-skills/replicateRun open-source ML models via Replicate's HTTP API
- Generates images or text from hosted open-source models
- Depends on Replicate API and requires a connector setup
- Executes async jobs by submitting version IDs and inputs
- Delivers results through output URLs from completed predictions
SKILL.md
.github/skills/replicateView on GitHub ↗
---
name: replicate
description: Replicate API for running open-source ML models in the cloud. Use when user mentions "Replicate", "run a model on Replicate", "AI image generation", "SDXL", "FLUX", "Llama", or "open-source ML inference".
---
# Replicate
Replicate lets you run open-source machine learning models via a simple HTTP API. Submit a prediction, poll until it completes, and retrieve the output URLs.
> Official docs: `https://replicate.com/docs/reference/http`
---
## When to Use
Use this skill when you need to:
- Generate images using SDXL, FLUX Schnell, or other diffusion models
- Run text generation with Llama or other open-source LLMs
- Execute any model hosted on replicate.com
- Poll the status of an async prediction job
---
## Prerequisites
Connect the **Replicate** connector at [app.vm0.ai/connectors](https://app.vm0.ai/connectors).
> **Troubleshooting:** If requests fail, run `zero doctor check-connector --env-name REPLICATE_TOKEN` or `zero doctor check-connector --url https://api.replicate.com/v1/models --method GET`
---
## How to Use
All predictions are asynchronous: submit a job, then poll until `status` is `succeeded` or `failed`.
### 1. Run a Model by Version ID
Write to `/tmp/replicate_prediction.json`:
```json
{
"version": "<model-version-id>",
"input": {
"prompt": "A photorealistic cat sitting on a chair"
}
}
```
```bash
curl -s -X POST "https://api.replicate.com/v1/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_prediction.json | jq '{id, status, urls}'
```
The response includes a prediction `id` and a `urls.get` URL for polling.
### 2. Run the Latest Version of a Model
Replace `<owner>` and `<model-name>` with the model's owner and name (e.g. `black-forest-labs` / `flux-schnell`).
Write to `/tmp/replicate_prediction.json`:
```json
{
"input": {
"prompt": "A photorealistic cat sitting on a chair"
}
}
```
```bash
curl -s -X POST "https://api.replicate.com/v1/models/<owner>/<model-name>/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_prediction.json | jq '{id, status, urls}'
```
### 3. Poll Prediction Status
Replace `<prediction-id>` with the `id` from the create response.
```bash
curl -s "https://api.replicate.com/v1/predictions/<prediction-id>" --header "Authorization: Bearer $REPLICATE_TOKEN" | jq '{id, status, output, error}'
```
Keep polling every 2–5 seconds until `status` is `succeeded` or `failed`.
| `status` | Meaning |
|-------------|----------------------------------------|
| `starting` | Model is cold-starting |
| `processing`| Model is running |
| `succeeded` | Output is ready in the `output` field |
| `failed` | Check the `error` field for details |
| `canceled` | Prediction was canceled |
### 4. Generate an Image with FLUX Schnell
Write to `/tmp/replicate_flux.json`:
```json
{
"input": {
"prompt": "A serene mountain lake at sunrise, photorealistic",
"num_outputs": 1
}
}
```
```bash
curl -s -X POST "https://api.replicate.com/v1/models/black-forest-labs/flux-schnell/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_flux.json | jq '{id, status, urls}'
```
### 5. Generate an Image with Stability AI SDXL
Write to `/tmp/replicate_sdxl.json`:
```json
{
"input": {
"prompt": "A cyberpunk cityscape at night, neon lights, 4k",
"negative_prompt": "blurry, low quality",
"num_outputs": 1,
"width": 1024,
"height": 1024
}
}
```
```bash
curl -s -X POST "https://api.replicate.com/v1/models/stability-ai/sdxl/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_sdxl.json | jq '{id, status, urls}'
```
### 6. Run a Text Generation Model (Llama 3 70B)
Write to `/tmp/replicate_llama.json`:
```json
{
"input": {
"prompt": "Explain quantum entanglement in simple terms.",
"max_tokens": 512
}
}
```
```bash
curl -s -X POST "https://api.replicate.com/v1/models/meta/llama-3-70b-instruct/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_llama.json | jq '{id, status, urls}'
```
Text generation responses stream tokens as an array. Poll until `succeeded`, then read `output` (an array of strings — join them for the full response).
### 7. List Recent Predictions
```bash
curl -s "https://api.replicate.com/v1/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" | jq '.results[] | {id, status, created_at, urls}'
```
### 8. Search for Models
```bash
curl -s "https://api.replicate.com/v1/models" --header "Authorization: Bearer $REPLICATE_TOKEN" | jq '.results[] | {url, description}'
```
### 9. Get Model Details
Replace `<owner>/<model-name>` with the model identifier.
```bash
curl -s "https://api.replicate.com/v1/models/<owner>/<model-name>" --header "Authorization: Bearer $REPLICATE_TOKEN" | jq '{url, description, latest_version}'
```
### 10. Run via a Deployment
Replace `<deployment-owner>` and `<deployment-name>` with the deployment's owner and name.
Write to `/tmp/replicate_deploy.json`:
```json
{
"input": {
"prompt": "A futuristic robot in a garden"
}
}
```
```bash
curl -s -X POST "https://api.replicate.com/v1/deployments/<deployment-owner>/<deployment-name>/predictions" --header "Authorization: Bearer $REPLICATE_TOKEN" --header "Content-Type: application/json" -d @/tmp/replicate_deploy.json | jq '{id, status, urls}'
```
---
## Guidelines
1. **Always poll after submit**: Predictions are async. Never assume instant completion — always poll `GET /v1/predictions/<id>` until `status` is `succeeded` or `failed`.
2. **Poll interval**: 2–5 seconds is reasonable. Cold-starting models may take 30–60 seconds on the first prediction.
3. **Image output**: `output` will be an array of URLs (e.g. `["https://replicate.delivery/..."]`). Download with `curl -L`.
4. **Text output**: `output` is an array of token strings. Join them: `| jq '.output | join("")'`.
5. **Popular models**:
- Image: `black-forest-labs/flux-schnell`, `stability-ai/sdxl`
- Text: `meta/llama-3-70b-instruct`
6. **Version vs. latest**: Use `/v1/models/<owner>/<name>/predictions` to always run the latest version. Use `/v1/predictions` with a `version` ID to pin a specific version.
More from vm0-ai/vm0-skills
- account-reconciliationPerform account reconciliations comparing general ledger balances against subledgers, bank statements, or external records. Use for bank reconciliation, GL-to-subledger reconciliation, intercompany reconciliation, balance sheet reconciliation, reconciling item analysis, outstanding item aging, or clearing open items.
- agentphoneBuild AI phone agents with AgentPhone API. Use when the user wants to make phone calls, send/receive SMS, manage phone numbers, create voice agents, set up webhooks, or check usage — anything related to telephony, phone numbers, or voice AI.
- ahrefsAhrefs SEO API for backlink and keyword analysis. Use when user mentions
- amplitudeAmplitude product analytics API. Use when user mentions "Amplitude",
- analysis-qaQuality-check a data analysis before sharing — verify joins, aggregations, denominators, time ranges, and metric definitions. Detect pitfalls like survivorship bias, average-of-averages, join explosion, timezone mismatches, incomplete periods, and selection bias. Includes documentation templates for reproducible analyses.
- anthropic-managed-agentsAnthropic Managed Agents API for programmatically creating, running, and streaming AI agents on Anthropic's cloud infrastructure. Use when the user mentions "Managed Agents", "Anthropic agent sessions", or needs to create/run/stream an Anthropic agent with tool use (bash, git, web), attach GitHub repositories, or inject secrets via Vault. Do NOT use for standard Claude Messages API — use the Claude API skill instead.
- apifyApify web scraping platform. Use when user mentions "scrape website",
- asanaAsana API for tasks and projects. Use when user mentions "Asana", "asana.com",
- atlassianAtlassian API for Confluence and Jira. Use when user mentions "Confluence
- attioAttio REST API for AI-native CRM operations — manage companies, people, deals, and custom objects, plus notes, tasks, lists, and comments. Use when the user mentions "Attio", "CRM record", "create company", "add person", "list entry", "CRM note", or "CRM task".