lancedb

$npx mdskill add mkurman/zorai/lancedb

Store and retrieve multimodal data efficiently without a server.

  • Enables agents to index text, images, and audio for fast search.
  • Integrates with Python DataFrames and Open-Clip embeddings.
  • Executes hybrid queries using columnar storage and vector similarity.
  • Delivers results directly as lists or pandas DataFrames.

SKILL.md

.github/skills/lancedbView on GitHub ↗
---
name: lancedb
description: "LanceDB — serverless vector database for AI. Columnar storage on Lance format, zero-copy access, multimodal search (text + images + audio), and direct DataFrame integration. No separate server."
tags: [lancedb, vector-database, embedded, multimodal, embeddings, python, zorai]
---
## Overview

LanceDB is a developer-friendly, serverless vector database built on the Lance columnar format. It supports multimodal search (text, image, audio embeddings), hybrid search, and efficient streaming ingestion without a separate server process.

## Installation

```bash
uv pip install lancedb
```

## Create and Query

```python
import lancedb
import numpy as np

db = lancedb.connect("./my_lancedb")
table = db.create_table("vectors", [
    {"vector": np.random.rand(128), "text": "hello world"},
    {"vector": np.random.rand(128), "text": "goodbye moon"},
])

results = table.search(np.random.rand(128)).limit(5).to_list()
print([r["text"] for r in results])
```

## Open-Clip Embeddings

```python
import lancedb
from lancedb.embeddings import with_open_clip

@with_open_clip
class Images:
    image: str
    vector: list

table = db.create_table("images", schema=Images)
table.add([{"image": "photo.jpg"}, {"image": "diagram.png"}])
results = table.search("sunset landscape").limit(3).to_pandas()
```

## References
- [LanceDB docs](https://lancedb.github.io/lancedb/)
- [Lance format](https://lancedb.github.io/lance/)

More from mkurman/zorai

SkillDescription
account-management>
agile-scrum>
albumentationsFast image augmentation library (Albumentations). 70+ transforms for classification, segmentation, object detection, keypoints, and pose estimation. Optimized OpenCV-based pipeline with unified API across all CV tasks. Supports images, masks, bounding boxes, and keypoints simultaneously. Note: classic Albumentations (MIT) is no longer maintained; successor AlbumentationsX uses AGPL-3.0. For torchvision-native augmentations, use torchvision.transforms.v2.
aml-complianceAnti-Money Laundering (AML) and Know Your Customer (KYC) compliance workflow. Sanctions screening, PEP detection, transaction monitoring, suspicious activity reporting (SAR), and OFAC compliance.
anki-connectThis skill is for interacting with Anki through AnkiConnect, and should be used whenever a user asks to interact with Anki, including to read or modify decks, notes, cards, models, media, or sync operations.
approval-checkpoint-long-taskCanonical long-task pack for daemon-managed work with deliberate approval checkpoints, status summaries, rollback notes, and mobile-safe governance-aware updates.
auditing-goal-artifactsUse when reviewing recent zorai goal run outputs, closure markers, ledgers, or evidence bundles to judge whether completion is credible or to identify remaining uncertainty.
autogenAutoGen (Microsoft) — multi-agent conversation framework. Agent-to-agent chat, code generation & execution, tool use, group chat, and human-in-the-loop. Build collaborative AI systems with specialized agents.
backtraderPython backtesting framework for trading strategies. Data feeds, brokers, analyzers, and live trading support. Strategy development with commission models, slippage, and signal-based execution.
beautiful-mermaidRender Mermaid diagrams as SVG and PNG using the Beautiful Mermaid library. Use when the user asks to render a Mermaid diagram.