databricks-apps-python
$
npx mdskill add databricks/databricks-agent-skills/databricks-apps-pythonBuild Databricks applications. For full examples and recipes, see the **[Databricks Apps Cookbook](https://apps-cookbook.dev/)**.
SKILL.md
.github/skills/databricks-apps-pythonView on GitHub ↗
---
name: databricks-apps-python
description: "Builds Databricks applications. Prefers AppKit (TypeScript + React SDK) for new apps; falls back to Python frameworks (Dash, Streamlit, Gradio, Flask, FastAPI, Reflex) when Python is required. Handles OAuth authorization, app resources, SQL warehouse and Lakebase connectivity, model serving, foundation model APIs, and deployment. Use when building web apps, dashboards, ML demos, or REST APIs for Databricks, or when the user mentions AppKit, Streamlit, Dash, Gradio, Flask, FastAPI, Reflex, or Databricks app."
---
# Databricks Applications
Build Databricks applications. For full examples and recipes, see the **[Databricks Apps Cookbook](https://apps-cookbook.dev/)**.
---
## AppKit (Preferred for New Apps)
**[AppKit](https://github.com/databricks/appkit)** is the recommended SDK for new Databricks apps. It is a TypeScript + React SDK with a plugin architecture, built-in caching, telemetry, and end-to-end type safety.
### Requirements
- Node.js v22+
- Databricks CLI v0.295.0+
### Scaffold a new app
```bash
databricks apps init
```
This interactive command scaffolds the full project, installs dependencies, and optionally deploys.
### Deploy
```bash
databricks apps deploy
```
### AppKit plugins
| Plugin | Purpose |
|--------|---------|
| **Analytics** | SQL queries against Databricks SQL Warehouses — file-based, typed, cached |
| **Genie** | Conversational AI/BI interface with natural language queries |
| **Files** | Browse/upload Unity Catalog Volumes |
| **Lakebase** | OLTP PostgreSQL via Lakebase with OAuth token management |
### AI-assisted development
```bash
# Install agent skills for AI-powered scaffolding
databricks experimental aitools skills install
# Query AppKit docs inline
npx @databricks/appkit docs "your question here"
```
### AppKit documentation
- **[AppKit Docs](https://databricks.github.io/appkit/docs/)** — getting started, plugins, API reference
- **[AI-assisted development](https://databricks.github.io/appkit/docs/development/ai-assisted-development)** — guidance for code assistants
- **[llms.txt](https://databricks.github.io/appkit/llms.txt)** — machine-readable docs for AI context
---
## Python Apps (alternative)
Use Python when: the team is Python-only, you need Streamlit/Dash/Gradio, or you are extending an existing Python app.
## Critical Rules for Python apps (always follow)
- **MUST** confirm framework choice or use [Python Framework Selection](#python-framework-selection) below
- **MUST** use SDK `Config()` for authentication (never hardcode tokens)
- **MUST** use `app.yaml` `valueFrom` for resources (never hardcode resource IDs)
- **MUST** use `dash-bootstrap-components` for Dash app layout and styling
- **MUST** use `@st.cache_resource` for Streamlit database connections
- **MUST** deploy Flask with Gunicorn, FastAPI with uvicorn (not dev servers)
## Required Steps for Python apps
Copy this checklist and verify each item:
```
- [ ] Framework selected
- [ ] Auth strategy decided: app auth, user auth, or both
- [ ] App resources identified (SQL warehouse, Lakebase, serving endpoint, etc.)
- [ ] Backend data strategy decided (SQL warehouse, Lakebase, or SDK)
- [ ] Deployment method: CLI or DABs
```
---
## Python Framework Selection
| Framework | Best For | app.yaml Command |
|-----------|----------|------------------|
| **Dash** | Production dashboards, BI tools, complex interactivity | `["python", "app.py"]` |
| **Streamlit** | Rapid prototyping, data science apps, internal tools | `["streamlit", "run", "app.py"]` |
| **Gradio** | ML demos, model interfaces, chat UIs | `["python", "app.py"]` |
| **Flask** | Custom REST APIs, lightweight apps, webhooks | `["gunicorn", "app:app", "-w", "4", "-b", "0.0.0.0:8000"]` |
| **FastAPI** | Async APIs, auto-generated OpenAPI docs | `["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]` |
| **Reflex** | Full-stack Python apps without JavaScript | `["reflex", "run", "--env", "prod"]` |
**Default**: Recommend **Streamlit** for prototypes, **Dash** for production dashboards, **FastAPI** for APIs, **Gradio** for ML demos.
---
## Quick Reference
| Concept | Details |
|---------|---------|
| **Runtime** | Python 3.11, Ubuntu 22.04, 2 vCPU, 6 GB RAM |
| **Pre-installed** | Dash 2.18.1, Streamlit 1.38.0, Gradio 4.44.0, Flask 3.0.3, FastAPI 0.115.0 |
| **Auth (app)** | Service principal via `Config()` — auto-injected `DATABRICKS_CLIENT_ID`/`DATABRICKS_CLIENT_SECRET` |
| **Auth (user)** | `x-forwarded-access-token` header — see [references/1-authorization.md](references/1-authorization.md) |
| **Resources** | `valueFrom` in app.yaml — see [references/2-app-resources.md](references/2-app-resources.md) |
| **Cookbook** | https://apps-cookbook.dev/ |
| **Docs** | https://docs.databricks.com/dev-tools/databricks-apps/ |
---
## Detailed Guides
**Authorization**: Use [references/1-authorization.md](references/1-authorization.md) when configuring app or user authorization — covers service principal auth, on-behalf-of user tokens, OAuth scopes, and per-framework code examples. (Keywords: OAuth, service principal, user auth, on-behalf-of, access token, scopes)
**App resources**: Use [references/2-app-resources.md](references/2-app-resources.md) when connecting your app to Databricks resources — covers SQL warehouses, Lakebase, model serving, secrets, volumes, and the `valueFrom` pattern. (Keywords: resources, valueFrom, SQL warehouse, model serving, secrets, volumes, connections)
**Frameworks**: See [references/3-frameworks.md](references/3-frameworks.md) for Databricks-specific patterns per framework — covers Dash, Streamlit, Gradio, Flask, FastAPI, and Reflex with auth integration, deployment commands, and Cookbook links. (Keywords: Dash, Streamlit, Gradio, Flask, FastAPI, Reflex, framework selection)
**Deployment**: Use [references/4-deployment.md](references/4-deployment.md) when deploying your app — covers Databricks CLI, Asset Bundles (DABs), app.yaml configuration, and post-deployment verification. (Keywords: deploy, CLI, DABs, asset bundles, app.yaml, logs)
**Lakebase**: Use [references/5-lakebase.md](references/5-lakebase.md) when using Lakebase (PostgreSQL) as your app's data layer — covers auto-injected env vars, psycopg2/asyncpg patterns, and when to choose Lakebase vs SQL warehouse. (Keywords: Lakebase, PostgreSQL, psycopg2, asyncpg, transactional, PGHOST)
**CLI commands**: Use [references/6-cli-approach.md](references/6-cli-approach.md) for managing app lifecycle via CLI — covers creating, deploying, monitoring, and deleting apps. (Keywords: CLI, create app, deploy app, app logs)
**Foundation Models**: See [examples/llm_config.py](examples/llm_config.py) for calling Databricks foundation model APIs — covers OAuth M2M auth, OpenAI-compatible client wiring, and token caching. (Keywords: foundation model, LLM, OpenAI client, chat completions)
---
## Workflow
1. Determine the task type:
**New app from scratch?** → Use [AppKit](#appkit-preferred-for-new-apps) (`databricks apps init`). Fall back to [Python Framework Selection](#python-framework-selection) only if Python is required.
**Setting up authorization?** → Read [references/1-authorization.md](references/1-authorization.md)
**Connecting to data/resources?** → Read [references/2-app-resources.md](references/2-app-resources.md)
**Using Lakebase (PostgreSQL)?** → Read [references/5-lakebase.md](references/5-lakebase.md)
**Deploying to Databricks?** → Read [references/4-deployment.md](references/4-deployment.md)
**Using CLI for app lifecycle?** → Read [references/6-cli-approach.md](references/6-cli-approach.md)
**Calling foundation model/LLM APIs?** → See [examples/llm_config.py](examples/llm_config.py)
2. Follow the instructions in the relevant guide
3. For full code examples, browse https://apps-cookbook.dev/
---
## Core Architecture
All Python Databricks apps follow this pattern:
```
app-directory/
├── app.py # Main application (or framework-specific name)
├── models.py # Pydantic data models
├── backend.py # Data access layer
├── requirements.txt # Additional Python dependencies
├── app.yaml # Databricks Apps configuration
└── README.md
```
### Backend Toggle Pattern
```python
import os
from databricks.sdk.core import Config
USE_MOCK = os.getenv("USE_MOCK_BACKEND", "true").lower() == "true"
if USE_MOCK:
from backend_mock import MockBackend as Backend
else:
from backend_real import RealBackend as Backend
backend = Backend()
```
### SQL Warehouse Connection (shared across all frameworks)
```python
from databricks.sdk.core import Config
from databricks import sql
cfg = Config() # Auto-detects credentials from environment
conn = sql.connect(
server_hostname=cfg.host,
http_path=f"/sql/1.0/warehouses/{os.getenv('DATABRICKS_WAREHOUSE_ID')}",
credentials_provider=lambda: cfg.authenticate,
)
```
### Pydantic Models
```python
from pydantic import BaseModel, Field
from datetime import datetime
from enum import Enum
class Status(str, Enum):
ACTIVE = "active"
PENDING = "pending"
class EntityOut(BaseModel):
id: str
name: str
status: Status
created_at: datetime
class EntityIn(BaseModel):
name: str = Field(..., min_length=1)
status: Status = Status.PENDING
```
---
## Common Issues
| Issue | Solution |
|-------|----------|
| **Connection exhausted** | Use `@st.cache_resource` (Streamlit) or connection pooling |
| **Auth token not found** | Check `x-forwarded-access-token` header — only available when deployed, not locally |
| **App won't start** | Check `app.yaml` command matches framework; check `databricks apps logs <name>` |
| **Resource not accessible** | Add resource via UI, verify SP has permissions, use `valueFrom` in app.yaml |
| **Import error on deploy** | Add missing packages to `requirements.txt` (pre-installed packages don't need listing) |
| **Lakebase app crashes on start** | `psycopg2`/`asyncpg` are NOT pre-installed — MUST add to `requirements.txt` |
| **Port conflict** | Apps must bind to `DATABRICKS_APP_PORT` env var (defaults to 8000). Never use 8080. Streamlit is auto-configured; for others, read the env var in code or use 8000 in app.yaml command |
| **Streamlit: set_page_config error** | `st.set_page_config()` must be the first Streamlit command |
| **Dash: unstyled layout** | Add `dash-bootstrap-components`; use `dbc.themes.BOOTSTRAP` |
| **Slow queries** | Use Lakebase for transactional/low-latency; SQL warehouse for analytical queries |
---
## Platform Constraints
| Constraint | Details |
|------------|---------|
| **Runtime** | Python 3.11, Ubuntu 22.04 LTS |
| **Compute** | 2 vCPUs, 6 GB memory (default) |
| **Pre-installed frameworks** | Dash, Streamlit, Gradio, Flask, FastAPI, Shiny |
| **Custom packages** | Add to `requirements.txt` in app root |
| **Network** | Apps can reach Databricks APIs; external access depends on workspace config |
| **User auth** | Public Preview — workspace admin must enable before adding scopes |
---
## Official Documentation
- **[AppKit](https://databricks.github.io/appkit/docs/)** — preferred SDK for new apps (TypeScript + React)
- **[Databricks Apps Overview](https://docs.databricks.com/dev-tools/databricks-apps/)** — main docs hub
- **[Apps Cookbook](https://apps-cookbook.dev/)** — ready-to-use code snippets (Streamlit, Dash, Reflex, FastAPI)
- **[Authorization](https://docs.databricks.com/dev-tools/databricks-apps/auth)** — app auth and user auth
- **[Resources](https://docs.databricks.com/dev-tools/databricks-apps/resources)** — SQL warehouse, Lakebase, serving, secrets
- **[app.yaml Reference](https://docs.databricks.com/dev-tools/databricks-apps/app-runtime)** — command and env config
- **[System Environment](https://docs.databricks.com/dev-tools/databricks-apps/system-env)** — pre-installed packages, runtime details
## Related Skills
- **databricks-app-apx** - full-stack apps with FastAPI + React
- **databricks-dabs** - deploying apps via DABs
- **[databricks-python-sdk](../databricks-python-sdk/SKILL.md)** - backend SDK integration
- **databricks-lakebase** - adding persistent PostgreSQL state (autoscaling managed PG with branching)
- **databricks-model-serving** - serving ML models for app integration
More from databricks/databricks-agent-skills
- databricks-agent-bricksCreate Agent Bricks: Knowledge Assistants (KA) for document Q&A and Supervisor Agents for multi-agent orchestration (MAS).
- databricks-ai-functionsUse Databricks built-in AI Functions (ai_classify, ai_extract, ai_summarize, ai_mask, ai_translate, ai_fix_grammar, ai_gen, ai_analyze_sentiment, ai_similarity, ai_parse_document, ai_query, ai_forecast) to add AI capabilities directly to SQL and PySpark pipelines without managing model endpoints. Also covers document parsing and building custom RAG pipelines (parse → chunk → index → query).
- databricks-aibi-dashboardsCreate Databricks AI/BI dashboards. Must use when creating, updating, or deploying Lakeview dashboards as Databricks Dashboard have a unique json structure. CRITICAL: You MUST test ALL SQL queries via CLI BEFORE deploying. Follow guidelines strictly.
- databricks-appsBuild apps on Databricks Apps platform. Use when asked to create dashboards, data apps, analytics tools, or visualizations. Evaluates data access patterns (analytics vs Lakebase synced tables) before scaffolding. Invoke BEFORE starting implementation.
- databricks-coreDatabricks CLI operations: auth, profiles, data exploration, and bundles. Contains up-to-date guidelines for Databricks-related CLI tasks.
- databricks-dabsCreate, configure, validate, deploy, run, and manage DABs — Declarative Automation Bundles (formerly Databricks Asset Bundles) — for Databricks resources including dashboards, jobs, pipelines, alerts, volumes, and apps
- databricks-dbsql>-
- databricks-docsDatabricks documentation reference via llms.txt index. Use when other skills do not cover a topic, looking up unfamiliar Databricks features, or needing authoritative docs on APIs, configurations, or platform capabilities.
- databricks-execution-compute>-
- databricks-icebergApache Iceberg tables on Databricks — Managed Iceberg tables, External Iceberg Reads (fka Uniform), Compatibility Mode, Iceberg REST Catalog (IRC), Iceberg v3, Snowflake interop, PyIceberg, OSS Spark, external engine access and credential vending. Use when creating Iceberg tables, enabling External Iceberg Reads (uniform) on Delta tables (including Streaming Tables and Materialized Views via compatibility mode), configuring external engines to read Databricks tables via Unity Catalog IRC, integrating with Snowflake catalog to read Foreign Iceberg tables