postgres-migrations

$npx mdskill add pr-pm/prpm/postgres-migrations

Fix PostgreSQL migration errors and optimize schema changes.

  • Resolves subquery grouping issues and immutable function violations.
  • Depends on PostgreSQL SQL dialect and migration frameworks.
  • Analyzes error logs to suggest CTE restructuring or function fixes.
  • Outputs corrected SQL queries with idempotent migration patterns.

SKILL.md

.github/skills/postgres-migrationsView on GitHub ↗
---
name: postgres-migrations
description: Comprehensive guide to PostgreSQL migrations - common errors, generated columns, full-text search, indexes, idempotent migrations, and best practices for database schema changes
---

# PostgreSQL Migrations Skill

## Common PostgreSQL Migration Errors and Solutions

### 1. "Subquery uses ungrouped column from outer query"

**Cause**: Subquery in SELECT/CASE references columns from outer query that aren't in GROUP BY.

**Solution**: Use CTE (Common Table Expression) to separate aggregation from subqueries:

```sql
-- ❌ Bad - subquery references ungrouped p.id
SELECT
  SPLIT_PART(p.id, '/', 1) as author,
  COUNT(*) as count,
  CASE WHEN EXISTS (
    SELECT 1 FROM users WHERE username = SPLIT_PART(p.id, '/', 1)
  ) THEN TRUE ELSE FALSE END as claimed
FROM packages p
GROUP BY SPLIT_PART(p.id, '/', 1);

-- ✅ Good - use CTE to compute aggregates first
WITH author_stats AS (
  SELECT
    SPLIT_PART(p.id, '/', 1) as author,
    COUNT(*) as count
  FROM packages p
  GROUP BY SPLIT_PART(p.id, '/', 1)
)
SELECT
  author,
  count,
  EXISTS (SELECT 1 FROM users WHERE username = author_stats.author) as claimed
FROM author_stats;
```

### 2. "Functions in index expression must be marked IMMUTABLE"

**Cause**: PostgreSQL requires functions in indexes/generated columns to be IMMUTABLE.

**Problem Functions**:
- `array_to_string()` - marked STABLE, not IMMUTABLE
- `to_char()` - depends on timezone/locale settings
- `now()` - changes over time

**Solution**: Create IMMUTABLE wrapper functions:

```sql
-- Create IMMUTABLE wrapper for array_to_string
CREATE OR REPLACE FUNCTION immutable_array_to_string(text[], text)
RETURNS text AS $$
  SELECT array_to_string($1, $2)
$$ LANGUAGE SQL IMMUTABLE PARALLEL SAFE;

-- Use in generated column
ALTER TABLE packages
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
  setweight(to_tsvector('english', coalesce(name, '')), 'A') ||
  setweight(to_tsvector('english', immutable_array_to_string(tags, ' ')), 'B')
) STORED;

-- Now you can index it
CREATE INDEX idx_search ON packages USING gin(search_vector);
```

### 3. "Relation does not exist" (Extensions)

**Cause**: Extension not installed (e.g., `pg_stat_statements`, `pg_trgm`, `uuid-ossp`).

**Solution**: Make extension usage optional with error handling:

```sql
-- Try to create extension, ignore if unavailable
DO $$
BEGIN
  IF NOT EXISTS (SELECT 1 FROM pg_extension WHERE extname = 'pg_trgm') THEN
    BEGIN
      CREATE EXTENSION pg_trgm;
    EXCEPTION
      WHEN insufficient_privilege OR feature_not_supported THEN
        RAISE NOTICE 'pg_trgm extension not available - skipping trigram indexes';
    END;
  END IF;
END $$;

-- Only create trigram indexes if extension exists
DO $$
BEGIN
  IF EXISTS (SELECT 1 FROM pg_extension WHERE extname = 'pg_trgm') THEN
    CREATE INDEX idx_name_trgm ON packages USING gin(name gin_trgm_ops);
  END IF;
END $$;
```

### 4. Idempotent Migrations

**Always use IF (NOT) EXISTS** to make migrations re-runnable:

```sql
-- Tables
CREATE TABLE IF NOT EXISTS users (...);

-- Columns
ALTER TABLE users ADD COLUMN IF NOT EXISTS email VARCHAR(255);

-- Indexes
CREATE INDEX IF NOT EXISTS idx_users_email ON users(email);

-- Drop operations
DROP TABLE IF EXISTS old_table CASCADE;
DROP INDEX IF EXISTS old_index;
DROP VIEW IF EXISTS old_view CASCADE;
DROP FUNCTION IF EXISTS old_function(args);

-- Extensions
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
```

### 5. Handling Circular Dependencies

**Issue**: Table A references table B, table B references table A.

**Solution**: Create tables first without foreign keys, then add constraints:

```sql
-- Step 1: Create tables without foreign keys
CREATE TABLE users (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  name VARCHAR(255)
);

CREATE TABLE posts (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  author_id UUID  -- No FK constraint yet
);

-- Step 2: Add foreign key constraints
ALTER TABLE posts
ADD CONSTRAINT fk_posts_author
FOREIGN KEY (author_id) REFERENCES users(id);
```

### 6. Working with Generated Columns

**Rules**:
- Must use IMMUTABLE functions only
- Cannot reference other generated columns
- Use STORED (not VIRTUAL in PostgreSQL)
- Cannot be updated directly

```sql
-- ✅ Good - IMMUTABLE functions
ALTER TABLE packages
ADD COLUMN full_name TEXT
GENERATED ALWAYS AS (namespace || '/' || name) STORED;

-- ✅ Good - with COALESCE for nulls
ALTER TABLE packages
ADD COLUMN search_text TEXT
GENERATED ALWAYS AS (
  coalesce(name, '') || ' ' || coalesce(description, '')
) STORED;

-- ❌ Bad - NOW() is not immutable
ALTER TABLE logs
ADD COLUMN year INTEGER
GENERATED ALWAYS AS (EXTRACT(YEAR FROM NOW())) STORED;  -- ERROR

-- ✅ Good - use created_at column instead
ALTER TABLE logs
ADD COLUMN year INTEGER
GENERATED ALWAYS AS (EXTRACT(YEAR FROM created_at)) STORED;
```

### 7. Materialized Views

**Best Practices**:

```sql
-- Create with data
CREATE MATERIALIZED VIEW IF NOT EXISTS package_rankings AS
SELECT
  id,
  name,
  total_downloads,
  ROW_NUMBER() OVER (ORDER BY total_downloads DESC) as rank
FROM packages
WHERE visibility = 'public';

-- Create indexes on materialized views
CREATE INDEX IF NOT EXISTS idx_rankings_downloads
ON package_rankings(total_downloads DESC);

-- Refresh function
CREATE OR REPLACE FUNCTION refresh_rankings()
RETURNS void AS $$
BEGIN
  REFRESH MATERIALIZED VIEW CONCURRENTLY package_rankings;
END;
$$ LANGUAGE plpgsql;

-- Schedule refresh (requires pg_cron extension)
-- SELECT cron.schedule('refresh-rankings', '0 * * * *', 'SELECT refresh_rankings()');
```

### 8. Full-Text Search Optimization

**Pattern**: Use generated column + GIN index for best performance:

```sql
-- 1. Create immutable helper
CREATE OR REPLACE FUNCTION immutable_array_to_string(text[], text)
RETURNS text AS $$
  SELECT array_to_string($1, $2)
$$ LANGUAGE SQL IMMUTABLE PARALLEL SAFE;

-- 2. Add generated column
ALTER TABLE packages
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
  setweight(to_tsvector('english', coalesce(name, '')), 'A') ||
  setweight(to_tsvector('english', coalesce(description, '')), 'B') ||
  setweight(to_tsvector('english', immutable_array_to_string(tags, ' ')), 'C')
) STORED;

-- 3. Create GIN index
CREATE INDEX idx_packages_search ON packages USING gin(search_vector);

-- 4. Query using the index
SELECT *
FROM packages
WHERE search_vector @@ websearch_to_tsquery('english', 'react hooks');
```

### 9. Composite Indexes for Common Queries

**Principles**:
- Equality filters first, then ranges, then sorts
- Most selective columns first
- Include WHERE clause conditions

```sql
-- Query: WHERE type = 'agent' AND category = 'development' ORDER BY downloads DESC
CREATE INDEX idx_packages_type_category_downloads
ON packages(type, category, total_downloads DESC)
WHERE visibility = 'public';

-- Query: WHERE author = 'foo' AND deprecated = FALSE ORDER BY created_at DESC
CREATE INDEX idx_packages_author_active
ON packages(author_id, created_at DESC)
WHERE deprecated = FALSE AND visibility = 'public';

-- Partial index for common filter
CREATE INDEX idx_packages_verified
ON packages(verified, total_downloads DESC)
WHERE verified = TRUE AND visibility = 'public';
```

### 10. Migration File Structure

**Best Practice Template**:

```sql
-- Migration XXX: Description
-- Brief explanation of what this migration does

-- ============================================
-- EXTENSIONS
-- ============================================

CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS "pg_trgm";

-- ============================================
-- TABLES
-- ============================================

CREATE TABLE IF NOT EXISTS table_name (
  id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  name VARCHAR(255) NOT NULL,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

-- ============================================
-- INDEXES
-- ============================================

CREATE INDEX IF NOT EXISTS idx_table_name ON table_name(name);

-- ============================================
-- VIEWS
-- ============================================

CREATE OR REPLACE VIEW view_name AS
SELECT * FROM table_name WHERE active = true;

-- ============================================
-- FUNCTIONS
-- ============================================

CREATE OR REPLACE FUNCTION function_name()
RETURNS void AS $$
BEGIN
  -- Function body
END;
$$ LANGUAGE plpgsql;

-- ============================================
-- TRIGGERS
-- ============================================

CREATE OR REPLACE FUNCTION update_timestamp()
RETURNS TRIGGER AS $$
BEGIN
  NEW.updated_at = NOW();
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trigger_update_timestamp
  BEFORE UPDATE ON table_name
  FOR EACH ROW
  EXECUTE FUNCTION update_timestamp();

-- ============================================
-- COMMENTS
-- ============================================

COMMENT ON TABLE table_name IS 'Description of table purpose';
COMMENT ON COLUMN table_name.name IS 'Description of column';
```

## Common Patterns

### Pattern: Auto-updating Timestamps

```sql
CREATE OR REPLACE FUNCTION update_updated_at()
RETURNS TRIGGER AS $$
BEGIN
  NEW.updated_at = NOW();
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- Apply to all tables that need it
CREATE TRIGGER trigger_users_updated_at
  BEFORE UPDATE ON users
  FOR EACH ROW
  EXECUTE FUNCTION update_updated_at();
```

### Pattern: Soft Delete

```sql
ALTER TABLE packages ADD COLUMN IF NOT EXISTS deleted_at TIMESTAMP WITH TIME ZONE;

CREATE INDEX IF NOT EXISTS idx_packages_not_deleted
ON packages(id) WHERE deleted_at IS NULL;

-- View for active records
CREATE OR REPLACE VIEW active_packages AS
SELECT * FROM packages WHERE deleted_at IS NULL;
```

### Pattern: Enumerated Types

```sql
-- Option 1: CHECK constraint (more flexible)
ALTER TABLE packages
ADD COLUMN status VARCHAR(50) DEFAULT 'active'
CHECK (status IN ('active', 'deprecated', 'archived'));

-- Option 2: ENUM type (more strict)
CREATE TYPE package_status AS ENUM ('active', 'deprecated', 'archived');
ALTER TABLE packages ADD COLUMN status package_status DEFAULT 'active';
```

### Pattern: JSON/JSONB Columns

```sql
ALTER TABLE packages ADD COLUMN IF NOT EXISTS metadata JSONB DEFAULT '{}';

-- Index on JSONB keys
CREATE INDEX IF NOT EXISTS idx_packages_metadata_tags
ON packages USING gin((metadata->'tags'));

-- Index on specific JSON path
CREATE INDEX IF NOT EXISTS idx_packages_metadata_version
ON packages((metadata->>'version'));
```

## Performance Tips

### 1. ANALYZE After Migrations

```sql
-- Update statistics after adding indexes or bulk data
ANALYZE packages;
ANALYZE VERBOSE packages;  -- Show details
```

### 2. EXPLAIN Your Queries

```sql
-- Check if indexes are being used
EXPLAIN ANALYZE
SELECT * FROM packages WHERE type = 'agent' ORDER BY downloads DESC LIMIT 10;

-- Look for:
-- - "Index Scan" (good) vs "Seq Scan" (bad for large tables)
-- - High "cost" values
-- - Long "execution time"
```

### 3. Vacuum After Bulk Changes

```sql
-- Clean up dead rows
VACUUM ANALYZE packages;

-- Full vacuum (locks table)
VACUUM FULL packages;
```

## Migration Checklist

- [ ] All CREATE statements use IF (NOT) EXISTS
- [ ] All DROP statements use IF EXISTS
- [ ] All functions in indexes/generated columns are IMMUTABLE
- [ ] Foreign keys reference existing tables
- [ ] Indexes have meaningful names (idx_table_column pattern)
- [ ] Extensions are optional with error handling
- [ ] Comments added for complex logic
- [ ] Test migration in local/dev before production
- [ ] Migration is idempotent (can run multiple times safely)
- [ ] Large migrations include progress logging

## Testing Migrations Locally

```bash
# Run migration
npm run migrate

# Check for errors
docker-compose logs postgres

# Rollback if needed (manual)
# Connect to DB and DROP objects created by migration

# Verify
docker-compose exec postgres psql -U prpm -d prpm_registry -c "\d packages"
docker-compose exec postgres psql -U prpm -d prpm_registry -c "\di"  # List indexes
```

## Resources

- [PostgreSQL CREATE INDEX](https://www.postgresql.org/docs/current/sql-createindex.html)
- [Generated Columns](https://www.postgresql.org/docs/current/ddl-generated-columns.html)
- [Full-Text Search](https://www.postgresql.org/docs/current/textsearch.html)
- [IMMUTABLE Functions](https://www.postgresql.org/docs/current/xfunc-volatility.html)

More from pr-pm/prpm