python-parallelization
$
npx mdskill add elizaOS/eliza/python-parallelizationTransform sequential Python code to leverage parallel and concurrent execution patterns.
SKILL.md
.github/skills/python-parallelizationView on GitHub ↗
---
name: python-parallelization
description: Transform sequential Python code into parallel/concurrent implementations. Use when asked to parallelize Python code, improve code performance through concurrency, convert loops to parallel execution, or identify parallelization opportunities. Handles CPU-bound (multiprocessing), I/O-bound (asyncio, threading), and data-parallel (vectorization) scenarios.
---
# Python Parallelization Skill
Transform sequential Python code to leverage parallel and concurrent execution patterns.
## Workflow
1. **Analyze** the code to identify parallelization candidates
2. **Classify** the workload type (CPU-bound, I/O-bound, or data-parallel)
3. **Select** the appropriate parallelization strategy
4. **Transform** the code with proper synchronization and error handling
5. **Verify** correctness and measure expected speedup
## Parallelization Decision Tree
```
Is the bottleneck CPU-bound or I/O-bound?
CPU-bound (computation-heavy):
├── Independent iterations? → multiprocessing.Pool / ProcessPoolExecutor
├── Shared state needed? → multiprocessing with Manager or shared memory
├── NumPy/Pandas operations? → Vectorization first, then consider numba/dask
└── Large data chunks? → chunked processing with Pool.map
I/O-bound (network, disk, database):
├── Many independent requests? → asyncio with aiohttp/aiofiles
├── Legacy sync code? → ThreadPoolExecutor
├── Mixed sync/async? → asyncio.to_thread()
└── Database queries? → Connection pooling + async drivers
Data-parallel (array/matrix ops):
├── NumPy arrays? → Vectorize, avoid Python loops
├── Pandas DataFrames? → Use built-in vectorized methods
├── Large datasets? → Dask for out-of-core parallelism
└── GPU available? → Consider CuPy or JAX
```
## Transformation Patterns
### Pattern 1: Loop to ProcessPoolExecutor (CPU-bound)
**Before:**
```python
results = []
for item in items:
results.append(expensive_computation(item))
```
**After:**
```python
from concurrent.futures import ProcessPoolExecutor
with ProcessPoolExecutor() as executor:
results = list(executor.map(expensive_computation, items))
```
### Pattern 2: Sequential I/O to Async (I/O-bound)
**Before:**
```python
import requests
def fetch_all(urls):
return [requests.get(url).json() for url in urls]
```
**After:**
```python
import asyncio
import aiohttp
async def fetch_all(urls):
async with aiohttp.ClientSession() as session:
tasks = [fetch_one(session, url) for url in urls]
return await asyncio.gather(*tasks)
async def fetch_one(session, url):
async with session.get(url) as response:
return await response.json()
```
### Pattern 3: Nested Loops to Vectorization
**Before:**
```python
result = []
for i in range(len(a)):
row = []
for j in range(len(b)):
row.append(a[i] * b[j])
result.append(row)
```
**After:**
```python
import numpy as np
result = np.outer(a, b)
```
### Pattern 4: Mixed CPU/IO with asyncio
```python
import asyncio
from concurrent.futures import ProcessPoolExecutor
async def hybrid_pipeline(data, urls):
loop = asyncio.get_event_loop()
# CPU-bound in process pool
with ProcessPoolExecutor() as pool:
processed = await loop.run_in_executor(pool, cpu_heavy_fn, data)
# I/O-bound with async
results = await asyncio.gather(*[fetch(url) for url in urls])
return processed, results
```
## Parallelization Candidates
Look for these patterns in code:
| Pattern | Indicator | Strategy |
|---------|-----------|----------|
| `for item in collection` with independent iterations | No shared mutation | `Pool.map` / `executor.map` |
| Multiple `requests.get()` or file reads | Sequential I/O | `asyncio.gather()` |
| Nested loops over arrays | Numerical computation | NumPy vectorization |
| `time.sleep()` or blocking waits | Waiting on external | Threading or async |
| Large list comprehensions | Independent transforms | `Pool.map` with chunking |
## Safety Requirements
Always preserve correctness when parallelizing:
1. **Identify shared state** - variables modified across iterations break parallelism
2. **Check dependencies** - iteration N depending on N-1 requires sequential execution
3. **Handle exceptions** - wrap parallel code in try/except, use `executor.submit()` for granular error handling
4. **Manage resources** - use context managers, limit worker count to avoid exhaustion
5. **Preserve ordering** - use `map()` over `submit()` when order matters
## Common Pitfalls
- **GIL trap**: Threading doesn't help CPU-bound Python code—use multiprocessing
- **Pickle failures**: Lambda functions and nested classes can't be pickled for multiprocessing
- **Memory explosion**: ProcessPoolExecutor copies data to each process—use shared memory for large data
- **Async in sync**: Can't just add `async` to existing code—requires restructuring call chain
- **Over-parallelization**: Parallel overhead exceeds gains for small workloads (<1000 items typically)
## Verification Checklist
Before finalizing transformed code:
- [ ] Output matches sequential version for test inputs
- [ ] No race conditions (shared mutable state properly synchronized)
- [ ] Exceptions are caught and handled appropriately
- [ ] Resources are properly cleaned up (pools closed, connections released)
- [ ] Worker count is bounded (default or explicit limit)
- [ ] Added appropriate imports
More from elizaOS/eliza
- ac-branch-pi-modelAC branch pi-model power flow equations (P/Q and |S|) with transformer tap ratio and phase shift, matching `acopf-math-model.md` and MATPOWER branch fields. Use when computing branch flows in either direction, aggregating bus injections for nodal balance, checking MVA (rateA) limits, computing branch loading %, or debugging sign/units issues in AC power flow.
- academic-pdf-redactionRedact text from PDF documents for blind review anonymization
- ada-plan-view-accessibilityUse when checking simplified ADA-derived plan-view bathroom accessibility constraints such as turning space, door clear width, toilet centerline, grab bars, and lavatory knee/toe clearance.
- analyze-ciAnalyze failed GitHub Action jobs for a pull request.
- architectural-dxf-extractionUse when extracting plan-view architectural geometry from DXF files with semantic CAD layers, especially when outputs must normalize rooms, doors, fixtures, clearances, and grab bars into machine-checkable JSON.
- attitude-controller-plannerUse this skill when implementing the inner control loop for a quadrotor — attitude (roll/pitch/yaw) PID control and attitude planning (converting desired acceleration to desired Euler angles). Covers gain layout, integral reset pattern, and the attitude planner inverse kinematics.
- azure-bgpAnalyze and resolve BGP oscillation and BGP route leaks in Azure Virtual WAN–style hub-and-spoke topologies (and similar cloud-managed BGP environments). Detect preference cycles, identify valley-free violations, and propose allowed policy-level mitigations while rejecting prohibited fixes.
- box-least-squaresBox Least Squares (BLS) periodogram for detecting transiting exoplanets and eclipsing binaries. Use when searching for periodic box-shaped dips in light curves. Alternative to Transit Least Squares, available in astropy.timeseries. Based on Kovács et al. (2002).
- browser-testingVERIFY your changes work. Measure CLS, detect theme flicker, test visual stability, check performance. Use BEFORE and AFTER making changes to confirm fixes. Includes ready-to-run scripts: measure-cls.ts, detect-flicker.ts
- cache-policy-comparisonCompare and implement eviction policies (LRU, LFU, FIFO, S3FIFO, ARC) for bounded-capacity caches. Use when choosing or implementing an eviction policy for a buffer pool, page cache, CDN edge, or LLM KV cache, or when writing a replay simulator that supports multiple policies. Clarifies recency vs frequency semantics, queue topology, saturating counters, ghost buffers, and the second-chance rule that distinguishes modern FIFO-family policies from classic LRU.