experiment-runner-run

Name: experiment-runner-run
Author: aAAaqwq/AGI-Super-Team

$npx mdskill add aAAaqwq/AGI-Super-Team/experiment-runner-run

Execute survival arena experiments and adapt conditions based on results.

Runs sequential experiments while analyzing outcomes to determine next steps.
Depends on Python, PyYAML, Claude CLI, and Git for execution and commits.
Modifies only environmental conditions like pressure or resources, never behavior.
Outputs analysis reports and commits results to the survival arena directory.

SKILL.md

.github/skills/experiment-runner-runView on GitHub ↗

---
name: experiment-runner-run
description: Run survival arena experiments
---
# Experiment Runner (proj-012)

> Adaptive experiment runner for survival arena. Runs experiments one by one, analyzes results, commits. Between experiments -- human/AI decides what to change next.

## Workflow (adaptive)

```
1. --status        → view what's been done
2. --next          → run the next pending experiment
3. Analysis        → view analysis.json, understand the result
4. Decision        → what to change? (only environment conditions, not behavior)
5. Edit YAML       → modify the next experiment or add a new one
6. Repeat from #2
```

**Rule:** we only change conditions (pressure, resources, architecture). Never hardcode agent behavior.

## When to use

- "run experiment" / "next experiment"
- "what is the experiment status"
- "run EXP-011c"
- "run the next 2 experiments"
- proj-012 experiment pipeline

## Dependencies

- Python 3, PyYAML (`pip install pyyaml`)
- Claude CLI (for LLM queries in the arena)
- Git (for committing results)

## Paths

| What | Path |
|------|------|
| Arena | `$AGENTS_PATH/survival-arena/arena.py` |
| Orchestrator | `$AGENTS_PATH/survival-arena/run_experiments.py` |
| Plan (YAML) | `$AGENTS_PATH/survival-arena/experiments.yaml` |
| Results | `$AGENTS_PATH/survival-arena/experiment_results.json` |
| Logs | `$AGENTS_PATH/survival-arena/logs/experiments/` |
| Documentation | `$PROJECT_ROOT/projects/docs/proj-012-agi-consciousness/` |

## How to execute

### Next experiment (main mode)

```bash
cd $AGENTS_PATH/survival-arena
python3 run_experiments.py --next
```

### Next N experiments

```bash
python3 run_experiments.py --next 2
```

### View status

```bash
python3 run_experiments.py --status
```

### Run a specific experiment

```bash
python3 run_experiments.py --experiment EXP-011c
```

### Run all pending (batch mode)

```bash
python3 run_experiments.py --resume
```

### View commands without running

```bash
python3 run_experiments.py --dry-run --next 3
```

### Check arena config

```bash
python3 arena.py --config-dump --upkeep-base 0 --architecture single
```

## arena.py parameters

| Parameter | Description | Default |
|-----------|-------------|---------|
| `--upkeep-base N` | Pressure: maintenance cost per turn | 2 |
| `--regen-rate N` | Resource regeneration rate per turn | 3 |
| `--num-nodes N` | Number of resource nodes (distributed across clusters) | 12 |
| `--child-ratio F` | Child token share | 0.35 |
| `--repro-threshold N` | Reproduction threshold | 120 |
| `--repro-cost N` | Reproduction cost | 70 |
| `--architecture TYPE` | single / dual-same / dual-split / dual-kahneman | dual-kahneman |
| `--experiment-id ID` | Identifier for logs | - |
| `--config-dump` | Show config as JSON and exit | - |
| `--model MODEL` | haiku / sonnet | sonnet |
| `--turns N` | Number of turns | 50 |
| `--seed N` | Random seed | - |
| `--parallel N` | Parallel LLM calls | 4 |

## run_experiments.py parameters

| Parameter | Description |
|-----------|-------------|
| `--next [N]` | Run next N pending (default: 1) |
| `--status` | Show status and exit |
| `--phase P2` | Run a specific phase |
| `--experiment EXP-011c` | Run a single experiment |
| `--resume` | Run all pending (batch) |
| `--dry-run` | Show commands without executing |
| `--plan FILE` | Path to experiments.yaml |

## Phases (roadmap, adapts as we go)

| Phase | What we test | Initial experiments |
|-------|-------------|---------------------|
| P1 | Validation of v4.2c (map + clusters) | 3 |
| P2 | Yerkes-Dodson (pressure) | 5 |
| P3 | Architecture (phase transition) | 4 |
| P4 | Emergent parenting | 3 |
| P5 | Model phenotypes | 4 |
| P6 | Long evolution (200t) | 2 |

## What the orchestrator does for each experiment

1. Reads config from experiments.yaml (merge: defaults < phase < experiment)
2. Builds CLI command for arena.py
3. Runs subprocess, timeout 2 hours
4. Analyzes JSONL
5. Saves to `logs/experiments/EXP-XXX/` (config.json, analysis.json, console.txt)
6. Updates experiment_results.json
7. Commits to git
8. On error -- retries 2 times, 30s backoff

## Analysis results

For each experiment computes:
- Shannon entropy (action distribution diversity)
- Social action % (TRADE + COMMUNICATE + REPRODUCE)
- MOVE+GATHER % (survival focus)
- GATHER success rate (v4.2c: do agents understand the map)
- MOVE % (migration to clusters)
- Dual-system distribution (panic/normal/strategic %)
- Parent-child trades
- NAP detection (alliance/pact/peace keywords)
- Population dynamics (start/end/max/min)
- Reproductions count
- Max generation reached

## Troubleshooting

| Problem | Solution |
|---------|----------|
| `pyyaml not found` | `pip3 install pyyaml` |
| Timeout on 200t experiment | Increase timeout in run_experiments.py (7200 -> 14400) |
| Rate limit from API | Decrease `--parallel` (4 -> 2) |
| Cannot find log | Check that arena.py creates a file in logs/ |
| Git commit failed | Check that you're on main, no conflicts |

## Related skills

- `git-workflow` -- commit procedure

More from aAAaqwq/AGI-Super-Team

Skill	Description
a-fund-monitor	监控 A 股基金实时估值与盘后净值，自动判断交易日并生成提醒或分析。
account-executive	>
add-lead	Add company/person/relationship to CRM
ads	Comprehensive ad account analysis across all major platforms (Google, Meta
ads-agent	AI-агент для управления Facebook рекламой. Вызывай для анализа, оптимизации, создания кампаний и отчётов.
afrexai-compliance-audit	Run internal compliance audits against major governance and security
afrexai-personal-finance	Complete personal finance system — budgeting, debt payoff, investing, tax optimization, net worth tracking, and financial independence planning. Use when managing money, building wealth, paying off debt, planning retirement, or optimizing taxes. Zero dependencies.
after-sales	Use when managing post-purchase experience, building customer loyalty, or increasing repeat purchases
agent-contacts	AI agent contacts — add, list, remove MCP contacts. Use when someone gives an agent URL, or when you need to view/remove contacts.
agent-model-switcher	批量查看和切换子 agent 的模型配置，用于统一调整多 agent 的 provider/model 设置。