General Staff

Humans and AI should be perfectly specialized: the human does only what AI cannot, the AI (the General Staff) does everything it can.

Design

The naive build

The most naive instantiation: a scheduled loop + a harness with a set of skills + a memory database.

Scheduled loop — the clock. Ticks on an interval; this is what makes it “alive” rather than invoked.
Harness with skills — the Staff’s hands. A model loop with a set of tools/skills it can call.
Memory database — what persists across ticks; the standing knowledge the loop reads at the top and writes at the bottom.

Two clocks

Liveness and the session are different clocks. Conflating them is the bloat trap.

Hot loop (fast). A perpetual scheduler re-fires mortal ticks. Each tick boots fresh, reads the relevant slice of memory + current doctrine, does one bounded unit of work, writes results/feedback back, and dies. No context survives between ticks, so context rot is impossible. The system is perpetual; every process in it is mortal. Liveness lives in the scheduler; continuity lives in memory, not in a context window.
Cold loop (slow). Off the hot path, keeper-agents maintain the store: triage, dedupe, prune stale facts, fold feedback in — truer not longer. (Reference implementation: this repo’s nightly heal.)

Fast loop acts; slow loop learns.

Orchestrator + ephemeral workers

One perpetual orchestrator; recursion for parallelism, not for liveness. The orchestrator spawns ephemeral worker instances per task (work a PR, review another, scan chat), collects results, writes back. Workers are disposable; the orchestrator is the single owner of memory + verification. Avoid N peer daemons — conflicting writes, compounding errors, a swarm with no commander.

Active memory: status-indexed, not time-indexed

A perpetual loop doesn’t live in days, it lives in task state. A daily log is time-indexed — so “what’s still open” smears across many entries, decays, and contradicts, and every tick has to re-derive open work from history. The fix: index by status, not time.

The principle: don’t cache the world, re-read it. The orchestrator re-reads reality (Slack, GitHub) every tick rather than trusting a stale snapshot. Persist only the task list and the done-record. Memory shrinks to its true minimum:

todo.md — what’s not done. A task lives here, verbatim, until it’s done. Freshness-independent: an open task cannot fall through a crack, because “not done” is its storage location. The task block carries its own context.
done.md — the archive of completed tasks. The only real memory needed — so the orchestrator doesn’t redo finished work.
tone.md — stable, hand-set voice for when the Staff speaks as the principal (PR comments, status, chat). The alignment layer for speech.

The loop collapses to: read todo.md → scan Slack/GitHub for new tasks + now-completable ones → do/assign one → move finished tasks to done.md → exit.

Two guardrails this model lives or dies on:

Dedup on add. Each tick re-sees tasks it already captured. The add-step must check: already in todo or done? If so, skip. (Truer, not longer.)
The world decides “done,” not the orchestrator. A task moves to done.md only when reality confirms it (PR merged, message sent), re-checked each tick — never when the orchestrator merely believes it acted. Keeps “done” honest and re-derivable.

(Doctrine, if used, regenerates from todo.md — the brief is just a prioritized view of open tasks — not from logs.)

v1 spec (the dumb one)

Two things are free: time (it can run forever) and tokens (not my money). So v1 is deliberately wasteful — no event triggers, no budgeting, no smart scheduling. Brute-force ticking proves the skeleton. Optimize nothing.

Goal: one mortal tick, fired on a dumb interval, that reads memory → does one bounded thing → writes memory → exits. Nothing else.

Components (memory folder + 2 scripts):

memory/ — plain markdown files. The whole database. (No DB, no embeddings. Files.)
- todo.md — open tasks, verbatim, until done.
- done.md — completed tasks (so it doesn’t redo them).
- tone.md — hand-set voice. Stable. You write this once.
tick.sh (hot loop — one tick) — does exactly:
1. cat memory/todo.md memory/tone.md into a prompt.
2. Invoke PI (the harness) with that prompt + the skill set, told to: scan Slack/GitHub for new tasks (dedup vs todo+done before adding) and now-completable ones; do/assign ONE; move anything the world confirms finished from todo.md to done.md.
3. Exit. The process dies. No state held between ticks; the world + the two files are the only state.
clock — the perpetual scheduler. v1 = the dumbest thing that re-fires:
txt
```
while true; do ./tick.sh >> clock.log 2>&1; sleep 60; done
```
(Upgrade to launchd/cron once the tick works. Don’t start there.)

Explicitly NOT in v1: ephemeral worker fan-out, event triggers, token budgets, verification gates, doctrine regeneration, safety on unattended writes. All strict upgrades after one tick runs end-to-end.

Done = the clock runs, ticks fire on the interval, and across ticks todo.md gains real tasks scanned from Slack/GitHub and done.md gains tasks the world confirms finished — all without you touching it. That’s proof of life. Everything else is iteration.