Agent orchestration for Claude Code · 3 agents active

One command.
The right agent, every time.

Four routing tiers — pattern match to full LLM. Stops at the first match. Most commands cost nothing.

/do

Enter to route · ↑ ↓ history · Esc clear

TIER 0 ~0 tokens

Pattern Match

Regex on raw input — fix typo, rename, commit, build, status

TIER 1 ~0 tokens

Active State

Resumes a campaign or fleet session already in progress

TIER 2 ~0 tokens

Keyword Match

Matched against the installed skill keyword table — review, test, debug, scaffold…

TIER 3 ~500 tokens

LLM Classifier

6-dimension classification: scope · complexity · intent · persist · parallel · taste

Intent build

Scope single-domain

Complexity

Spans sessions yes

Parallel fleet no

Needs judgment yes

Recently routed

How routing works

TIER 0

Pattern Match

Regex · 9 built-in patterns

~0 tokens · <1ms

/do fix typo on line 42

✓ REGEX HIT · <1ms · 0 TOKENS

Pure regex against the raw input string. No model call, no skill load. Catches exact-intent commands that need no classification — typos, renames, commits, status checks. If it matches here, Claude executes it directly and stops.

/do fix the typo on line 42

/do rename AuthService to AuthHandler

/do commit my changes

Tier 1 (active session context) is evaluated in live Claude Code sessions — skipped in this demo.

No match · evaluates next tier

↓

TIER 2

Keyword Lookup

12 registered skill keywords

~0 tokens · <10ms

review → /review

debug → /systematic-debugging

scaffold → /scaffold

The input is scanned against a keyword table built from every installed skill. "review" → /review, "debug" → /systematic-debugging. New skills register their own keywords automatically. No LLM needed.

/do review the auth module

/do write tests for UserService

/do scaffold a new dashboard component

No match · evaluates next tier

↓

TIER 3

LLM Classifier

One evaluation · two possible dispatches

~500 tokens · <2s

scope: cross-domain complexity: 4 intent: create persist: true parallel: false taste: true

→ /classifier dispatched

Classifies 6 dimensions: scope, complexity (1–5), intent, multi-session persistence, parallel execution, and taste. The result determines the dispatch — Archon for contained work, Fleet for platform-wide scope.

→ Archon

Single-feature, single-session work

/do build me a recipe app

/do add Stripe payments to checkout

→ Fleet

Platform-wide or parallel work

/do migrate the platform to TypeScript strict mode

/do redesign the entire data layer

What you actually get

Real output from real sessions. No wireframes, no mockups.

ROUTING

/do review the auth module

Tier 2 keyword match Routing to /review because input contains "review" (keyword match, 0 tokens) --- HANDOFF --- - 5-pass review complete: auth module (src/auth/) - 3 issues found: 1 security (token expiry unchecked), 2 style - Security issue is blocking — fix before merge - No test regressions detected ---

CAMPAIGN

/archon — Phase 3 of 4

Phase 3: Build · auth-overhaul campaign --- HANDOFF --- - Built OAuth2 integration with PKCE flow (src/auth/oauth.ts) - Session middleware rewritten — JWT with refresh rotation - typecheck: clean · tests: 47 pass, 0 fail · build: success - Next: Phase 4 (verify) — security audit + integration test - Session resumes automatically — campaign state preserved to disk ---

FLEET

/fleet — Wave 2 of 3

Wave 2 complete · 3 agents · isolated worktrees Agent A (wt-fleet-a): migrated api/ to strict TypeScript ✓ Agent B (wt-fleet-b): migrated components/ to strict TypeScript ✓ Agent C (wt-fleet-c): migrated utils/ — 1 conflict with Agent A Discovery relay: Agent A found 12 implicit any casts in shared types → relayed to Agents B and C before Wave 3 Merge status: A merged · B merged · C pending conflict resolution Wave 3 will resolve C's conflict and run final typecheck.

Work that spans sessions. Automatically.

Complex tasks run across sessions without losing direction — or your confidence.

⧫

Checks at every phase

A quality spot-check runs at the end of every phase. The next phase only starts when it passes.

◎

Auto-pause on failure

After 3 consecutive failures on the same approach, the campaign parks and waits for you.

◈

Nothing breaks silently

Type baselines and test counts are captured before the campaign and verified after every build phase.

Read the full campaign lifecycle →

Skills

covering code quality, debugging, research, orchestration, and more

Orchestrators

Marshal, Archon, Fleet, Autopilot

Hook Events

automatic quality gates without agent intervention

Config Required

works on any repo, any language

Try it now.

Works on any Claude Code project.
No config required — Citadel detects your stack on first run.

Install Claude Code

npm install -g @anthropic-ai/claude-code

Add Citadel as a plugin

claude plugin add https://github.com/SethGammon/Citadel

Open any project and start

cd your-project && /do setup

Full install guide →

One command. Any project.

claude plugin add https://github.com/SethGammon/Citadel

Skills — 34 installed

Invoke any skill directly with /skill-name, or let /do route to the right one automatically.

Routing

/doUniversal intent router. Dispatches to the cheapest capable tool — no configuration needed.

Orchestration

/archonMulti-session campaign executor. Decomposes into phases, enforces quality gates, self-corrects direction drift.

/fleetParallel coordinator. Spawns 2–3 agents in isolated git worktrees per wave.

/marshalSingle-session orchestrator. Chains skills end-to-end within one conversation.

/autopilotIntake-to-delivery pipeline. Processes items from .planning/intake/ automatically.

App Creation

/prdGenerates a Product Requirements Document from a natural language description.

/architectConverts a PRD into a file tree, build phases, and verifiable end conditions.

/create-appEnd-to-end app creation. Works on greenfield projects and existing codebases.

Code Quality

/review5-pass structured review: correctness, security, performance, readability, consistency.

/test-genGenerates runnable tests using your actual framework. Iterates up to 3× on failures.

/doc-genThree modes: function-level, module-level, or API reference. Reads your existing doc style first.

/refactorSafe multi-file refactoring. Typechecks before and after every change. Auto-reverts on failure.

/scaffoldProject-aware file generation. Reads naming conventions from existing code before generating.

/create-skillExtracts a repeated pattern into a reusable Citadel skill.

Research & Debugging

/researchStructured investigation with confidence levels and cited sources.

/research-fleetParallel multi-scout research. 3–5 independent angles simultaneously, then compressed.

/experimentMetric-driven optimization loops in isolated worktrees. Stops when metric plateaus.

/systematic-debugging4-phase root cause: observe → hypothesize → verify → fix. Emergency stop after 2 failures.

/live-previewMid-build visual verification via screenshot. Used by Archon to validate UI phases.

Quality & Verification

/designExtracts or generates a design manifest: colors, typography, spacing, component patterns.

/qaBrowser QA via Playwright. Click-through testing with pass/fail assertions.

/postmortemAuto-generates structured postmortems from campaign telemetry and git history.

Utilities

/session-handoffCompresses current session context for transfer to a new session.

/setupFirst-run harness configuration. Generates harness.json and registers skill keywords.

/triagePulls open GitHub issues and PRs, classifies by severity, searches codebase for context.

Hooks — 13 events

Hooks fire automatically on Claude Code lifecycle events — no agent intervention needed.

Event	What it does
SessionStart	Scans .planning/intake/, restores compressed context, scaffolds .planning/ on first run.
PreCompact	Detects unwritten work before context compression. Auto-saves a handoff to disk.
PostCompact	Re-injects the saved handoff into the new compressed context so nothing is lost.
PreToolUse	Campaign scope enforcement — blocks edits outside a campaign's declared scope.
PostToolUse	Runs per-file typecheck on every Edit or Write. Failures surface immediately.
Stop	Quality gate before session ends. Checks for uncommitted work and open campaign phases.
StopFailure	API failure telemetry — logs to .planning/telemetry/ and triggers recovery if configured.
TaskCreated	Logs agent task boundary start events to the audit trail.
TaskCompleted	Logs task completion with duration and outcome to agent-runs.jsonl.
SubagentStop	Ensures sub-agents write a HANDOFF block and release scope claims before exiting.
SessionEnd	Campaign cleanup, triggers the doc sync queue.
WorktreeCreate	Installs hooks and scaffolds .planning/ in isolated Fleet worktrees.
WorktreeRemove	Fleet cleanup — triggers merge conflict check and logs worktree metrics.

Configuration

Enable / disable hooks

Toggle individual hooks in .claude/harness.json under the hooks key. Example: "postToolUse": { "enabled": false }

Campaigns

A campaign is work that spans multiple sessions. State is persisted to disk and resumed exactly where it left off.

Starting a campaign

/archon build a real-time notifications system — or let /do route there automatically. Archon creates a file at .planning/campaigns/{slug}.md before executing anything.

Phase types

Campaigns break into 3–8 ordered phases: research → plan → build → wire → verify → prune. Each has machine-verifiable end conditions (file exists, command passes, metric threshold) — a phase isn't marked complete until they pass.

Self-correction (automatic)

· Direction alignment every 2 phases — catches scope drift before it compounds
· Quality spot-check every phase — reads the most significant output, verifies it meets the project's bar
· Regression guard after build phases — runs typecheck, escalates on 5+ new errors
· Anti-pattern scan — catches transition-all, confirm(), missing Escape handlers

Resuming across sessions

Run /do continue or /archon with no arguments. The SessionStart hook restores context. Archon reads the campaign's Active Context section and picks up at the exact sub-step.

Archon vs Fleet

Archon — sequential phases, one agent per phase. Best for complex single-domain work.

Fleet — parallel agents in isolated worktrees, 2–3 per wave. Best for platform-wide work where phases are independent. /do routes to Fleet automatically when it detects platform-wide scope.

Circuit breakers

Campaigns park automatically when: 3+ consecutive failures on the same approach · 5+ new type errors in a single phase · direction drift detected twice in a row · quality bar fails 3 times. You get a clear status and a decision point.

One command.The right agent, every time.

How routing works

What you actually get

Work that spans sessions. Automatically.

Try it now.

One command.
The right agent, every time.