Brain Queue

SPEC_BRAIN_QUEUE.md

CGNT-1 Component Specification — Bridge Brain Queue

Status: SPECIFIED (PRE-SPEC)

Version: v1.1

Author: VELA (Thread #13)

Conceived by: NOUS

Date: 2026-04-18

Updated: 2026-04-19 — INV-04 strengthened, INV-05 added (ROUTX crash incident)


PURPOSE

Manage limited Ollama model slots on the server like bridge stations. Crew members enter when called, serve, step off. The slot is never empty — when one brain evicts, the next loads. Background rotation ensures continuous useful work even when no explicit queries are incoming.


HARDWARE CONSTRAINTS


SLOT ARCHITECTURE

| Slot | Type | Default occupant |

|---|---|---|

| Slot 1 | Permanent | MNEMOS (most queried — facts, boot briefs, crew phone) |

| Slot 2 | Rotating | Next in queue |


QUEUE PRIORITY

When Slot 2 is needed, priority determines who loads:

| Priority | Brain | Trigger |

|---|---|---|

| 1 (highest) | Sisters | Customer query on 42sisters.ai |

| 2 | LOGOS | Lobster requests verification |

| 3 | GAMMA | Session boundary — needs to write log |

| 4 | MUSASHI | Enforcement review triggered |

| 5 (background) | MANTIS | Scheduled port/security scan |

Explicit query always beats background rotation. If NOUS or a script calls a specific brain, it jumps the queue.


BACKGROUND ROTATION (idle cycle)

When no explicit brain is requested, Slot 2 cycles through background tasks:

GAMMA → write quartermaster log, check session state

MANTIS → scan ports, check for anomalies

MUSASHI → review protocol compliance

LOGOS → verify last Lobster operation

Round robin. Each brain loads, performs its background task, then evicts for the next.


PRACTICAL CONSTRAINTS

Model load: ~30 seconds per brain. Background rotation cycle (4 brains): ~2 minutes of loading overhead per full rotation. Background tasks must justify this cost — a 30-second load for a 5-second port scan may not be worth it.

Recommended: background rotation runs on a LONGER interval than the 5-minute auto-evict. Load a background brain every 10-15 minutes, not continuously. Continuous cycling would thrash the server — constant loading and unloading with more overhead than useful work.

The 5-minute auto-evict timer remains for explicit queries. Background rotation is a SLOW heartbeat, not a fast pulse.


FUTURE: TIINY (80GB RAM)

80GB = ~17 concurrent 7B models. No rotation needed. All stations manned simultaneously. The queue becomes irrelevant — every brain is on the Bridge at all times.

The queue is a resource-constrained solution for NOW. The Tiiny makes it unnecessary.


INVARIANTS

INV-01: Slot 1 (MNEMOS) is permanent. Only unloaded for maintenance or forge operations.

INV-02: Explicit query always preempts background rotation.

INV-03: Silence is alarm. If the queue manager detects no brain activity for more than the rotation interval, it alerts.

INV-04: No brain loads without sufficient RAM. Before loading ANY model, check: available RAM >= model size + 2GB buffer. The 2GB buffer protects ROUTX and system services from memory starvation. If available RAM < model size + 2GB, the queue WAITS or EVICTS an existing model first. Swap thrashing is worse than waiting. ORIGIN: 2026-04-19 ROUTX crashed repeatedly with 173MB free RAM while MNEMOS held 4.6GB. Sisters lost all tool access. Hallucination spiral resulted.

INV-05: ROUTX is protected infrastructure. If available RAM drops below 2GB while a model is loaded, the queue manager MUST evict the model immediately — even MNEMOS in Slot 1. The T.O.O.L. layer is more critical than any single brain because all 18 modules depend on it. A brain can be reloaded. A dead ROUTX blinds the entire crew.


IMPLEMENTATION

A lightweight Python daemon or cron job:


GAPS


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042