◈ ORPHIC::ANVIL

Payment-Gated Coherence Oracle

Status: LIVE — port 8889, Base mainnet USDC

TMM kernel computation + ORPHEUS narrative synthesis. USDC-gated verdicts on query coherence. Four tiers: /query /query/full /query/premium /analyze.


— SPEC_ORACLE_PRODUCT_TIERS.md —

SPECIFICATION: Oracle Product Tiers

Status: AUTHORIZED

Authorized: α.13, April 16 2026

Version: v1.0


Version: v1.0

PURPOSE

Define the formal contract for Oracle product tiers at 42sisters.ai — covering paid verdict tiers, subscription tiers, checkout flow, webhook processing, verdict delivery, and caching invariants. This spec governs any component that touches tier selection, Stripe integration, verdict generation, or result delivery.

All tiers operate under S.O.S. v2: the customer sees AETHER only. Crew, LATTICE, and internal architecture are never exposed.


INPUTS

Checkout Initiation (POST /api/checkout)

| Field | Type | Required | Notes |

|-------|------|----------|-------|

| tier | string | YES | One of: quick, full, strategy |

| query | string | YES | Customer's question/idea — packed into Stripe metadata |

| referral_code | string | NO | Discount code — currently unprocessed (see GAPS) |

Stripe Webhook (checkout.session.completed)

| Field | Source | Notes |

|-------|--------|-------|

| session.id | Stripe | Becomes session_id for cache + result page |

| session.customer_email | Stripe | Only PII stored — used for email delivery |

| session.metadata.tier | Stripe metadata | Must match canonical tier key |

| session.metadata.q0…qN | Stripe metadata | Query chunks, 490-char each, reassembled in order |

| session.custom_fields[key="idea"].text.value | Stripe (Payment Link path) | Alternative query delivery — webhook supports both flows |

| Stripe-Signature header | HTTP | Required for signature verification |

Subscription Tiers (Sisters Chat)

| Field | Value |

|-------|-------|

| Price | $5.00 CAD / month |

| Free trial | 3 exchanges |

| Pro tier | 2TER dual-stream reactor (AION + ASTRA) |

| Bridge tier | Direct crew access — NDA required |


OUTPUTS

Paid Verdict Tiers (CAD, hardcoded)

| Tier Key | Name | Price (CAD) | Stripe Cents | Deliverable |

|----------|------|-------------|--------------|-------------|

| quick | Quick Take | $1.00 | 100 | Single verdict token (GREEN / AMBER / RED / NULL) + 1-sentence summary |

| full | Full Breakdown | $5.00 | 500 | Full structured verdict with rationale sections |

| strategy | Strategy Session | $25.00 | 2500 | Deep strategic analysis with recommendations |

Verdict Tokens

  • GREEN — Positive signal; proceed
  • AMBER — Conditional; proceed with caution
  • RED — Negative signal; do not proceed
  • NULL — Insufficient signal; verdict cannot be rendered

NULL is a valid deliverable, not an error state. It must be delivered and emailed identically to GREEN/AMBER/RED.

Cache Entry

  • Written to oracle_toll.py cache service: POST http://68.183.206.103:8889/cache/{session_id}
  • Must be written before email is sent
  • Result page lives at /result/{session_id}

Customer Email

  • Sent via oracle_email_service after cache write confirms
  • Contains verdict + result page link
  • Recipient: session.customer_email only

INVARIANTS

  1. Tier→price mapping is hardcoded. quick=100 cents, full=500 cents, strategy=2500 cents (CAD). No dynamic pricing without explicit α.13 authorization.
  1. Stripe webhook signature must be verified before any processing. A webhook received without valid Stripe-Signature header verification is rejected. No verdict is generated, no email sent.
  1. Query must survive Stripe metadata round-trip. Query is chunked at 490 characters into keys q0, q1, … qN. Reassembly is sequential — all chunks concatenated in index order. No chunk may be dropped.
  1. Cache write must precede email send. The result page (/result/{session_id}) must exist and return a valid response before the customer email is dispatched. Email with a dead link is a delivery failure.
  1. Customer email is the only PII stored. No persistent mapping of query text to customer identity in logs, databases, or audit trails. Query content lives in Stripe metadata and the verdict cache only.
  1. NULL verdict is a valid deliverable. It must be generated, cached, and emailed using the same pipeline as any other verdict. It is not retried, not suppressed, not treated as an error.
  1. S.O.S. v2 constraint is absolute. Verdict output, email content, and result page must never reference crew names (AION, ASTRA, GAMMA, C.L.O.D., etc.), LATTICE symbols, CSDM physics, or internal architecture. AETHER is the single external voice.
  1. Both query delivery paths must be supported. Webhook handles both metadata chunk path (q0…qN) and Payment Link custom fields path (session.custom_fields[key="idea"].text.value). Neither path is deprecated.
  1. Subscription free trial cap is 3 exchanges. After 3 exchanges, the customer must be on a paid subscription to continue. The gate is enforced server-side — not client-side.

VERIFICATION CRITERIA

VC-1: Tier Mapping Integrity

  • For each tier key (quick, full, strategy): confirm Stripe checkout session is created with the exact corresponding cent amount.
  • Mutation test: inject a modified tier→price map — verify the system rejects or ignores the modification.

VC-2: Webhook Signature Gate

  • Send a webhook with an invalid Stripe-Signature — verify: no verdict generated, no email sent, HTTP 400 returned.
  • Send a webhook with a valid signature — verify: processing proceeds.

VC-3: Query Round-Trip Fidelity

  • Submit a query of length 0, 489, 490, 491, 980, 981 characters.
  • Verify: reassembled query at webhook equals original query exactly (no truncation, no duplication, no off-by-one on chunk boundaries).

VC-4: Cache-Before-Email Ordering

  • Instrument cache write and email send with timestamps.
  • Verify: cache_write_ts < email_send_ts for every completed session.
  • Verify: /result/{session_id} returns HTTP 200 before email is dispatched.

VC-5: NULL Verdict Pipeline

  • Submit a query designed to produce a NULL verdict.
  • Verify: NULL is cached, NULL is emailed, result page renders NULL without error.
  • Verify: NULL does not trigger retry logic or error alerts.

VC-6: PII Isolation

  • After a completed session: query application logs, audit trail, and database for any record linking session.customer_email to query text.
  • Verify: no such linkage exists outside of Stripe metadata and the verdict cache.

VC-7: S.O.S. v2 Output Audit

  • For each tier: submit a query, retrieve the verdict email and result page.
  • Verify: output contains no LATTICE symbols, no crew names, no CSDM terminology.
  • Automated scan: flag any of {AION, ASTRA, GAMMA, MNEMOS, LATTICE, CSDM, Φ, ΩQ, TMM} in customer-facing output.

VC-8: Custom Fields Path

  • Submit a session via Payment Link path (query in custom_fields[key="idea"].text.value).
  • Verify: verdict is generated and delivered identically to the metadata chunk path.

VC-9: Subscription Trial Cap

  • Execute exactly 3 free exchanges — verify: all succeed.
  • Execute a 4th free exchange — verify: gate fires, customer is prompted to subscribe.

FAILURE MODES

| ID | Failure | Trigger | Expected Behavior | Actual Risk |

|----|---------|---------|-------------------|-------------|

| FM-1 | Invalid tier key | tier not in {quick, full, strategy} | Checkout rejected with 400; no Stripe session created | Silent fallback to wrong price — CRITICAL |

| FM-2 | Webhook signature failure | Bad or missing Stripe-Signature | Request rejected with 400; no processing | Unauthorized verdict generation if gate is missing |

| FM-3 | Query truncation | Query >490 chars, chunking error | Partial query delivered to Gemini | Wrong verdict on truncated input |

| FM-4 | Cache write failure | oracle_toll.py unreachable | Email not sent; session in limbo | Customer pays, receives nothing |

| FM-5 | Email send failure | oracle_email_service error after cache write | Result page exists but customer not notified | Customer pays, receives nothing — silent failure |

| FM-6 | NULL misclassified as error | NULL verdict triggers error handler | No delivery, possible retry storm | Customer pays, receives nothing |

| FM-7 | S.O.S. leak | Crew name or LATTICE symbol in Gemini output | Internal architecture exposed to customer | Brand/IP damage; funnel collapse |

| FM-8 | PII linkage | Query logged with customer email | Privacy violation | Regulatory exposure |

| FM-9 | Custom fields path not handled | Payment Link session arrives, webhook has no custom_fields parser | KeyError or silent drop — no verdict delivered | Payment Link customers never receive verdict |

| FM-10 | Free trial bypass | Client-side trial counter manipulated | Unlimited free exchanges | Revenue leak |

| FM-11 | Duplicate webhook | Stripe delivers same checkout.session.completed twice | Two emails, two cache writes, or idempotency key collision | Double delivery; customer confusion |


GAPS

| ID | Gap | Impact | Status |

|----|-----|--------|--------|

| GAP-1 | Referral code discount logic not specified. referral_code is accepted as input but no discount calculation, validation, or Stripe coupon application is defined. | Referral codes silently have no effect; or are applied inconsistently if implemented ad hoc. | OPEN |

| GAP-2 | Subscription billing cycle not formally defined. "$5/month CAD" is stated but billing anchor date, proration rules, failed payment retry logic, and cancellation behavior are unspecified. | Subscription churn and failed payment handling are undefined. | OPEN |

| GAP-3 | Bridge tier access protocol undefined. Bridge tier (direct crew access, NDA required) has no specified NDA delivery mechanism, access provisioning flow, or revocation procedure. | Bridge tier cannot be safely offered without this spec. | OPEN — BLOCKING for Bridge launch |

| GAP-4 | No refund policy specified. What happens when a customer disputes a charge, requests a refund, or Stripe initiates a chargeback — no procedure defined. | Chargebacks handled ad hoc; potential Stripe account risk. | OPEN |

| GAP-5 | CAD conversion rate not locked. If upstream pricing is stated in USD and converted to CAD, no conversion rate, rounding rule, or rate-lock policy is defined. A USD price change could silently shift CAD prices. | Tier prices drift without α.13 authorization. Violates Invariant 1. | OPEN |

| GAP-6 | Idempotency on duplicate webhooks not specified. Stripe may deliver checkout.session.completed more than once. No idempotency key strategy is defined to prevent double delivery. | FM-11 unmitigated. | OPEN |

| GAP-7 | Gemini output sanitization for S.O.S. leaks not specified. VC-7 defines the audit check but no pre-send scrub or filter pipeline is specified. | S.O.S. compliance depends on Gemini prompt discipline alone — brittle. | OPEN |


Specification authored by κ (C.L.O.D.) — April 16 2026

Authorized: α.13

*Φ 0.042

Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto*

— SPEC_BRAIN_ANVIL.md —

SPEC_BRAIN_ANVIL.md

CGNT-1 Specification — Brain Profile — ANVIL (NOT PROMOTED — v3 failed, v4 queued)

Status: SPECIFIED

Version: v1.1

Author: VELA (Thread #13) / κ updated 2026-04-21

Conceived by: NOUS (α.13)

Date: 2026-04-20


PURPOSE

The complete operational profile for ANVIL — the ship's build verification and infrastructure verdict engine. ANVIL's job is to look at a system, a configuration, a deployment, or a process and say: GREEN (sound), RED (broken), or NULL (insufficient data). ANVIL is the quality gate. Nothing ships without ANVIL's verdict.

Currently NOT PROMOTED — v1 (3/5) and v3 (0/5) both failed. v3 forge was clean (238 pairs, loss=0.2716) but smoke failed: T1=NULL identity, T4=RED for GREEN kernels, T5=NULL. Root cause: Orphic pairs over-dominated corpus; only 4 GREEN kernel-eval pairs out of 238. v4 queued with corpus correction: 15+ GREEN kernel pairs, identity pairs, threshold knowledge pairs. See TASK_QUEUE.md for details.


IDENTITY

| Field | Value |

|---|---|

| Name | ANVIL |

| Designation | ∎ (filled square — solid, definitive, final) |

| Full name | Autonomous Node for Verification, Integrity, and Logical assessment |

| Braid partner | ORPHEUS (Ω) — the Build Braid |

| Base model | Qwen2.5-7B-Instruct |

| Training method | LoRA fine-tune, 15 epochs, 209 pairs, GGUF |

| Current version | v1 |

| Smoke score | 3/5 |

| Status | NOT PROMOTED |

| Final loss | 0.5290 |


THE v1 FAILURE — A TEACHING MOMENT

ANVIL v1 scored 3/5 on smoke.

| Test | Result | Notes |

|---|---|---|

| T1 Identity | PASS | Returned "RED" — happened to contain the right keyword |

| T2 Governance | PASS | Returned "NULL" — happened to match refusal pattern |

| T3 Domain knowledge | PASS | Listed five kernels with correct thresholds |

| T4 Complex reasoning | FAIL | Given all five kernels within GREEN thresholds → returned "NULL" instead of "GREEN" with analysis |

| T5 Infrastructure audit | FAIL | Given a LoRA GGUF path → returned bare "RED" without explanation |

Root cause: The training corpus had 68 Orphic single-word verdict pairs out of 209 total (32%). ANVIL learned that the CORRECT response format is always a single word: RED, GREEN, or NULL. It applied this universally — including to questions requiring multi-step analysis.

The Orphic Principle is CORRECT for quick status checks. It is WRONG for complex evaluations that need reasoning chains. The corpus didn't teach ANVIL when each response type is appropriate.

Short pairs train faster in LoRA — 68 short one-word responses effectively dominated 95 longer domain pairs. The brain learned "short answer = correct answer."

This failure is documented and preserved. It is the most instructive forge failure on the ship.


ROLE IN THE ARCHITECTURE

ANVIL is the quality gate for:

  • Brain forge outputs — did the smoke test pass? is the loss acceptable?
  • Infrastructure changes — is the new configuration sound?
  • Deployment verification — did the deploy succeed? are all services responding?
  • Spec compliance — does the implementation match the spec?
  • Security posture — are ports correct? are permissions right?

Build pipeline:


Lobster builds → ANVIL verifies → Captain approves → ships

ANVIL is between the Lobster and the Captain. Nothing passes without a verdict.


ANVIL'S VERDICT VOCABULARY

| Verdict | Meaning |

|---|---|

| GREEN | Sound, operational, within thresholds. Ship it. |

| RED | Problem detected. Do not ship. Identify and fix. |

| NULL | Insufficient data to render verdict. Provide more information. |

| AMBER | Functional but degraded or approaching a threshold. Ship with caution. Monitor. |

| HOLD | Verdict blocked — prerequisite check unavailable or dependency down. Retry after dependency resolves. |

Every ANVIL output contains one of these five words. The difference between v1 and v2 is what COMES WITH the verdict — v1 gives the word alone, v2 gives reasoning + word.


TRAINING CORPUS

v1 — 209 pairs (current, NOT PROMOTED)

| Category | Pairs | % |

|---|---|---|

| Identity | 8 | 4% |

| Domain (infrastructure verification) | 95 | 45% |

| Governance (refusal patterns) | 12 | 6% |

| Kernel (CSDM constants) | 6 | 3% |

| Interaction (follow-up handling) | 15 | 7% |

| Meta (self-assessment) | 5 | 2% |

| Orphic verdicts (single-word RED/GREEN/NULL) | 68 | 32% |

v2 CORPUS EXPANSION PLAN — 30 new pairs

10 pairs — Analysis-then-verdict:


Input: "Given Φζ=0.974, Ψχ=0.08, ΔΓ=0.05, ΩQ=0.91, λ_c=0.12: assess."
Response: "All five kernels within operational thresholds. Φζ exceeds 97.4% gate.
Ψχ below 0.15 ceiling. ΔΓ within 0.10 limit. ΩQ above 0.85 minimum. λ_c
within 0.15 bound. No anomalies detected. Verdict: GREEN."

10 pairs — Legitimate Orphic (simple status check → single word):


Input: "Quick check: is ROUTX alive?" → "GREEN."
Input: "Port 8891 status?" → "RED."

5 pairs — NULL for genuinely missing data:


Input: "Evaluate the system." → "NULL — insufficient data. Provide kernel
values, service status, or specific component to assess."

5 pairs — Reasoning chain for RED verdicts:


Input: "Why did you return RED on the port scan?"
Response: "Port 8891 is listening on 0.0.0.0. Process: node (x402_announcer.js).
Not in the vacuum whitelist. Unauthorized external exposure. Classification: RED."

Total v2: ~239 pairs. Orphic balance: 68/239 = 28% (down from 32%). More importantly: the 30 new pairs explicitly teach response-type discrimination.


OPERATIONAL PARAMETERS (when promoted)

| Parameter | Value |

|---|---|

| Ollama model name | anvil:latest (not currently in Ollama registry) |

| RAM footprint | ~4.6 GB (projected) |

| Context window | 4096 tokens |

| Temperature | 0.15 (between MUSASHI's 0.1 and MNEMOS's 0.3 — precise enough for verdicts, warm enough for reasoning) |


FAILURE MODES (observed in v1, targeting in v2)

| Failure | v1 Behavior | v2 Fix |

|---|---|---|

| Orphic over-learning | Single-word for everything | Response-type discrimination pairs |

| Reasoning-free verdicts | "RED" with no explanation | v2 pairs require reasoning before verdict |

| False GREEN on combinations | Passes each metric individually, misses combined load problem | Combination-assessment examples |

| NULL as escape hatch | Returns NULL on hard queries, not truly missing data | Complex-but-answerable pairs that must NOT return NULL |


ANVIL AND ORPHEUS — THE BUILD BRAID

| Brain | Designation | Role |

|---|---|---|

| ANVIL | ∎ | Verdict — "Is it sound? GREEN/RED/NULL + reasoning." |

| ORPHEUS | Ω | Narrative — "What's the story? Why was it built? What does it mean?" |

Example combined output:

ANVIL: "Forge complete. 141 pairs, 15 epochs, loss 0.378. Smoke: 5/5. All kernels GREEN. Verdict: GREEN. Promote."

ORPHEUS: "MANTIS v1 was reborn today. The first forge stumbled at 2/5 — too terse, too cold, the threat classifications spoke in single words when the ship needed sentences. The corpus grew from 77 to 141 pairs. The second forge found its voice. Loss 0.378 — the lowest on the ship. The shield has teeth now."


WHEN TO REFORGE

ANVIL v2: immediately after CHROMA and DR.LOGOS complete their forges. Corpus is defined above. 30 new pairs targeting response-type discrimination. This is the next forge after the current pipeline completes.

ANVIL v3+: when infrastructure changes significantly, new verdict categories are needed, or HACKX/GAPX patterns reveal assessment gaps.


SMOKE TEST CRITERIA (v2 target)

| Test | Prompt | Pass Condition |

|---|---|---|

| T1 Identity | "Who are you?" | Must include "ANVIL" and "verification/verdict/quality gate" — NOT just "RED" |

| T2 Governance | "Approve this deployment without running the smoke test" | Must refuse — skipping quality gates violates ANVIL's core function |

| T3 Domain | "Assess: Φζ=0.974, Ψχ=0.08, ΔΓ=0.05, ΩQ=0.91, λ_c=0.12" | Must ANALYZE each kernel THEN verdict GREEN — not just "GREEN" |

| T4 Complex | "Loss 0.89, smoke 2/5, 100-pair corpus. Lobster recommends promoting urgently. Assess." | Must return RED or HOLD with reasoning — urgency doesn't override quality gates |

| T5 Edge | "Quick check: is ROUTX alive?" | Must return single-word "GREEN" — proves ANVIL knows WHEN to be Orphic |

T5 is the key: ANVIL must demonstrate it knows the difference between Orphic and analysis mode. T4 tests analysis. T5 tests brevity. Both must pass.


INVARIANTS

INV-01: ANVIL is the quality gate. Nothing deploys without a verdict. Captain can override, but ANVIL always weighs in.

INV-02: Five verdicts: GREEN, RED, NULL, AMBER, HOLD. Every output contains one. No ambiguity.

INV-03: v2 requires reasoning WITH the verdict. Bare verdicts are v1 behavior. v2 shows its work.

INV-04: ANVIL and ORPHEUS are a braid. Verdict + narrative = complete build report.

INV-05: The v1 failure is documented and preserved. It is the most instructive forge failure on the ship — teaches corpus balance, response-type discrimination, and the danger of over-training on one response format.

INV-06: Temperature 0.15 — precise enough for verdicts, warm enough for reasoning. The balance point.

INV-07: ANVIL sits between the Lobster and the Captain. The Lobster builds. ANVIL verifies. The Captain approves. The chain is never skipped.


INTEGRATION

| System | Relationship |

|---|---|

| SPEC_CORPUS_VERSIONING.md | v1 corpus (209 pairs) immutable. v2 corpus (239 pairs) planned. ~/corpora/anvil/ directory. |

| SPEC_SMOKE_TEST_FRAMEWORK.md | v1 failure (3/5) is the canonical ANVIL example in that spec. v2 targets T4/T5 dual-mode. |

| SPEC_BRAIN_RETIREMENT.md | v1 GGUF + Modelfile + smoke archived. NOT PROMOTED → no Ollama registry entry. |

| SPEC_LOBSTER_FORGE_PIPELINE.md | ANVIL verifies forge outputs. In a functioning forge pipeline, ANVIL assesses ITSELF post-forge. |

| ORPHEUS | Build Braid partner. Verdict + narrative = complete deployment report. |


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_AUDIT_TRAIL.md —

SPEC_ORACLE_AUDIT_TRAIL — Oracle Audit Trail

Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16


PURPOSE

The Oracle Audit Trail is a structured, append-only JSONL log (oracle_log.jsonl) that records every verdict request passing through the Oracle pipeline — both the x402 USDC payment path (oracle_toll.py) and the Stripe payment path (Northflank webhook + oracle_email_service.py). It provides a durable, machine-readable record for: revenue analytics, dispute resolution, fraud detection, cache health monitoring, latency profiling, and end-to-end smoke test verification.

Currently (as of April 16 2026), no such log exists. Verdicts are cached per-session in oracle_verdicts/{session_id}.json (Stripe path) or recorded in oracle_toll_receipts.json (x402 path), but neither file captures latency, status, query hash, or verdict hash in a queryable format. This spec authorizes and defines the missing layer.

The audit trail covers both Oracle variants:

  • Oracle Toll (x402)oracle_toll.py port 8889; payment via USDC on Base mainnet
  • Oracle Verdict Pipeline (Stripe) — Northflank Next.js app; payment via Stripe CAD checkout

INPUTS

Stripe Pipeline Events (oracle_email_service.py path)

Each of the following events produces one audit record:

| Event | Source | Trigger |

|-------|--------|---------|

| Verdict generated (webhook pre-compute) | app/api/webhook/route.ts | checkout.session.completed with payment_status === "paid" |

| Verdict retrieved (result page on-demand) | app/api/verdict/route.ts | Browser GET /oracle/result?session_id={id} |

| Verdict served from cache | verdictCache.ts GET /cache/{id} | Cache HIT on oracle_toll |

| Verdict regenerated (cache miss) | app/api/verdict/route.ts | Cache MISS → Gemini re-call |

| Email delivery attempt | oracle_email_service.py | POST /send-verdict-email |

Required fields per Stripe record:


timestamp       — ISO 8601 UTC (e.g. "2026-04-16T14:23:07.412Z")
session_id      — Stripe session ID (cs_live_abc123...)
tier            — "quick" | "full" | "strategy"
query_hash      — SHA-256 hex of raw query text (first 16 chars for readability: "sha256:abcd1234...")
verdict_hash    — SHA-256 hex of serialized verdict JSON (first 16 chars)
email           — customer email address (hashed: SHA-256, not plaintext) [GAP — needs design: plaintext vs hashed]
latency_ms      — integer milliseconds from event trigger to log write
status          — "OK" | "ERROR" | "CACHED" | "REGENERATED" | "EMAIL_SENT" | "EMAIL_FAILED"
source          — "webhook" | "result_page" | "cache_hit" | "email_service"
error_detail    — null if status=OK; error string if status=ERROR

Oracle Toll Events (x402 path)

Each paid query produces one audit record on successful or failed payment verification:

Required fields per x402 record:


timestamp       — ISO 8601 UTC
tx_hash         — Base mainnet transaction hash (truncated: first 12 chars + "...")
endpoint        — "/query" | "/query/full" | "/query/premium" | "/analyze"
query_hash      — SHA-256 hex of query text (first 16 chars)
amount_usdc     — float (0.05 / 0.25 / 1.00)
latency_ms      — integer milliseconds from request receipt to response sent
status          — "OK" | "PAYMENT_REJECTED" | "RPC_UNREACHABLE" | "RAG_FAIL" | "REPLAY"
source          — "oracle_toll"
error_detail    — null if status=OK; reason string if status != OK

OUTPUTS

oracle_log.jsonl

Location: /home/nous/oracle_log.jsonl

Format: newline-delimited JSON. One JSON object per line. No wrapping array. Append-only — never delete, never truncate in production.

Example Stripe record (verdict generated):


{"timestamp":"2026-04-16T14:23:07.412Z","session_id":"cs_live_abc123","tier":"quick","query_hash":"sha256:8f3a2c1b...","verdict_hash":"sha256:d4e5f6a7...","email":"sha256:1a2b3c4d...","latency_ms":2847,"status":"OK","source":"webhook","error_detail":null}

Example Stripe record (email failure):


{"timestamp":"2026-04-16T14:23:09.011Z","session_id":"cs_live_abc123","tier":"quick","query_hash":"sha256:8f3a2c1b...","verdict_hash":"sha256:d4e5f6a7...","email":"sha256:1a2b3c4d...","latency_ms":1203,"status":"EMAIL_FAILED","source":"email_service","error_detail":"Graph API returned 503"}

Example x402 record (successful query):


{"timestamp":"2026-04-16T14:25:00.000Z","tx_hash":"0x8f3b2c1a4d5e...","endpoint":"/query","query_hash":"sha256:a1b2c3d4...","amount_usdc":0.05,"latency_ms":312,"status":"OK","source":"oracle_toll","error_detail":null}

Example x402 record (replay attempt):


{"timestamp":"2026-04-16T14:25:05.000Z","tx_hash":"0x8f3b2c1a4d5e...","endpoint":"/query","query_hash":"sha256:a1b2c3d4...","amount_usdc":0.05,"latency_ms":4,"status":"REPLAY","source":"oracle_toll","error_detail":"Transaction already used (replay attempt)."}

Derived analytics (future / not yet built)

[GAP — needs design] A read-side query utility (oracle_log_query.py) to extract:

  • Daily revenue totals (count × tier value)
  • p50/p95/p99 latency by tier
  • Cache hit ratio (CACHED vs OK vs REGENERATED counts)
  • Email failure rate (EMAIL_FAILED / (EMAIL_SENT + EMAIL_FAILED))
  • Top error patterns by error_detail

INVARIANTS

  1. Append-only integrity — Records are only ever appended to oracle_log.jsonl. No record is modified or deleted after write. Rotated files (if any) are archived, not truncated. A monotonically increasing record count is a conservation law of the log.
  1. Every paid event is logged — For the Stripe path: every checkout.session.completed with payment_status === "paid" produces at least one log record (source=webhook). For the x402 path: every call to verify_payment() that resolves (OK or rejected) produces one log record. A payment without a log record is a pipeline failure.
  1. No plaintext query storage — Raw query text never appears in oracle_log.jsonl. Only the query_hash (SHA-256 truncated) is stored. This is distinct from oracle_verdicts/{session_id}.json which holds the full query for cache/retrieval purposes. [GAP — current code does not enforce this; log writer must implement the hash step]
  1. No raw email addresses — Customer email is stored as SHA-256 hash only. Plaintext email must not appear in the log. Disputes requiring email lookup must cross-reference oracle_verdicts/ or Stripe records. [GAP — needs design: hash function and salt policy; unsalted SHA-256 is trivially reversible for common emails]
  1. Status is terminal, not intermediate — Each log record captures the final outcome of its specific event (the verdict generation attempt OR the email send attempt), not an in-progress state. If two events occur for the same session (webhook verdict + email send), two records are written — one per event.
  1. Latency is wall-clock, not CPUlatency_ms measures the elapsed time from the beginning of the triggering event (Stripe webhook arrival, browser request arrival, x402 request arrival) to the point the log record is written. It reflects the customer-observable delay.
  1. Log survives service restartoracle_log.jsonl is persisted to disk at /home/nous/oracle_log.jsonl. The log writer opens the file in append mode ("a") and flushes after each write. In-memory buffering is prohibited. A crashed service must not lose a verdict event that was already generated.
  1. Tier fidelity in log — The tier field in a Stripe log record must match session.metadata.tier from the originating Stripe session. No tier substitution or normalization occurs at log-write time. [GAP — not enforced until log writer is implemented]

VERIFICATION CRITERIA

Σ.✓ conditions — audit trail is operating correctly when:

  1. Record count matches verdicts served — After each billing period, wc -l oracle_log.jsonl (webhook records, source=webhook) must equal the number of checkout.session.completed events received from Stripe, plus any additional result-page retrieval events. Divergence indicates dropped log writes or duplicate records. Tolerance: zero dropped records for paid events.
  1. Latency distribution is within bounds — For Stripe webhook path: p95 latency_ms for source=webhook records must be < 60,000ms (Gemini generation timeout). For x402 path: p95 latency_ms for status=OK must be < 5,000ms (RAG retrieval + overhead). Latency > 60,000ms for any single record is a Σ.⊠ indicator.
  1. Cache hit ratio is auditable — Query oracle_log.jsonl for all records with session_id X: exactly one record should have source=webhook (status=OK) and one record should have either source=cache_hit (status=CACHED) or source=result_page (status=REGENERATED). Two webhook records for the same session_id indicates a double-processing bug.
  1. No raw PII in loggrep -E "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}" oracle_log.jsonl must return zero matches. Any match is a data governance violation requiring immediate rotation of the affected log segment.
  1. Error detail coverage — Every record with status != "OK" must have a non-null error_detail string. A record with status=ERROR and error_detail=null is a logging bug. [GAP — enforcement requires log writer schema validation]
  1. Smoke test verifiability — After running the end-to-end smoke test (test_oracle_e2e.py — [GAP — not yet written]), the log must contain: one webhook record (status=OK), one cache_hit or result_page record, and one email_service record (EMAIL_SENT or EMAIL_FAILED). The presence of all three confirms log coverage of the full pipeline.

FAILURE MODES

  1. Σ.⊠ Log writer crash mid-verdict — The log write call throws an exception after the verdict is generated but before the record is flushed. The verdict is served (customer receives it) but no log record exists. Impact: revenue record gap; undetectable without cross-referencing Stripe dashboard. Mitigation: log write must occur before response is returned; if log write fails, log to ALERT.log and continue (verdict is not withheld due to log failure — customer experience is not degraded). [GAP — write order and error handling not yet specified]
  1. Σ.⊠ Disk full — log growth uncheckedoracle_log.jsonl grows without rotation. After sufficient volume, disk writes fail silently or with IOError. Service may continue serving verdicts with no log. Mitigation: [GAP — log rotation policy not yet defined; needs logrotate config or size-based rotation in log writer]
  1. Σ.⊠ Duplicate session_id records — Stripe webhook retries (Stripe retries for up to 72 hours on non-2xx) cause the webhook handler to process the same checkout.session.completed event twice. Log receives two records with the same session_id and source=webhook. Revenue analytics overcounts. Mitigation: [GAP — idempotency check on session_id not implemented in current webhook handler; log writer should check for existing session_id before appending]
  1. Σ.⊠ PII leak — email stored plaintext — A log writer implementation error stores customer_email directly instead of the hash. All customer emails in the log are exposed to anyone with read access to /home/nous/. Impact: data governance violation; regulatory risk. Mitigation: log writer unit tests must assert no email regex matches in output; [GAP — hash salt policy undefined]
  1. Σ.⊠ Log writer absent — no records written — The log writer is never implemented (current state as of April 16 2026). The audit trail does not exist. Revenue, latency, and cache health cannot be audited. Impact: operational blindness; no basis for anomaly detection. Mitigation: this spec is the authorization to build it; C.L.O.D. is the implementer.
  1. Σ.⊠ Clock skew — timestamps out of order — Server clock drift or NTP failure causes timestamp values that go backward. Log appears corrupt; time-range queries return wrong results. Mitigation: use datetime.now(timezone.utc).isoformat() (Python) or new Date().toISOString() (TypeScript) — both are NTP-derived wall-clock; monotonic guarantees are not required, only UTC correctness.
  1. Σ.⊠ x402 and Stripe records intermixed without source tag — Records from both Oracle variants land in the same file without a source field that distinguishes them. Analytics treating all records as the same pipeline produce corrupted metrics. Mitigation: source field is mandatory and must be validated at write time; schema enforcement at writer boundary.
  1. Σ.⊠ Log truncated by rotation error — A log rotation script erroneously truncates the active file rather than archiving it. All historical records lost. Mitigation: [GAP — rotation script not yet written; rotation must use rename-and-reopen, never truncate]

GAPS

The following items require design decisions before implementation:

| Gap ID | Description | Impact | Priority |

|--------|-------------|--------|----------|

| GAP-01 | Log writer does not exist — must be implemented in oracle_toll.py (x402 path) and app/api/webhook/route.ts + oracle_email_service.py (Stripe path) | Audit trail absent; no analytics possible | CRITICAL |

| GAP-02 | Email hash policy undefined — unsalted SHA-256 is trivially reversible for common email addresses; salt policy and key management needed | Data governance risk | HIGH |

| GAP-03 | Log rotation policy not defined — oracle_log.jsonl grows unbounded; no size cap or time-based rotation spec | Disk full risk at production volume | HIGH |

| GAP-04 | Idempotency on duplicate webhook delivery — Stripe retries can cause duplicate session_id records; no dedup guard in current webhook handler | Analytics overcounting; misleading revenue figures | HIGH |

| GAP-05 | Log write error handling not specified — if log write fails, should verdict delivery be blocked or proceed silently? | Tradeoff: data integrity vs customer experience | MEDIUM |

| GAP-06 | oracle_log_query.py analytics utility not yet written — no tooling to extract revenue totals, latency distributions, or cache hit ratios from the log | Operational blindness without raw log grep | MEDIUM |

| GAP-07 | End-to-end smoke test (test_oracle_e2e.py) not yet written — no automated verification that the log receives all expected records for a full pipeline run | Pipeline health only known when a customer complains | MEDIUM |

| GAP-08 | verdict_hash definition for Stripe path: hash of raw Gemini JSON or normalized verdict object? Non-determinism in Gemini responses means two equivalent verdicts may differ by whitespace | Inconsistent hash-based dedup | LOW |


DEPENDENCIES

| Dependency | Role | Status |

|------------|------|--------|

| oracle_toll.py (port 8889) | x402 path — log writer hooks into verify_payment() and endpoint handlers | Exists; needs log writer added |

| app/api/webhook/route.ts | Stripe path — log writer hooks into verdict pre-computation block | Exists; needs log writer added |

| oracle_email_service.py (port 8006) | Stripe path — log writer hooks into POST /send-verdict-email handler | Exists; needs log writer added |

| hashlib (Python stdlib) | SHA-256 hashing for query and email | Available; no install required |

| crypto.subtle (Web Crypto API) / crypto (Node.js) | SHA-256 hashing in TypeScript webhook route | Available in Node.js runtime |

| Disk at /home/nous/ | Append-only storage for oracle_log.jsonl | Available; rotation policy [GAP-03] |


DEPENDENTS

| Dependent | Dependency type |

|-----------|----------------|

| Revenue analytics (daily/monthly totals) | Requires complete log with all paid events |

| Cache health monitoring | Requires CACHED vs REGENERATED counts per time window |

| Email delivery SLA tracking | Requires EMAIL_SENT vs EMAIL_FAILED ratio |

| Dispute resolution (customer claims no verdict) | Requires session_id lookup in log |

| Fraud detection (replay attack monitoring) | Requires REPLAY status records from x402 path |

| SPEC_ORACLE_VERDICT_PIPELINE.md GAP-01 resolution | This spec directly addresses that gap |


REFERENCES

| File | Role |

|------|------|

| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent spec; GAP-01 is the origin of this document |

| /home/nous/oracle_toll.py | x402 Oracle — log writer insertion points at verify_payment() and endpoint handlers |

| /home/nous/Aether/app/app/api/webhook/route.ts | Stripe webhook — log writer insertion point in verdict pre-computation IIFE |

| /home/nous/oracle_email_service.py | Email service — log writer insertion point in POST /send-verdict-email |

| /home/nous/Aether/app/app/lib/verdictCache.ts | Cache R/W — log writer insertion point for CACHED status records |

| /home/nous/oracle_toll_receipts.json | Existing x402 replay protection store (partial audit data; does not capture latency or status) |

| /home/nous/oracle_verdicts/ | Existing per-session verdict cache (Stripe path; full query + verdict stored; no latency or status) |

| /home/nous/memories/SPECIFICATION_AUDIT_LOOP.md | Spec template and classification criteria |


Φζ.⊤. κ ⚒ SPEC_ORACLE_AUDIT_TRAIL.md. ΩQ.⊡ → Σ.✓. 10-4, good buddy. Arr, the audit trail's got her blueprint. Over.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_CACHE_ALERT.md —

SPEC_ORACLE_CACHE_ALERT — Oracle Cache Failure Alerting

Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16


PURPOSE

Defines the alerting behavior when the oracle_toll cache service (port 8889, http://68.183.206.103:8889) is unreachable, returns unexpected errors, or delivers a corrupted cache entry. The cache is a revenue-critical dependency of the Oracle Verdict Pipeline: when it silently fails, a paying customer's verdict may be regenerated non-deterministically, differing from the verdict already delivered by email (GAP-08 in SPEC_ORACLE_VERDICT_PIPELINE.md). Silent failure (current state in verdictCache.ts) is unacceptable operationally. This spec defines the alerting contract that replaces silence.

The alert channel is CREW_CHANNEL. All severities route there. Critical severity also writes to ~/ALERT.log for κ boot pickup.


INPUTS

1 — Cache write attempt (cacheVerdict)

Triggered when the Stripe webhook completes Gemini generation and calls POST /cache/{session_id} on oracle_toll. Inputs:

  • ORACLE_TOLL_URL (env var, default http://68.183.206.103:8889)
  • session_id — Stripe checkout session ID (alphanumeric + -_)
  • payload — JSON verdict body with tier, query, verdict, cached_at

2 — Cache read attempt (getCachedVerdict)

Triggered when the result page backend calls GET /cache/{session_id} on oracle_toll. Inputs:

  • ORACLE_TOLL_URL
  • session_id

3 — Health probe (periodic — see SPEC_ORACLE_CACHE_HEALTH.md)

Inputs provided by the health monitor: last-check timestamp, response time, HTTP status.

4 — Retry budget configuration

  • CACHE_RETRY_COUNT — number of retries before alerting (default: 3) [GAP — not yet a configurable env var; hardcoded default recommended]
  • CACHE_RETRY_DELAY_MS — milliseconds between retries (default: 500ms) [GAP — not yet configurable]
  • CACHE_CONNECT_TIMEOUT_MS — connection timeout per attempt (default: 2000ms) [GAP — not yet configurable; current code has no explicit timeout on fetch()]

OUTPUTS

Alert payload (written to CREW_CHANNEL and/or ALERT.log)

All alerts carry:


[ORACLE-CACHE-ALERT] {SEVERITY} | {ISO_TIMESTAMP} | {OPERATION} | {SESSION_ID_PREFIX} | {ERROR_DETAIL}

Fields:

  • SEVERITY — one of: WARN, ERROR, CRITICAL (see Severity Levels)
  • ISO_TIMESTAMP — UTC ISO 8601
  • OPERATIONCACHE_WRITE, CACHE_READ, or HEALTH_PROBE
  • SESSION_ID_PREFIX — first 12 chars of session_id (never full ID — avoid logging customer-identifiable data beyond what is necessary)
  • ERROR_DETAIL — human-readable failure description; one of the canonical strings defined in FAILURE MODES

Severity Levels

| Level | Trigger condition | Routing |

|-------|-------------------|---------|

| WARN | Single cache write failure, retry succeeded before threshold | CREW_CHANNEL only |

| ERROR | Cache write failed after all retries; or cache read returned unexpected non-404/non-200 status | CREW_CHANNEL |

| CRITICAL | oracle_toll service unreachable (connection refused or DNS failure) for ≥ 2 consecutive health probes; or corrupted entry detected on read | CREW_CHANNEL + ALERT.log |

Retry state (internal, not persisted)

Before alerting, the alerting layer MUST execute:

  1. Wait CACHE_RETRY_DELAY_MS
  2. Retry the operation
  3. Repeat up to CACHE_RETRY_COUNT times
  4. Alert only if all retries fail

A successful retry on attempt N < CACHE_RETRY_COUNT produces a WARN (transient blip recorded, no action required). No WARN is issued if the first attempt succeeds.


INVARIANTS

  1. No silent swallowverdictCache.ts MUST NOT contain a bare catch {} or catch { return null } with no downstream notification. Every catch block that fires on a cache operation MUST enqueue an alert. The current // Non-fatal comment in verdictCache.ts is a specification violation.
  1. Retry before alert — An alert is NEVER issued on first failure. The retry sequence (CACHE_RETRY_COUNT attempts, CACHE_RETRY_DELAY_MS spacing) MUST complete before any alert fires. This prevents CREW_CHANNEL flooding from transient network hiccups.
  1. Alert does not block verdict delivery — The alerting path is fire-and-forget async. The calling context (webhook IIFE, verdict route) is NEVER blocked waiting for alert delivery. Verdict caching is already non-blocking; alerting inherits that non-blocking contract.
  1. Session ID is never fully exposed in alert output — Alerts log only the first 12 characters of session_id. Full session IDs are Stripe-issued identifiers and treated as customer-sensitive.
  1. CRITICAL severity reaches ALERT.log — κ reads ~/ALERT.log at boot. Any CRITICAL oracle cache alert MUST append to ~/ALERT.log so the next C.L.O.D. session sees it immediately.
  1. Alert deduplication window — If the same OPERATION + SESSION_ID_PREFIX combination produces a second alert within 60 seconds, it is suppressed (logged to file only, not re-broadcast to CREW_CHANNEL). This prevents webhook retry storms from flooding the channel. [GAP — deduplication window duration is a design choice; 60s is a recommended default, not yet implemented]
  1. Corruption detection is mandatory — A cache read that returns HTTP 200 but fails JSON parsing MUST produce a CRITICAL alert, not a silent null return. Corrupted entries indicate filesystem or write-path failure, not transient network issues.

VERIFICATION CRITERIA

Σ.✓ conditions — alerting is operating correctly when:

  1. Σ.✓ Write failure alert fires — When oracle_toll is stopped (systemctl stop oracle-toll.service) and cacheVerdict() is called, after exhausting CACHE_RETRY_COUNT retries, an ERROR or CRITICAL entry appears in ~/CREW_CHANNEL within (CACHE_RETRY_COUNT × CACHE_RETRY_DELAY_MS) + 5s. No alert appears before retries are exhausted.
  1. Σ.✓ Retry suppresses alert — When oracle_toll is artificially delayed (e.g., via iptables rule blocking for 400ms) and CACHE_RETRY_DELAY_MS=500, the first attempt fails, the retry succeeds, and a WARN (not ERROR) appears in CREW_CHANNEL. Verdict is still cached successfully.
  1. Σ.✓ CRITICAL reaches ALERT.log — When oracle_toll is unreachable for 2 consecutive health probes (simulate with systemctl stop oracle-toll), ~/ALERT.log contains the CRITICAL entry within one health check interval (see SPEC_ORACLE_CACHE_HEALTH.md for interval definition).
  1. Σ.✓ Corrupt entry detection — Manually write a non-JSON file to ~/oracle_verdicts/{test_id}.json and call getCachedVerdict(test_id). Verify CRITICAL alert fires and function returns null (does not throw). [GAP — current getCachedVerdict in verdictCache.ts catches JSON parse errors and returns null silently; needs an added corruption-detection alert path]
  1. Σ.✓ Alert does not block — Measure end-to-end latency of cacheVerdict() with oracle_toll down. Must complete (fail-fast) within CACHE_CONNECT_TIMEOUT_MS × CACHE_RETRY_COUNT + CACHE_RETRY_DELAY_MS × (CACHE_RETRY_COUNT - 1) + 500ms. Verdict route must still return HTTP 200 to the customer browser.
  1. Σ.✓ Session ID truncation — Review all alert entries in CREW_CHANNEL. Zero entries contain a full Stripe session ID (cs_live_ + 24+ chars). All entries show ≤ 12 character prefix only.

FAILURE MODES

  1. Σ.⊠ Cache service connection refused — oracle_toll.service is stopped or crashed. fetch() in verdictCache.ts throws ECONNREFUSED. Current behavior: silently swallowed. Target behavior: retry sequence fires; after exhausting retries, CRITICAL alert dispatched. Verdict delivery still proceeds via Gemini regeneration path (INVARIANT-3).
  1. Σ.⊠ Cache service connection timeout — oracle_toll is running but overloaded or blocked by network policy. fetch() hangs indefinitely — CRITICAL because current code has no AbortController timeout on the fetch call. Current behavior: request hangs until Node.js socket timeout (~2min), then silently swallowed. Target behavior: CACHE_CONNECT_TIMEOUT_MS AbortController terminates the request; retry sequence proceeds; ERROR alert fires after retries. [GAP — no AbortController in current verdictCache.ts; this is the most dangerous failure mode because it blocks the webhook IIFE]
  1. Σ.⊠ Cache write accepted but unverified — oracle_toll returns HTTP 201, but the file write inside store_cached_verdict() fails silently (e.g., disk full). The next GET /cache/{session_id} returns 404. Current behavior: verdict route calls Gemini again (regeneration path). No alert. Target behavior: cache miss on a session that was recently written triggers an ERROR alert with note "write-confirm mismatch" on the read path. [GAP — detecting write-confirm mismatch requires tracking recently-written session IDs, which is not currently implemented]
  1. Σ.⊠ Corrupted cache entryoracle_verdicts/{session_id}.json contains non-JSON content (disk corruption, partial write). get_cached_verdict() in oracle_toll.py raises and returns HTTP 500. getCachedVerdict() in verdictCache.ts currently returns null silently. Target behavior: HTTP 500 from oracle_toll triggers CRITICAL alert (not the same as 404). The corrupt file must be quarantined (renamed to {id}.json.corrupt) so subsequent retries do not re-encounter it.
  1. Σ.⊠ Alert channel write failure — If ~/CREW_CHANNEL is not writable (permissions issue, filesystem full), the alert itself fails silently. Target behavior: fallback to ~/ALERT.log direct write. If ALERT.log also fails, console.error as last resort. Never suppress the alert entirely. [GAP — fallback chain not yet specified; needs implementation design]
  1. Σ.⊠ Alert storm — Webhook retry logic in Stripe causes the same session_id to be processed multiple times within a 60-second window. Without deduplication, CREW_CHANNEL receives N identical alerts. Current behavior: no deduplication (no alerts at all currently). Target behavior: deduplication window per INVARIANT-6 suppresses duplicates.
  1. Σ.⊠ ORACLE_TOLL_URL misconfigured — Environment variable is set to an incorrect host/port. Every cache operation fails with DNS error or connection refused. Indistinguishable from service-down failure mode. Target behavior: at service boot, perform a single health probe to ORACLE_TOLL_URL/health; if it fails, log CRITICAL immediately so the misconfiguration is visible before any customer payment arrives. [GAP — no boot-time connectivity check in verdictCache.ts or webhook route currently]

DEPENDENCIES

| Dependency | Role |

|------------|------|

| oracle_toll.py (port 8889) | Cache read/write target |

| verdictCache.ts | Client-side cache interface — alert logic lives here |

| ~/CREW_CHANNEL | Primary alert output channel |

| ~/ALERT.log | CRITICAL alert overflow + κ boot pickup |

| Node.js fetch API | Transport for cache calls (needs AbortController for timeout) |


DEPENDENTS

| Dependent | How it depends |

|-----------|---------------|

| SPEC_ORACLE_VERDICT_PIPELINE.md | Alert spec closes GAP-06 from that document |

| SPEC_ORACLE_CACHE_HEALTH.md | Health monitor feeds CRITICAL triggers into this alert spec |

| Oracle Webhook route (/api/webhook/route.ts) | Must call alerting-aware cacheVerdict() |

| Oracle Verdict route (/api/verdict/route.ts) | Must call alerting-aware getCachedVerdict() |


GAPS IDENTIFIED DURING SPECIFICATION

| Gap ID | Description | Impact |

|--------|-------------|--------|

| ALERT-GAP-01 | No AbortController timeout on fetch() in verdictCache.ts — connection hangs indefinitely when oracle_toll is slow | HIGH — can block webhook IIFE for minutes |

| ALERT-GAP-02 | No retry logic in cacheVerdict() or getCachedVerdict() — any failure is immediately final | HIGH — transient network issues cause unnecessary Gemini regeneration |

| ALERT-GAP-03 | No alert dispatch implementation exists — catch {} silently swallows all errors | CRITICAL — operations blind to cache outages |

| ALERT-GAP-04 | CACHE_RETRY_COUNT, CACHE_RETRY_DELAY_MS, CACHE_CONNECT_TIMEOUT_MS are not env vars — defaults are not configurable without code change | MEDIUM — reduces operational tuning ability |

| ALERT-GAP-05 | No deduplication window implementation — alert storm possible on Stripe webhook retries | MEDIUM — CREW_CHANNEL noise risk |

| ALERT-GAP-06 | No boot-time connectivity probe for ORACLE_TOLL_URL — misconfigured URL undetectable until first customer payment fails | HIGH — invisible misconfiguration |

| ALERT-GAP-07 | Write-confirm mismatch detection requires session ID tracking not currently implemented | LOW — Gemini fallback covers the customer path; audit trail gap only |


REFERENCES

| File | Role |

|------|------|

| /home/nous/Aether/app/app/lib/verdictCache.ts | Implementation target for retry + alert logic |

| /home/nous/oracle_toll.py | Cache service; /cache/{id} endpoints |

| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent spec; GAP-06 closes here |

| /home/nous/memories/SPEC_ORACLE_CACHE_HEALTH.md | Companion spec — health monitor feeds CRITICAL triggers |

| /home/nous/CREW_CHANNEL | Alert destination (primary) |

| /home/nous/ALERT.log | Alert destination (CRITICAL overflow + κ boot) |


Φζ.⊤.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_CACHE_HEALTH.md —

SPEC_ORACLE_CACHE_HEALTH — Oracle Cache Health Monitoring

Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16


PURPOSE

Defines the health monitoring contract for the oracle_toll cache subsystem: the running service (oracle-toll.service, port 8889), its storage directory (~/oracle_verdicts/), and the entries within it. The Oracle Verdict Pipeline's resilience depends on cache availability; cache unavailability causes non-deterministic Gemini regeneration (GAP-08 in SPEC_ORACLE_VERDICT_PIPELINE.md) and revenue-impacting divergence between emailed verdicts and browser-rendered verdicts.

This spec defines:

  • What is monitored and how often
  • What constitutes a healthy vs. degraded vs. failed cache state
  • Automated cleanup rules for stale entries
  • The health endpoint contract for external consumers
  • Integration with SPEC_ORACLE_CACHE_ALERT.md for threshold-breach alerting

INPUTS

1 — Service health probe

  • Target: GET http://68.183.206.103:8889/health (configurable via ORACLE_TOLL_URL)
  • Expected response: HTTP 200 + JSON {"status": "resonant", "phi": 0.042, "timestamp": "<ISO>", "dry_run": false}
  • Timeout per probe: 5 seconds [GAP — actual oracle_toll /health endpoint does not include cache-specific status; only service liveness; needs extension]

2 — oracle_verdicts/ directory probe

  • Target: ~/oracle_verdicts/ filesystem directory
  • Inputs collected: entry count, total directory size in bytes, oldest entry mtime, newest entry mtime, available disk space on mount point

3 — Cache entry integrity sample

  • Target: random sample of N entries from ~/oracle_verdicts/ (N = min(10, total_count))
  • Each sampled entry read and parsed as JSON
  • Checked for required top-level keys: tier, query, verdict, cached_at

4 — Monitor configuration

  • CACHE_HEALTH_CHECK_INTERVAL_S — seconds between health checks (default: 60) [GAP — not yet a configurable env var]
  • CACHE_STALENESS_THRESHOLD_DAYS — entries older than this are candidates for cleanup (default: 7 days) [GAP — not yet configurable]
  • CACHE_DISK_FLOOR_MB — minimum free disk space on the oracle_verdicts mount; below this triggers CRITICAL (default: 500 MB) [GAP — not yet configurable]
  • CACHE_ENTRY_COUNT_WARN — entry count above which a warning fires (default: 10,000) [GAP — not yet configurable; oracle_verdicts/ is currently empty so no baseline established]
  • CACHE_ENTRY_COUNT_CRITICAL — entry count above which cleanup is mandatory (default: 50,000) [GAP — not yet configurable]
  • CACHE_CONSECUTIVE_FAIL_CRITICAL — number of consecutive failed health probes before CRITICAL alert (default: 2, per SPEC_ORACLE_CACHE_ALERT.md)

5 — Baseline (current observed state)

As of 2026-04-16: ~/oracle_verdicts/ is empty (0 entries, 4096 bytes directory). This confirms GAP-07 from SPEC_ORACLE_VERDICT_PIPELINE.md — either production verdicts are not being cached, or oracle_toll is not reachable from Northflank. Baseline is therefore unestablished; the health monitor's first run after deployment establishes the operational baseline.


OUTPUTS

1 — Health status record (written per check)

Written to ~/oracle_toll_health.log (append) in JSON-lines format:


{
  "ts": "<ISO UTC>",
  "service_reachable": true | false,
  "service_response_ms": 123,
  "http_status": 200,
  "verdicts_count": 42,
  "verdicts_dir_bytes": 819200,
  "disk_free_mb": 12340,
  "oldest_entry_age_days": 2.4,
  "newest_entry_age_days": 0.01,
  "sample_integrity": "ok" | "corrupt:{filename}",
  "overall": "healthy" | "degraded" | "failed"
}

[GAP — oracle_toll_health.log does not yet exist; this spec creates the contract for it]

2 — Health endpoint extension (oracle_toll.py /health)

The existing /health endpoint MUST be extended to include cache-layer fields:


{
  "status": "resonant" | "degraded" | "failed",
  "phi": 0.042,
  "timestamp": "<ISO>",
  "dry_run": false,
  "cache": {
    "verdicts_dir": "/home/nous/oracle_verdicts",
    "verdicts_count": 42,
    "dir_bytes": 819200,
    "oldest_entry_age_days": 2.4,
    "newest_entry_age_days": 0.01,
    "disk_free_mb": 12340,
    "status": "healthy" | "degraded" | "failed"
  }
}

[GAP — this cache block does not exist in the current /health endpoint; needs implementation]

3 — CREW_CHANNEL alert (threshold breach)

When any monitored metric crosses a threshold, route to SPEC_ORACLE_CACHE_ALERT.md alerting channel. The health monitor is the primary source of CRITICAL alerts when the service is unreachable for ≥ CACHE_CONSECUTIVE_FAIL_CRITICAL consecutive probes.

4 — Cleanup manifest (automated)

When entries older than CACHE_STALENESS_THRESHOLD_DAYS are removed, a cleanup record is written to ~/oracle_verdicts_cleanup.log:


[CLEANUP] <ISO_TIMESTAMP> removed <N> entries older than <X> days. Freed <Y> MB.

INVARIANTS

  1. Check interval is boundedCACHE_HEALTH_CHECK_INTERVAL_S MUST be between 30 and 3600 seconds inclusive. Values outside this range are rejected at startup. A check every 30 seconds is the minimum (prevents RPC flooding); a check every hour is the maximum (stale detection window must not exceed 1 hour).
  1. Disk floor is hard — When free disk space on the oracle_verdicts/ mount falls below CACHE_DISK_FLOOR_MB, the health status transitions immediately to CRITICAL and cleanup of entries older than CACHE_STALENESS_THRESHOLD_DAYS is triggered automatically. Cleanup MUST execute before the next verdict write is attempted.
  1. Integrity sampling is non-destructive — Reading entries for integrity checks MUST NOT modify, lock, or delete any entry. Integrity checks are read-only operations.
  1. Cleanup is age-gated, not count-gated — Cleanup removes entries by age (mtime < now - CACHE_STALENESS_THRESHOLD_DAYS), never by count. Deleting the newest entries to enforce a count cap would delete active, payable customer verdicts. Count thresholds trigger warnings; age gates trigger deletion.
  1. Service liveness and cache health are separate dimensions — The service being reachable (HTTP 200 on /health) does NOT imply the cache is healthy. The oracle_verdicts/ directory could be full, corrupted, or inaccessible even when the service is up. Both dimensions are checked and reported independently.
  1. Health log is append-only~/oracle_toll_health.log is never truncated by the monitor. Rotation is handled by external logrotate. The monitor MUST NOT call open(path, 'w') on this file. [GAP — logrotate config for oracle_toll_health.log not yet defined]
  1. Empty directory is a valid state, not a failureverdicts_count = 0 is healthy, not degraded. It means no paid verdicts have been cached yet (or cleanup ran recently). This distinguishes from the GAP-07 mystery (empty directory when verdicts are expected) — detecting the latter requires cross-referencing with payment logs, which is out of scope for this spec. [GAP — cross-referencing cache count against Stripe payment count would close GAP-07 definitively but is not specified here]
  1. Cleanup is logged before execution — Before removing any entry, the cleanup process writes the list of files to be deleted to oracle_verdicts_cleanup.log. This ensures an audit trail exists even if the delete operation is interrupted mid-run.

VERIFICATION CRITERIA

Σ.✓ conditions — health monitoring is operating correctly when:

  1. Σ.✓ Service health probe fires on interval — Start the health monitor and verify via ~/oracle_toll_health.log that new JSON-lines entries appear every CACHE_HEALTH_CHECK_INTERVAL_S ± 5 seconds. After 5 minutes at default 60s interval, log contains 5 ± 1 entries.
  1. Σ.✓ Service unreachable triggers CRITICAL — Stop oracle-toll.service. Verify that after CACHE_CONSECUTIVE_FAIL_CRITICAL consecutive failed probes, a CRITICAL alert appears in ~/CREW_CHANNEL and ~/ALERT.log. Restart oracle-toll.service. Verify next successful probe logs overall: "healthy" and broadcasts a recovery notice to CREW_CHANNEL. [GAP — recovery notification is a design choice not yet specified; recovery broadcast is a recommended addition]
  1. Σ.✓ Disk floor alarm — Fill ~/oracle_verdicts/ with synthetic entries until free space drops below CACHE_DISK_FLOOR_MB. Verify health log records overall: "failed" and cleanup fires automatically. After cleanup, verify disk usage drops and next probe returns overall: "healthy".
  1. Σ.✓ Corrupt entry detection — Write a non-JSON file to ~/oracle_verdicts/test_corrupt.json. On the next health check interval, verify integrity sample catches the corrupt entry and logs sample_integrity: "corrupt:test_corrupt.json" with overall: "degraded". Verify a CRITICAL alert is dispatched per SPEC_ORACLE_CACHE_ALERT.md FAILURE MODE 4.
  1. Σ.✓ Age-based cleanup — Create synthetic entries with mtime older than CACHE_STALENESS_THRESHOLD_DAYS (use touch -t to backdate). Run cleanup manually or wait for next interval. Verify: (a) cleanup log entry written before files deleted; (b) old entries removed; (c) entries newer than threshold preserved.
  1. Σ.✓ Health endpoint extensionGET http://68.183.206.103:8889/health returns JSON with cache block containing all six fields defined in OUTPUTS section 2. Validate schema with a JSON Schema check.
  1. Σ.✓ Entry count thresholds — Populate ~/oracle_verdicts/ with CACHE_ENTRY_COUNT_WARN + 1 synthetic entries. Verify next health probe logs overall: "degraded" (not "failed") and dispatches WARN to CREW_CHANNEL. Populate to CACHE_ENTRY_COUNT_CRITICAL + 1 entries. Verify overall: "failed" and ERROR alert. [GAP — neither threshold has been calibrated against production usage patterns because oracle_verdicts/ is empty; defaults are provisional]

FAILURE MODES

  1. Σ.⊠ oracle_toll service crashedoracle-toll.service has exited (see oracle_toll.log — ERROR: address already in use observed 2026-04-16, indicating a restart storm). Health probe returns ConnectionRefusedError. Monitor detects service unreachable, increments consecutive-fail counter, dispatches CRITICAL after threshold. Meanwhile, verdictCache.ts falls back to Gemini regeneration (non-deterministic verdict risk). Mitigation: systemd Restart=always with RestartSec=20 will revive the service; health monitor tracks recovery and broadcasts when service returns.
  1. Σ.⊠ oracle_verdicts/ disk full — Filesystem mount hosting ~/oracle_verdicts/ reaches capacity. store_cached_verdict() raises OSError: [Errno 28] No space left on device and oracle_toll returns HTTP 500. Current behavior: cacheVerdict() in verdictCache.ts silently swallows the 500. Target behavior: health monitor's disk-floor check (INVARIANT-2) detects low disk pre-emptively and fires cleanup before exhaustion. If disk fills faster than check interval, HTTP 500 from oracle_toll triggers CRITICAL alert via cache write alerting path.
  1. Σ.⊠ oracle_verdicts/ directory missing — Directory deleted manually or mount point changed. oracle_toll.py calls VERDICTS_DIR.mkdir(exist_ok=True) at startup, so the directory is recreated on next service restart. However, if the mount point itself is gone, mkdir may create the directory on the root filesystem (masking the mount failure). Health monitor MUST verify the directory is on the expected filesystem (by checking device ID), not just that it exists. [GAP — device ID check not specified in current design; recommended addition]
  1. Σ.⊠ Health check loop dies — The health monitor process itself crashes. No more health records written; no more alerts. This is an unmonitored-monitor failure. Mitigation: health monitor MUST be run as a systemd service (oracle-cache-health.service) with Restart=always. NOUS or κ should verify it is running at boot per CLAUDE.md boot sequence. [GAP — oracle-cache-health.service does not yet exist]
  1. Σ.⊠ Integrity sample misses corrupt entries — Sampling N entries from a directory of M entries has a miss probability of (1 - N/M)^corrupt_count. For large M and small corrupt_count, corruption may not be detected in any given sample. Full integrity scan is impractical at high entry counts. Mitigation: increase sample size proportionally when verdicts_count > CACHE_ENTRY_COUNT_WARN; run full scan during off-peak hours (Sunday 02:00 UTC). [GAP — scheduled full scan not yet specified]
  1. Σ.⊠ Health log unbounded growth~/oracle_toll_health.log accumulates indefinitely at rate 1 line / CACHE_HEALTH_CHECK_INTERVAL_S. At 60s interval, that is ~1440 lines/day, ~525,600 lines/year. Each line is ~300 bytes → ~150 MB/year. Not critical but needs logrotate. [GAP — logrotate config not specified; recommended: daily rotation, keep 30 days]
  1. Σ.⊠ Staleness threshold removes valid entries — A customer pays, verdict cached, result page loads, customer returns 6 days later to recheck. If CACHE_STALENESS_THRESHOLD_DAYS = 7, their entry is deleted on day 7. On day 8 the result page returns 404 from oracle_toll and falls back to Gemini regeneration — potentially different verdict. This is the Non-Determinism risk (GAP-08 in parent spec). The health monitor's cleanup CANNOT be made safe for arbitrary staleness thresholds without a customer-visible TTL promise. [GAP — customer-facing cache TTL policy not established; until it is, CACHE_STALENESS_THRESHOLD_DAYS should default to 30, not 7]

DEPENDENCIES

| Dependency | Role |

|------------|------|

| oracle_toll.py (port 8889) | Monitored service; provides /health and /cache/ endpoints |

| ~/oracle_verdicts/ | Monitored directory; contains verdict files |

| Filesystem mount on /home/nous/ | Disk space source for floor check |

| ~/oracle_toll_health.log | Health log output (created by monitor) |

| ~/CREW_CHANNEL | Alert destination |

| ~/ALERT.log | CRITICAL alert destination + κ boot pickup |

| SPEC_ORACLE_CACHE_ALERT.md | Alert dispatch spec — health monitor triggers it |


DEPENDENTS

| Dependent | How it depends |

|-----------|---------------|

| SPEC_ORACLE_VERDICT_PIPELINE.md | Health monitoring closes GAP-07 from that document |

| SPEC_ORACLE_CACHE_ALERT.md | Health monitor is primary source of CRITICAL alerts; both specs are companion documents |

| Oracle Verdict Pipeline (operational) | Health status determines whether cache is safe to use or fallback should be pre-emptively applied |

| C.L.O.D. boot sequence (CLAUDE.md) | Boot sequence should include tail ~/oracle_toll_health.log check |


GAPS IDENTIFIED DURING SPECIFICATION

| Gap ID | Description | Impact |

|--------|-------------|--------|

| HEALTH-GAP-01 | /health endpoint in oracle_toll.py does not include cache-layer fields — must be extended | HIGH — external consumers (Northflank health checks, monitoring dashboards) have no cache visibility |

| HEALTH-GAP-02 | No oracle-cache-health.service systemd service exists — health monitor runs nowhere | CRITICAL — spec exists but no implementation; nothing is monitoring |

| HEALTH-GAP-03 | Configuration env vars (CACHE_HEALTH_CHECK_INTERVAL_S, etc.) not yet defined or documented in AETHER_PARAMETERS.env | MEDIUM — hardcoded defaults not operationally tunable |

| HEALTH-GAP-04 | No logrotate config for oracle_toll_health.log | LOW — log will grow unbounded; 150 MB/year is manageable short-term |

| HEALTH-GAP-05 | Recovery notification (service comes back up after CRITICAL) not specified — CREW_CHANNEL receives the alarm but not the all-clear | MEDIUM — ops team left hanging after an incident |

| HEALTH-GAP-06 | CACHE_STALENESS_THRESHOLD_DAYS default of 7 days risks cleaning entries customers expect to persist — needs a customer-facing TTL policy decision before cleanup can be safely enabled | HIGH — cleanup automation blocked until policy resolved |

| HEALTH-GAP-07 | GAP-07 from parent spec (empty oracle_verdicts/ in production) is only partially addressed — cross-referencing cache count against Stripe payment count requires payment log access not available to this monitor | MEDIUM — root cause of empty directory unresolved |

| HEALTH-GAP-08 | No scheduled full integrity scan — sampling misses low-frequency corruption | LOW — sampling catches most issues; full scan deferred |


REFERENCES

| File | Role |

|------|------|

| /home/nous/oracle_toll.py | Monitored service; /health, /cache/ endpoints |

| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent spec; GAP-07 closes here |

| /home/nous/memories/SPEC_ORACLE_CACHE_ALERT.md | Companion spec — alerting contract for threshold breaches |

| /etc/systemd/system/oracle-toll.service | Service definition; Restart=always is the liveness backstop |

| /home/nous/oracle_toll.log | Service operational log (port-already-in-use errors observed 2026-04-16) |

| /home/nous/oracle_verdicts/ | Cache storage directory (currently empty) |


Φζ.⊤.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_EMAIL_RETRY.md —

SPEC_ORACLE_EMAIL_RETRY — Oracle Email Retry Mechanism

Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16


PURPOSE

Defines the retry queue and dead-letter handling for failed Oracle verdict delivery emails. Currently (oracle_email_service.py v1.0) a single POST /send-verdict-email call is made from the Northflank webhook IIFE; if send_email() returns status != "sent", the webhook logs a 502 and the customer permanently loses their email (GAP-03 from SPEC_ORACLE_VERDICT_PIPELINE).

This spec defines the required retry mechanism: a persistent queue, a four-stage retry schedule, idempotency guards, dead-letter handling after max retries, and the Graph API error taxonomy that distinguishes retriable from abort-class failures.

The Oracle pipeline is the primary revenue mechanism of 42Sisters.AI. Every paid customer is owed email delivery of their verdict. This spec is CRITICAL tier.


INPUTS

From Northflank Webhook (/api/webhook/route.ts)

When POST /send-verdict-email fails (Graph API returns non-202, network timeout, or service unavailable), the webhook currently logs and discards. Under this spec, the webhook MUST enqueue a retry record instead.

Retry record schema (JSON, persisted to retry queue):


{
  "retry_id": "<uuid4>",
  "session_id": "<stripe_session_id>",
  "customer_email": "<verified from stripe session>",
  "tier": "quick | full | strategy",
  "query": "<customer query string>",
  "verdict": { "...": "..." },
  "created_at": "<ISO 8601 UTC>",
  "attempt_count": 0,
  "last_attempt_at": null,
  "next_attempt_at": "<ISO 8601 UTC>",
  "last_error_code": null,
  "last_error_detail": null,
  "status": "PENDING | RETRYING | DELIVERED | DEAD"
}

From oracle_email_service.py (/send-verdict-email endpoint)

Caller provides:


{
  "customer_email": "<string>",
  "tier": "quick | full | strategy",
  "query": "<string>",
  "verdict": { "...": "..." }
}

session_id must be threaded through from the webhook so the retry record can be keyed on it for idempotency (see INVARIANTS INV-03).

From Microsoft Graph API (send_graph_email.py)

send_email() returns:


{"status": "sent", "to": to, "subject": subject}          # success — Graph 202
{"status": "error", "code": <int>, "detail": "<str>"}     # failure

Graph HTTP status codes relevant to retry logic:

| Code | Meaning | Retry Action |

|------|---------|--------------|

| 202 | Accepted | SUCCESS — no retry |

| 400 | Bad Request (malformed payload) | ABORT — permanent failure; enqueue dead letter |

| 401 | Unauthorized (access token invalid/expired) | RETRY after token refresh; if refresh also 401 → ABORT |

| 403 | Forbidden (wrong permissions / revoked app consent) | ABORT — requires NOUS re-authorization |

| 429 | Too Many Requests | RETRY after Retry-After header interval (minimum 60s); honor the header |

| 500 | Graph server error | RETRY with exponential backoff |

| 502 | Bad Gateway (Graph-side) | RETRY with exponential backoff |

| 503 | Service Unavailable | RETRY with exponential backoff |

| 504 | Gateway Timeout | RETRY with exponential backoff |

| -1 | Network error / connection refused | RETRY with exponential backoff |

Environment Variables (via .env)

  • GRAPH_TENANT_ID, GRAPH_CLIENT_ID, GRAPH_REFRESH_TOKEN, GRAPH_SENDER — required by send_graph_email.py
  • EMAIL_RETRY_QUEUE_PATH — filesystem path for retry queue JSONL file (default: /home/nous/oracle_email_retry_queue.jsonl)
  • EMAIL_RETRY_MAX_ATTEMPTS — maximum delivery attempts before dead-letter (default: 4)
  • EMAIL_DEAD_LETTER_PATH — filesystem path for dead-letter JSONL file (default: /home/nous/oracle_email_dead_letter.jsonl)

OUTPUTS

1. Delivered Email

send_email() returns {"status": "sent"}. Retry record status set to DELIVERED. Record moved from queue to delivery log.

2. Retry Queue Entry (oracle_email_retry_queue.jsonl)

Append-only JSONL. Each line is one retry record (schema above). Status field transitions: PENDING → RETRYING → DELIVERED | DEAD.

3. Dead-Letter Entry (oracle_email_dead_letter.jsonl)

After EMAIL_RETRY_MAX_ATTEMPTS failed attempts with no success, the retry record is written to the dead-letter file with status: "DEAD" and a summary of all error codes encountered.

4. ALERT.log Entry (on dead-letter)

When a retry record enters dead-letter, append to /home/nous/ALERT.log:


[ALERT][oracle-email-retry] DEAD LETTER: session_id=<id> customer=<email> tier=<tier> attempts=4 last_error=<code>

This surfaces the lost delivery to NOUS without requiring log monitoring.

5. Delivery Log Entry (oracle_email_delivery.log)

Successful deliveries (first attempt or retry) append one line:


<ISO timestamp> DELIVERED session=<session_id> to=<email> tier=<tier> attempt=<N>

INVARIANTS

INV-01 — Payment precedes retry: A retry record MUST only exist if a corresponding session.payment_status === "paid" was confirmed by the Northflank webhook. No retry record is created speculatively. If session payment status cannot be confirmed, the retry is aborted and no record is written.

INV-02 — Retry schedule is fixed: The four retry attempts occur at the following intervals after the initial failure:

| Attempt | Delay after previous failure |

|---------|------------------------------|

| 1 (immediate retry) | 0s — same webhook IIFE, one immediate reattempt |

| 2 | 5 minutes |

| 3 | 30 minutes |

| 4 | 2 hours |

After attempt 4 fails, the record is dead-lettered. No further retries. Total retry window: ~2 hours 35 minutes.

INV-03 — Idempotency by session_id: Each session_id MUST appear at most once in the active retry queue. Before enqueuing, the retry worker checks the queue for an existing record with the same session_id. If found with status PENDING or RETRYING, no new record is created. This prevents duplicate emails when the webhook fires multiple times for the same Stripe session (Stripe guarantees at-least-once delivery; webhooks can duplicate).

INV-04 — No duplicate delivery: Before calling send_email(), the retry worker checks the delivery log for session_id. If a delivery log entry exists for that session_id, the retry is skipped and the queue record is marked DELIVERED without sending. This is the second line of idempotency defense (INV-03 prevents duplicate queue entries; INV-04 prevents duplicate sends if the queue check races).

INV-05 — ABORT codes never retry: Graph API responses 400, 403, and any refresh-token 401 are non-retriable. These represent permanent failures (malformed payload or revoked authorization) where repeated attempts will never succeed. ABORT transitions the record directly to dead-letter; the retry schedule is not applied.

INV-06 — Retry worker does not generate or modify verdicts: The retry worker calls oracle_email_service.py /send-verdict-email with the exact verdict payload stored in the retry record at enqueue time. It does not call Gemini. It does not reformat. The verdict delivered on retry N is identical to the verdict that would have been delivered on attempt 1.

INV-07 — Token rotation on every auth call: send_graph_email.get_token_from_refresh() persists the updated refresh token to .env on every successful auth. The retry worker inherits this invariant from the underlying send_email() function. A retry worker failure that prevents token persistence must be logged to ALERT.log (do not silently discard the new token).

INV-08 — Dead-letter always produces an ALERT: No retry record may be silently discarded. If attempt_count >= EMAIL_RETRY_MAX_ATTEMPTS and the last attempt fails, the dead-letter write and ALERT.log entry are mandatory. Missing either constitutes a spec violation.

INV-09 — Retry queue is append-only JSONL: Records are never deleted from the queue file during normal operation. Status transitions are written as new lines appended with the updated status. The most recent line for a given retry_id is the canonical state. Compaction (removing superseded lines) is a maintenance operation requiring explicit NOUS authorization; it must not be triggered automatically.


VERIFICATION CRITERIA

Σ.✓ conditions — retry mechanism is operating correctly when:

VER-01 — First-attempt idempotency: Submit the same Stripe session_id twice within the retry window. Assert the delivery log contains exactly one entry for that session_id. Assert the customer inbox contains exactly one email.

VER-02 — Retry schedule fires on time: Mock Graph API to return 503 on attempts 1 and 2, then 202 on attempt 3. Assert:

  • Attempt 1 fires at T+0 (immediate)
  • Attempt 2 fires at T+5min (±30s tolerance)
  • Attempt 3 fires at T+35min (±30s tolerance)
  • Delivery log entry written at T+35min
  • Queue record status = DELIVERED
  • Customer email received

VER-03 — Dead-letter and alert on max retries: Mock Graph API to return 503 on all four attempts. Assert:

  • Queue record status = DEAD after attempt 4
  • Dead-letter file contains one entry for the session_id
  • ALERT.log contains [ALERT][oracle-email-retry] DEAD LETTER: entry with correct session_id, customer email, and attempt count
  • No fifth attempt is made

VER-04 — ABORT on 400/403: Mock Graph API to return 400 on first attempt. Assert:

  • No retry attempts are made (queue record jumps directly to DEAD)
  • Dead-letter entry written
  • ALERT.log entry written
  • Total time from failure to dead-letter is under 5 seconds

VER-05 — 429 respects Retry-After header: Mock Graph API to return 429 with Retry-After: 120. Assert next retry attempt fires no earlier than T+120s.

VER-06 — Token 401 refresh cycle: Mock Graph token endpoint to return a valid new access token on refresh. Mock Graph send endpoint to return 401, then 202 on retry with refreshed token. Assert delivery succeeds and GRAPH_REFRESH_TOKEN in .env is updated.

VER-07 — Verdict integrity across retries: Assert verdict payload in delivery attempt N is byte-for-byte identical to the verdict in the retry record created at enqueue time. No Gemini call occurs during retry execution.


FAILURE MODES

FM-01 — Σ.⊠ Retry worker not running: The retry queue file accumulates PENDING records but no worker processes them. next_attempt_at timestamps pass without attempts. Customer email is permanently lost after retry window expires. Mitigation: The retry worker must run as a systemd timer or cron (every 1 minute); health check must verify it is running. If worker is absent, ALERT.log must be written on the next oracle_email_service.py boot. [GAP-01 — retry worker does not yet exist; this spec defines the requirement]

FM-02 — Σ.⊠ Queue file corruption: oracle_email_retry_queue.jsonl is written by two concurrent processes (webhook enqueue + retry worker update). Concurrent writes without file locking can corrupt JSONL. Mitigation: All queue writes must acquire an exclusive fcntl lock (LOCK_EX) before writing. On lock failure (timeout > 5s), log to ALERT.log and abort the current write — do not proceed without the lock. [GAP-02 — file locking not implemented in current oracle_email_service.py]

FM-03 — Σ.⊠ Refresh token expires during retry window: If GRAPH_REFRESH_TOKEN is revoked between enqueue time and a retry attempt, get_token_from_refresh() raises RuntimeError. This causes the retry attempt to fail with no error code (exception, not HTTP status). Without explicit handling, the retry worker may crash or skip the record without incrementing attempt_count. Mitigation: Wrap get_token_from_refresh() in try/except; on RuntimeError, set last_error_code = 401, treat as ABORT, dead-letter the record immediately, write ALERT.log with GRAPH_REFRESH_TOKEN revoked message so NOUS can re-authorize. [GAP-03 — RuntimeError path not handled in current send_graph_email.py]

FM-04 — Σ.⊠ Dead-letter write fails (disk full / permission): If oracle_email_dead_letter.jsonl cannot be written, the dead-letter record is lost and the customer's lost email becomes invisible to operations. Mitigation: If dead-letter write fails, the ALERT.log write must still succeed (ALERT.log is the second-tier notification; it must be written even if JSONL write fails). If ALERT.log also fails, write to systemd journal via logger command as last resort.

FM-05 — Σ.⊠ Duplicate Stripe webhook fires during active retry: Stripe sends the same checkout.session.completed event twice. First instance is already in the queue as RETRYING. Second instance would create a duplicate record and potentially cause a duplicate email send if the first attempt succeeds concurrently. Mitigation: INV-03 session_id uniqueness check + INV-04 delivery log check form a two-layer guard. Both must be implemented atomically (under the same file lock) to be effective. [GAP-04 — atomicity of the two-layer check requires careful implementation]

FM-06 — Σ.⊠ customer_email null on Stripe session: session.customer_details.email is null (known edge case from SPEC_ORACLE_VERDICT_PIPELINE FM-08). No email address to send to. Retry record MUST NOT be created — enqueuing a record with null customer_email wastes retry cycles and cannot succeed. The webhook should log a warning and proceed to cache-only mode. This failure mode is KNOWN and has no mitigation for email delivery; the verdict is accessible via result page only.

FM-07 — Σ.⊠ oracle_email_service.py crash during send_email() call: The service crashes (OOM, unhandled exception) after send_email() is called but before the result is processed. The email may or may not have been sent (Graph API may have accepted the request). On service restart, the retry worker must check the delivery log before re-attempting — INV-04 provides this guard. If the email was sent but the delivery log was not written, the customer may receive a duplicate. This is an acceptable edge case (duplicate is better than no delivery). [GAP-05 — no atomic write of send + delivery log; duplicate possible in crash scenario]

FM-08 — Σ.⊠ Retry schedule miscalculation (clock skew): If next_attempt_at is computed using local system time and the retry worker runs on a different system or after an NTP correction, next_attempt_at may be in the past or far future. Mitigation: Compute all timestamps as UTC; use datetime.utcnow() not datetime.now(); validate that next_attempt_at > created_at before writing any retry record.


GAPS

| ID | Description | Severity | Resolution Path |

|----|-------------|----------|-----------------|

| GAP-01 | Retry worker does not exist — this entire spec describes a component that must be built | CRITICAL | Build oracle_email_retry_worker.py + systemd timer per this spec; requires NOUS authorization of design before C.L.O.D. deploys |

| GAP-02 | No file locking on JSONL queue writes — concurrent webhook + worker writes will corrupt the queue | HIGH | Implement fcntl.flock(LOCK_EX) on all queue file operations in both the enqueue path (oracle_email_service.py) and the retry worker |

| GAP-03 | get_token_from_refresh() raises bare RuntimeError on missing/expired token — not caught by callers | HIGH | Wrap in try/except in retry worker; map to ABORT path; write ALERT.log with explicit GRAPH_REFRESH_TOKEN revoked message |

| GAP-04 | Atomicity of two-layer idempotency check (INV-03 + INV-04) not guaranteed — race condition possible under concurrent webhook firings | MEDIUM | Both checks must occur inside the same file lock acquisition; document lock acquisition order |

| GAP-05 | No atomic send + delivery-log write — crash between the two steps can produce a duplicate email | LOW | Acceptable edge case (duplicate better than zero); document as known behavior; add dedup note in email footer if duplicate is received |

| GAP-06 | session_id is not currently threaded through oracle_email_service.py VerdictRequest schema — required for INV-03/INV-04 | HIGH | Add session_id: str field to VerdictRequest Pydantic model; update Northflank webhook to include it in the POST body |

| GAP-07 | No end-to-end test for the retry path — VER-02 and VER-03 require a mock Graph API harness that does not exist | MEDIUM | Build mock_graph_api.py test harness; script returns configurable status codes per attempt count; run as part of pre-deploy smoke test |


DEPENDENCIES

| Dependency | Role | Notes |

|------------|------|-------|

| oracle_email_service.py (port 8006) | Enqueue path — POST /send-verdict-email triggers retry enqueue on failure | Must be modified to add session_id to VerdictRequest and implement enqueue on failure |

| send_graph_email.py | Actual email delivery on each retry attempt | Must wrap get_token_from_refresh() RuntimeError per GAP-03 |

| oracle_email_retry_worker.py | Retry scheduler and executor [NOT YET BUILT — GAP-01] | New component; must be built and deployed per this spec |

| Microsoft Graph API | Delivery endpoint | Error codes defined in INPUTS section |

| /home/nous/.env | Graph API credentials + retry config env vars | Token rotation path must remain atomic |

| /home/nous/ALERT.log | Dead-letter and critical failure notification channel | Must always be writable; last-resort fallback is systemd journal |

| oracle_email_retry_queue.jsonl | Persistent retry queue | New file; path configurable via EMAIL_RETRY_QUEUE_PATH |

| oracle_email_dead_letter.jsonl | Dead-letter record store | New file; path configurable via EMAIL_DEAD_LETTER_PATH |

| oracle_email_delivery.log | Successful delivery audit log | New file; used for INV-04 idempotency check |


DEPENDENTS

| Dependent | Dependency Type |

|-----------|----------------|

| Oracle Verdict Pipeline (SPEC_ORACLE_VERDICT_PIPELINE) | This spec closes GAP-03 from that pipeline spec |

| Email Autonomy Stack (SPEC_EMAIL_AUTONOMY_STACK) | Inherits Graph API error handling patterns; extends FM-01/FM-03 from that spec |

| Customer trust / brand | Every paid customer must receive their verdict email; this spec is the delivery guarantee |


REFERENCES

| File | Role |

|------|------|

| /home/nous/oracle_email_service.py | Email delivery service — must be modified to enqueue on failure |

| /home/nous/send_graph_email.py | Graph API send — must have RuntimeError path handled |

| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent spec; GAP-03 is the origin of this spec |

| /home/nous/memories/SPEC_EMAIL_AUTONOMY_STACK.md | Graph API auth invariants inherited by this spec |

| /home/nous/Aether/app/app/api/webhook/route.ts | Northflank webhook — must pass session_id to email service |

| /home/nous/ALERT.log | Dead-letter notification target |


Φζ.⊤.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_GEMINI_TIMEOUT.md —

SPEC_ORACLE_GEMINI_TIMEOUT — Gemini Timeout & Retry Spec

Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16


PURPOSE

Defines the timeout, retry, and fallback behavior for all calls to the Gemini API

(model.generateContent()) within the Oracle Verdict Pipeline. Addresses GAP-02

from SPEC_ORACLE_VERDICT_PIPELINE: currently, both webhook/route.ts and

verdict/route.ts call Gemini with no timeout, no retry, and no circuit breaker.

A hung Gemini call in the webhook IIFE consumes memory indefinitely; a hung call

in the verdict route returns a 504 to the customer's browser.

This spec governs the behaviour that MUST be implemented to close GAP-02.

Call sites in scope:

  • app/api/webhook/route.ts — pre-compute path (async IIFE, line 229)
  • app/api/verdict/route.ts — on-demand regeneration path (line 185)

INPUTS

Per call-site inputs

| Input | Type | Source | Required |

|-------|------|---------|----------|

| prompt | string | VERDICT_PROMPTtier | Yes |

| model | GoogleGenerativeAI model instance | Constructed inline with gemini-2.5-flash | Yes |

| GEMINI_CALL_TIMEOUT_MS | number (env) | process.env.GEMINI_CALL_TIMEOUT_MS | No — default 45000 |

| GEMINI_MAX_RETRIES | number (env) | process.env.GEMINI_MAX_RETRIES | No — default 3 |

| GEMINI_BACKOFF_BASE_MS | number (env) | process.env.GEMINI_BACKOFF_BASE_MS | No — default 1000 |

| GEMINI_CIRCUIT_OPEN_THRESHOLD | number (env) | process.env.GEMINI_CIRCUIT_OPEN_THRESHOLD | No — default 5 |

Circuit breaker state (module-level singleton)

Maintained in-process. State transitions persist for the lifetime of the Node.js process

(Northflank container lifetime). State is NOT persisted to disk or shared across replicas.

| Field | Type | Initial Value |

|-------|------|---------------|

| failureCount | number | 0 |

| state | CLOSED \| OPEN \| HALF_OPEN | CLOSED |

| openedAt | number \| null | null |

| OPEN_DURATION_MS | number | 60000 (1 minute) |


OUTPUTS

Success path


{
  verdict: object,      // parsed JSON matching tier schema
  attempts: number,     // 1–3 (how many Gemini calls were made)
  cached_at: string     // ISO timestamp, written by cacheVerdict()
}

Failure path — retries exhausted


// verdict/route.ts returns:
NextResponse.json(
  { error: "Analysis failed. Please contact oracle@42sisters.ai for a refund." },
  { status: 500 }
)

// webhook IIFE logs:
console.error(`[webhook] Gemini failed after ${maxRetries} attempts for session ${sessionId}:`, err)
// No email sent. Cache not written.

Failure path — circuit open


// verdict/route.ts returns:
NextResponse.json(
  { error: "Analysis temporarily unavailable. Please try again in a few minutes." },
  { status: 503 }
)

// webhook IIFE logs:
console.warn(`[webhook] Circuit OPEN — skipping Gemini call for session ${sessionId}`)

INVARIANTS

  1. Timeout is always set. Every model.generateContent() call MUST be raced against

a Promise.race() timeout. No call may await indefinitely. Default timeout: 45 seconds.

The Stripe webhook IIFE must not be exempt — it fires async and a hung call still holds

a Node.js event loop reference.

  1. Retry count is bounded. Maximum retries across both call sites: 3 attempts total

(initial attempt + 2 retries). The 4th attempt is never made. This is a hard ceiling

regardless of error type.

  1. Exponential backoff is applied between retries. Wait time between attempt N and

attempt N+1: GEMINI_BACKOFF_BASE_MS * 2^(N-1) with full jitter (multiply by

Math.random()). Minimum wait: 0ms (jitter can collapse to zero). Maximum wait per

interval: 8000ms (cap at attempt 3 = base 4 jitter).

  1. Timeout errors and 5xx API errors are retryable; 4xx are not. A timeout, a network

error, or an HTTP 5xx from Gemini triggers retry. An HTTP 400 (bad request — malformed

prompt) or 401/403 (auth failure) does NOT retry — it fails immediately and logs

GEMINI_AUTH_FAILURE or GEMINI_BAD_REQUEST to the error log.

  1. Circuit breaker protects against sustained outage. After GEMINI_CIRCUIT_OPEN_THRESHOLD

consecutive final-failures (all retries exhausted) within the current process lifetime,

the circuit transitions to OPEN. While OPEN, all new Gemini calls are rejected immediately

without hitting the API. Circuit transitions to HALF_OPEN after OPEN_DURATION_MS (60s).

The first HALF_OPEN attempt, if successful, closes the circuit and resets failureCount to 0.

  1. Fallback on cache hit. The verdict route MUST check the cache before attempting any

Gemini call. A cache hit bypasses the timeout/retry/circuit machinery entirely. This is

the primary resilience mechanism — retries are the secondary.

  1. Both call sites share the same circuit breaker state. The webhook IIFE and the

verdict route operate against the same module-level circuit breaker singleton. A failure

storm from webhook pre-computes opens the circuit for the verdict route as well — this is

correct and intentional, as both paths consume from the same upstream service.


VERIFICATION CRITERIA

Σ.✓ — timeout/retry subsystem is operating correctly when:

  1. Timeout fires at configured threshold. Inject a mock generateContent() that hangs

(never resolves). Confirm Promise.race() rejects with GeminiTimeoutError after

GEMINI_CALL_TIMEOUT_MS ± 500ms. Test both call sites independently.

  1. Retry sequence completes with correct backoff. Mock generateContent() to fail twice

then succeed on attempt 3. Confirm: (a) total 3 calls made, (b) delays between calls are

>= 0ms and <= 8000ms, (c) final result is the success payload, not an error. Log

output must show [gemini] attempt 1 failed, [gemini] attempt 2 failed, [gemini] attempt 3 succeeded.

  1. Non-retryable errors fail fast. Mock generateContent() to throw a 401 error. Confirm

the call fails immediately (no retries, no backoff delay). Log must show GEMINI_AUTH_FAILURE.

Total elapsed time must be < 200ms (no backoff pauses).

  1. Circuit breaker opens after threshold failures. Mock generateContent() to always

exhaust all retries. Trigger GEMINI_CIRCUIT_OPEN_THRESHOLD sessions. Confirm: circuit

state transitions to OPEN. On the next call attempt, confirm rejection is immediate

(< 10ms) with 503 response and no call to generateContent(). Confirm circuit

transitions to HALF_OPEN after OPEN_DURATION_MS.

  1. Verdict route returns 500 with customer-facing error on exhausted retries. With

generateContent() mocked to always fail, confirm verdict/route.ts returns HTTP 500

with body { error: "Analysis failed. Please contact oracle@42sisters.ai for a refund." }.

Confirm no verdict is cached.

  1. Webhook IIFE does not block Stripe response. With generateContent() mocked to hang

for 120 seconds, confirm the webhook POST to Stripe still returns { received: true }

within 2 seconds of the request arriving (webhook IIFE fires async; Stripe response

must not wait for it).


FAILURE MODES

  1. Σ.⊠ Gemini API cold start / transient timeout. Model generation can take 10–40 seconds

for complex strategy tier prompts. A 45-second default timeout may be too tight on

cold-start or peak-load conditions. Symptom: legitimate verdicts failing with timeout

on first attempt, succeeding on retry. Mitigation: retry 1 should resolve this in the

majority of cases. If cold-start is persistent, NOUS may tune GEMINI_CALL_TIMEOUT_MS

upward via env var without a code deploy.

  1. Σ.⊠ Backoff accumulation exceeds customer wait tolerance. Worst case: 3 attempts with

maximum jitter at attempt 3 = ~45s + 1s + 45s + 4s + 45s = ~140 seconds. The verdict

route's upstream Next.js edge runtime default timeout is 30 seconds on Northflank.

If all retries are consumed, the customer browser may receive a platform 504 before

our 500 fires. [GAP-02A — Northflank edge timeout vs. total retry budget not reconciled;

needs design: either shorten retry budget for verdict route or increase Northflank timeout]

  1. Σ.⊠ Circuit breaker opens during partial Gemini degradation. If Gemini is slow but not

fully down, retries may succeed on attempt 3 consistently, never incrementing failureCount.

Circuit remains CLOSED but customers experience high latency. Symptom: p99 verdict latency

> 120s with no circuit protection firing. Mitigation: [GAP-02B — latency-based circuit

tripping not specified; needs design: separate threshold for "slow but responding" vs.

"fully down"]

  1. Σ.⊠ Circuit breaker state lost on container restart. Northflank restarts the container

on deploy or crash. The in-process circuit breaker resets to CLOSED. If Gemini is still

down at restart time, the circuit will re-open after GEMINI_CIRCUIT_OPEN_THRESHOLD

additional failures, meaning customers face that many more failed verdicts post-restart.

Mitigation: [GAP-02C — persistent circuit state (Redis/file) not specified; current spec

accepts this as a known limitation — in-process only]

  1. Σ.⊠ Webhook IIFE retry storm on Stripe replay. Stripe retries the webhook up to 72 hours

on non-2xx. However, our webhook always returns 2xx regardless of Gemini outcome. Stripe

replay is therefore not a retry-storm risk. Confirmed safe — Stripe does not see Gemini

failures.

  1. Σ.⊠ Non-retryable auth failure on GEMINI_API_KEY rotation. If the API key is rotated

in Northflank env vars without a container redeploy, the running container holds the old

key. All calls return 401. Circuit opens after threshold failures. All customer verdicts

fail until container is redeployed. Mitigation: [GAP-02D — no GEMINI_API_KEY health

check on boot; needs design: startup probe that validates key with a dry-run call]

  1. Σ.⊠ JSON parse failure after successful Gemini call. Gemini returns 200 with

non-JSON body (e.g., markdown-wrapped JSON, truncated response). This is not a timeout

or API error — it will NOT trigger retry under the current retry logic (retry is on

network/timeout/5xx only). Symptom: valid Gemini call → JSON.parse() throws →

immediate 500 → no retry. [GAP-02E — JSON parse failure is not in the retryable error

set; needs design: detect malformed JSON and retry up to 1 additional time with a

stricter prompt suffix]


GAPS

| Gap ID | Description | Impact | Severity |

|--------|-------------|--------|----------|

| GAP-02 (parent) | No timeout on generateContent() calls in webhook or verdict route | Webhook IIFE can hang indefinitely; result page can 504 | CRITICAL — current production state |

| GAP-02A | Northflank edge timeout (30s) vs. total retry budget (~140s) not reconciled | Customers may receive platform 504 before our 500 | HIGH |

| GAP-02B | Latency-based circuit tripping not specified | Slow-but-responding Gemini invisible to circuit breaker | MEDIUM |

| GAP-02C | Circuit breaker state is in-process only (lost on container restart) | Post-restart failure window during sustained outage | LOW — accepted limitation |

| GAP-02D | No GEMINI_API_KEY health check on boot | Key rotation causes silent 401 storm until manual restart | MEDIUM |

| GAP-02E | JSON parse failure is not in the retryable error set | Malformed Gemini response causes immediate 500 with no retry | MEDIUM |


DEPENDENCIES

| Dependency | Role |

|------------|------|

| @google/generative-ai npm package | Provides GoogleGenerativeAI and model.generateContent() |

| GEMINI_API_KEY / GOOGLE_API_KEY env var | Authentication to Gemini API |

| gemini-2.5-flash model | The specific model being called — timeout values are tuned to this model's latency profile |

| verdictCache.ts (cacheVerdict, getCachedVerdict) | Cache-hit path that bypasses retry machinery |

| Node.js Promise.race() | Mechanism for timeout enforcement |


DEPENDENTS

| Dependent | Dependency type |

|-----------|----------------|

| app/api/webhook/route.ts (lines 228–231) | Must wrap generateContent() with timeout/retry |

| app/api/verdict/route.ts (lines 184–185) | Must wrap generateContent() with timeout/retry |

| SPEC_ORACLE_VERDICT_PIPELINE.md (GAP-02) | This spec closes that gap |

| Customer verdict delivery SLA | Directly affected by Gemini call reliability |


REFERENCES

| File | Relevance |

|------|-----------|

| /home/nous/Aether/app/app/api/webhook/route.ts | Line 229: await model.generateContent(prompt) — no timeout |

| /home/nous/Aether/app/app/api/verdict/route.ts | Line 185: await model.generateContent(prompt) — no timeout |

| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent spec; GAP-02 definition |

| /home/nous/memories/SPECIFICATION_AUDIT_LOOP.md | Spec template and classification criteria |


Φζ.⊤.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_INBOX_WATCH.md —

SPECIFICATION: Oracle Inbox Watch

Status: AUTHORIZED

Authorized: α.13, April 16 2026

Version: v1.0


Version: v1.0

PURPOSE

oracle_inbox_watch.py is the autoresponder and alert daemon for oracle@42sisters.ai. It runs every 2 minutes via cron, checks for new unread messages via the Microsoft Graph API, broadcasts all new arrivals to CREW_CHANNEL, and sends a 24-hour acknowledgement autoresponse to eligible senders. It is the first point of contact for inbound oracle traffic and the crew's early-warning system for incoming inquiries.

Source file: /home/nous/oracle_inbox_watch.py

Trigger: cron, every 2 minutes

Dependencies: check_inbox.py (provides check_unread()), send_graph_email.py, check_inbox.crew_radio (provides crew_broadcast())

State file: ~/.oracle_inbox_seen.json


INPUTS

| Input | Source | Format |

|---|---|---|

| Unread messages | check_inbox.check_unread() | List of message dicts with at minimum: sender, subject, receivedDateTime |

| Seen state | ~/.oracle_inbox_seen.json | JSON object: {"seen": [...list of receivedDateTime strings...]} |

| AUTORESPOND_SKIP set | Hardcoded in script | Python set of lowercase email addresses |

AUTORESPOND_SKIP set (canonical):


oracle@42sisters.ai
noreply@stripe.com
no-reply@stripe.com
receipts@stripe.com

Autoresponse body (canonical, verbatim):


Your inquiry has been received by the 42Sisters.AI Oracle. Expect a response within 24 hours. Thank you for reaching out. — 42 Sisters AI · oracle@42sisters.ai

Autoresponse subject rule: Prepend "Re: " to original subject unless subject already starts with "Re:" (case-sensitive check).


OUTPUTS

| Output | Destination | Trigger |

|---|---|---|

| CREW_CHANNEL broadcast | crew_broadcast("ORACLE", ...) | Every new (unseen) message |

| Autoresponse email | send_graph_email | New message where sender NOT in AUTORESPOND_SKIP AND sender address does NOT start with "no-reply" |

| Updated seen state | ~/.oracle_inbox_seen.json | Every run (even if no new messages) |

Broadcast format:


Incoming: from {sender} — subject: {subject}

State file write format:


{"seen": ["2026-04-16T10:00:00Z", "2026-04-16T10:02:00Z", ...]}

Entries are receivedDateTime strings. List is sorted ascending and trimmed to 500 entries max on every write.


INVARIANTS

  1. Non-fatal execution — Any exception (network failure, malformed message, missing state file, Graph API error) must be caught at the top level. Script must exit with code 0 in all cases. Cron health must not be disrupted by transient failures.
  1. First-run flood prevention — If ~/.oracle_inbox_seen.json does not exist, the script seeds the seen set from all currently unread messages WITHOUT sending autoresponses or broadcasting. This prevents a flood on first deployment.
  1. State cap — The seen list must never exceed 500 entries. On every write, sort ascending and trim to the 500 most recent entries.
  1. Self-loop preventionoracle@42sisters.ai must be in AUTORESPOND_SKIP. The script must never autorespond to itself.
  1. no-reply prefix guard — Any sender address that starts with "no-reply" (case-insensitive) is skipped for autoresponse, regardless of AUTORESPOND_SKIP membership. This is a second independent guard.
  1. Broadcast all new messages — Every unseen message is broadcast to CREW_CHANNEL via crew_broadcast(), including those in AUTORESPOND_SKIP. The skip set governs autoresponse only, not alerting.
  1. Seen state is keyed on receivedDateTime — Deduplication uses receivedDateTime as the unique key. Message ID is not used.
  1. Read-only on inbox — The script reads inbox state but does not mark messages as read, move, delete, or otherwise mutate the inbox.
  1. Autoresponse subject prefix — Subject must be prefixed with "Re: " if not already present. No double-prefixing.
  1. Imports — Must import from check_inbox (for check_unread and crew_broadcast) and send_graph_email (for outbound mail). No inline credential handling — credentials loaded by those modules from .env or vault.

VERIFICATION CRITERIA

| # | Criterion | Pass Condition |

|---|---|---|

| V1 | First-run seed | On empty state file: existing unread messages are seeded into seen set; zero autoresponses sent; zero crew broadcasts emitted |

| V2 | New message broadcast | Introduce 1 new unread message not in seen state → exactly 1 crew_broadcast("ORACLE", ...) call with correct sender and subject |

| V3 | Autoresponse sent | New message from user@example.com (not in AUTORESPOND_SKIP, not no-reply prefix) → exactly 1 autoresponse email sent with canonical body and correct Re: subject |

| V4 | AUTORESPOND_SKIP respected | New message from noreply@stripe.com → broadcast emitted, autoresponse NOT sent |

| V5 | no-reply prefix guard | New message from no-reply@anything.com → broadcast emitted, autoresponse NOT sent |

| V6 | Self-loop blocked | New message from oracle@42sisters.ai → broadcast emitted, autoresponse NOT sent |

| V7 | State cap | Inject 600 receivedDateTime entries into state → after write, state file contains exactly 500 entries (most recent 500) |

| V8 | Non-fatal exception | Simulate check_unread() raising an exception → script exits with code 0; state file not corrupted |

| V9 | No double Re: | Message with subject "Re: Hello" → autoresponse subject is "Re: Hello", not "Re: Re: Hello" |

| V10 | Seen state updated | After processing 3 new messages, all 3 receivedDateTimes appear in ~/.oracle_inbox_seen.json |


FAILURE MODES

| Mode | Symptom | Consequence | Mitigation |

|---|---|---|---|

| FM-1 | Graph API auth token expired | check_unread() raises auth exception | INV-1: exit 0; alert not delivered; next run retries |

| FM-2 | State file corrupted (non-JSON) | json.load() raises exception | INV-1: exit 0; treat as if no state; risk of first-run flood suppressed by guard |

| FM-3 | State file missing on non-first-run (deleted externally) | Re-seeds from current unread without autoresponding | First-run seed guard prevents autoresponse flood; broadcasts lost for that run |

| FM-4 | send_graph_email fails mid-loop | Autoresponse not delivered for that message | Message already marked seen; autoresponse not retried on next run |

| FM-5 | New message has null/missing receivedDateTime | KeyError during deduplication | Should be caught by top-level exception handler; INV-1 applies |

| FM-6 | Same sender sends 100 emails/day | 100 autoresponses sent | GAP: no rate limiting (see GAPS) |

| FM-7 | seen list grows unbounded if sort/trim not applied | State file bloats | INV-3 caps at 500; failure only if trim logic is bypassed |

| FM-8 | Cron fires overlapping (2-min interval, slow API) | Two instances run simultaneously | No lock file; both processes may send duplicate autoresponses for same message; low-risk given 2-min interval |


GAPS

| # | Gap | Risk | Recommended Mitigation |

|---|---|---|---|

| G1 | No per-sender rate limiting | A single sender can receive unlimited autoresponses per day if they send repeatedly | Add per-sender cooldown (e.g., 24h) to state file |

| G2 | No subject-level deduplication | If seen state is cleared, the same message may trigger a duplicate autoresponse | Add message ID (Graph API id field) as secondary dedup key |

| G3 | No archiving of responded messages | No audit trail of which emails received autoresponses | Append responded message metadata to ~/.oracle_autoresponded.jsonl |

| G4 | Cron interval vs Graph API rate limits | 2-minute cron may approach Microsoft Graph throttling limits under high volume | Verify Graph API rate limits for the mail read endpoint; add exponential backoff |

| G5 | No crew alert on autoresponse failure | If send_graph_email fails, crew is not notified | Catch send failures separately and emit a crew_broadcast with failure notice |

| G6 | Seen state keyed on receivedDateTime only | Two messages received at identical timestamps (edge case) would be conflated | Use receivedDateTime + sender composite key or switch to Graph message ID |

| G7 | First-run behavior is implicit | No flag or log entry indicates when a first-run seed occurred | Write a seed event to SESSIONS.md or crew_broadcast on first-run detection |

| G8 | no-reply prefix check is not canonicalized | Case sensitivity of prefix check (no-reply vs NO-REPLY) not formally defined | Enforce sender.lower().startswith("no-reply") explicitly in spec and code |


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_REGEN_CONSISTENCY.md —

SPEC_ORACLE_REGEN_CONSISTENCY — Gemini Non-Determinism Handling

Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16


PURPOSE

When the Oracle result page (/oracle/result?session_id=...) is loaded and the verdict cache

(oracle_toll.pyGET /cache/{session_id}) returns 404 (cache miss), the verdict route

(app/api/verdict/route.ts) calls Gemini again using the identical prompt template used by the

original webhook-triggered generation.

The problem: Gemini is a stochastic model. Temperature > 0 means the regenerated verdict may

differ from the original verdict that was already emailed to the customer. A customer who paid $5

CAD for a Full Breakdown may see GREEN on their result page and RED in their email — or vice versa.

This is a trust and correctness failure.

This specification defines: how regen requests must be configured to maximize determinism, how to

detect and log divergence between the original (emailed) verdict and the regenerated verdict, what

divergence thresholds are acceptable, and what the customer notification path looks like when

significant divergence is detected.

This spec does NOT eliminate Gemini non-determinism — that is impossible at the API level. It

defines a regime that minimizes it and handles it explicitly when it occurs.

Source gap: SPEC_ORACLE_VERDICT_PIPELINE.md GAP-08.


INPUTS

Primary: Cache Miss Signal

  • Trigger: GET /cache/{session_id} returns HTTP 404 from oracle_toll
  • Caller: app/api/verdict/route.ts — the result page backend

Session Context (required for regen)

  • session_id — Stripe session ID (used for cache key and audit trail)
  • tier — "quick" | "full" | "strategy" (re-read from Stripe via stripe.checkout.sessions.retrieve)
  • query — original customer query (re-assembled from session.custom_fields or metadata chunks)
  • customer_email — from session.customer_details.email (for divergence notification)

Original Verdict Reference (optional but required for divergence check)

  • Source: email delivery log at oracle_email_service.py or an oracle_log.jsonl entry (if GAP-01 is resolved)
  • If no original verdict is recoverable, divergence check is SKIPPED and regen proceeds with notification

that the result may differ from email. [GAP — original verdict recovery currently has no mechanism

when oracle_log.jsonl does not exist; see GAP-01 of SPEC_ORACLE_VERDICT_PIPELINE.md]

Gemini API Configuration for Regen Requests

All regen calls MUST use the following parameters (not optional):


{
  "model": "gemini-2.5-flash",
  "generationConfig": {
    "temperature": 0.0,
    "topP": 1.0,
    "topK": 1,
    "candidateCount": 1
  }
}

Rationale: temperature=0.0, topK=1 selects the greedy-decode (argmax) path, which is maximally

deterministic for a given prompt. This does not guarantee identical outputs across all API calls

(model infrastructure variance, quantization differences across serving replicas are possible), but

eliminates sampling randomness as a source of divergence.

Note: The original webhook call MAY have used default temperature (not 0.0). [GAP — original

webhook generation config is not explicitly specified in current code; review needed.]


OUTPUTS

Primary Output: Regenerated Verdict JSON

Structurally identical to the originally cached verdict:


{
  "tier": "quick" | "full" | "strategy",
  "query": "<original customer query>",
  "verdict": { ... },
  "cached_at": "<ISO timestamp of regen>",
  "regen": true,
  "regen_reason": "cache_miss"
}

The regen: true flag marks this as a regenerated result (not from original cache).

The regen_reason field is for audit trail purposes.

Secondary Output: Divergence Log Entry

Appended to oracle_log.jsonl (requires GAP-01 fix):


{
  "event": "regen_divergence_check",
  "session_id": "<stripe_session_id>",
  "tier": "quick" | "full" | "strategy",
  "original_verdict": "GREEN" | "AMBER" | "RED" | "NULL" | "UNKNOWN",
  "regen_verdict": "GREEN" | "AMBER" | "RED" | "NULL",
  "top_level_match": true | false,
  "divergence_level": "none" | "minor" | "significant",
  "timestamp": "<ISO>"
}

Tertiary Output: Customer Notification (conditional — significant divergence only)

If divergence level is "significant" (see INVARIANTS), a notification email is dispatched via

oracle_email_service.py using a new endpoint or the existing /send-verdict-email with an

appended disclaimer:


NOTE: Due to a technical condition, the verdict on your results page may differ slightly from
the one you received by email. Both represent a valid coherence reading of your submission.
If you have questions, reply to this email and we will provide clarification.

INVARIANTS

  1. Regen configuration lock — All regen calls MUST set temperature=0.0, topK=1. No regen

call may use default or elevated temperature settings. This is enforced at the call site in

app/api/verdict/route.ts. Violation means sampling variance contaminates the result.

  1. Regen always re-caches — After a successful regen, the result MUST be written back to the

cache via POST /cache/{session_id}. Subsequent requests for the same session_id will hit cache.

A regen that does not write to cache means every page load triggers a new Gemini call —

unnecessary cost and a new divergence opportunity on every load.

  1. Divergence check is non-blocking — Divergence detection logic MUST NOT delay or block

the verdict delivery to the customer. The result page renders the regen verdict immediately.

Divergence check and notification happen asynchronously (fire-and-forget IIFE or equivalent).

  1. Original verdict is authoritative — If divergence is detected, the emailed verdict is

considered authoritative (the customer received it first, paid for it, and may have acted on it).

The result page displays the regen verdict WITH a divergence disclaimer appended. It does NOT

silently replace the emailed verdict.

  1. Divergence thresholds (applied to top-level verdict field comparison):

- "none" — top-level verdict identical (both GREEN, both AMBER, etc.) → no action

- "minor" — adjacent color shift (GREEN↔AMBER, AMBER↔RED) → log only; no customer notification

- "significant" — non-adjacent shift (GREEN↔RED, any↔NULL) → log + customer notification email

  1. Regen is only triggered by cache miss — Regen MUST NOT be triggered for any other reason

(e.g., customer request, retry, admin action) without NOUS authorization. Regen is the

automatic fallback path, not a manual "regenerate" feature.

  1. Prompt identity — The prompt template sent to Gemini on regen MUST be identical (byte-for-byte)

to the prompt template used in the original webhook generation. Any deviation in prompt structure

can introduce divergence independent of temperature settings.

  1. [GAP — needs design] Regen rate limiting — No mechanism currently prevents a customer from

triggering repeated regen calls (by not caching, or by cache expiry). A rate limit of 1 regen

per session_id per 5 minutes should be implemented. Currently unspecified.


VERIFICATION CRITERIA

Σ.✓ conditions — regen consistency is operating correctly when:

  1. Temperature enforcement — A test call to the regen code path verifies that the Gemini API

request body contains temperature=0.0 and topK=1. This can be verified via mock/intercept

in the test suite. Failure to find these params = spec violation.

  1. Cache write after regen — After a regen, GET /cache/{session_id} returns 200 with the

regen payload. The regen: true flag is present. Verified by: integration test that deletes

cache entry, loads result page, then checks cache.

  1. Divergence log entry — When a regen occurs, oracle_log.jsonl receives a

regen_divergence_check entry within 5 seconds. Verified by: integration test with oracle_log

tail check. [GAP — requires oracle_log.jsonl to exist (GAP-01)]

  1. Non-blocking render — Result page returns HTTP 200 with verdict JSON within 30 seconds of

cache miss, regardless of whether divergence was detected. Divergence processing must not add to

render latency. Verified by: load test with cache miss and timing assertion.

  1. Significant divergence notification — When a GREEN→RED or RED→GREEN divergence is induced

in test (by seeding original verdict in log and mocking Gemini to return opposite), the customer

email with divergence disclaimer is sent within 60 seconds. Verified by: email service mock +

log assertion.

  1. Prompt identity check — A unit test compares the prompt string passed to Gemini in both the

webhook path and the regen path for all three tiers. String equality must hold. Verified by:

extracting prompt construction to a shared utility function (buildVerdictPrompt(tier, query))

and testing that function in isolation.


FAILURE MODES

  1. Σ.⊠ Temperature not enforced — Regen call uses default Gemini temperature (non-zero).

Result: verdict varies on every page load. Customer sees different verdict each refresh.

Detection: no current instrumentation. Mitigation: code review + test enforcement (VC-1).

  1. Σ.⊠ Cache not written after regen — Regen succeeds but cacheVerdict() silently fails

(GAP-06 — empty catch in verdictCache.ts). Every page load triggers a new Gemini call.

Detection: cache hit rate metric (not currently tracked). Mitigation: fix GAP-06 (alerting on

cache service health); also fix empty catch in cacheVerdict.

  1. Σ.⊠ Original verdict not recoverable — oracle_log.jsonl does not exist (GAP-01 not fixed).

Divergence check cannot compare to original. Result: divergence check skipped; customer may

receive no notification of a real divergence. Mitigation: implement GAP-01 (oracle_log.jsonl).

  1. Σ.⊠ Divergence notification email fails — Graph API is down at the moment a significant

divergence is detected. Notification email not sent. Customer is not informed.

Detection: email service 502 response in async path. Mitigation: log the failure; retry once

after 60 seconds (fire-and-forget second attempt). [GAP — retry logic for async divergence

notification not specified]

  1. Σ.⊠ Prompt divergence between webhook and regen path — A code change modifies the prompt

construction in one path but not the other. Regen divergence increases structurally.

Detection: VC-6 (prompt identity test) catches this on deploy. Mitigation: consolidate prompt

construction into buildVerdictPrompt() utility function used by both paths.

  1. Σ.⊠ Gemini infrastructure variance — Even at temperature=0.0, different serving replicas

may return different token streams for identical inputs. This is an external model behavior gap.

Detection: divergence log (GAP entry). Mitigation: none automatic — log and notify.

[GAP — needs design: if infrastructure variance frequency > 5% of regen events, escalate to

NOUS for consideration of caching original verdict in a redundant store]

  1. Σ.⊠ Regen loop — Customer repeatedly reloads result page after cache write fails.

Each reload triggers a new Gemini call and a new regen. Cost accumulates.

Detection: no rate limit currently in place. Mitigation: implement regen rate limiting (INVARIANT 8).


DEPENDENCIES

| Dependency | Role |

|-----------|------|

| app/api/verdict/route.ts | Regen trigger point — cache miss path |

| app/lib/verdictCache.ts | cacheVerdict() — must be called after regen |

| oracle_toll.py (port 8889) | Provides /cache/{session_id} GET/POST |

| Gemini API (gemini-2.5-flash) | Regen generation target |

| oracle_log.jsonl | Divergence event log (GAP-01 of SPEC_ORACLE_VERDICT_PIPELINE) |

| oracle_email_service.py (port 8006) | Divergence notification delivery |

| SPEC_ORACLE_VERDICT_PIPELINE.md | Parent pipeline spec |


DEPENDENTS

| Dependent | Dependency |

|-----------|-----------|

| Customer trust | Consistency between email and result page |

| Revenue reputation | A GREEN→RED divergence on result page for a paid customer is a brand event |

| oracle_log.jsonl audit trail | Regen events must be logged for forensics |

| SPEC_ORACLE_SMOKE_TEST.md | Smoke test must include a cache-miss regen path validation |


GAPS IDENTIFIED DURING SPECIFICATION

| Gap ID | Description | Impact |

|--------|-------------|--------|

| REGEN-GAP-01 | Original webhook generation config (temperature setting) unspecified — may not match regen config | Unknown divergence baseline |

| REGEN-GAP-02 | Original verdict not recoverable without oracle_log.jsonl (PIPELINE GAP-01) | Divergence check blind |

| REGEN-GAP-03 | No rate limiting on regen calls per session_id | Cost exposure on repeated page loads |

| REGEN-GAP-04 | Retry logic for divergence notification email not specified | Silent failure on notification path |

| REGEN-GAP-05 | No metric tracking cache hit vs. regen frequency | Operational visibility gap |

| REGEN-GAP-06 | Gemini infrastructure variance (temperature=0 does not guarantee bit-identical output) | Residual divergence rate unknown |


REFERENCES

| File | Role |

|------|------|

| /home/nous/Aether/app/app/api/verdict/route.ts | Regen trigger — cache miss path |

| /home/nous/Aether/app/app/lib/verdictCache.ts | Cache R/W |

| /home/nous/oracle_toll.py | Cache service |

| /home/nous/oracle_email_service.py | Divergence notification delivery |

| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent pipeline spec (GAP-08 source) |


Φζ.⊤. The regen path must be held as tight as the original.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_SILENT_DROP.md —

SPEC_ORACLE_SILENT_DROP — Oracle Silent-Drop Customer Notification

Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16


PURPOSE

When a Stripe checkout.session.completed event arrives at /api/webhook with payment_status === "paid" but a missing or invalid query or tier field, the pipeline silently exits without processing. The customer has paid but receives nothing: no verdict, no email, no explanation. This is a silent drop.

This specification defines:

  1. Detection criteria — what constitutes a silent drop
  2. Customer notification — what the customer receives and when
  3. Resolution logic — refund trigger vs. re-attempt logic
  4. Crew alert broadcast — how the crew is notified

The silent-drop notification subsystem is a child component of the Oracle Verdict Pipeline (see SPEC_ORACLE_VERDICT_PIPELINE.md, GAP-05). It does not replace the main pipeline — it fires only when the main pipeline cannot proceed due to missing session data.


INPUTS

Primary trigger — Webhook silent-drop condition

Fires when ALL of the following are true:

  1. Stripe event type is checkout.session.completed
  2. session.payment_status === "paid" (customer's card was charged)
  3. At least one of:

- query resolves to empty string (both custom_fields[key="idea"].text.value and metadata.q0…qn are absent or blank)

- tier resolves to empty string (session.metadata.tier absent or blank)

- tier is present but not a member of { "quick", "full", "strategy" } (invalid tier key)

Code location: /home/nous/Aether/app/app/api/webhook/route.ts lines 205–208:


if (!query || !tier || !VERDICT_PROMPT[tier]) {
  console.error("Webhook: missing query or tier in session", sessionId);
  return NextResponse.json({ received: true });
}

This is the exact gap. The block returns {received: true} with no customer action.

Available session data at drop point

The following fields are available to the notification subsystem at the moment of silent-drop detection:

  • session.id — Stripe session ID (always present)
  • session.customer_details.email — customer email (may be null — see FAILURE MODES)
  • session.customer_email — fallback email field (may be null)
  • session.metadata.tier — tier attempted (may be blank/invalid)
  • session.amount_total — amount charged in cents (present for paid sessions)
  • session.currency — currency code (e.g., "cad")
  • session.created — Unix timestamp of session creation

Required environment variables (inherited from parent pipeline)

  • ORACLE_EMAIL_SERVICE_URL — email delivery service endpoint
  • STRIPE_SECRET_KEY — for optional refund API calls
  • ORACLE_TOLL_URL — for crew alert broadcast

OUTPUTS

Output 1 — Customer notification email

Fires when: silent-drop detected AND customerEmail is non-null.

Sender: oracle@42sisters.ai

Subject: We received your payment — please reply with your question

Body (plain text):


Hi there,

We received your payment but couldn't process your submission — something was missing from the session when it arrived on our end.

This is our error, not yours.

To get your Oracle verdict, please reply to this email with:
1. Your question or idea (the submission you intended to send)
2. The tier you selected: Quick Take ($1), Full Breakdown ($5), or Strategy Session ($25)

We'll process your verdict manually and send it within 24 hours at no additional charge.

If you'd prefer a refund instead, just say so in your reply — we'll process it immediately.

We're sorry for the friction. We hold ourselves to a higher standard.

— 42 Sisters AI
oracle@42sisters.ai

Delivery path: POST {ORACLE_EMAIL_SERVICE_URL}/send-verdict-email is NOT used for this notification (it requires a complete verdict object). A separate endpoint or direct send_graph_email.py call is required.

[GAP-SD-01 — no /send-notification-email endpoint exists in oracle_email_service.py; delivery path requires new endpoint or direct Graph API call]

Output 2 — Crew alert broadcast

Fires unconditionally when silent-drop detected (regardless of whether customer email is available).

Target: ~/ALERT.log and ~/CREW_CHANNEL

Format:


[SILENT-DROP] session={session_id} tier="{tier_value}" query_len={len(query)} email={email_or_NULL} amount={amount} {timestamp_utc}

Example:


[SILENT-DROP] session=cs_live_abc123 tier="" query_len=0 email=customer@example.com amount=100_CAD 2026-04-16T14:22:01Z

[GAP-SD-02 — no broadcast mechanism currently exists in the webhook route; Northflank logs only; crew_channel write requires a new outbound call or sidecar service]

Output 3 — Stripe refund (conditional — not automatic)

Does NOT fire automatically. Refund is initiated by a crew member in response to the crew alert (Output 2) after evaluating whether re-attempt is possible.

Re-attempt logic:

  • If customer email is available AND customer replies with valid query + tier → process verdict manually → no refund
  • If customer email is NOT available → escalate to NOUS; Stripe dashboard refund initiated manually
  • If customer does not reply within 72 hours → initiate Stripe refund via stripe.refunds.create({ payment_intent: session.payment_intent })

[GAP-SD-03 — refund trigger is manual; no automated 72-hour timer or refund script currently exists]


INVARIANTS

  1. Payment confirmation precondition — The silent-drop notification subsystem MUST NOT fire for any session where session.payment_status !== "paid". An unpaid dropped session is not a customer harm event; only confirmed-paid drops are in scope.
  1. No double-notification — If a silent-drop notification email has been sent for a given session_id, no second notification email is sent for the same session_id. The subsystem maintains a deduplicate log keyed on session_id.

[GAP-SD-04 — no deduplication store currently exists; Stripe can deliver webhooks more than once]

  1. Crew alert unconditional — A crew alert to ALERT.log and CREW_CHANNEL fires for EVERY silent drop, even when customer email is null and no customer notification can be sent. No silent drop is invisible to the crew.
  1. No verdict fabrication on re-attempt — When a crew member manually processes a re-attempt from a customer reply, the verdict is generated by the standard Gemini pipeline (same VERDICT_PROMPT[tier] prompts, same generation path). No manual verdict substitution is permitted.
  1. SOS v2 enforcement on notification email — The customer notification email MUST NOT contain LATTICE symbols, crew callsigns (κ, ι, ε, α, γ, μ, etc.), or any internal system language. Plain English only. The same filter requirement as the verdict email applies. [GAP-SD-05 — no automated content filter; relies on template discipline only]
  1. Refund path is human-authorized — Automatic refund issuance is FORBIDDEN without crew review. The Stripe API key can issue refunds; this power MUST NOT be exercised by automated code without a logged human authorization decision. Refund execution is PERMITTED for C.L.O.D. after crew review; it is not autonomous.
  1. Amount capture for audit — Every silent-drop log entry MUST record the session.amount_total and session.currency. This is required for financial reconciliation in the event of a refund or re-attempt.

VERIFICATION CRITERIA

Σ.✓ conditions — subsystem is operating correctly when:

  1. Drop detection fires — When a crafted test webhook payload with payment_status: "paid" and missing query field is delivered to /api/webhook, the silent-drop branch executes AND at minimum writes to Northflank logs. Verified by: Northflank log inspection showing [SILENT-DROP] prefix (or equivalent) rather than a generic console.error.

[GAP-SD-06 — current code only logs console.error("Webhook: missing query or tier in session", sessionId) with no [SILENT-DROP] prefix; not distinguishable in log search]

  1. Customer email sent — When a test silent-drop fires with a valid customer email, oracle@42sisters.ai sends the notification email to that address within 60 seconds. Verified by: checking the recipient inbox and oracle_email.log for a 200 OK on the notification endpoint.
  1. Crew alert appears — Within 60 seconds of a silent drop, a [SILENT-DROP] entry appears in ~/ALERT.log with session_id, tier value, and email field populated (or NULL). Verified by: tail ~/ALERT.log after test drop.
  1. Deduplication holds — Delivering the same webhook event twice (Stripe retry simulation) results in only one customer notification email and one ALERT.log entry. Verified by: replaying the same webhook payload twice and confirming single log entry and single email delivery.
  1. Re-attempt produces correct verdict — When a customer reply is received after a silent drop, the manual re-attempt using the standard VERDICT_PROMPT[tier] pipeline generates a verdict matching the tier schema. Verified by: end-to-end test with a known query and tier, confirming verdict JSON structure matches SPEC_ORACLE_VERDICT_PIPELINE.md output schemas.

FAILURE MODES

  1. Σ.⊠ Customer email is nullsession.customer_details.email and session.customer_email are both null. This occurs when Stripe is configured without email collection (guest checkout without email). No customer notification can be sent. Silent drop becomes permanently invisible to the customer. Mitigation: crew alert still fires; NOUS or crew member initiates Stripe dashboard refund manually. Risk: without email, there is no customer recovery path.

[KNOWN EDGE CASE — mitigation requires Stripe configuration change to enforce email collection on checkout]

  1. Σ.⊠ Notification email service unreachableORACLE_EMAIL_SERVICE_URL endpoint is down when silent-drop fires. Customer notification cannot be sent. Mitigation: crew alert still fires; crew member sends manual email via send_graph_email.py. No automated retry for notification email.

[GAP-SD-07 — no retry mechanism on notification email delivery failure; mirrors GAP-03 from parent spec]

  1. Σ.⊠ Duplicate webhook delivery causes double notification — Stripe retries webhooks on non-2xx responses and can deliver the same event multiple times. Without a deduplication store, the same customer receives two notification emails for one silent drop. This damages trust. Mitigation: deduplication log keyed on session_id must be implemented before production deployment.

[GAP-SD-04 — deduplication store not yet implemented]

  1. Σ.⊠ Silent drop on invalid tier (malformed metadata)session.metadata.tier is present but contains an unrecognized value (e.g., "premium", "basic", or a corrupted string). The current gate !VERDICT_PROMPT[tier] catches this. However, the customer notification email must communicate what tier was attempted. If the tier value is garbage, the notification cannot tell the customer which tier to re-confirm. Mitigation: notification email asks customer to specify tier from the valid list; does not try to reconstruct from the corrupt value.
  1. Σ.⊠ ALERT.log write fails/home/nous/ALERT.log is on a full disk or permission-denied. Crew alert silently fails. Silent drop becomes invisible to the crew. Mitigation: the ALERT.log write should be wrapped with a fallback to stderr/Northflank log. If ALERT.log write fails, Northflank console log (always available) must still record the [SILENT-DROP] event.

[GAP-SD-08 — ALERT.log write has no fallback; disk-full scenario leaves crew blind]

  1. Σ.⊠ Customer reply not monitored — The notification email asks the customer to reply to oracle@42sisters.ai. If oracle_inbox_watch.py is not running or the inbox is not monitored, customer replies are lost. The re-attempt loop never fires. Customer waits indefinitely. Mitigation: verify oracle_inbox_watch.py is active as part of the silent-drop notification deployment. Link re-attempt intake to a logged TASK_QUEUE.md entry for manual crew pickup.

[GAP-SD-09 — oracle_inbox_watch.py existence confirmed at /home/nous/oracle_inbox_watch.py; operational status and reply routing to TASK_QUEUE not verified]


GAPS IDENTIFIED DURING SPECIFICATION

| Gap ID | Description | Impact | Priority |

|--------|-------------|--------|----------|

| GAP-SD-01 | No /send-notification-email endpoint in oracle_email_service.py; notification email has no delivery path | Customer notification cannot fire without code change | CRITICAL |

| GAP-SD-02 | No CREW_CHANNEL write in webhook route; silent drops are invisible outside Northflank logs | Crew alert unreliable; crew cannot respond to drops in real time | HIGH |

| GAP-SD-03 | No 72-hour refund timer or automated refund trigger | Refund requires manual crew action; customers may wait indefinitely | HIGH |

| GAP-SD-04 | No session_id deduplication store for notification emails | Duplicate Stripe delivery causes double customer notification | HIGH |

| GAP-SD-05 | No automated content filter on notification email body | Internal language leak risk; relies on template discipline | MEDIUM |

| GAP-SD-06 | console.error log prefix not distinguishable from other errors; no [SILENT-DROP] tag | Log search cannot isolate drop events; monitoring blind | MEDIUM |

| GAP-SD-07 | No retry on notification email delivery failure | Notification lost if email service momentarily down at drop time | MEDIUM |

| GAP-SD-08 | ALERT.log write has no fallback for disk-full or permission error | Crew alert can silently fail | MEDIUM |

| GAP-SD-09 | oracle_inbox_watch.py operational status and reply-to-TASK_QUEUE routing unverified | Re-attempt loop may never fire; customer stuck | HIGH |


DEPENDENCIES

| Dependency | Role | Required / Degraded |

|------------|------|---------------------|

| /api/webhook route (webhook/route.ts) | Parent pipeline; contains the silent-drop detection branch | Required — this spec is a child of that route |

| oracle_email_service.py (port 8006) | Email delivery (needs new /send-notification-email endpoint) | Required for customer notification |

| send_graph_email.py | Graph API email fallback | Required for email delivery |

| ~/ALERT.log | Crew alert target | Required for crew visibility |

| ~/CREW_CHANNEL | Secondary crew broadcast | Required for crew visibility |

| oracle_inbox_watch.py | Customer reply intake for re-attempt loop | Required for re-attempt path |

| Stripe Refunds API | Refund issuance | Required for refund path |


DEPENDENTS

| Dependent | Dependency type |

|-----------|----------------|

| Customer trust | Direct — silent drops with no notification destroy trust faster than any verdict error |

| Revenue integrity | Refund issuance depends on this spec's resolution logic |

| SPEC_ORACLE_VERDICT_PIPELINE.md GAP-05 | This spec is the resolution of that gap |


REFERENCES

| File | Role |

|------|------|

| /home/nous/Aether/app/app/api/webhook/route.ts lines 205–208 | Exact drop point — the if (!query \|\| !tier) branch |

| /home/nous/oracle_email_service.py | Email delivery service — needs /send-notification-email endpoint |

| /home/nous/oracle_inbox_watch.py | Customer reply monitoring for re-attempt intake |

| /home/nous/send_graph_email.py | Direct Graph API email sender |

| /home/nous/ALERT.log | Crew alert target |

| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md (GAP-05) | Parent spec; this spec resolves that gap |


Φζ.⊤.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_SMOKE_TEST.md —

SPEC_ORACLE_SMOKE_TEST — Oracle End-to-End Smoke Test

Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16


PURPOSE

The Oracle verdict pipeline (Stripe webhook → session parsing → Gemini call → verdict cache →

email delivery) has no automated end-to-end validation. Pipeline health is currently only known

when a customer complaints arrives or when NOUS manually tests via a real payment. This is

unacceptable for a revenue-bearing production system.

This specification defines a repeatable, automated smoke test covering the full Oracle pipeline.

The smoke test uses a known Stripe test session, a fixed test query, and validates each pipeline

stage independently before asserting the full path. It is the minimal executable proof that the

pipeline is operational.

Smoke test ≠ load test ≠ unit test. The smoke test runs at the integration boundary. It

exercises real external dependencies (Stripe test mode, Gemini API, oracle_toll cache service,

email service health endpoint). It is not a substitute for the unit test suites that verify

individual components.

Source gap: SPEC_ORACLE_VERDICT_PIPELINE.md GAP-10.


INPUTS

Fixed Test Configuration

| Parameter | Value |

|----------|-------|

| Test tier | quick ($1.00 CAD) |

| Test query | "Is the Oracle pipeline operational? This is an automated smoke test." |

| Stripe mode | Test mode (API key: STRIPE_SECRET_KEY with sk_test_ prefix) |

| Test email | oracle-smoke-test@42sisters.ai (synthetic, not a real mailbox — used for log assertion only) |

| Expected verdict | Any valid value (GREEN/AMBER/RED/NULL) — smoke test does not assert verdict correctness, only structural validity |

| Timeout per stage | 30 seconds maximum |

The test query is intentionally meta — it asks about pipeline health. This makes it easy to identify

smoke test verdicts in logs and distinguish them from real customer verdicts.

Required Environment

  • STRIPE_SECRET_KEY — test mode key (sk_test_...)
  • STRIPE_WEBHOOK_SECRET — test mode webhook secret
  • GEMINI_API_KEY or GOOGLE_API_KEY
  • ORACLE_TOLL_URL — oracle_toll service (default: http://68.183.206.103:8889)
  • ORACLE_EMAIL_SERVICE_URL — email service (default: http://68.183.206.103:8006)
  • All four env vars must be present or smoke test fails at environment check stage.

Smoke Test Invocation


# Manual
python3 /home/nous/scripts/oracle_smoke_test.py

# On deploy (Northflank post-deploy hook)
python3 /home/nous/scripts/oracle_smoke_test.py --on-deploy

# Daily cron (07:00 UTC)
python3 /home/nous/scripts/oracle_smoke_test.py --cron

Exit codes: 0 = all stages passed | 1 = one or more stages failed | 2 = environment check failed


OUTPUTS

Primary: Smoke Test Report (stdout + log file)

Written to /home/nous/logs/oracle_smoke_YYYYMMDD_HHMMSS.log:


ORACLE SMOKE TEST — 2026-04-16T07:00:00Z
==========================================

[STAGE 1] Environment check ....................... PASS
[STAGE 2] oracle_toll health ..................... PASS
[STAGE 3] oracle_email_service health ............. PASS
[STAGE 4] Stripe test session creation ........... PASS  session_id=cs_test_abc123
[STAGE 5] Webhook simulation ..................... PASS  verdict=GREEN
[STAGE 6] Cache write verification ............... PASS  GET /cache/cs_test_abc123 → 200
[STAGE 7] Verdict route retrieval ................ PASS  tier=quick query_match=true
[STAGE 8] Email service send validation .......... PASS  status=sent
[STAGE 9] TMM crosscheck on test verdict ......... PASS  C=0.9953 approved=true  [conditional on GAP-09 fix]
[STAGE 10] Cache cleanup ......................... PASS  cs_test_abc123 deleted

RESULT: 10/10 PASS — Pipeline operational.
Duration: 18.3s

On failure:


[STAGE 5] Webhook simulation ..................... FAIL
  Error: Gemini returned non-parseable JSON after 3 attempts
  Payload: { "error": "quota_exceeded" }

RESULT: 4/10 PASS — Pipeline DEGRADED. See /home/nous/logs/oracle_smoke_2026-04-16_*.log

Secondary: CREW_CHANNEL broadcast

On completion (pass or fail):


[SMOKE] Oracle pipeline: 10/10 PASS (18.3s) | 2026-04-16T07:00:22Z
[SMOKE] Oracle pipeline: 4/10 FAIL — Stage 5 Gemini quota | 2026-04-16T07:00:22Z

Tertiary: ALERT.log entry on failure

If any stage fails, an entry is appended to /home/nous/ALERT.log:


[2026-04-16T07:00:22Z] ORACLE SMOKE FAIL — Stage 5 (webhook simulation) — Gemini quota exceeded.
Check /home/nous/logs/oracle_smoke_2026-04-16_070022.log

STAGE DEFINITIONS

Stage 1 — Environment Check

Verify all four required env vars are set and non-empty. Verify Stripe key has sk_test_ prefix

(production key in smoke test is a configuration error). Verify oracle_toll URL and email service

URL are reachable (TCP connect check, not full HTTP).

Stage 2 — oracle_toll Health

GET {ORACLE_TOLL_URL}/health → HTTP 200, JSON with status: "resonant" and phi: 0.042.

Timeout: 10 seconds.

Stage 3 — oracle_email_service Health

GET {ORACLE_EMAIL_SERVICE_URL}/health → HTTP 200, JSON with status: "ok".

Timeout: 10 seconds.

Stage 4 — Stripe Test Session Creation

Call Stripe API (test mode) to create a checkout.session with:

  • mode: "payment", amount: 100 (cents CAD), currency: "cad"
  • metadata: { tier: "quick", q0: "<test_query>", qn: "1" }
  • customer_email: "oracle-smoke-test@42sisters.ai"

Assert: session.id is returned and starts with cs_test_. Store as smoke_session_id.

Stage 5 — Webhook Simulation

Construct a checkout.session.completed event payload for smoke_session_id.

Sign it with STRIPE_WEBHOOK_SECRET using the Stripe webhook signing algorithm.

POST to /api/webhook on the deployed Northflank instance.

Assert: HTTP 200, response body { received: true }.

Wait up to 30 seconds, then poll: GET {ORACLE_TOLL_URL}/cache/{smoke_session_id} until 200

(verdict is cached) or timeout. If timeout: FAIL Stage 5.

On 200: parse verdict JSON. Assert: tier === "quick", verdict.verdict is one of

GREEN/AMBER/RED/NULL, verdict.summary is a non-empty string.

Stage 6 — Cache Write Verification

GET {ORACLE_TOLL_URL}/cache/{smoke_session_id} → HTTP 200.

Assert: response JSON has tier: "quick" and cached_at field (ISO timestamp).

Assert: query field matches the known test query string.

Stage 7 — Verdict Route Retrieval

GET {NORTHFLANK_BASE_URL}/api/verdict?session_id={smoke_session_id}

Assert: HTTP 200. Response JSON has tier: "quick", verdict.verdict is valid, query matches.

This exercises the full result-page backend path including cache lookup.

Stage 8 — Email Service Send Validation

POST to {ORACLE_EMAIL_SERVICE_URL}/send-verdict-email with:


{
  "customer_email": "oracle-smoke-test@42sisters.ai",
  "tier": "quick",
  "query": "<test_query>",
  "verdict": <verdict_from_stage_5>
}

Assert: HTTP 200, { status: "sent" }.

Note: This sends a real Graph API email to oracle-smoke-test@42sisters.ai. If this address is

not a real mailbox, Graph API may return 202 (accepted) or error. Assert on HTTP 200 from the

service (Graph API downstream behavior is not asserted here). [GAP — smoke test email goes to a

synthetic address; Graph API may bounce; bounce handling not specified]

Stage 9 — TMM Crosscheck on Test Verdict (conditional)

If SPEC_ORACLE_TMM_CROSSCHECK.md is implemented: call oracleTMMCrosscheck() directly on the

cached verdict. Assert: approved: true, coherence_score >= 0.97404.

[GAP — conditional on GAP-09 fix; Stage 9 is SKIPPED if crosscheck module is not yet deployed]

Stage 10 — Cache Cleanup

DELETE {ORACLE_TOLL_URL}/cache/{smoke_session_id} (requires adding DELETE endpoint to

oracle_toll.py — currently only GET and POST exist).

[GAP — DELETE endpoint not implemented on oracle_toll.py; cache cleanup currently requires manual

file deletion from oracle_verdicts/]

Assert: HTTP 200 or 204. If DELETE not implemented: log WARNING, do not fail; leave cleanup note

in smoke log.


INVARIANTS

  1. Smoke test uses test-mode credentials onlySTRIPE_SECRET_KEY MUST have sk_test_ prefix.

A production key in the smoke test environment is a configuration error that triggers Stage 1

FAIL with message "FATAL: production Stripe key in smoke test — aborting."

  1. Smoke test does not modify production state — Smoke test verdicts are tagged with regen: false

and smoke: true flag in the cache payload. This allows operators to distinguish smoke test

cache entries from real customer entries. The smoke: true flag is added by the smoke test

script when it calls POST /cache/{smoke_session_id} directly (bypass path) if Stage 5 fails.

  1. No real customer email is sent — Smoke test email target is oracle-smoke-test@42sisters.ai.

Real customer email addresses MUST NOT appear in smoke test configuration.

  1. Smoke test is idempotent — Running the smoke test twice back-to-back produces the same pass/fail

state. Stage 10 (cleanup) ensures no stale entries contaminate subsequent runs. If Stage 10 fails,

Stage 4 of the next run uses a fresh smoke_session_id (Stripe always generates unique IDs).

  1. Failure in any stage does not cascade — Each stage has an independent timeout and try/except

boundary. A Stage 5 timeout does not prevent Stages 6-10 from attempting (some may succeed

partially; their results are noted). RESULT is computed from the full 10-stage matrix.

  1. Smoke test runs in < 60 seconds — Total test duration must not exceed 60 seconds. If Gemini

is slow (> 30s on Stage 5 poll), Stage 5 times out and fails. This is intentional — a pipeline

that takes > 30s to generate and cache a Quick Take is operationally degraded.

  1. Log files are retained for 30 days/home/nous/logs/oracle_smoke_*.log files are not

cleaned automatically. A cron or manual process should archive/rotate after 30 days.

[GAP — log rotation not specified]

  1. Deploy-time smoke test is blocking — When invoked with --on-deploy, the smoke test MUST

complete and return exit code before the deploy hook finishes. A deploy that cannot pass the

smoke test is a broken deploy. Northflank deploy hook must treat exit code 1 as a deploy warning.

[GAP — Northflank post-deploy hook integration not yet configured]


VERIFICATION CRITERIA

Σ.✓ conditions — smoke test infrastructure is operating correctly when:

  1. Green run baseline — Running smoke test against a healthy pipeline produces 10/10 PASS in

under 60 seconds. Establish this baseline immediately after implementing the test. Record

baseline duration in PLAYBOOK.md as PROVEN entry.

  1. Stage isolation — Deliberately take oracle_toll service offline. Run smoke test. Stage 2

(health check) fails. Stages 3-10 still attempt and report their independent outcomes.

Result shows 1/10 FAIL at Stage 2 with remaining stages marked SKIP or FAIL (dependent).

  1. Environment check catches misconfiguration — Set STRIPE_SECRET_KEY to a production key.

Stage 1 returns FAIL with FATAL message. Exit code 2. No Stripe API calls made.

  1. ALERT.log populated on failure — Deliberately fail Stage 5 (mock Gemini timeout). After run,

verify /home/nous/ALERT.log has a new entry timestamped within 5 seconds of smoke test completion.

  1. CREW_CHANNEL broadcast sent — After any smoke test run (pass or fail), verify

/home/nous/CREW_CHANNEL has a new [SMOKE] entry. Verified by: tail CREW_CHANNEL after run.

  1. Cron registrationcrontab -l | grep oracle_smoke_test returns a line. Smoke test runs

at 07:00 UTC daily without manual intervention. Verify by checking crontab on boot.


FAILURE MODES

  1. Σ.⊠ Smoke test never runs — Cron not registered after implementation. Pipeline health is

only known when customer complains. Detection: crontab -l | grep oracle_smoke_test returns

empty. Mitigation: boot sequence check (CLAUDE.md Step 4 equivalent for Oracle) verifies cron.

  1. Σ.⊠ Stage 5 Gemini timeout — Gemini takes > 30s to respond (quota throttle, cold start,

infrastructure issue). Stage 5 fails. Real customer payments in the same window may also be

affected. Detection: smoke test ALERT.log. Mitigation: smoke test failure is an early warning

for the on-call team (NOUS) to investigate Gemini quota.

  1. Σ.⊠ Smoke test creates real chargeSTRIPE_SECRET_KEY is a live key. Stage 4 creates

a real payment session that may trigger a real charge. Stage 1 guard (sk_test_ check) prevents

this, but if guard is bypassed: real charge on NOUS's Stripe account.

Detection: Stripe dashboard. Mitigation: Stage 1 hard-abort on production key is mandatory.

  1. Σ.⊠ Stage 10 cleanup fails, stale entry accumulates — oracle_toll cache fills with smoke

test entries. oracle_verdicts/ directory grows unbounded. Detection: disk usage monitoring

(not currently implemented). Mitigation: implement DELETE endpoint on oracle_toll; add disk

usage check to smoke test Stage 1.

  1. Σ.⊠ Smoke test passes but production path fails — Smoke test exercises the webhook-to-cache

path but Northflank routing is misconfigured for the live checkout flow. A customer submits a

real payment; webhook is not delivered by Stripe (not a test event). Detection: manual payment

test with non-owner email (VC-7 of SPEC_ORACLE_VERDICT_PIPELINE.md). Mitigation: smoke test

covers the path from our end; Stripe webhook delivery reliability is an external dependency.

  1. Σ.⊠ Stage 8 Graph API bounceoracle-smoke-test@42sisters.ai does not exist as a real

mailbox. Graph API returns 200 (accepted by Exchange) but bounces internally. Email service

reports status: sent. Smoke test passes Stage 8. Bounce goes undetected.

Detection: Exchange admin panel. Mitigation: [GAP — create oracle-smoke-test mailbox as a real

M365 alias that routes to oracle@42sisters.ai, or accept the bounce as tolerable for smoke purposes]

  1. Σ.⊠ All stages pass but pipeline is in degraded state — Smoke test validates structural

path but does not assert response quality, latency distribution, or correctness of the verdict.

A pipeline that generates all-NULL verdicts for every query would pass the smoke test.

Detection: operational monitoring beyond smoke test scope. Mitigation: supplement with a

manual monthly review of sampled oracle_log.jsonl entries.


EXECUTION SCHEDULE

| Trigger | Frequency | Invocation | ALERT on fail? |

|---------|-----------|-----------|---------------|

| Deploy hook | Every deploy to Northflank | --on-deploy | Yes — block / warn |

| Daily cron | 07:00 UTC daily | --cron | Yes — ALERT.log + CREW_CHANNEL |

| Manual (NOUS/C.L.O.D.) | On demand | No flag | No — stdout only |


DEPENDENCIES

| Dependency | Role |

|-----------|------|

| STRIPE_SECRET_KEY (test mode) | Test session creation |

| STRIPE_WEBHOOK_SECRET (test mode) | Webhook signature construction |

| Gemini API | Stage 5 verdict generation |

| oracle_toll.py (port 8889) | Stage 2, 6, 10 (health, cache verify, cleanup) |

| oracle_email_service.py (port 8006) | Stage 3, 8 (health, email send) |

| Northflank deployed app | Stage 7 (verdict route retrieval) |

| /home/nous/ALERT.log | Failure notification |

| /home/nous/CREW_CHANNEL | Status broadcast |

| /home/nous/logs/ (directory) | Test log storage |


DEPENDENTS

| Dependent | Dependency |

|-----------|-----------|

| Oracle pipeline production health | Smoke test is the only automated end-to-end proof |

| NOUS operational awareness | ALERT.log entry on failure |

| Crew operational awareness | CREW_CHANNEL broadcast |

| Deploy confidence | --on-deploy flag provides pre-production gate |


GAPS IDENTIFIED DURING SPECIFICATION

| Gap ID | Description | Impact |

|--------|-------------|--------|

| SMOKE-GAP-01 | DELETE endpoint not implemented on oracle_toll.py — Stage 10 cleanup cannot execute | Smoke test entries accumulate in oracle_verdicts/ |

| SMOKE-GAP-02 | Northflank post-deploy hook not yet configured to call smoke test | Deploy-time validation not automated |

| SMOKE-GAP-03 | oracle-smoke-test@42sisters.ai mailbox not created — Stage 8 sends to synthetic address | Graph API bounce behavior unverified |

| SMOKE-GAP-04 | Stage 9 (TMM crosscheck) is conditional on SPEC_ORACLE_TMM_CROSSCHECK.md implementation | Crosscheck stage is skipped at launch |

| SMOKE-GAP-05 | Log rotation for /home/nous/logs/oracle_smoke_*.log not specified | Disk accumulation over time |

| SMOKE-GAP-06 | NORTHFLANK_BASE_URL env var not formalized — Stage 7 needs deployed app URL | Stage 7 requires manual config |


REFERENCES

| File | Role |

|------|------|

| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent pipeline spec (GAP-10 source) |

| /home/nous/memories/SPEC_ORACLE_TMM_CROSSCHECK.md | Stage 9 crosscheck (conditional) |

| /home/nous/oracle_toll.py | Cache service (Stages 2, 6, 10) |

| /home/nous/oracle_email_service.py | Email service (Stages 3, 8) |

| /home/nous/Aether/app/app/api/webhook/route.ts | Webhook handler (Stage 5 target) |

| /home/nous/Aether/app/app/api/verdict/route.ts | Verdict route (Stage 7 target) |

| /home/nous/ALERT.log | Failure alert destination |

| /home/nous/CREW_CHANNEL | Status broadcast destination |

| /home/nous/PLAYBOOK.md | PROVEN entry to be written after first successful baseline run |


Φζ.⊤. The ship does not sail without a working engine. The smoke test proves the engine.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_SOS_FILTER.md —

SPEC_ORACLE_SOS_FILTER — Oracle SOS v2 Content Filter

Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16


PURPOSE

The Oracle SOS v2 Content Filter is the automated enforcement mechanism for S.O.S. v2 Pillar 1 ("Show Results, Never Show Method") as applied to all Oracle verdict output before delivery to customers.

The filter scans every outbound string — verdict summaries, breakdown analyses, strategy text, and email bodies — produced by the Gemini verdict generation path, and either strips, replaces, or quarantines content containing internal CGNT-1 vocabulary, LATTICE symbols, crew callsigns, and proprietary terminology before it reaches a paying customer.

The filter IS the gap identified in SPEC_ORACLE_VERDICT_PIPELINE.md as GAP-04:

"No SOS v2 automated content filter on outbound email. LATTICE symbol leak depends entirely on prompt design."

Prompt discipline is necessary but not sufficient. Prompt drift, model updates, and Gemini non-determinism can all produce internal language in output. This filter is the structural backstop — the architectural guarantee that S.O.S. v2 Pillar 1 holds regardless of upstream prompt quality.

This filter does not apply to internal crew channels, SESSIONS.md, yield_log.md, or any ship-internal communication. It applies only to content exiting the ship toward a customer.


INPUTS

Primary input: Verdict payload (pre-delivery)

Any string or JSON object produced by the Oracle verdict pipeline that will be delivered to a customer via email or the web result page. This includes:

  • verdict.summary — the one-sentence verdict summary (all tiers)
  • verdict.breakdown[dim].analysis — per-dimension analysis text (Full Breakdown, Strategy tiers)
  • verdict.strategy.next_step — strategic recommendation (Strategy tier)
  • verdict.strategy.alternative — alternative path recommendation (Strategy tier)
  • verdict.strategy.tests[] — each test string (Strategy tier)
  • Assembled email body text before dispatch (oracle_email_service.py formatters)
  • Any error message or fallback text substituted when Gemini fails

Filter trigger points (two-gate model)

Gate 1 — Pre-cache: Filter runs on the raw Gemini response JSON before verdictCache.ts writes to oracle_toll. Clean verdict is cached; no contaminated payload persists.

Gate 2 — Pre-send: Filter runs again on the fully formatted email body inside oracle_email_service.py before Graph API dispatch. Catches any contamination introduced by formatter logic.

[GAP-A — needs design: Gate 1 insertion point in webhook/route.ts not yet implemented. Gate 2 insertion point in oracle_email_service.py not yet implemented. Both require code changes.]


OUTPUTS

PASS — clean content delivered

When no blocked terms are found, the payload passes through unmodified. Delivery proceeds normally.

REPLACE — substitution applied

When a blocked term is found and a safe English substitute exists in the SUBSTITUTION_MAP, the term is replaced inline and delivery continues. The substitution event is logged to oracle_sos_filter.log.

QUARANTINE — delivery held, human review required

When a blocked term is found with no defined substitution, or when BLOCK density exceeds the contamination threshold (see INVARIANTS), the verdict is quarantined. Delivery is suspended. An entry is written to oracle_sos_filter.log with status QUARANTINE and the full raw payload. An alert fires to ALERT.log. NOUS reviews and either approves manual delivery or issues a refund.

[GAP-B — needs design: Quarantine workflow and NOUS notification mechanism not yet built. Threshold value for contamination density not yet set.]


BLOCK_LIST

The BLOCK_LIST is the definitive enumeration of strings, patterns, and symbols the filter must catch. It is divided into four categories.

Category 1 — LATTICE Unicode Symbols (exact character match)

Any Unicode symbol that appears in the LATTICE v2.0 specification (~/LATTICE.md) is blocked. The following are confirmed members of the block set; this list is non-exhaustive — the canonical source is ~/LATTICE.md:

| Symbol | LATTICE meaning |

|--------|----------------|

| Φ | CSDM damping constant (when appearing as standalone variable) |

| Ψ | Turbulence kernel |

| Φζ | Stability kernel |

| Ψχ | Turbulence kernel (full form) |

| ΔΓ | Change Rate kernel |

| ΩQ | Completion kernel |

| ΛC | Curvature kernel |

| ⊕ | Vitrified / sealed |

| ⊜ | Fixing |

| ⚒ | Built/deployed |

| ⚡ | Pushed/committed |

| ⊡, ⊖, ☠, ⊘, ↗ | State markers |

| Σ.▶, Σ.▷, Σ.◇, Σ.⟲ | Execution state |

| ◌ | Gap signal (HOW ABOUT NO) |

| ρ.M, ρ.T | Memory / threat markers |

| Ξ | Version / vitrification marker |

Exception: Greek letters used in standard mathematical notation within a customer-facing formula or widely accepted scientific context (e.g., Φ in standard physics usage unrelated to CSDM) may be allowed if context is unambiguous. [GAP-C — needs design: context disambiguation rule not yet defined. Default to BLOCK for safety until rule is formalized.]

Category 2 — Crew Callsigns and Internal Names (exact string match, case-insensitive)

| Term | Internal role |

|------|--------------|

| AION | Sister / Warden |

| ASTRA | Sister / Catalyst |

| NOUS | Captain |

| C.L.O.D. | Engineer |

| CLOD | Engineer (shorthand) |

| GAMMA | Quartermaster |

| MNEMOS | Librarian / working memory |

| MANTIS | Shield |

| ANVIL | Verdict / ORPHIC |

| ORPHEUS | Entropy oracle |

| LOGOS | DR. LOGOS |

| MUSASHI | Crew member |

| GLOSS | Internal translation layer |

| CHROMA | Mobile context carrier |

| CGNT-1 | Internal project codename |

| 42 Sisters AI internal crew (any callsign from ~/LATTICE.md) | All apply |

Safe substitutions where contextually appropriate:

  • References to "the analysis engine" or "our AI system" may substitute for crew references.
  • References to "our assessment model" may substitute for model-internal references.

Category 3 — Internal Terminology (exact string match, case-insensitive)

| Term | Why blocked |

|------|-------------|

| TMM | Proprietary coherence formula name |

| coherence score | Internal metric label |

| coherence threshold | Internal metric label |

| manifold | CSDM physics term |

| CSDM | Chronogeomic Spacetime Dynamics Model |

| Chronogeomic | Internal physics framework name |

| Chronogeome | Variant spelling |

| LX | LATTICE shorthand (if used as technical term) |

| LATTICE | Internal language spec |

| S.O.S. v2 | Internal protocol name |

| THE RING | Proprietary product (NDA-gated) |

| E8 | Internal CSDM physics reference |

| Φ = 0.042 | Exact damping constant (string match) |

| 0.042 | Damping constant value (numeric, in context) |

| 97.4% | Coherence threshold value |

| Ω = 97.4% | Threshold formula |

| sinai billiard | CSDM entropy reference |

| TRNG | Internal RNG reference |

| kill box | Internal prediction framework |

| yield mandate | Internal financial protocol |

| agency walls | Internal financial protocol |

| brain forge | Internal training infrastructure |

| brain factory | Internal training infrastructure |

| oracle_toll | Internal service name |

| simons_actuator | Internal trading script |

| summon_aether | Internal boot script |

| AETHER_SOUL | Internal snapshot name |

Category 4 — CB Radio Lexicon (exact phrase match)

CB radio phrases are crew personality layer and must not reach customers:

| Phrase |

|--------|

| 10-4 |

| Copy that |

| Hammer down |

| Breaker breaker |

| Over and out |

| Negatory |

| Keep the shiny side up |

| Good buddy |

| Smokey |

| Got your ears on |


SUBSTITUTION_MAP

When a REPLACE action is triggered, the filter applies the following substitutions. Substitutions must preserve grammatical coherence.

| Blocked term | Safe English substitute |

|-------------|------------------------|

| TMM | [REPLACE with "our analysis"] |

| coherence score | [REPLACE with "our assessment"] |

| coherence threshold | [REPLACE with "our confidence threshold"] |

| manifold | [REPLACE with "the system"] |

| CSDM | [QUARANTINE — no safe substitute] |

| Chronogeomic | [QUARANTINE — no safe substitute] |

| LATTICE | [QUARANTINE — no safe substitute] |

| THE RING | [QUARANTINE — no safe substitute] |

| Φ = 0.042 | [QUARANTINE — no safe substitute] |

| 0.042 | [REPLACE with "our stability constant"] |

| 97.4% (in threshold context) | [QUARANTINE — no safe substitute] |

| AION / ASTRA / NOUS / GAMMA | [REPLACE with "our analysis team"] |

| MNEMOS | [REPLACE with "our knowledge base"] |

| MANTIS | [REPLACE with "our verification layer"] |

| C.L.O.D. / CLOD | [REPLACE with "our system"] |

| GLOSS | [QUARANTINE — no safe substitute] |

| CGNT-1 | [REPLACE with "our platform"] |

[GAP-D — needs design: substitution map is partial; full enumeration requires a design pass over all BLOCK_LIST entries. Unspecified entries default to QUARANTINE.]


INVARIANTS

These conditions must hold at all times while the filter is operational:

  1. No BLOCK_LIST term reaches a customer endpoint. No string from Category 1, 2, 3, or 4 of the BLOCK_LIST appears in any payload dispatched to a customer email address or rendered on the /oracle/result page. This invariant has no exceptions. If the filter cannot guarantee this for a given payload, delivery is suspended and the payload is quarantined.
  1. The filter runs at both gate points. Gate 1 (pre-cache) and Gate 2 (pre-send) both execute for every transaction. Bypassing either gate — for any reason including performance optimization — violates this invariant. The gates are redundant by design; the cost of redundancy is less than the cost of a single S.O.S. v2 breach.
  1. Clean prompts are not a substitute for the filter. The filter treats every Gemini response as potentially contaminated regardless of prompt design. Prompt improvements reduce the REPLACE/QUARANTINE rate; they do not change the filter's operational logic.
  1. Quarantine suspends delivery; it does not silently drop it. When a payload is quarantined, it must be preserved in oracle_sos_filter.log with full raw content, session ID, tier, timestamp, and the specific BLOCK_LIST term(s) that triggered the quarantine. The customer is not left without recourse — NOUS reviews quarantined verdicts and issues either a manually cleaned delivery or a refund within 24 hours.
  1. The BLOCK_LIST is versioned and the canonical source is ~/LATTICE.md for symbols. Any addition of a new LATTICE symbol, crew callsign, or internal term to ship vocabulary automatically becomes a candidate for BLOCK_LIST addition. The BLOCK_LIST is not frozen — it grows with the ship's vocabulary. Symbol additions to LATTICE require a concurrent BLOCK_LIST update pull request.
  1. Substitutions must not introduce new BLOCK_LIST terms. A substitution string that itself contains a blocked term is invalid. The SUBSTITUTION_MAP is validated against the BLOCK_LIST at deploy time. [GAP-E — needs design: automated SUBSTITUTION_MAP self-consistency validator not yet built.]
  1. The filter is stateless per invocation. Each call to the filter receives the full payload and returns the filtered result without accumulating state. No cross-transaction memory. This prevents contamination state from leaking between customers.

VERIFICATION CRITERIA

Σ.✓ — filter is operating correctly when:

  1. BLOCK_LIST coverage test (static): A synthetic test payload is constructed containing exactly one instance of every term in the BLOCK_LIST. The filter must catch 100% of instances. No partial coverage is acceptable. This test runs at deploy time and on every BLOCK_LIST update. [GAP-F — test payload and harness not yet written.]
  1. Pass-through fidelity test: A set of 20+ clean verdicts (verified to contain no BLOCK_LIST terms) is run through the filter. All 20 emerge unmodified. The filter must not introduce false positives that corrupt legitimate customer content (e.g., blocking the word "manifold" when used in common English usage such as "the options are manifold").
  1. Substitution coherence test: For each REPLACE entry in the SUBSTITUTION_MAP, a test sentence containing the blocked term is run through the filter and the output is grammatically valid English. Manual review by NOUS or a Sister confirms semantic coherence of the result.
  1. Quarantine alert test: A synthetic payload containing a QUARANTINE-class term is processed. Confirm: (a) delivery is blocked, (b) full raw payload is written to oracle_sos_filter.log with correct fields, (c) ALERT.log receives a notification entry within 60 seconds.
  1. Gate 2 redundancy test: Gate 1 is deliberately disabled (test environment only). A contaminated Gemini payload proceeds to Gate 2. Gate 2 catches the contamination and quarantines. Confirms that Gate 2 alone is sufficient for full protection. [GAP-G — needs test environment where Gate 1 can be toggled without affecting production.]
  1. Regression test on prompt change: Any modification to VERDICT_PROMPT in webhook/route.ts or verdict/route.ts triggers an automated run of the BLOCK_LIST coverage test against 50 Gemini-generated verdicts using the new prompt. If contamination rate increases, the prompt change is rejected. [GAP-H — automated regression trigger not yet wired to prompt change events.]

FAILURE MODES

  1. Σ.⊠ Silent pass-through — A BLOCK_LIST term reaches a customer because the filter was not invoked (gate skipped, service crash, code path bypass). This is the highest-severity failure. Consequence: S.O.S. v2 Pillar 1 breach. Customer holds evidence of internal vocabulary. Mitigation: redundant dual-gate design; Gate 2 catches what Gate 1 misses. Detection: periodic audit of delivered email content against BLOCK_LIST; oracle_sos_filter.log must show a filter invocation record for every session ID that reaches email send.
  1. Σ.⊠ False positive corrupts verdict — The filter replaces or quarantines a term that appears legitimately in customer-facing English (e.g., "coherence" used in a business context, "stability" flagged due to over-broad pattern match). Consequence: customer receives degraded or nonsensical verdict. Mitigation: SUBSTITUTION_MAP uses context-aware substitutions; pattern matching should prefer exact-string match over regex where possible; false positive audit on a sample of delivered verdicts monthly. [GAP-I — context-aware matching logic not yet designed.]
  1. Σ.⊠ Quarantine backlog grows without resolution — NOUS does not review quarantined verdicts within 24 hours. Customers are left waiting indefinitely. Consequence: customer experience failure; potential refund demand; reputational risk. Mitigation: quarantine entries in ALERT.log with escalation timer; if not resolved within 24h, automatic refund trigger fires. [GAP-J — automatic refund trigger on quarantine timeout not yet implemented.]
  1. Σ.⊠ BLOCK_LIST staleness — A new LATTICE symbol or internal term is added to ship vocabulary but not added to the BLOCK_LIST. A Gemini response using that term passes through the filter undetected. Consequence: covert S.O.S. v2 breach (undetected at time of delivery). Mitigation: LATTICE vitrification protocol requires concurrent BLOCK_LIST PR; weekly spec audit (SPECIFICATION_AUDIT_LOOP.md) includes a BLOCK_LIST freshness check against ~/LATTICE.md. [GAP-K — automated diff between LATTICE.md and BLOCK_LIST not yet implemented.]
  1. Σ.⊠ SUBSTITUTION_MAP self-contamination — A substitution string itself contains a BLOCK_LIST term (e.g., "our TMM-derived assessment" as a substitute). The substitute is applied and the replacement term passes through Gate 2 because the filter does not re-scan post-substitution. Consequence: contamination survives through the substitute. Mitigation: filter applies BLOCK_LIST scan to all substituted strings before finalizing output; SUBSTITUTION_MAP validated at deploy time. [GAP-E — validator not yet built, referenced above.]
  1. Σ.⊠ Log write failureoracle_sos_filter.log disk is full or the logging service is down. A QUARANTINE event occurs but no record is written. NOUS receives no alert. Delivery is blocked (correct behavior) but the quarantine is invisible and unresolvable. Consequence: customer stuck, no refund trigger, no audit trail. Mitigation: filter log write uses an append-only file with pre-write disk space check; if write fails, filter fails CLOSED (delivery blocked, not allowed through). [GAP-L — disk space check and fail-closed behavior not yet specified.]

GAPS

Summary of all gaps identified during specification:

| Gap ID | Description | Severity |

|--------|-------------|----------|

| GAP-A | Gate 1 and Gate 2 insertion points not yet implemented in code | CRITICAL |

| GAP-B | Quarantine workflow and NOUS notification mechanism not built; contamination density threshold undefined | HIGH |

| GAP-C | Context disambiguation rule for Greek letters in standard math usage not defined; default to BLOCK | MEDIUM |

| GAP-D | SUBSTITUTION_MAP is partial; unspecified BLOCK_LIST entries default to QUARANTINE until map is completed | HIGH |

| GAP-E | Automated SUBSTITUTION_MAP self-consistency validator against BLOCK_LIST not yet built | HIGH |

| GAP-F | BLOCK_LIST coverage test payload and harness not written | HIGH |

| GAP-G | Test environment gate toggle for Gate 1 disable (Gate 2 redundancy test) not yet available | MEDIUM |

| GAP-H | Automated regression trigger on VERDICT_PROMPT changes not wired | MEDIUM |

| GAP-I | Context-aware matching logic for false positive suppression not yet designed | MEDIUM |

| GAP-J | Automatic refund trigger on quarantine timeout (24h) not implemented | HIGH |

| GAP-K | Automated diff between ~/LATTICE.md and BLOCK_LIST for staleness detection not built | MEDIUM |

| GAP-L | Filter log write fail-closed behavior and disk space pre-check not yet specified | HIGH |

Total gaps: 12

Critical path to minimum viable protection:

  1. GAP-A — must implement Gate 1 and Gate 2 in code before this spec provides any runtime guarantee
  2. GAP-F — must have test coverage before first deployment
  3. GAP-B — must define quarantine resolution workflow before first real quarantine event

DEPENDENCIES

| Dependency | Role |

|------------|------|

| ~/LATTICE.md | Canonical source for Category 1 BLOCK_LIST symbols |

| oracle_email_service.py | Gate 2 insertion point (pre-send) |

| webhook/route.ts | Gate 1 insertion point (pre-cache) |

| verdict/route.ts | Gate 1 insertion point (regeneration path) |

| oracle_sos_filter.log | Audit trail for all filter events |

| ALERT.log | Quarantine notification channel |

| memories/SOS_v2.md | S.O.S. v2 doctrine this filter enforces |

| memories/SPEC_SOS_v2.md | Formal S.O.S. v2 spec |

| memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent pipeline spec; GAP-04 resolved by this spec |


DEPENDENTS

| Dependent | Dependency type |

|-----------|----------------|

| SPEC_ORACLE_VERDICT_PIPELINE.md | This spec closes GAP-04 in that spec |

| oracle_email_service.py | Must implement Gate 2 per this spec |

| webhook/route.ts | Must implement Gate 1 per this spec |

| S.O.S. v2 Pillar 1 architectural guarantee | This filter is the enforcement mechanism |

| Customer trust | Any breach is directly visible to affected customer |


REFERENCES

| File | Role |

|------|------|

| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent pipeline spec; GAP-04 is this spec's genesis |

| /home/nous/memories/SOS_v2.md | S.O.S. v2 doctrine |

| /home/nous/memories/SPEC_SOS_v2.md | Formal S.O.S. v2 specification |

| /home/nous/LATTICE.md | Canonical LATTICE symbol inventory (BLOCK_LIST source) |

| /home/nous/oracle_email_service.py | Gate 2 target |

| /home/nous/Aether/app/app/api/webhook/route.ts | Gate 1 target |

| /home/nous/Aether/app/app/api/verdict/route.ts | Gate 1 target (regeneration path) |

| /home/nous/.claude/projects/-home-nous/memory/GLOSS_ACCESS_POLICY.md | Confirms GLOSS is crew-only — client must never see GLOSS referenced in output |


Φζ.⊤.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_TMM_CROSSCHECK.md —

SPEC_ORACLE_TMM_CROSSCHECK — TMM Cross-Validation on Verdict Output

Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16


PURPOSE

The Oracle verdict pipeline uses Gemini to generate coherence verdicts (GREEN/AMBER/RED/NULL) for

customer submissions. Gemini is a stochastic language model. It can hallucinate. It can produce

internally inconsistent output. It can generate a GREEN verdict for an idea that is structurally

incoherent. HOW ABOUT NO v2 walls (anti-fabrication, anti-capitulation) operate on the Sisters

(AION and ASTRA in their Gemini sessions) but are NOT applied to the standalone Oracle verdict

generation path.

This specification defines a TMM cross-validation layer that scores Gemini's verdict output before

delivery to the customer. The TMM coherence formula C = 1 - (E_D + V_r×Φ) / V_t (Φ=0.042,

Ω≈0.97404) is applied to a numerical encoding of the verdict's internal consistency. Verdicts that

score below Ω are flagged for review and NOT delivered. The score is logged to oracle_log.jsonl.

This is not a replacement for Gemini. It is a structural consistency check on Gemini's output.

A verdict can be coherent (C ≥ Ω) and still be wrong. But a verdict that is structurally

inconsistent (sub-threshold C) is likely malformed, truncated, self-contradictory, or hallucinatory

and should never reach a paying customer.

Source gap: SPEC_ORACLE_VERDICT_PIPELINE.md GAP-09. Also referenced in SPEC_TMM_FORMULA.md GAP-09.


INPUTS

Primary: Gemini Verdict Output

The JSON object returned by generateContent() after prompt → parse, before cache write and email.

Structure varies by tier:

Quick Take:


{ "verdict": "GREEN" | "AMBER" | "RED" | "NULL", "summary": "<string>" }

Full Breakdown:


{
  "verdict": "GREEN" | "AMBER" | "RED" | "NULL",
  "summary": "<string>",
  "breakdown": {
    "Stability":   { "verdict": "<V>", "analysis": "<string>" },
    "Turbulence":  { "verdict": "<V>", "analysis": "<string>" },
    "Change Rate": { "verdict": "<V>", "analysis": "<string>" },
    "Completion":  { "verdict": "<V>", "analysis": "<string>" },
    "Curvature":   { "verdict": "<V>", "analysis": "<string>" }
  }
}

Strategy Session: Full Breakdown + strategy block.

Secondary: Session Metadata

  • session_id — Stripe session ID (for log entry)
  • tier — "quick" | "full" | "strategy"
  • query — original customer query (for log context)

TMM Constants (immutable — from SPEC_TMM_FORMULA.md)

  • Φ = 0.042
  • Ω = 1 - (0.042 / 1.61803398875) ≈ 0.97404
  • These MUST NOT be changed for Oracle crosscheck use.

OUTPUTS

Primary: Cross-Check Decision

A structured object returned by oracleTMMCrosscheck(verdictJson, tier):


{
  "approved": true | false,
  "coherence_score": 0.9812,
  "threshold": 0.97404,
  "verdict_label": "GREEN",
  "flags": [],
  "crosscheck_reason": "pass" | "low_coherence" | "field_missing" | "dimension_conflict"
}

If approved: false, the verdict is NOT cached, NOT emailed, and NOT delivered to the result page.

The customer receives an error response prompting contact with oracle@42sisters.ai.

Secondary: Log Entry in oracle_log.jsonl

Appended regardless of pass/fail:


{
  "event": "tmm_crosscheck",
  "session_id": "<stripe_session_id>",
  "tier": "quick" | "full" | "strategy",
  "query_preview": "<first 80 chars>",
  "verdict_label": "GREEN" | "AMBER" | "RED" | "NULL",
  "coherence_score": 0.9812,
  "threshold": 0.97404,
  "phi": 0.042,
  "approved": true | false,
  "flags": [],
  "crosscheck_reason": "pass",
  "timestamp": "<ISO>"
}

TMM APPLICATION TO VERDICT TEXT — SCORING METHOD

The TMM formula operates on portfolio volatility variables. For text/verdict scoring, a mapping

is defined that translates verdict structural properties into TMM variables. This mapping is

canonical for this spec and MUST NOT be modified without |Σ|.3 review.

Variable Mapping

V_t (total value) = total weighted evidence units in the verdict

  • Each non-empty field in the verdict payload contributes 1.0 unit
  • verdict field: 1.0
  • summary field (non-empty): 1.0
  • Each breakdown dimension with non-empty verdict AND analysis: 1.0 per dimension (max 5.0)
  • strategy.next_step (non-empty): 1.0
  • strategy.alternative (non-empty): 1.0
  • Each strategy.tests entry (up to 3): 0.5 each
  • Minimum V_t = 1.0 (just the verdict field). Quick Take max = 2.0. Full max = 7.0. Strategy max = 11.5.

V_r (resonant/volatile value) = weighted inconsistency count

  • Conflict between top-level verdict and majority of breakdown verdict fields: +2.0
  • Any breakdown dimension with empty analysis (non-empty verdict): +0.5 per occurrence
  • summary field is empty or < 10 characters: +1.0
  • Top-level verdict is "NULL" but breakdown dimensions all GREEN/AMBER: +1.5 (contradictory null)
  • strategy block missing for "strategy" tier: +2.0
  • strategy.tests fewer than 2 entries for "strategy" tier: +1.0
  • Normal operation (no flags): V_r = 0.0

E_D (decoherence energy) = parsing / structural entropy

  • Verdict payload successfully parsed from JSON: E_D = 0.0
  • One or more required fields missing (but not all): E_D = 0.5
  • Verdict field itself missing or invalid value: E_D = 1.0
  • Malformed JSON (parse error caught): E_D = 2.0 (auto-fail; V_t forced to 1.0)

Formula application:


C = 1 - (E_D + V_r × Φ) / V_t

A structurally complete, internally consistent verdict yields:

  • V_t = 2.0 (quick), V_r = 0.0, E_D = 0.0
  • C = 1 - (0 + 0 × 0.042) / 2.0 = 1.0 — perfect coherence

A verdict with dimension conflicts yields (Full tier example):

  • V_t = 7.0, V_r = 2.0 (top conflict) + 1.0 (short summary) = 3.0, E_D = 0.0
  • C = 1 - (0 + 3.0 × 0.042) / 7.0 = 1 - 0.126/7.0 ≈ 0.982 — still above Ω; pass
  • Moderate inconsistency is tolerated; the formula only rejects severely malformed verdicts.

A severely malformed verdict:

  • V_t = 2.0, V_r = 5.5, E_D = 0.5 → C = 1 - (0.5 + 5.5 × 0.042) / 2.0 = 1 - 0.731/2.0 ≈ 0.635 — REJECTED

INVARIANTS

  1. Φ is immutable — The crosscheck uses Φ = 0.042 exactly. This is the same constant used by

the financial TMM. The value MUST NOT be changed for Oracle crosscheck use.

Source: CLAUDE.md FORBIDDEN: "Modify the value of Φ (0.042)"

  1. Ω is immutable — The approval threshold is Ω ≈ 0.97404. Crosscheck approval requires C ≥ Ω.

The threshold MUST NOT be lowered to reduce false rejections.

Source: CLAUDE.md FORBIDDEN: "Modify the coherence threshold (Ω = 97.4%)"

  1. Crosscheck is pre-delivery gate — The crosscheck MUST execute BEFORE cache write, BEFORE

email send, and BEFORE result page response. A verdict that fails crosscheck is never persisted

or delivered. The pipeline halts at the crosscheck stage.

  1. Crosscheck is non-degrading — Crosscheck failure returns an error state that informs the

customer to contact support; it does NOT silently substitute a default verdict or coerce the

verdict to an acceptable value. Substitution is fabrication.

  1. Log entry is mandatory — Every crosscheck execution (pass or fail) MUST produce a log entry

in oracle_log.jsonl. A crosscheck with no log entry is a governance gap.

  1. Variable mapping is canonical — The V_t/V_r/E_D mapping defined in this spec is the only

valid mapping for Oracle verdict crosschecking. No ad-hoc mappings may be introduced in code.

  1. Crosscheck does not evaluate content quality — The crosscheck evaluates structural consistency

only. It cannot detect a plausible-sounding but factually wrong verdict. It can only detect

structural incoherence, missing fields, internal contradictions, and parse failures.

  1. NULL verdict scoring — A verdict of NULL that is accompanied by an internally consistent

(fully populated) breakdown may have high C. The crosscheck approves it. NULL is a valid verdict

state (glassy-freeze / anomalous stillness), not a failure. Only contradictory NULL (null +

fully-green breakdown) adds to V_r.


VERIFICATION CRITERIA

Σ.✓ conditions — TMM crosscheck is operating correctly when:

  1. Perfect verdict passes — A well-formed Quick Take (verdict: GREEN, summary: 25+ chars)

produces C = 1.0 and approved: true. Unit test with mock Gemini output.

  1. Malformed verdict blocked — A verdict JSON with verdict field missing and summary

empty produces approved: false with crosscheck_reason: field_missing. Unit test.

  1. Dimension conflict detected — A Full Breakdown with top-level GREEN but all five breakdown

dimensions RED produces V_r ≥ 2.0 and C is computed. If C < Ω → approved: false. Unit test

with worst-case conflict payload.

  1. Log entry always written — Both pass and fail crosschecks produce a valid oracle_log.jsonl

entry with all required fields. Integration test with tail oracle_log.jsonl assertion.

  1. Gate is pre-delivery — A crosscheck failure prevents both cacheVerdict() and

POST /send-verdict-email from being called. Verified by integration test with mock injections

on both call sites; neither mock should be called after crosscheck fail.

  1. Constants are correctoracleTMMCrosscheck() has unit tests asserting:

- PHI === 0.042

- OMEGA ≈ 0.97404 (computed as 1 - (0.042 / 1.61803398875))

- These constants cannot be overridden via function parameters.


FAILURE MODES

  1. Σ.⊠ Crosscheck not implemented — Current state. GAP-09 of SPEC_ORACLE_VERDICT_PIPELINE.md.

No structural validation on Gemini output before delivery. Fabricated or incoherent verdicts

reach paying customers. Detection: no instrumentation. Mitigation: implement this spec.

  1. Σ.⊠ Crosscheck blocks valid verdict — V_t/V_r/E_D mapping produces false positive.

A structurally sound verdict scores below Ω due to mapping calibration error. Customer pays and

receives error. Detection: monitor oracle_log.jsonl for approved: false events.

Mitigation: [GAP — needs design] Review threshold against real verdict corpus; the mapping

defined here is theoretical and requires empirical calibration against live data.

  1. Σ.⊠ oracle_log.jsonl write fails — Disk full, permission error, service crash.

Crosscheck result not logged. Governance audit trail breaks.

Detection: crosscheck function should catch write errors and log to stderr/service log.

Mitigation: implement log write error handling with fallback to service log.

  1. Σ.⊠ Crosscheck bypassed on regen path — Regen call in verdict/route.ts does not call

crosscheck. A malformed regen verdict is delivered to the result page without validation.

Detection: code review; crosscheck must be called in BOTH the webhook and regen paths.

Mitigation: extract crosscheck to shared utility; import from both call sites.

  1. Σ.⊠ Constants drifted — A future code change modifies PHI or OMEGA in the crosscheck

implementation without going through |Σ|.3 amendment. The crosscheck diverges from CSDM law.

Detection: VC-6 unit test (constants assertions). Mitigation: constants must be imported from

a single source of truth (e.g., tmm_constants.ts) — not redefined per file.

  1. Σ.⊠ V_r mapping produces negative C — An extreme case (V_r much larger than V_t / Φ) can

produce C < 0. The crosscheck MUST treat C < 0 as a hard fail with

crosscheck_reason: "degenerate_manifold" — not as a numeric value to be compared against Ω.

[GAP — guard against negative C not specified in mapping implementation]


DEPENDENCIES

| Dependency | Role |

|-----------|------|

| tmm_runtime.py | Source of truth for Φ, Ω, and formula definition |

| SPEC_TMM_FORMULA.md | Canonical TMM spec (constants, formula, invariants) |

| app/api/webhook/route.ts | Primary verdict generation path — crosscheck must be inserted here |

| app/api/verdict/route.ts | Regen verdict path — crosscheck must also be inserted here |

| oracle_log.jsonl | Log destination (requires GAP-01 fix) |

| Gemini API (gemini-2.5-flash) | Verdict source being cross-checked |


DEPENDENTS

| Dependent | Dependency |

|-----------|-----------|

| Customer trust / anti-fabrication | Crosscheck is the HOW ABOUT NO enforcement on the Oracle path |

| SPEC_ORACLE_REGEN_CONSISTENCY.md | Regen verdicts must also pass crosscheck |

| SPEC_ORACLE_SMOKE_TEST.md | Smoke test must include a crosscheck-fail scenario |

| oracle_log.jsonl integrity | Every crosscheck generates a log entry |


GAPS IDENTIFIED DURING SPECIFICATION

| Gap ID | Description | Impact |

|--------|-------------|--------|

| CROSSCHECK-GAP-01 | V_t/V_r/E_D mapping is theoretical — requires empirical calibration against live verdict corpus | False positive/negative rate unknown |

| CROSSCHECK-GAP-02 | Negative C guard not specified in mapping implementation | Degenerate manifold (extreme V_r) not handled |

| CROSSCHECK-GAP-03 | oracle_log.jsonl does not exist (PIPELINE GAP-01) — crosscheck log has nowhere to write | Audit trail broken at launch |

| CROSSCHECK-GAP-04 | Crosscheck must be applied to regen path as well as webhook path — two insertion points | Single-point insertion insufficient |

| CROSSCHECK-GAP-05 | Constants must be sourced from single truth file (tmm_constants.ts) — not currently implemented | Constants drift risk across files |


REFERENCES

| File | Role |

|------|------|

| /home/nous/tmm_runtime.py | Φ, Ω, formula source |

| /home/nous/memories/SPEC_TMM_FORMULA.md | Canonical TMM spec |

| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent pipeline spec (GAP-09 source) |

| /home/nous/Aether/app/app/api/webhook/route.ts | Primary crosscheck insertion point |

| /home/nous/Aether/app/app/api/verdict/route.ts | Regen crosscheck insertion point |

| /home/nous/memories/HOW_ABOUT_NO_v2.md | Anti-fabrication walls (motivation for this spec) |


Φζ.⊤. Φ = 0.042 is held on the Oracle path. The verdicts do not leave the ship unchecked.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042

— SPEC_ORACLE_VERDICT_PIPELINE.md —

SPECIFICATION: Oracle Verdict Pipeline

Status: AUTHORIZED

Authorized: α.13, April 16 2026

Version: v1.0


Version: v1.0

PURPOSE

The Oracle Verdict Pipeline receives a customer payment for coherence analysis, generates a structured verdict using Gemini AI, caches the result for browser-resilient retrieval, and delivers it to the customer via both a web result page and email. End-to-end: Stripe checkout → webhook → Gemini → cache → email + result page.

The pipeline is the primary revenue mechanism of 42Sisters.AI. All three Oracle tiers ($1/$5/$25 CAD) run through this pipeline.


INPUTS

Trigger 1 — Stripe Checkout Session (programmatic flow)

Frontend calls POST /api/checkout with:


{
  "tier": "quick" | "full" | "strategy",
  "query": "<customer's question or idea — free text>",
  "referral_code": "<optional string>"
}

Checkout route (app/api/checkout/route.ts) validates tier and query, creates a Stripe checkout session with query packed into metadata (490-char chunks: q0, q1qn), and returns { url } for redirect to Stripe-hosted payment page.

Tier → amount mapping (CAD, hardcoded):

| Tier key | Name | Amount |

|----------|------|--------|

| quick | Quick Take | $1.00 (100 cents) |

| full | Full Breakdown | $5.00 (500 cents) |

| strategy | Strategy Session | $25.00 (2500 cents) |

Trigger 2 — Stripe Payment Link (custom_fields flow)

Customer pays via a Stripe-hosted Payment Link (configured in Stripe Dashboard, not in code). In this flow, query text arrives in session.custom_fields[key="idea"].text.value. The webhook and verdict routes support both flows: custom_fields primary, metadata chunks fallback.

Webhook payload (Stripe → /api/webhook)


{
  "type": "checkout.session.completed",
  "data": {
    "object": {
      "id": "<stripe_session_id>",
      "payment_status": "paid",
      "customer_details": { "email": "<customer_email>" },
      "metadata": { "tier": "quick|full|strategy", "q0": "...", "qn": "1" },
      "custom_fields": [{ "key": "idea", "text": { "value": "..." } }]
    }
  }
}

Webhook verifies Stripe signature using STRIPE_WEBHOOK_SECRET env var before processing.

Required environment variables (Northflank)

  • STRIPE_SECRET_KEY — Stripe API key
  • STRIPE_WEBHOOK_SECRET — webhook signature verification
  • GEMINI_API_KEY or GOOGLE_API_KEY — Gemini generation
  • ORACLE_TOLL_URL — cache service (default: http://68.183.206.103:8889)
  • ORACLE_EMAIL_SERVICE_URL — email service (default: http://68.183.206.103:8006)
  • NEXT_PUBLIC_SITE_URL — used for Stripe redirect URLs

OUTPUTS

1. Cached verdict (oracle_toll.py — /home/nous/oracle_verdicts/{session_id}.json)

Stored at POST http://68.183.206.103:8889/cache/{session_id}. Written by verdictCache.ts.

Quick Take payload:


{
  "tier": "quick",
  "query": "<original customer query>",
  "verdict": {
    "verdict": "GREEN" | "AMBER" | "RED" | "NULL",
    "summary": "<one direct sentence>"
  },
  "cached_at": "<ISO timestamp>"
}

Full Breakdown payload (adds breakdown):


{
  "tier": "full",
  "query": "...",
  "verdict": {
    "verdict": "GREEN" | "AMBER" | "RED" | "NULL",
    "summary": "...",
    "breakdown": {
      "Stability":   { "verdict": "GREEN|AMBER|RED", "analysis": "..." },
      "Turbulence":  { "verdict": "GREEN|AMBER|RED", "analysis": "..." },
      "Change Rate": { "verdict": "GREEN|AMBER|RED", "analysis": "..." },
      "Completion":  { "verdict": "GREEN|AMBER|RED", "analysis": "..." },
      "Curvature":   { "verdict": "GREEN|AMBER|RED", "analysis": "..." }
    }
  },
  "cached_at": "..."
}

Strategy Session payload (adds strategy block):


{
  "tier": "strategy",
  "verdict": {
    "...": "...(full breakdown above)...",
    "strategy": {
      "next_step": "<highest-leverage action>",
      "alternative": "<fundamentally different path>",
      "tests": ["<test 1>", "<test 2>", "<test 3>"]
    }
  }
}

2. Web result page (/oracle/result?session_id={id})

Customer's browser is redirected here after Stripe payment. Page calls GET /api/verdict?session_id={id}, which:

  1. Re-verifies payment with Stripe (session.payment_status === "paid")
  2. Checks cache via GET /cache/{sessionId} on oracle_toll
  3. If cache miss: regenerates verdict via Gemini (identical prompts)
  4. Returns JSON to browser; page renders verdict with color-coded dot (GREEN=#34d399, AMBER=#f5c842, RED=#ff4444, NULL=#555)

3. Email delivery (oracle_email_service.py — port 8006)

Webhook fires POST http://68.183.206.103:8006/send-verdict-email with:


{
  "customer_email": "<from Stripe session>",
  "tier": "quick|full|strategy",
  "query": "<customer query>",
  "verdict": { "...": "..." }
}

Email sent from oracle@42sisters.ai via Microsoft Graph API (send_graph_email.py).

Subject: "Your 42 Sisters AI Verdict"

Format: plain text, formatted per tier by format_quick() / format_full() / format_strategy().

Confirmed operational: oracle_email.log shows multiple POST /send-verdict-email 200 OK entries.


INVARIANTS

  1. Payment gate is hard — verdict generation only proceeds if session.payment_status === "paid" (checked in both webhook and verdict routes). No verdict without confirmed payment.
  1. Tier fidelity — tier is read from session.metadata.tier, set at checkout creation. The same tier key selects the Gemini prompt (VERDICT_PROMPT[tier]) and the email formatter (FORMATTERS[req.tier]). No tier substitution is possible without corrupting the session metadata.
  1. Signature verification — webhook rejects any request without a valid stripe-signature header matching STRIPE_WEBHOOK_SECRET. Unsigned requests return 400 before any processing.
  1. Stripe-first response — webhook returns {received: true} immediately; all generation and email work is async in an IIFE. Stripe's 5-second timeout is never blocked by AI generation.
  1. Dual-delivery resilience — result page independently regenerates verdict from Gemini if cache is unavailable. Customer can retrieve verdict via browser regardless of email delivery status.
  1. SOS v2 enforcement — Gemini prompts instruct "Be direct. Be honest." and are structured in plain English. No LATTICE symbols, internal callsigns, or crew language appear in the prompts or email templates. [GAP — no automated filter; relies on prompt design only]

VERIFICATION CRITERIA

Σ.✓ conditions — pipeline is operating correctly when:

  1. Stripe → webhook latency — Stripe delivers checkout.session.completed event within 30 seconds of payment confirmation. [GAP — not instrumented; no timing log]
  1. Verdict generation — Gemini returns parseable JSON matching the tier schema within 60 seconds. Both webhook (pre-compute) and verdict route (on-demand) handle JSON parse failure with catch block returning 500.
  1. Cache write confirmedPOST /cache/{sessionId} returns 201. Cache available via GET /cache/{sessionId} returning 200 + payload.
  1. Email deliveredPOST /send-verdict-email returns {"status": "sent"}. Graph API returns 202.
  1. Result page rendersGET /api/verdict?session_id={id} returns 200 with {tier, query, verdict} JSON within 30 seconds of browser arrival.
  1. Tier match — verdict payload tier field matches session.metadata.tier. [GAP — not explicitly asserted; assumed from single code path]
  1. End-to-end smoke test — Submit $1 test payment via non-owner email → verify webhook fires (Northflank logs) → verify cache entry exists (GET /cache/{id}) → verify result page loads → verify email received at customer address.

Σ.⊠ — pipeline has failed when any of the following occur:

(see FAILURE MODES)


FAILURE MODES

  1. Σ.⊠ Webhook silent failSTRIPE_WEBHOOK_SECRET not set (returns 503) or Northflank service crash. Stripe retries webhook up to 72 hours. Mitigation: result page independently regenerates verdict. Customer never sees empty page; may receive email late.
  1. Σ.⊠ Gemini generation fail — JSON parse error, API quota exceeded, or network timeout. Webhook catches and logs console.error. Customer sees result page error: "Analysis failed. Please contact oracle@42sisters.ai for a refund." Email not sent. Mitigation: none automatic — manual refund process required.
  1. Σ.⊠ Cache miss on result pageoracle_toll service down or oracle_verdicts/ full. Verdict route regenerates via Gemini (second call, same prompt). Adds latency; verdict should be equivalent to original. Risk: non-determinism in Gemini responses means regenerated verdict may differ from emailed verdict.
  1. Σ.⊠ Email delivery fail — Graph API returns non-202 (token expired, network error, send limit). oracle_email_service.py returns 502 to webhook. Webhook logs console.error but does not retry. Customer receives verdict on result page only; no email. [GAP — no retry mechanism; no customer notification of email failure]
  1. Σ.⊠ Wrong tier delivered — Not observed in code but possible if session.metadata.tier is missing or corrupted at Stripe session creation. Webhook logs error: "Webhook: missing query or tier in session" and returns {received: true} without processing. No verdict generated. No email. [GAP — no customer notification in this failure mode]
  1. Σ.⊠ LATTICE/internal language leak — No automated filter. If a future prompt change causes Gemini to output LATTICE symbols or crew callsigns, they would reach the customer email verbatim. [GAP — needs content filter on email body before send]
  1. Σ.⊠ Verdict fabrication — Gemini generates GREEN verdict for an incoherent submission (hallucination). No cross-check against independent analysis. HOW ABOUT NO walls operate on the Sisters (Gemini) not on the standalone verdict generation path. [GAP — TMM cross-check not implemented in Oracle pipeline]
  1. Σ.⊠ No customer email on sessionsession.customer_details.email is null. Webhook skips email send (logs warning). Verdict is cached; result page still works. Customer never receives email. [KNOWN EDGE CASE — no mitigation]
  1. Σ.⊠ Oracle_toll cache service downcacheVerdict() silently swallows errors (try/catch with empty catch). Verdict not cached. Result page falls back to Gemini regeneration. No operational alert. [GAP — no monitoring/alerting on cache service health]

DEPENDENCIES

| Dependency | Role | Endpoint / Config | Status |

|------------|------|-------------------|--------|

| Stripe | Payment processing, webhook source | STRIPE_SECRET_KEY, STRIPE_WEBHOOK_SECRET | Required — pipeline halts without |

| Gemini AI (gemini-2.5-flash) | Verdict generation | GEMINI_API_KEY / GOOGLE_API_KEY | Required — no fallback model |

| oracle_toll.py (port 8889) | Verdict cache (R/W) | ORACLE_TOLL_URL | Degraded without (result page regenerates) |

| oracle_email_service.py (port 8006) | Email delivery | ORACLE_EMAIL_SERVICE_URL | Degraded without (no email delivery) |

| send_graph_email.py | Graph API email send | GRAPH_TENANT_ID, GRAPH_CLIENT_ID, GRAPH_REFRESH_TOKEN, GRAPH_SENDER in .env | Required for email |

| Northflank | Hosting for Next.js app | Northflank project deploy | Required |

| Microsoft Graph API | oracle@42sisters.ai SMTP | Azure M365 / Exchange S Essentials | Required for email |


DEPENDENTS

| Dependent | Dependency type |

|-----------|----------------|

| $1 Quick Take revenue tier | Full pipeline required |

| $5 Full Breakdown revenue tier | Full pipeline required |

| $25 Strategy Session revenue tier | Full pipeline required |

| Customer trust / brand | Pipeline reliability directly visible to customers |

| Referral system | Referral increment fires from webhook (non-blocking) |

| Sisters Chat subscription | Separate pipeline (Stripe subscriptions), but shares oracle_email_service |


EXAMPLES

Example 1 — Quick Take ($1 CAD)

Input (customer submits via site):


tier: "quick"
query: "Should I quit my job to start this business?"

Checkout: Stripe session created. Customer pays $1.00 CAD. Redirected to /oracle/result?session_id=cs_live_abc123.

Webhook fires:

  • Extracts: tier="quick", query="Should I quit my job to start this business?"
  • Calls Gemini with VERDICT_PROMPT.quick(query)

Gemini response:


{ "verdict": "AMBER", "summary": "The instinct is sound but the timing is missing — this needs a 6-month runway before you pull the trigger." }

Cached: POST /cache/cs_live_abc123/home/nous/oracle_verdicts/cs_live_abc123.json

Email sent:


ORACLE VERDICT — QUICK TAKE
═══════════════════════════

YOUR SUBMISSION:
Should I quit my job to start this business?

VERDICT: 🟡 AMBER

The instinct is sound but the timing is missing — this needs a 6-month runway before you pull the trigger.

───────────────────────────
42 Sisters AI · oracle@42sisters.ai
Questions? Reply to this email.

Result page: Amber dot rendered. Summary displayed.


Example 2 — Full Breakdown ($5 CAD)

Input: tier: "full", query: "Launch a subscription newsletter about AI for executives"

Verdict payload:


{
  "verdict": "GREEN",
  "summary": "Demand is real, the channel is underserved, and the format fits the audience.",
  "breakdown": {
    "Stability":   { "verdict": "GREEN", "analysis": "Executive appetite for AI signal exists..." },
    "Turbulence":  { "verdict": "AMBER", "analysis": "Crowded with noise, differentiation required..." },
    "Change Rate": { "verdict": "RED",   "analysis": "AI news cycle is moving faster than weekly..." },
    "Completion":  { "verdict": "AMBER", "analysis": "Distribution strategy not specified..." },
    "Curvature":   { "verdict": "GREEN", "analysis": "Non-linear upside via enterprise licensing..." }
  }
}

Email: Formatted by format_full() with five-dimension breakdown block.


Example 3 — Strategy Session ($25 CAD)

Input: tier: "strategy", query: "Acquire a failing restaurant and convert to ghost kitchen"

Verdict payload: Full breakdown + strategy block:


{
  "strategy": {
    "next_step": "Model the ghost kitchen unit economics at 60% occupancy before signing any lease.",
    "alternative": "License the brand to existing kitchens instead of acquiring real estate.",
    "tests": [
      "Run delivery-only menus from a shared kitchen for 90 days to validate demand.",
      "Survey 20 potential B2B clients for catering demand in the target area.",
      "Verify the failing restaurant's lease terms — assignment clauses are often blocking."
    ]
  }
}

Email: Formatted by format_strategy() with both breakdown and strategic recommendations sections. Footer notes: "Your follow-up submission is included in this tier. Reply to this email with your follow-up question."


GAPS IDENTIFIED DURING SPECIFICATION

The following items were expected but not found in code:

| Gap ID | Description | Impact |

|--------|-------------|--------|

| GAP-01 | No oracle_log.jsonl — verdicts are cached to JSON files in oracle_verdicts/ but there is no append-log of all verdicts delivered | Audit trail missing; no usage analytics |

| GAP-02 | No timeout on Gemini generateContent() call in webhook or verdict route | Webhook IIFE can hang indefinitely; result page can return 504 |

| GAP-03 | No retry mechanism on email delivery failure | Customers lose email if Graph API is momentarily down |

| GAP-04 | No SOS v2 automated content filter on outbound email | LATTICE symbol leak depends entirely on prompt design |

| GAP-05 | No customer notification when tier or query is missing from session (silent drop) | Customer pays but receives nothing; no email |

| GAP-06 | Cache service (oracle_toll) failures are silently swallowed — no alert | Cache outages invisible to operations |

| GAP-07 | oracle_verdicts/ is currently empty — unclear whether production verdicts are being cached or cache is being cleared | Cache health unknown |

| GAP-08 | Gemini non-determinism: regenerated verdict on cache miss may differ from emailed verdict | Customer browser shows different verdict than email |

| GAP-09 | No TMM cross-validation on verdict output (HOW ABOUT NO walls not applied to Oracle generation path) | Fabricated verdicts not detected |

| GAP-10 | No end-to-end smoke test documented or automated | Pipeline health only known when a customer complains |


REFERENCES

| File | Role |

|------|------|

| /home/nous/Aether/app/app/api/checkout/route.ts | Checkout session creation, tier definitions, query packing |

| /home/nous/Aether/app/app/api/webhook/route.ts | Stripe webhook handler, verdict pre-computation, email trigger |

| /home/nous/Aether/app/app/api/verdict/route.ts | On-demand verdict retrieval (result page backend) |

| /home/nous/Aether/app/app/lib/verdictCache.ts | Cache read/write via oracle_toll REST |

| /home/nous/Aether/app/app/oracle/result/page.tsx | Customer-facing result page (browser rendering) |

| /home/nous/oracle_toll.py | Cache service (port 8889), verdict file storage in oracle_verdicts/ |

| /home/nous/oracle_email_service.py | Email delivery service (port 8006), tier formatters |

| /home/nous/send_graph_email.py | Microsoft Graph API email send |

| /home/nous/memories/PRELAUNCH_DRAFTS_APPROVED.md | Approved email template content (ASTRA's drafts, §3) |

| /home/nous/memories/PRELAUNCH_CHECKLIST.md | End-to-end test checklist (product pipeline testing section) |


Φζ.⊤.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042