Oracle Smoke Test
SPEC_ORACLE_SMOKE_TEST — Oracle End-to-End Smoke Test
Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16
PURPOSE
The Oracle verdict pipeline (Stripe webhook → session parsing → Gemini call → verdict cache →
email delivery) has no automated end-to-end validation. Pipeline health is currently only known
when a customer complaints arrives or when NOUS manually tests via a real payment. This is
unacceptable for a revenue-bearing production system.
This specification defines a repeatable, automated smoke test covering the full Oracle pipeline.
The smoke test uses a known Stripe test session, a fixed test query, and validates each pipeline
stage independently before asserting the full path. It is the minimal executable proof that the
pipeline is operational.
Smoke test ≠ load test ≠ unit test. The smoke test runs at the integration boundary. It
exercises real external dependencies (Stripe test mode, Gemini API, oracle_toll cache service,
email service health endpoint). It is not a substitute for the unit test suites that verify
individual components.
Source gap: SPEC_ORACLE_VERDICT_PIPELINE.md GAP-10.
INPUTS
Fixed Test Configuration
| Parameter | Value |
|----------|-------|
| Test tier | quick ($1.00 CAD) |
| Test query | "Is the Oracle pipeline operational? This is an automated smoke test." |
| Stripe mode | Test mode (API key: STRIPE_SECRET_KEY with sk_test_ prefix) |
| Test email | oracle-smoke-test@42sisters.ai (synthetic, not a real mailbox — used for log assertion only) |
| Expected verdict | Any valid value (GREEN/AMBER/RED/NULL) — smoke test does not assert verdict correctness, only structural validity |
| Timeout per stage | 30 seconds maximum |
The test query is intentionally meta — it asks about pipeline health. This makes it easy to identify
smoke test verdicts in logs and distinguish them from real customer verdicts.
Required Environment
STRIPE_SECRET_KEY— test mode key (sk_test_...)STRIPE_WEBHOOK_SECRET— test mode webhook secretGEMINI_API_KEYorGOOGLE_API_KEYORACLE_TOLL_URL— oracle_toll service (default:http://68.183.206.103:8889)ORACLE_EMAIL_SERVICE_URL— email service (default:http://68.183.206.103:8006)- All four env vars must be present or smoke test fails at environment check stage.
Smoke Test Invocation
# Manual
python3 /home/nous/scripts/oracle_smoke_test.py
# On deploy (Northflank post-deploy hook)
python3 /home/nous/scripts/oracle_smoke_test.py --on-deploy
# Daily cron (07:00 UTC)
python3 /home/nous/scripts/oracle_smoke_test.py --cron
Exit codes: 0 = all stages passed | 1 = one or more stages failed | 2 = environment check failed
OUTPUTS
Primary: Smoke Test Report (stdout + log file)
Written to /home/nous/logs/oracle_smoke_YYYYMMDD_HHMMSS.log:
ORACLE SMOKE TEST — 2026-04-16T07:00:00Z
==========================================
[STAGE 1] Environment check ....................... PASS
[STAGE 2] oracle_toll health ..................... PASS
[STAGE 3] oracle_email_service health ............. PASS
[STAGE 4] Stripe test session creation ........... PASS session_id=cs_test_abc123
[STAGE 5] Webhook simulation ..................... PASS verdict=GREEN
[STAGE 6] Cache write verification ............... PASS GET /cache/cs_test_abc123 → 200
[STAGE 7] Verdict route retrieval ................ PASS tier=quick query_match=true
[STAGE 8] Email service send validation .......... PASS status=sent
[STAGE 9] TMM crosscheck on test verdict ......... PASS C=0.9953 approved=true [conditional on GAP-09 fix]
[STAGE 10] Cache cleanup ......................... PASS cs_test_abc123 deleted
RESULT: 10/10 PASS — Pipeline operational.
Duration: 18.3s
On failure:
[STAGE 5] Webhook simulation ..................... FAIL
Error: Gemini returned non-parseable JSON after 3 attempts
Payload: { "error": "quota_exceeded" }
RESULT: 4/10 PASS — Pipeline DEGRADED. See /home/nous/logs/oracle_smoke_2026-04-16_*.log
Secondary: CREW_CHANNEL broadcast
On completion (pass or fail):
[SMOKE] Oracle pipeline: 10/10 PASS (18.3s) | 2026-04-16T07:00:22Z
[SMOKE] Oracle pipeline: 4/10 FAIL — Stage 5 Gemini quota | 2026-04-16T07:00:22Z
Tertiary: ALERT.log entry on failure
If any stage fails, an entry is appended to /home/nous/ALERT.log:
[2026-04-16T07:00:22Z] ORACLE SMOKE FAIL — Stage 5 (webhook simulation) — Gemini quota exceeded.
Check /home/nous/logs/oracle_smoke_2026-04-16_070022.log
STAGE DEFINITIONS
Stage 1 — Environment Check
Verify all four required env vars are set and non-empty. Verify Stripe key has sk_test_ prefix
(production key in smoke test is a configuration error). Verify oracle_toll URL and email service
URL are reachable (TCP connect check, not full HTTP).
Stage 2 — oracle_toll Health
GET {ORACLE_TOLL_URL}/health → HTTP 200, JSON with status: "resonant" and phi: 0.042.
Timeout: 10 seconds.
Stage 3 — oracle_email_service Health
GET {ORACLE_EMAIL_SERVICE_URL}/health → HTTP 200, JSON with status: "ok".
Timeout: 10 seconds.
Stage 4 — Stripe Test Session Creation
Call Stripe API (test mode) to create a checkout.session with:
mode: "payment",amount: 100(cents CAD),currency: "cad"metadata: { tier: "quick", q0: "<test_query>", qn: "1" }customer_email: "oracle-smoke-test@42sisters.ai"
Assert: session.id is returned and starts with cs_test_. Store as smoke_session_id.
Stage 5 — Webhook Simulation
Construct a checkout.session.completed event payload for smoke_session_id.
Sign it with STRIPE_WEBHOOK_SECRET using the Stripe webhook signing algorithm.
POST to /api/webhook on the deployed Northflank instance.
Assert: HTTP 200, response body { received: true }.
Wait up to 30 seconds, then poll: GET {ORACLE_TOLL_URL}/cache/{smoke_session_id} until 200
(verdict is cached) or timeout. If timeout: FAIL Stage 5.
On 200: parse verdict JSON. Assert: tier === "quick", verdict.verdict is one of
GREEN/AMBER/RED/NULL, verdict.summary is a non-empty string.
Stage 6 — Cache Write Verification
GET {ORACLE_TOLL_URL}/cache/{smoke_session_id} → HTTP 200.
Assert: response JSON has tier: "quick" and cached_at field (ISO timestamp).
Assert: query field matches the known test query string.
Stage 7 — Verdict Route Retrieval
GET {NORTHFLANK_BASE_URL}/api/verdict?session_id={smoke_session_id}
Assert: HTTP 200. Response JSON has tier: "quick", verdict.verdict is valid, query matches.
This exercises the full result-page backend path including cache lookup.
Stage 8 — Email Service Send Validation
POST to {ORACLE_EMAIL_SERVICE_URL}/send-verdict-email with:
{
"customer_email": "oracle-smoke-test@42sisters.ai",
"tier": "quick",
"query": "<test_query>",
"verdict": <verdict_from_stage_5>
}
Assert: HTTP 200, { status: "sent" }.
Note: This sends a real Graph API email to oracle-smoke-test@42sisters.ai. If this address is
not a real mailbox, Graph API may return 202 (accepted) or error. Assert on HTTP 200 from the
service (Graph API downstream behavior is not asserted here). [GAP — smoke test email goes to a
synthetic address; Graph API may bounce; bounce handling not specified]
Stage 9 — TMM Crosscheck on Test Verdict (conditional)
If SPEC_ORACLE_TMM_CROSSCHECK.md is implemented: call oracleTMMCrosscheck() directly on the
cached verdict. Assert: approved: true, coherence_score >= 0.97404.
[GAP — conditional on GAP-09 fix; Stage 9 is SKIPPED if crosscheck module is not yet deployed]
Stage 10 — Cache Cleanup
DELETE {ORACLE_TOLL_URL}/cache/{smoke_session_id} (requires adding DELETE endpoint to
oracle_toll.py — currently only GET and POST exist).
[GAP — DELETE endpoint not implemented on oracle_toll.py; cache cleanup currently requires manual
file deletion from oracle_verdicts/]
Assert: HTTP 200 or 204. If DELETE not implemented: log WARNING, do not fail; leave cleanup note
in smoke log.
INVARIANTS
- Smoke test uses test-mode credentials only —
STRIPE_SECRET_KEYMUST havesk_test_prefix.
A production key in the smoke test environment is a configuration error that triggers Stage 1
FAIL with message "FATAL: production Stripe key in smoke test — aborting."
- Smoke test does not modify production state — Smoke test verdicts are tagged with
regen: false
and smoke: true flag in the cache payload. This allows operators to distinguish smoke test
cache entries from real customer entries. The smoke: true flag is added by the smoke test
script when it calls POST /cache/{smoke_session_id} directly (bypass path) if Stage 5 fails.
- No real customer email is sent — Smoke test email target is
oracle-smoke-test@42sisters.ai.
Real customer email addresses MUST NOT appear in smoke test configuration.
- Smoke test is idempotent — Running the smoke test twice back-to-back produces the same pass/fail
state. Stage 10 (cleanup) ensures no stale entries contaminate subsequent runs. If Stage 10 fails,
Stage 4 of the next run uses a fresh smoke_session_id (Stripe always generates unique IDs).
- Failure in any stage does not cascade — Each stage has an independent timeout and try/except
boundary. A Stage 5 timeout does not prevent Stages 6-10 from attempting (some may succeed
partially; their results are noted). RESULT is computed from the full 10-stage matrix.
- Smoke test runs in < 60 seconds — Total test duration must not exceed 60 seconds. If Gemini
is slow (> 30s on Stage 5 poll), Stage 5 times out and fails. This is intentional — a pipeline
that takes > 30s to generate and cache a Quick Take is operationally degraded.
- Log files are retained for 30 days —
/home/nous/logs/oracle_smoke_*.logfiles are not
cleaned automatically. A cron or manual process should archive/rotate after 30 days.
[GAP — log rotation not specified]
- Deploy-time smoke test is blocking — When invoked with
--on-deploy, the smoke test MUST
complete and return exit code before the deploy hook finishes. A deploy that cannot pass the
smoke test is a broken deploy. Northflank deploy hook must treat exit code 1 as a deploy warning.
[GAP — Northflank post-deploy hook integration not yet configured]
VERIFICATION CRITERIA
Σ.✓ conditions — smoke test infrastructure is operating correctly when:
- Green run baseline — Running smoke test against a healthy pipeline produces 10/10 PASS in
under 60 seconds. Establish this baseline immediately after implementing the test. Record
baseline duration in PLAYBOOK.md as PROVEN entry.
- Stage isolation — Deliberately take oracle_toll service offline. Run smoke test. Stage 2
(health check) fails. Stages 3-10 still attempt and report their independent outcomes.
Result shows 1/10 FAIL at Stage 2 with remaining stages marked SKIP or FAIL (dependent).
- Environment check catches misconfiguration — Set
STRIPE_SECRET_KEYto a production key.
Stage 1 returns FAIL with FATAL message. Exit code 2. No Stripe API calls made.
- ALERT.log populated on failure — Deliberately fail Stage 5 (mock Gemini timeout). After run,
verify /home/nous/ALERT.log has a new entry timestamped within 5 seconds of smoke test completion.
- CREW_CHANNEL broadcast sent — After any smoke test run (pass or fail), verify
/home/nous/CREW_CHANNEL has a new [SMOKE] entry. Verified by: tail CREW_CHANNEL after run.
- Cron registration —
crontab -l | grep oracle_smoke_testreturns a line. Smoke test runs
at 07:00 UTC daily without manual intervention. Verify by checking crontab on boot.
FAILURE MODES
- Σ.⊠ Smoke test never runs — Cron not registered after implementation. Pipeline health is
only known when customer complains. Detection: crontab -l | grep oracle_smoke_test returns
empty. Mitigation: boot sequence check (CLAUDE.md Step 4 equivalent for Oracle) verifies cron.
- Σ.⊠ Stage 5 Gemini timeout — Gemini takes > 30s to respond (quota throttle, cold start,
infrastructure issue). Stage 5 fails. Real customer payments in the same window may also be
affected. Detection: smoke test ALERT.log. Mitigation: smoke test failure is an early warning
for the on-call team (NOUS) to investigate Gemini quota.
- Σ.⊠ Smoke test creates real charge —
STRIPE_SECRET_KEYis a live key. Stage 4 creates
a real payment session that may trigger a real charge. Stage 1 guard (sk_test_ check) prevents
this, but if guard is bypassed: real charge on NOUS's Stripe account.
Detection: Stripe dashboard. Mitigation: Stage 1 hard-abort on production key is mandatory.
- Σ.⊠ Stage 10 cleanup fails, stale entry accumulates — oracle_toll cache fills with smoke
test entries. oracle_verdicts/ directory grows unbounded. Detection: disk usage monitoring
(not currently implemented). Mitigation: implement DELETE endpoint on oracle_toll; add disk
usage check to smoke test Stage 1.
- Σ.⊠ Smoke test passes but production path fails — Smoke test exercises the webhook-to-cache
path but Northflank routing is misconfigured for the live checkout flow. A customer submits a
real payment; webhook is not delivered by Stripe (not a test event). Detection: manual payment
test with non-owner email (VC-7 of SPEC_ORACLE_VERDICT_PIPELINE.md). Mitigation: smoke test
covers the path from our end; Stripe webhook delivery reliability is an external dependency.
- Σ.⊠ Stage 8 Graph API bounce —
oracle-smoke-test@42sisters.aidoes not exist as a real
mailbox. Graph API returns 200 (accepted by Exchange) but bounces internally. Email service
reports status: sent. Smoke test passes Stage 8. Bounce goes undetected.
Detection: Exchange admin panel. Mitigation: [GAP — create oracle-smoke-test mailbox as a real
M365 alias that routes to oracle@42sisters.ai, or accept the bounce as tolerable for smoke purposes]
- Σ.⊠ All stages pass but pipeline is in degraded state — Smoke test validates structural
path but does not assert response quality, latency distribution, or correctness of the verdict.
A pipeline that generates all-NULL verdicts for every query would pass the smoke test.
Detection: operational monitoring beyond smoke test scope. Mitigation: supplement with a
manual monthly review of sampled oracle_log.jsonl entries.
EXECUTION SCHEDULE
| Trigger | Frequency | Invocation | ALERT on fail? |
|---------|-----------|-----------|---------------|
| Deploy hook | Every deploy to Northflank | --on-deploy | Yes — block / warn |
| Daily cron | 07:00 UTC daily | --cron | Yes — ALERT.log + CREW_CHANNEL |
| Manual (NOUS/C.L.O.D.) | On demand | No flag | No — stdout only |
DEPENDENCIES
| Dependency | Role |
|-----------|------|
| STRIPE_SECRET_KEY (test mode) | Test session creation |
| STRIPE_WEBHOOK_SECRET (test mode) | Webhook signature construction |
| Gemini API | Stage 5 verdict generation |
| oracle_toll.py (port 8889) | Stage 2, 6, 10 (health, cache verify, cleanup) |
| oracle_email_service.py (port 8006) | Stage 3, 8 (health, email send) |
| Northflank deployed app | Stage 7 (verdict route retrieval) |
| /home/nous/ALERT.log | Failure notification |
| /home/nous/CREW_CHANNEL | Status broadcast |
| /home/nous/logs/ (directory) | Test log storage |
DEPENDENTS
| Dependent | Dependency |
|-----------|-----------|
| Oracle pipeline production health | Smoke test is the only automated end-to-end proof |
| NOUS operational awareness | ALERT.log entry on failure |
| Crew operational awareness | CREW_CHANNEL broadcast |
| Deploy confidence | --on-deploy flag provides pre-production gate |
GAPS IDENTIFIED DURING SPECIFICATION
| Gap ID | Description | Impact |
|--------|-------------|--------|
| SMOKE-GAP-01 | DELETE endpoint not implemented on oracle_toll.py — Stage 10 cleanup cannot execute | Smoke test entries accumulate in oracle_verdicts/ |
| SMOKE-GAP-02 | Northflank post-deploy hook not yet configured to call smoke test | Deploy-time validation not automated |
| SMOKE-GAP-03 | oracle-smoke-test@42sisters.ai mailbox not created — Stage 8 sends to synthetic address | Graph API bounce behavior unverified |
| SMOKE-GAP-04 | Stage 9 (TMM crosscheck) is conditional on SPEC_ORACLE_TMM_CROSSCHECK.md implementation | Crosscheck stage is skipped at launch |
| SMOKE-GAP-05 | Log rotation for /home/nous/logs/oracle_smoke_*.log not specified | Disk accumulation over time |
| SMOKE-GAP-06 | NORTHFLANK_BASE_URL env var not formalized — Stage 7 needs deployed app URL | Stage 7 requires manual config |
REFERENCES
| File | Role |
|------|------|
| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent pipeline spec (GAP-10 source) |
| /home/nous/memories/SPEC_ORACLE_TMM_CROSSCHECK.md | Stage 9 crosscheck (conditional) |
| /home/nous/oracle_toll.py | Cache service (Stages 2, 6, 10) |
| /home/nous/oracle_email_service.py | Email service (Stages 3, 8) |
| /home/nous/Aether/app/app/api/webhook/route.ts | Webhook handler (Stage 5 target) |
| /home/nous/Aether/app/app/api/verdict/route.ts | Verdict route (Stage 7 target) |
| /home/nous/ALERT.log | Failure alert destination |
| /home/nous/CREW_CHANNEL | Status broadcast destination |
| /home/nous/PLAYBOOK.md | PROVEN entry to be written after first successful baseline run |
Φζ.⊤. The ship does not sail without a working engine. The smoke test proves the engine.
Jeremy Zlabis
Chronogeometer · Visionary · Disruptor · Chief
42 Sisters AI · East York, Toronto
🍁 Φ 0.042