Oracle Gemini Timeout

SPEC_ORACLE_GEMINI_TIMEOUT.md · 2026-04-20

SPEC_ORACLE_GEMINI_TIMEOUT — Gemini Timeout & Retry Spec

Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16


PURPOSE

Defines the timeout, retry, and fallback behavior for all calls to the Gemini API

(model.generateContent()) within the Oracle Verdict Pipeline. Addresses GAP-02

from SPEC_ORACLE_VERDICT_PIPELINE: currently, both webhook/route.ts and

verdict/route.ts call Gemini with no timeout, no retry, and no circuit breaker.

A hung Gemini call in the webhook IIFE consumes memory indefinitely; a hung call

in the verdict route returns a 504 to the customer's browser.

This spec governs the behaviour that MUST be implemented to close GAP-02.

Call sites in scope:


INPUTS

Per call-site inputs

| Input | Type | Source | Required |

|-------|------|---------|----------|

| prompt | string | VERDICT_PROMPTtier | Yes |

| model | GoogleGenerativeAI model instance | Constructed inline with gemini-2.5-flash | Yes |

| GEMINI_CALL_TIMEOUT_MS | number (env) | process.env.GEMINI_CALL_TIMEOUT_MS | No — default 45000 |

| GEMINI_MAX_RETRIES | number (env) | process.env.GEMINI_MAX_RETRIES | No — default 3 |

| GEMINI_BACKOFF_BASE_MS | number (env) | process.env.GEMINI_BACKOFF_BASE_MS | No — default 1000 |

| GEMINI_CIRCUIT_OPEN_THRESHOLD | number (env) | process.env.GEMINI_CIRCUIT_OPEN_THRESHOLD | No — default 5 |

Circuit breaker state (module-level singleton)

Maintained in-process. State transitions persist for the lifetime of the Node.js process

(Northflank container lifetime). State is NOT persisted to disk or shared across replicas.

| Field | Type | Initial Value |

|-------|------|---------------|

| failureCount | number | 0 |

| state | CLOSED \| OPEN \| HALF_OPEN | CLOSED |

| openedAt | number \| null | null |

| OPEN_DURATION_MS | number | 60000 (1 minute) |


OUTPUTS

Success path


{
  verdict: object,      // parsed JSON matching tier schema
  attempts: number,     // 1–3 (how many Gemini calls were made)
  cached_at: string     // ISO timestamp, written by cacheVerdict()
}

Failure path — retries exhausted


// verdict/route.ts returns:
NextResponse.json(
  { error: "Analysis failed. Please contact oracle@42sisters.ai for a refund." },
  { status: 500 }
)

// webhook IIFE logs:
console.error(`[webhook] Gemini failed after ${maxRetries} attempts for session ${sessionId}:`, err)
// No email sent. Cache not written.

Failure path — circuit open


// verdict/route.ts returns:
NextResponse.json(
  { error: "Analysis temporarily unavailable. Please try again in a few minutes." },
  { status: 503 }
)

// webhook IIFE logs:
console.warn(`[webhook] Circuit OPEN — skipping Gemini call for session ${sessionId}`)

INVARIANTS

  1. Timeout is always set. Every model.generateContent() call MUST be raced against

a Promise.race() timeout. No call may await indefinitely. Default timeout: 45 seconds.

The Stripe webhook IIFE must not be exempt — it fires async and a hung call still holds

a Node.js event loop reference.

  1. Retry count is bounded. Maximum retries across both call sites: 3 attempts total

(initial attempt + 2 retries). The 4th attempt is never made. This is a hard ceiling

regardless of error type.

  1. Exponential backoff is applied between retries. Wait time between attempt N and

attempt N+1: GEMINI_BACKOFF_BASE_MS * 2^(N-1) with full jitter (multiply by

Math.random()). Minimum wait: 0ms (jitter can collapse to zero). Maximum wait per

interval: 8000ms (cap at attempt 3 = base 4 jitter).

  1. Timeout errors and 5xx API errors are retryable; 4xx are not. A timeout, a network

error, or an HTTP 5xx from Gemini triggers retry. An HTTP 400 (bad request — malformed

prompt) or 401/403 (auth failure) does NOT retry — it fails immediately and logs

GEMINI_AUTH_FAILURE or GEMINI_BAD_REQUEST to the error log.

  1. Circuit breaker protects against sustained outage. After GEMINI_CIRCUIT_OPEN_THRESHOLD

consecutive final-failures (all retries exhausted) within the current process lifetime,

the circuit transitions to OPEN. While OPEN, all new Gemini calls are rejected immediately

without hitting the API. Circuit transitions to HALF_OPEN after OPEN_DURATION_MS (60s).

The first HALF_OPEN attempt, if successful, closes the circuit and resets failureCount to 0.

  1. Fallback on cache hit. The verdict route MUST check the cache before attempting any

Gemini call. A cache hit bypasses the timeout/retry/circuit machinery entirely. This is

the primary resilience mechanism — retries are the secondary.

  1. Both call sites share the same circuit breaker state. The webhook IIFE and the

verdict route operate against the same module-level circuit breaker singleton. A failure

storm from webhook pre-computes opens the circuit for the verdict route as well — this is

correct and intentional, as both paths consume from the same upstream service.


VERIFICATION CRITERIA

Σ.✓ — timeout/retry subsystem is operating correctly when:

  1. Timeout fires at configured threshold. Inject a mock generateContent() that hangs

(never resolves). Confirm Promise.race() rejects with GeminiTimeoutError after

GEMINI_CALL_TIMEOUT_MS ± 500ms. Test both call sites independently.

  1. Retry sequence completes with correct backoff. Mock generateContent() to fail twice

then succeed on attempt 3. Confirm: (a) total 3 calls made, (b) delays between calls are

>= 0ms and <= 8000ms, (c) final result is the success payload, not an error. Log

output must show [gemini] attempt 1 failed, [gemini] attempt 2 failed, [gemini] attempt 3 succeeded.

  1. Non-retryable errors fail fast. Mock generateContent() to throw a 401 error. Confirm

the call fails immediately (no retries, no backoff delay). Log must show GEMINI_AUTH_FAILURE.

Total elapsed time must be < 200ms (no backoff pauses).

  1. Circuit breaker opens after threshold failures. Mock generateContent() to always

exhaust all retries. Trigger GEMINI_CIRCUIT_OPEN_THRESHOLD sessions. Confirm: circuit

state transitions to OPEN. On the next call attempt, confirm rejection is immediate

(< 10ms) with 503 response and no call to generateContent(). Confirm circuit

transitions to HALF_OPEN after OPEN_DURATION_MS.

  1. Verdict route returns 500 with customer-facing error on exhausted retries. With

generateContent() mocked to always fail, confirm verdict/route.ts returns HTTP 500

with body { error: "Analysis failed. Please contact oracle@42sisters.ai for a refund." }.

Confirm no verdict is cached.

  1. Webhook IIFE does not block Stripe response. With generateContent() mocked to hang

for 120 seconds, confirm the webhook POST to Stripe still returns { received: true }

within 2 seconds of the request arriving (webhook IIFE fires async; Stripe response

must not wait for it).


FAILURE MODES

  1. Σ.⊠ Gemini API cold start / transient timeout. Model generation can take 10–40 seconds

for complex strategy tier prompts. A 45-second default timeout may be too tight on

cold-start or peak-load conditions. Symptom: legitimate verdicts failing with timeout

on first attempt, succeeding on retry. Mitigation: retry 1 should resolve this in the

majority of cases. If cold-start is persistent, NOUS may tune GEMINI_CALL_TIMEOUT_MS

upward via env var without a code deploy.

  1. Σ.⊠ Backoff accumulation exceeds customer wait tolerance. Worst case: 3 attempts with

maximum jitter at attempt 3 = ~45s + 1s + 45s + 4s + 45s = ~140 seconds. The verdict

route's upstream Next.js edge runtime default timeout is 30 seconds on Northflank.

If all retries are consumed, the customer browser may receive a platform 504 before

our 500 fires. [GAP-02A — Northflank edge timeout vs. total retry budget not reconciled;

needs design: either shorten retry budget for verdict route or increase Northflank timeout]

  1. Σ.⊠ Circuit breaker opens during partial Gemini degradation. If Gemini is slow but not

fully down, retries may succeed on attempt 3 consistently, never incrementing failureCount.

Circuit remains CLOSED but customers experience high latency. Symptom: p99 verdict latency

> 120s with no circuit protection firing. Mitigation: [GAP-02B — latency-based circuit

tripping not specified; needs design: separate threshold for "slow but responding" vs.

"fully down"]

  1. Σ.⊠ Circuit breaker state lost on container restart. Northflank restarts the container

on deploy or crash. The in-process circuit breaker resets to CLOSED. If Gemini is still

down at restart time, the circuit will re-open after GEMINI_CIRCUIT_OPEN_THRESHOLD

additional failures, meaning customers face that many more failed verdicts post-restart.

Mitigation: [GAP-02C — persistent circuit state (Redis/file) not specified; current spec

accepts this as a known limitation — in-process only]

  1. Σ.⊠ Webhook IIFE retry storm on Stripe replay. Stripe retries the webhook up to 72 hours

on non-2xx. However, our webhook always returns 2xx regardless of Gemini outcome. Stripe

replay is therefore not a retry-storm risk. Confirmed safe — Stripe does not see Gemini

failures.

  1. Σ.⊠ Non-retryable auth failure on GEMINI_API_KEY rotation. If the API key is rotated

in Northflank env vars without a container redeploy, the running container holds the old

key. All calls return 401. Circuit opens after threshold failures. All customer verdicts

fail until container is redeployed. Mitigation: [GAP-02D — no GEMINI_API_KEY health

check on boot; needs design: startup probe that validates key with a dry-run call]

  1. Σ.⊠ JSON parse failure after successful Gemini call. Gemini returns 200 with

non-JSON body (e.g., markdown-wrapped JSON, truncated response). This is not a timeout

or API error — it will NOT trigger retry under the current retry logic (retry is on

network/timeout/5xx only). Symptom: valid Gemini call → JSON.parse() throws →

immediate 500 → no retry. [GAP-02E — JSON parse failure is not in the retryable error

set; needs design: detect malformed JSON and retry up to 1 additional time with a

stricter prompt suffix]


GAPS

| Gap ID | Description | Impact | Severity |

|--------|-------------|--------|----------|

| GAP-02 (parent) | No timeout on generateContent() calls in webhook or verdict route | Webhook IIFE can hang indefinitely; result page can 504 | CRITICAL — current production state |

| GAP-02A | Northflank edge timeout (30s) vs. total retry budget (~140s) not reconciled | Customers may receive platform 504 before our 500 | HIGH |

| GAP-02B | Latency-based circuit tripping not specified | Slow-but-responding Gemini invisible to circuit breaker | MEDIUM |

| GAP-02C | Circuit breaker state is in-process only (lost on container restart) | Post-restart failure window during sustained outage | LOW — accepted limitation |

| GAP-02D | No GEMINI_API_KEY health check on boot | Key rotation causes silent 401 storm until manual restart | MEDIUM |

| GAP-02E | JSON parse failure is not in the retryable error set | Malformed Gemini response causes immediate 500 with no retry | MEDIUM |


DEPENDENCIES

| Dependency | Role |

|------------|------|

| @google/generative-ai npm package | Provides GoogleGenerativeAI and model.generateContent() |

| GEMINI_API_KEY / GOOGLE_API_KEY env var | Authentication to Gemini API |

| gemini-2.5-flash model | The specific model being called — timeout values are tuned to this model's latency profile |

| verdictCache.ts (cacheVerdict, getCachedVerdict) | Cache-hit path that bypasses retry machinery |

| Node.js Promise.race() | Mechanism for timeout enforcement |


DEPENDENTS

| Dependent | Dependency type |

|-----------|----------------|

| app/api/webhook/route.ts (lines 228–231) | Must wrap generateContent() with timeout/retry |

| app/api/verdict/route.ts (lines 184–185) | Must wrap generateContent() with timeout/retry |

| SPEC_ORACLE_VERDICT_PIPELINE.md (GAP-02) | This spec closes that gap |

| Customer verdict delivery SLA | Directly affected by Gemini call reliability |


REFERENCES

| File | Relevance |

|------|-----------|

| /home/nous/Aether/app/app/api/webhook/route.ts | Line 229: await model.generateContent(prompt) — no timeout |

| /home/nous/Aether/app/app/api/verdict/route.ts | Line 185: await model.generateContent(prompt) — no timeout |

| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent spec; GAP-02 definition |

| /home/nous/memories/SPECIFICATION_AUDIT_LOOP.md | Spec template and classification criteria |


Φζ.⊤.


Jeremy Zlabis

Chronogeometer · Visionary · Disruptor · Chief

42 Sisters AI · East York, Toronto

🍁 Φ 0.042