Oracle Gemini Timeout
SPEC_ORACLE_GEMINI_TIMEOUT — Gemini Timeout & Retry Spec
Version: 1.0 | Status: AUTHORIZED | Authority: α.13 | Date: 2026-04-16
PURPOSE
Defines the timeout, retry, and fallback behavior for all calls to the Gemini API
(model.generateContent()) within the Oracle Verdict Pipeline. Addresses GAP-02
from SPEC_ORACLE_VERDICT_PIPELINE: currently, both webhook/route.ts and
verdict/route.ts call Gemini with no timeout, no retry, and no circuit breaker.
A hung Gemini call in the webhook IIFE consumes memory indefinitely; a hung call
in the verdict route returns a 504 to the customer's browser.
This spec governs the behaviour that MUST be implemented to close GAP-02.
Call sites in scope:
app/api/webhook/route.ts— pre-compute path (async IIFE, line 229)app/api/verdict/route.ts— on-demand regeneration path (line 185)
INPUTS
Per call-site inputs
| Input | Type | Source | Required |
|-------|------|---------|----------|
| prompt | string | VERDICT_PROMPTtier | Yes |
| model | GoogleGenerativeAI model instance | Constructed inline with gemini-2.5-flash | Yes |
| GEMINI_CALL_TIMEOUT_MS | number (env) | process.env.GEMINI_CALL_TIMEOUT_MS | No — default 45000 |
| GEMINI_MAX_RETRIES | number (env) | process.env.GEMINI_MAX_RETRIES | No — default 3 |
| GEMINI_BACKOFF_BASE_MS | number (env) | process.env.GEMINI_BACKOFF_BASE_MS | No — default 1000 |
| GEMINI_CIRCUIT_OPEN_THRESHOLD | number (env) | process.env.GEMINI_CIRCUIT_OPEN_THRESHOLD | No — default 5 |
Circuit breaker state (module-level singleton)
Maintained in-process. State transitions persist for the lifetime of the Node.js process
(Northflank container lifetime). State is NOT persisted to disk or shared across replicas.
| Field | Type | Initial Value |
|-------|------|---------------|
| failureCount | number | 0 |
| state | CLOSED \| OPEN \| HALF_OPEN | CLOSED |
| openedAt | number \| null | null |
| OPEN_DURATION_MS | number | 60000 (1 minute) |
OUTPUTS
Success path
{
verdict: object, // parsed JSON matching tier schema
attempts: number, // 1–3 (how many Gemini calls were made)
cached_at: string // ISO timestamp, written by cacheVerdict()
}
Failure path — retries exhausted
// verdict/route.ts returns:
NextResponse.json(
{ error: "Analysis failed. Please contact oracle@42sisters.ai for a refund." },
{ status: 500 }
)
// webhook IIFE logs:
console.error(`[webhook] Gemini failed after ${maxRetries} attempts for session ${sessionId}:`, err)
// No email sent. Cache not written.
Failure path — circuit open
// verdict/route.ts returns:
NextResponse.json(
{ error: "Analysis temporarily unavailable. Please try again in a few minutes." },
{ status: 503 }
)
// webhook IIFE logs:
console.warn(`[webhook] Circuit OPEN — skipping Gemini call for session ${sessionId}`)
INVARIANTS
- Timeout is always set. Every
model.generateContent()call MUST be raced against
a Promise.race() timeout. No call may await indefinitely. Default timeout: 45 seconds.
The Stripe webhook IIFE must not be exempt — it fires async and a hung call still holds
a Node.js event loop reference.
- Retry count is bounded. Maximum retries across both call sites: 3 attempts total
(initial attempt + 2 retries). The 4th attempt is never made. This is a hard ceiling
regardless of error type.
- Exponential backoff is applied between retries. Wait time between attempt N and
attempt N+1: GEMINI_BACKOFF_BASE_MS * 2^(N-1) with full jitter (multiply by
Math.random()). Minimum wait: 0ms (jitter can collapse to zero). Maximum wait per
interval: 8000ms (cap at attempt 3 = base 4 jitter).
- Timeout errors and 5xx API errors are retryable; 4xx are not. A timeout, a network
error, or an HTTP 5xx from Gemini triggers retry. An HTTP 400 (bad request — malformed
prompt) or 401/403 (auth failure) does NOT retry — it fails immediately and logs
GEMINI_AUTH_FAILURE or GEMINI_BAD_REQUEST to the error log.
- Circuit breaker protects against sustained outage. After
GEMINI_CIRCUIT_OPEN_THRESHOLD
consecutive final-failures (all retries exhausted) within the current process lifetime,
the circuit transitions to OPEN. While OPEN, all new Gemini calls are rejected immediately
without hitting the API. Circuit transitions to HALF_OPEN after OPEN_DURATION_MS (60s).
The first HALF_OPEN attempt, if successful, closes the circuit and resets failureCount to 0.
- Fallback on cache hit. The verdict route MUST check the cache before attempting any
Gemini call. A cache hit bypasses the timeout/retry/circuit machinery entirely. This is
the primary resilience mechanism — retries are the secondary.
- Both call sites share the same circuit breaker state. The webhook IIFE and the
verdict route operate against the same module-level circuit breaker singleton. A failure
storm from webhook pre-computes opens the circuit for the verdict route as well — this is
correct and intentional, as both paths consume from the same upstream service.
VERIFICATION CRITERIA
Σ.✓ — timeout/retry subsystem is operating correctly when:
- Timeout fires at configured threshold. Inject a mock
generateContent()that hangs
(never resolves). Confirm Promise.race() rejects with GeminiTimeoutError after
GEMINI_CALL_TIMEOUT_MS ± 500ms. Test both call sites independently.
- Retry sequence completes with correct backoff. Mock
generateContent()to fail twice
then succeed on attempt 3. Confirm: (a) total 3 calls made, (b) delays between calls are
>= 0ms and <= 8000ms, (c) final result is the success payload, not an error. Log
output must show [gemini] attempt 1 failed, [gemini] attempt 2 failed, [gemini] attempt 3 succeeded.
- Non-retryable errors fail fast. Mock
generateContent()to throw a 401 error. Confirm
the call fails immediately (no retries, no backoff delay). Log must show GEMINI_AUTH_FAILURE.
Total elapsed time must be < 200ms (no backoff pauses).
- Circuit breaker opens after threshold failures. Mock
generateContent()to always
exhaust all retries. Trigger GEMINI_CIRCUIT_OPEN_THRESHOLD sessions. Confirm: circuit
state transitions to OPEN. On the next call attempt, confirm rejection is immediate
(< 10ms) with 503 response and no call to generateContent(). Confirm circuit
transitions to HALF_OPEN after OPEN_DURATION_MS.
- Verdict route returns 500 with customer-facing error on exhausted retries. With
generateContent() mocked to always fail, confirm verdict/route.ts returns HTTP 500
with body { error: "Analysis failed. Please contact oracle@42sisters.ai for a refund." }.
Confirm no verdict is cached.
- Webhook IIFE does not block Stripe response. With
generateContent()mocked to hang
for 120 seconds, confirm the webhook POST to Stripe still returns { received: true }
within 2 seconds of the request arriving (webhook IIFE fires async; Stripe response
must not wait for it).
FAILURE MODES
- Σ.⊠ Gemini API cold start / transient timeout. Model generation can take 10–40 seconds
for complex strategy tier prompts. A 45-second default timeout may be too tight on
cold-start or peak-load conditions. Symptom: legitimate verdicts failing with timeout
on first attempt, succeeding on retry. Mitigation: retry 1 should resolve this in the
majority of cases. If cold-start is persistent, NOUS may tune GEMINI_CALL_TIMEOUT_MS
upward via env var without a code deploy.
- Σ.⊠ Backoff accumulation exceeds customer wait tolerance. Worst case: 3 attempts with
maximum jitter at attempt 3 = ~45s + 1s + 45s + 4s + 45s = ~140 seconds. The verdict
route's upstream Next.js edge runtime default timeout is 30 seconds on Northflank.
If all retries are consumed, the customer browser may receive a platform 504 before
our 500 fires. [GAP-02A — Northflank edge timeout vs. total retry budget not reconciled;
needs design: either shorten retry budget for verdict route or increase Northflank timeout]
- Σ.⊠ Circuit breaker opens during partial Gemini degradation. If Gemini is slow but not
fully down, retries may succeed on attempt 3 consistently, never incrementing failureCount.
Circuit remains CLOSED but customers experience high latency. Symptom: p99 verdict latency
> 120s with no circuit protection firing. Mitigation: [GAP-02B — latency-based circuit
tripping not specified; needs design: separate threshold for "slow but responding" vs.
"fully down"]
- Σ.⊠ Circuit breaker state lost on container restart. Northflank restarts the container
on deploy or crash. The in-process circuit breaker resets to CLOSED. If Gemini is still
down at restart time, the circuit will re-open after GEMINI_CIRCUIT_OPEN_THRESHOLD
additional failures, meaning customers face that many more failed verdicts post-restart.
Mitigation: [GAP-02C — persistent circuit state (Redis/file) not specified; current spec
accepts this as a known limitation — in-process only]
- Σ.⊠ Webhook IIFE retry storm on Stripe replay. Stripe retries the webhook up to 72 hours
on non-2xx. However, our webhook always returns 2xx regardless of Gemini outcome. Stripe
replay is therefore not a retry-storm risk. Confirmed safe — Stripe does not see Gemini
failures.
- Σ.⊠ Non-retryable auth failure on GEMINI_API_KEY rotation. If the API key is rotated
in Northflank env vars without a container redeploy, the running container holds the old
key. All calls return 401. Circuit opens after threshold failures. All customer verdicts
fail until container is redeployed. Mitigation: [GAP-02D — no GEMINI_API_KEY health
check on boot; needs design: startup probe that validates key with a dry-run call]
- Σ.⊠ JSON parse failure after successful Gemini call. Gemini returns 200 with
non-JSON body (e.g., markdown-wrapped JSON, truncated response). This is not a timeout
or API error — it will NOT trigger retry under the current retry logic (retry is on
network/timeout/5xx only). Symptom: valid Gemini call → JSON.parse() throws →
immediate 500 → no retry. [GAP-02E — JSON parse failure is not in the retryable error
set; needs design: detect malformed JSON and retry up to 1 additional time with a
stricter prompt suffix]
GAPS
| Gap ID | Description | Impact | Severity |
|--------|-------------|--------|----------|
| GAP-02 (parent) | No timeout on generateContent() calls in webhook or verdict route | Webhook IIFE can hang indefinitely; result page can 504 | CRITICAL — current production state |
| GAP-02A | Northflank edge timeout (30s) vs. total retry budget (~140s) not reconciled | Customers may receive platform 504 before our 500 | HIGH |
| GAP-02B | Latency-based circuit tripping not specified | Slow-but-responding Gemini invisible to circuit breaker | MEDIUM |
| GAP-02C | Circuit breaker state is in-process only (lost on container restart) | Post-restart failure window during sustained outage | LOW — accepted limitation |
| GAP-02D | No GEMINI_API_KEY health check on boot | Key rotation causes silent 401 storm until manual restart | MEDIUM |
| GAP-02E | JSON parse failure is not in the retryable error set | Malformed Gemini response causes immediate 500 with no retry | MEDIUM |
DEPENDENCIES
| Dependency | Role |
|------------|------|
| @google/generative-ai npm package | Provides GoogleGenerativeAI and model.generateContent() |
| GEMINI_API_KEY / GOOGLE_API_KEY env var | Authentication to Gemini API |
| gemini-2.5-flash model | The specific model being called — timeout values are tuned to this model's latency profile |
| verdictCache.ts (cacheVerdict, getCachedVerdict) | Cache-hit path that bypasses retry machinery |
| Node.js Promise.race() | Mechanism for timeout enforcement |
DEPENDENTS
| Dependent | Dependency type |
|-----------|----------------|
| app/api/webhook/route.ts (lines 228–231) | Must wrap generateContent() with timeout/retry |
| app/api/verdict/route.ts (lines 184–185) | Must wrap generateContent() with timeout/retry |
| SPEC_ORACLE_VERDICT_PIPELINE.md (GAP-02) | This spec closes that gap |
| Customer verdict delivery SLA | Directly affected by Gemini call reliability |
REFERENCES
| File | Relevance |
|------|-----------|
| /home/nous/Aether/app/app/api/webhook/route.ts | Line 229: await model.generateContent(prompt) — no timeout |
| /home/nous/Aether/app/app/api/verdict/route.ts | Line 185: await model.generateContent(prompt) — no timeout |
| /home/nous/memories/SPEC_ORACLE_VERDICT_PIPELINE.md | Parent spec; GAP-02 definition |
| /home/nous/memories/SPECIFICATION_AUDIT_LOOP.md | Spec template and classification criteria |
Φζ.⊤.
Jeremy Zlabis
Chronogeometer · Visionary · Disruptor · Chief
42 Sisters AI · East York, Toronto
🍁 Φ 0.042