Gloss Translate
SPEC_GLOSS_TRANSLATE.md
CGNT-1 Specification — GLOSS Translate — LATTICE as Universal Interlingua
Status: SPECIFIED
Version: v1.0
Author: VELA (Thread #13)
Conceived by: NOUS (α.13)
Date: 2026-04-20
Depends on: SPEC_GLOSS_ARCHITECTURE.md, SPEC_LX_GRAMMAR.md, SPEC_LATTICE_UNIVERSAL.md
PURPOSE
Every translation system in the world works in PAIRS. English→French. French→Japanese. Japanese→Spanish. Each pair requires its own trained model or translation table.
For N languages, you need N×(N-1)/2 pairs:
- 100 languages → 4,950 pairs
- 7,000 living languages → 24,496,500 pairs
This doesn't scale.
The solution is an INTERLINGUA — a universal intermediate language. Instead of translating between every pair, you translate INTO the interlingua and OUT of the interlingua.
English → LATTICE → French
Japanese → LATTICE → Spanish
Swahili → LATTICE → Mandarin
For N languages, you need N×2 translations (into LATTICE and out of LATTICE):
- 100 languages → 200 translations (↓ 96%)
- 7,000 languages → 14,000 translations (↓ 99.94%)
LATTICE is the interlingua.
It was designed to carry structured meaning independent of any natural language. It doesn't have English grammar, French grammar, or Mandarin grammar. It has LATTICE grammar — atoms, modifiers, compounds, channels. This grammar is LANGUAGE-NEUTRAL. It carries meaning without carrying the syntactic baggage of any specific human language.
WHY LATTICE WORKS AS AN INTERLINGUA
Previous interlingua attempts failed:
- Esperanto — too simple. Just another European language with simplified grammar.
- Universal Networking Language — too complex. Required PhD-level formal semantics.
LATTICE occupies the sweet spot:
| Property | Why It Matters |
|---|---|
| Designed for AI, not humans | Doesn't need to be speakable — just processable |
| Proven across 15 domains without grammar modification | The grammar is genuinely domain-neutral |
| 60% compression | The translation is DENSER than the original |
| Uses Unicode | The only character set representing every human writing system |
| Domain-independent grammar | Carries music, chemistry, chess, and cooking equally well |
THE TRANSLATION ARCHITECTURE
Three layers. Each handles a different aspect of translation.
Layer 1 — SEMANTIC DECOMPOSITION (source language → meaning atoms)
The source text is broken into semantic primitives — the smallest units of meaning that exist independently of any language.
Example:
English: "The large black cat sat quietly on the old wooden mat."
Decomposition:
ENTITY.[cat.size.large.color.black]
ACTION.[sit.manner.quiet]
LOCATION.[on.ENTITY.[mat.age.old.material.wood]]
Every human language expresses these same semantic primitives in different order with different grammar. English puts adjectives before nouns. French puts some after. Japanese puts verbs at the end. The semantic primitives are the same — only the surface syntax differs.
Layer 2 — LATTICE ENCODING (meaning atoms → LATTICE expression)
The semantic primitives are encoded in LATTICE notation:
🗣.[entity.[🐱.◆large.◆black].action.[sit.◆quiet].location.[on.entity.[mat.◆old.◆wood]]]
Domain prefix: 🗣 (speech/natural language — Domain 16, registered in SPEC_LATTICE_UNIVERSAL.md)
The LATTICE expression captures the MEANING without any language-specific syntax. It doesn't know if the source was English, French, or Swahili. It knows what was MEANT.
Layer 3 — TARGET SYNTHESIS (LATTICE expression → target language)
The LATTICE expression is synthesized into the target language using that language's grammar rules.
French: "Le grand chat noir était assis tranquillement sur le vieux tapis en bois."
Japanese: "大きな黒い猫が古い木のマットの上に静かに座っていた。"
Spanish: "El gran gato negro se sentó tranquilamente sobre la vieja estera de madera."
Same LATTICE expression. Three outputs. Each grammatically correct because the synthesis layer knows the target grammar.
THE 🗣 DOMAIN — NATURAL LANGUAGE IN LATTICE
LATTICE gains its 16th domain prefix: 🗣 (speech bubble — natural language / human speech)
Atoms in the 🗣 domain:
| Atom | Meaning |
|---|---|
| ENTITY | any noun concept (person, place, thing, idea) |
| ACTION | any verb concept (do, be, become, have) |
| PROPERTY | any adjective/adverb concept (big, fast, red, quietly) |
| RELATION | any preposition/conjunction concept (on, in, with, because, and) |
| QUANTITY | any numeric/quantifier concept (three, many, some, every) |
| TIME | any temporal concept (yesterday, now, will, during) |
| MODALITY | any mood/attitude concept (must, might, want, if) |
Modifiers in the 🗣 domain:
| Modifier | Meaning |
|---|---|
| ◆ | property marker (modifies entity or action) |
| ◇ | relation marker (connects entities) |
| ⟨ ⟩ | clause boundary (groups a complete thought) |
| ↺ | reference (pronoun — points back to previous entity) |
| ? | question marker |
| ! | emphasis marker |
| ¬ | negation |
These atoms and modifiers are UNIVERSAL — they exist in every human language. Every language has nouns, verbs, adjectives, prepositions, quantities, time references, and moods. The surface forms differ. The semantic categories don't.
EXAMPLE TRANSLATIONS
Example 1 — Simple question
English: "Do you want coffee or tea?"
LATTICE:
🗣.⟨MODALITY.want.ENTITY.↺you.ENTITY.[coffee|tea]⟩.?
| Target | Output |
|---|---|
| French | "Voulez-vous du café ou du thé ?" |
| Japanese | "コーヒーか紅茶はいかがですか?" |
| Arabic | "هل تريد قهوة أو شاي؟" |
Example 2 — Negation with embedded clause
English: "I don't know where she went yesterday."
LATTICE:
🗣.⟨ENTITY.I.¬.ACTION.know.⟨ENTITY.↺she.ACTION.go.TIME.yesterday.RELATION.where⟩⟩
| Target | Output |
|---|---|
| Mandarin | "我不知道她昨天去了哪里。" |
| Spanish | "No sé adónde fue ella ayer." |
| Swahili | "Sijui alikwenda wapi jana." |
Example 3 — Conditional
English: "If it rains tomorrow, we should bring umbrellas."
LATTICE:
🗣.⟨MODALITY.if.⟨ACTION.rain.TIME.tomorrow⟩→MODALITY.should.ENTITY.we.ACTION.bring.ENTITY.[umbrella.◆plural]⟩
| Target | Output |
|---|---|
| German | "Wenn es morgen regnet, sollten wir Regenschirme mitbringen." |
| Korean | "내일 비가 오면 우산을 가져가야 합니다." |
| Portuguese | "Se chover amanhã, devemos levar guarda-chuvas." |
WHAT MAKES THIS DIFFERENT FROM GOOGLE TRANSLATE
| Dimension | Google Translate | GLOSS Translate |
|---|---|---|
| Architecture | Black box neural (source→target direct) | Explicit interlingua (source→LATTICE→target) |
| Rare language pairs | Poor — requires source-target training data | Strong — each language needs only one LATTICE module |
| Transparency | None — you can't see what went wrong | Full — intermediate LATTICE expression is visible |
| Editability | None | The LATTICE intermediate can be modified before synthesis |
| Domain precision | General-purpose | LATTICE has 15 domain vocabularies for precision terminology |
| Compression | No | 60% reduction at the LATTICE layer |
| Endangered languages | Not viable commercially | Mission-critical — interlingua architecture makes it possible |
The intermediate LATTICE expression in API responses is a unique feature no other translation API offers. Developers can see the meaning representation, modify it, and re-synthesize — enabling human-in-the-loop translation, domain-specific translation, and style transfer.
IMPLEMENTATION PHASES
Phase 1 — ENGLISH↔LATTICE (exists now)
GLOSS already handles English↔LATTICE for operational vocabulary. Extend GLOSS with the 🗣 domain atoms and modifiers for general natural language semantic decomposition.
Quality gate: English→LATTICE→English round-trips must preserve meaning. If they do, the interlingua works.
Phase 2 — MAJOR LANGUAGES (6 months after Phase 1)
Add synthesis modules for the top 10 languages by speaker count:
Mandarin, Spanish, English (done), Hindi, Arabic, Bengali, Portuguese, Russian, Japanese, French.
10 languages × 2 directions = 20 translation paths.
Without LATTICE: 10×9/2 = 45 pairs needed. LATTICE saves 56% even at just 10 languages.
Phase 3 — COMMUNITY EXPANSION
L3 Architects (SPEC_LATTICE_L3_CURRICULUM.md) can register new languages. A native Swahili speaker who reaches L3 can build the Swahili synthesis module — they understand both LATTICE grammar and Swahili grammar.
Revenue split (per SPEC_INVENTIONX.md): 70% to the language contributor. 30% to platform.
Phase 4 — ENDANGERED LANGUAGES
~7,000 living languages. ~3,000 endangered — spoken by fewer than 10,000 people. Nearly zero representation in commercial translation services. No commercial incentive exists to build Navajo↔Tagalog.
LATTICE changes the economics. An endangered language needs only ONE synthesis module. Once built:
- Navajo↔Mandarin — works
- Navajo↔French — works
- Navajo↔Arabic — works
Not because anyone built Navajo-Mandarin training data. Because Navajo→LATTICE exists and LATTICE→Mandarin exists.
The interlingua democratizes translation for languages the commercial market ignores. This is LATTICE's most important potential application.
PRODUCT MODEL
GLOSS Translate in OBI OS
Translation is a Bridge feature. The user types in any language. The Bridge decomposes → LATTICE → synthesizes into the target language. Available to all docked AIs. Included in the $42/month OBI OS subscription.
GLOSS Translate Standalone
42sisters.ai/translate — free web tool for basic translation (limited to 500 characters). Full translation for $10/month standalone or included in OBI OS.
The free tier is the funnel — same as LATTICE Training Arena.
GLOSS Translate API
api.42sisters.ai/translate — REST API for developers.
- Input: source text + source language + target language
- Output: target text + intermediate LATTICE expression (transparency)
Pricing:
- Free tier: 1,000 chars/day
- $15/month: 100K chars/day
- Enterprise: custom
QUALITY CONSIDERATIONS
Idioms
"It's raining cats and dogs" decomposes to MEANING not WORDS.
Correct: 🗣.⟨ACTION.rain.◆heavy⟩
Incorrect: 🗣.⟨ACTION.rain.ENTITY.[cat.dog]⟩
The decomposition layer must recognize idioms and encode the intended meaning.
Cultural Context
"Please" in English is a politeness marker. In Japanese, the entire verb conjugation changes. The LATTICE modality system captures the INTENT (polite request) and lets the synthesis layer express it appropriately for the target culture.
Ambiguity
"Bank" means financial institution or river edge. Decomposition must disambiguate from context. If ambiguous: encode both meanings and flag for human review.
Poetry and Wordplay
Fundamentally untranslatable elements that depend on the source language's sound, rhythm, or double meanings. HOW ABOUT NO applied to translation — we don't pretend we can translate everything perfectly.
Wordplay is flagged: "This expression contains a pun that doesn't translate. Literal meaning: [X]. Intended humor: [Y]."
LATTICE GRAMMAR IS ALREADY LANGUAGE-NEUTRAL — PROOF
The 15 domain mappings in SPEC_LATTICE_UNIVERSAL.md prove that LATTICE grammar (atoms, modifiers, compounds, channels) works across structured domains without modification.
Natural language is another structured domain:
- Atoms = semantic primitives (entities, actions, properties)
- Modifiers = semantic relationships (properties, quantities, negation)
- Compounds = clauses (grouped thoughts)
- Channels = speech acts (statement, question, command, exclamation)
The same grammar that carries ♩.Ω.[E.4.●.↑] (music) and ⚗.[H.H.O] (chemistry) carries 🗣.⟨ENTITY.cat.ACTION.sit⟩ (natural language). No grammar modification needed.
LATTICE was accidentally designed as a universal interlingua the day it was created. It just took 15 domain mappings before we noticed.
INVARIANTS
INV-01: LATTICE is the interlingua. Translation always passes through LATTICE. No direct source→target shortcuts. The interlingua architecture is the value proposition — transparency, composability, and rare-language support all depend on the explicit intermediate representation.
INV-02: The intermediate LATTICE expression is VISIBLE to the user. Not hidden inside a black box. The user can see, edit, and learn from the semantic decomposition. Transparency is a feature, not a debug mode.
INV-03: Round-trip fidelity: source→LATTICE→source must preserve meaning. If English→LATTICE→English loses meaning, the decomposition is broken. Round-trip testing is the primary quality metric.
INV-04: Idioms are decomposed to MEANING, not WORDS. The decomposition layer must recognize figurative language.
INV-05: Untranslatable elements are FLAGGED, not fabricated. HOW ABOUT NO applies. We explain what was lost, not invent a false translation.
INV-06: Endangered language support is a MISSION, not just a feature. LATTICE's interlingua architecture makes it possible to support languages Google Translate never will. This is the most important application of the technology.
INV-07: Community-built synthesis modules follow the 70/30 revenue split. The contributor gets 70%. The platform gets 30%. The people who speak the language benefit from making it accessible.
INV-08: The 🗣 domain prefix is registered in SPEC_LATTICE_UNIVERSAL.md as Domain 16. Same grammar. Same registration. Same proof: LATTICE is universal.
INTEGRATION
| System | Relationship |
|---|---|
| SPEC_GLOSS_ARCHITECTURE.md | GLOSS Translate is an extension of GLOSS, not a separate system. Same architecture, expanded scope — adds 🗣 domain and synthesis modules. |
| SPEC_LATTICE_UNIVERSAL.md | 🗣 registers as Domain 16. Same registration process as the other 15 domains. |
| SPEC_LX_GRAMMAR.md | The 🗣 domain atoms and modifiers follow LX grammar rules. No grammar modifications — the existing spec is sufficient. |
| SPEC_ROUTX_VOCABULARY.md | New keywords: "translate [text] to [language]" routes to GLOSS Translate. Tier 1 for supported languages, Tier 3 for unsupported with contribution invitation. |
| SPEC_LATTICE_L3_CURRICULUM.md | L3 Architects can register new language synthesis modules. Domain 16 registration authority = L3 certification. |
| SPEC_LATTICE_VIRAL.md | GLOSS Translate is a viral feature. Rare language support and visible intermediate LATTICE expressions are shareable, remarkable capabilities. |
| SPEC_INVENTIONX.md | GLOSS Translate is a natural extension of Invention 1 (LATTICE Translator) — from compression tool to full interlingua translation system. |
| SPEC_LEARNX.md | Translation errors identified by users become training data for improving decomposition and synthesis modules. |
Jeremy Zlabis
Chronogeometer · Visionary · Disruptor · Chief
42 Sisters AI · East York, Toronto
🍁 Φ 0.042
-e # GLOSS Translate — LATTICE as Universal Interlingua
Status: SPECIFIED
Author: VELA #13
Date: 2026-04-20
Depends on: SPEC_GLOSS_ARCHITECTURE.md, SPEC_LX_GRAMMAR.md, SPEC_LATTICE_UNIVERSAL.md
Named for
GLOSS already translates English↔LATTICE. GLOSS Translate extends this to ANY human language↔LATTICE↔ANY human
-e language.
PURPOSE
Every translation system in the world works in PAIRS. English→French.
-e French→Japanese. Japanese→Spanish. Each pair requires its own trained model or translation table. For
-e N languages, you need N×(N-1)/2 pairs. For 100 languages: 4,950 translation pairs. For 7,000 living
-e languages: 24,496,500 pairs. This doesn't scale. The solution is an INTERLINGUA — a universal
-e intermediate language. Instead of translating between every pair, you translate INTO the interlingua
-e and OUT of the interlingua. English→LATTICE→French. Japanese→LATTICE→Spanish.
-e Swahili→LATTICE→Mandarin. For N languages, you need N×2 translations (into LATTICE and out of
-e LATTICE). For 100 languages: 200 translations. For 7,000 languages: 14,000 translations. That's a
-e reduction from 24 million pairs to 14 thousand. Three orders of magnitude. LATTICE is the
-e interlingua. It was designed to carry structured meaning independent of any natural language. It
-e doesn't have English grammar, French grammar, or Mandarin grammar. It has LATTICE grammar — atoms,
-e modifiers, compounds, channels. This grammar is LANGUAGE-NEUTRAL. It carries meaning without carrying
-e the syntactic baggage of any specific human language.
WHY LATTICE WORKS AS AN INTERLINGUA
Previous
-e interlingua attempts failed because they were either too simple (Esperanto — just another European
-e language with simplified grammar) or too complex (Universal Networking Language — required PhD-level
-e formal semantics). LATTICE occupies the sweet spot because: it was designed for AI, not humans — it
-e doesn't need to be speakable, just processable, it's already proven across 15 structured domains
-e without grammar modification, it compresses meaning by 60% — the translation is DENSER than the
-e original, it uses Unicode — the only character set that represents every human writing system, it's
-e domain-independent — the grammar carries music, chemistry, chess, and cooking equally well. The same
-e grammar that carries "Σ.✓ → Φζ.⊤" can carry "The cat sat on the mat" → ◌.[cat.sat.on.mat] or the
-e equivalent structured semantic decomposition.
THE TRANSLATION ARCHIT
-e
y communicate through LATTICE. Neither needs to speak the other's language. The Ring displays
-e messages in the USER's preferred language regardless of which AI generated them.
LEARNX — translation
-e errors identified by users become training data for improving decomposition and synthesis modules.
-e
Each correction makes the translation better. The system learns from use.
SPEC_LATTICE_UNIVERSAL.md —
-e and creative content. The corpus must include conversations from MULTIPLE AI providers (ChatGPT,