Designing Language-Safe AI Systems: Deterministic Guardrails for Multilingual Enterprise AI

Large Language Models are exceptionally good at producing fluent text.
They are not inherently good at knowing when not to answer.

In regulated or compliance-sensitive environments, this distinction is critical.
A linguistically plausible answer that is not grounded in official documentation is often worse than no answer at all.

This article describes a practical architecture for handling language detection, translation intent, and multilingual retrieval in an enterprise AI system — with a strong emphasis on determinism, evidence-first behavior, and hallucination prevention.

The examples are intentionally domain-neutral, but the patterns apply to legal, regulatory, financial, and policy-driven systems.

The Core Problem

Consider these seemingly simple user questions:

"What is E104?"
"Slovenski prevod za E104?"
"Hrvatski prevod za E104?"
"Kaj je E104 v slovaščini?"
"Slovenski prevod za Curcumin?"

At first glance, these look like:

definitions
translations
or simple multilingual queries

A naïve LLM-only approach will happily generate answers for all of them.

But in a regulated environment, each of these questions carries a different risk profile:

Some require retrieval
Some require translation
Some require terminology resolution
Some should result in a deterministic refusal

The challenge is not generating text —
it is deciding which answers are allowed to exist.

Key Design Principle: Evidence Before Language

The system described here follows one non-negotiable rule:

Language is applied after evidence is proven, never before.

This means:

Language preference never expands the answer space
Translation never invents facts
Missing official wording is explicitly acknowledged

Step 1: Language Hint Resolution (Not Language Detection)

We do not attempt full language detection.
Instead, we resolve a language hint with explicit provenance.

Result model

public sealed record LanguageHintResult(
    string? LanguageCode,
    bool IsExplicit,
    string Source
);

This allows the system to distinguish:

explicit requests ("Answer in Slovenian", lang=HR)
derived hints ("v slovaščini")
implicit sentence language
no language signal at all

Resolver behavior (simplified)

var hint = LanguageHintResolver.Resolve(query);

// Examples:
"Answer in Slovenian"      → SL, explicit
"Hrvatski prevod za E104" → HR, explicit
"v slovaščini"            → SK, derived
"Prevod za E104?"         → null

Crucially:

Country phrases do not imply language
Diacritics alone are insufficient
Ambiguous phrases default to no hint

This prevents false positives such as:

"Hrvatska uredba o aditivih"

being misinterpreted as a request for Croatian output.

Step 2: Translation Mode Is a First-Class Concept

Translation is not inferred from language alone.

Instead, translation intent is detected explicitly:

“translate”
“prevod”
“prevedi”
“in <language>”

bool isTranslation =
    IsTranslationQuery(query) ||
    ContainsLanguageInflection(query); // e.g. "v slovaščini"

This distinction is essential because translation mode activates stricter rules.

Step 3: Translation vs. Terminology Resolution

One of the most important lessons learned:

Not every “translation” request is a translation problem.

Example:

"Slovenski prevod za Curcumin?"

This is not a linguistic task.

It is a terminology resolution problem:

Does “Curcumin” have an official identifier?
Is there an official name in the target language?
If not — the system must not invent one

Naïve LLM output (wrong)

"kurkumin"

Linguistically plausible —
regulatorily invalid.

Deterministic system output (correct)

Official English name for E100: Curcumin

An official Slovenian translation was not found in the provided sources,
so it cannot be stated without guessing.

No hallucination.
No linguistic creativity.
Full auditability.

Step 4: Retrieval With Language Preference (Not Language Enforcement)

Retrieval follows a strict ladder:

Normal queries

preferred language → EN → any

Translation queries

preferred language → EN → any

But with one critical rule:

For identifier-based queries (e.g. E-codes), retrieval is restricted to authoritative sources only.

if (ContainsECode(query))
{
    request.Sources = new[] { LawSource.EurLex };
}

This prevents unrelated documents from contaminating the answer space.

Step 5: Deterministic E-Code Guardrail

For any query containing a formal identifier:

The system requires evidence that explicitly contains that identifier
If no such evidence exists:
- the request is refused deterministically
- the LLM is never called

if (ContainsECode(query) && !EvidenceContainsECode(hits))
{
    return DeterministicRefusal();
}

This applies to:

definitions
translations
“what is …” questions

Step 6: Deterministic Translation Output

When translation mode + identifier is active:

var answer = BuildDeterministicECodeTranslationAnswer(
    requestedCode,
    preferredLanguage,
    evidence
);

Behavior:

Use official wording in target language if it exists
Otherwise:
- fall back to English
- add an explicit disclaimer
Never synthesize terminology
Never call the LLM

Example output:

Official English name for E104: Quinoline Yellow

An official translation in language SK was not found in the provided sources,
so it cannot be stated without guessing.

Step 7: Language Coverage Grows With Data, Not Code

A critical property of this design:

Adding a new language requires indexing data — not changing logic.

Once official sources are indexed for:

BG, EN, ES, FI, GA, HR, HU, IT, LT, LV, MT, SK, SL, SV

The system automatically:

prefers that language
aligns sections by identifier
removes disclaimers
keeps all guardrails intact

This is data-driven multilinguality, not heuristic-driven.

Why This Works

This architecture succeeds because it:

Separates language intent from language execution
Treats translation as a regulated operation
Prefers refusal over invention
Keeps the LLM behind hard guardrails
Makes every decision explainable and testable

Most importantly:

Fluency is never allowed to outrank correctness.

Closing Thoughts

Enterprise AI systems do not fail because models are weak.
They fail because systems allow models to answer questions they should never answer.

By enforcing:

deterministic retrieval
language-aware guardrails
evidence-only translation paths

we can safely deploy multilingual AI systems in environments where correctness is non-negotiable.

Sometimes, the most intelligent answer an AI system can give is:

“I cannot say this without guessing.”

And that is exactly the point.

That’s all folks!

Cheers!
Gašper Rupnik

{End.}

Designing Language-Safe AI Systems: Deterministic Guardrails for Multilingual Enterprise AI

The Core Problem

Key Design Principle: Evidence Before Language

Step 1: Language Hint Resolution (Not Language Detection)

Result model

Resolver behavior (simplified)

Step 2: Translation Mode Is a First-Class Concept

Step 3: Translation vs. Terminology Resolution

Naïve LLM output (wrong)

Deterministic system output (correct)

Step 4: Retrieval With Language Preference (Not Language Enforcement)

Normal queries

Translation queries

Step 5: Deterministic E-Code Guardrail

Step 6: Deterministic Translation Output

Step 7: Language Coverage Grows With Data, Not Code

Why This Works

Closing Thoughts

Leave a comment Cancel reply

Follow me on Twitter

Follow Me

RSS

The Core Problem

Key Design Principle: Evidence Before Language

Step 1: Language Hint Resolution (Not Language Detection)

Result model

Resolver behavior (simplified)

Step 2: Translation Mode Is a First-Class Concept

Step 3: Translation vs. Terminology Resolution

Naïve LLM output (wrong)

Deterministic system output (correct)

Step 4: Retrieval With Language Preference (Not Language Enforcement)

Normal queries

Translation queries

Step 5: Deterministic E-Code Guardrail

Step 6: Deterministic Translation Output

Step 7: Language Coverage Grows With Data, Not Code

Why This Works

Closing Thoughts

Share this:

Related

Leave a comment Cancel reply

Follow me on Twitter

Follow Me

RSS