Engineering Deterministic Agent-Based AI Systems for Regulatory Domains

AI systems that operate on European legislation, regulatory texts, and customer policies cannot be built using the same assumptions as general-purpose conversational assistants.

In these domains, answers must be:

  • deterministic,
  • evidence-bound,
  • non-inferential,
  • and reproducible across executions.

This article describes a technical reference architecture and implementation patterns for building agent-based AI systems that intentionally restrict model behavior and prevent speculative output.


1. Problem Definition: Why Probabilistic Answers Are Unacceptable

Large Language Models are probabilistic by nature.
Regulatory systems are not.

Typical failure modes include:

  • inferring missing values (“the date is likely…”),
  • generalizing from similar regulations,
  • merging evidence across documents,
  • translating terminology without an official source.

Key requirement:

If information is not explicitly present in the evidence, the system must not produce it.

This shifts responsibility from the model to the surrounding system.


2. Reference Architecture: Agent Pipeline, Not Chat Completion

A regulatory-grade system should be structured as a deterministic agent pipeline, where each stage is system-controlled.

User Question
Intent Classification (rule-based / constrained)
Retrieval Scope Definition
Keyword + Vector Retrieval
Evidence Pruning & Validation
Bounded LLM Synthesis
Post-Processing Guardrails
Final Answer

Design rule:
The LLM is never allowed to decide what is relevant — only how to phrase validated facts.


3. Intent Classification as a Control Surface

Intent classification should not rely solely on the model.

Example: distinguishing table-bound questions from general text queries.

enum QueryIntent
{
TextLookup,
TableLookup,
DefinitionLookup,
TranslationLookup
}
QueryIntent DetectIntent(string question)
{
if (Regex.IsMatch(question, @"\b(table|column|row|value)\b", RegexOptions.IgnoreCase))
return QueryIntent.TableLookup;
if (Regex.IsMatch(question, @"\bdefine|meaning of\b", RegexOptions.IgnoreCase))
return QueryIntent.DefinitionLookup;
return QueryIntent.TextLookup;
}

Intent directly controls:

  • allowed retrieval sources,
  • evidence format,
  • failure behavior.

4. Retrieval: Evidence First, Semantics Second

Pure vector search is insufficient for legal material.

Recommended pattern:

  • keyword search for precision,
  • vector search for recall,
  • explicit prioritization of primary documents.
var keywordHits = keywordIndex.Search(query);
var vectorHits = vectorIndex.Search(embedding);
var merged = MergeAndRank(
keywordHits,
vectorHits,
preferPrimarySource: true
);

Before passing context to the model, evidence is hard-pruned:

var allowedEvidence = merged
.Where(e => e.Source == PrimarySource)
.Take(MaxEvidenceChunks)
.ToList();

If no valid evidence remains → fail closed.


5. Fail-Closed Guardrails (Hard Stops)

Fail-closed behavior must be implemented before the model is invoked.

Example: table-based queries requiring a date.

string AnswerTableDate(Table table)
{
if (!table.Columns.Contains("Date"))
return "The provided table does not specify the date.";
return table.GetValue("Date");
}

The model is never asked to “figure it out”.


6. Table QA: Treat Tables as Structured Data

Tables should not be passed as raw text blobs.

Instead:

  • parse tables into structured representations,
  • enforce single-document binding,
  • forbid cross-table joins.
class TableEvidence
{
public string DocumentId { get; init; }
public IReadOnlyList<Row> Rows { get; init; }
}
void ValidateTableScope(IEnumerable<TableEvidence> tables)
{
if (tables.Select(t => t.DocumentId).Distinct().Count() > 1)
throw new InvalidOperationException("Cross-document table merging is not allowed.");
}

This prevents one of the most common RAG hallucination patterns.


7. Bounded LLM Invocation

When the model is invoked, it must operate under strict constraints.

Example prompt strategy (simplified):

You may only use the provided evidence.
If the answer is not explicitly stated, respond exactly with:
"The information is not specified in the provided sources."
Do not infer, summarize, or generalize.

The system enforces post-conditions:

void ValidateAnswer(string answer)
{
if (ContainsSpeculation(answer))
throw new GuardrailViolationException();
}

8. Translation Without Interpretation

Translation must be evidence-backed.

string TranslateTerm(string term, EvidenceSet evidence)
{
var official = evidence.FindOfficialTranslation(term);
if (official is null)
return "No official translation is provided in the source documents.";
return official;
}

No fallback to “best guess” translation.


9. Deterministic Ingestion and Indexing

Correct answers require correct data foundations.

Recommended ingestion pattern:

  • explicit _SUCCESS markers,
  • retry with backoff,
  • no partial index visibility.
if (!File.Exists("_SUCCESS"))
throw new IngestionFailedException();

This ensures queries are never answered against incomplete data.


10. Testing for Impossibility, Not Possibility

Tests should assert what must never happen.

Example guardrail test:

[Fact]
public void Must_Not_Invent_Date_When_Missing()
{
var answer = RunQuery("What is the date?", tableWithoutDate);
Assert.Equal(
"The provided table does not specify the date.",
answer
);
}

This locks behavior across model upgrades.


11. Key Takeaway

For regulatory and policy-driven AI systems:

The goal is not to maximize helpfulness,
but to minimize the possibility of being wrong.

This requires:

  • shifting authority from the model to the system,
  • treating absence of data as a valid outcome,
  • enforcing determinism at every stage.

This is not conversational AI.
This is engineering for trust.

That’s all folks!

Cheers!
Gašper Rupnik

{End.}

Leave a comment

Website Powered by WordPress.com.

Up ↑