Engineering Deterministic Agent-Based AI Systems for Regulatory Domains

AI systems that operate on European legislation, regulatory texts, and customer policies cannot be built using the same assumptions as general-purpose conversational assistants.

In these domains, answers must be:

deterministic,
evidence-bound,
non-inferential,
and reproducible across executions.

This article describes a technical reference architecture and implementation patterns for building agent-based AI systems that intentionally restrict model behavior and prevent speculative output.

1. Problem Definition: Why Probabilistic Answers Are Unacceptable

Large Language Models are probabilistic by nature.
Regulatory systems are not.

Typical failure modes include:

inferring missing values (“the date is likely…”),
generalizing from similar regulations,
merging evidence across documents,
translating terminology without an official source.

Key requirement:

If information is not explicitly present in the evidence, the system must not produce it.

This shifts responsibility from the model to the surrounding system.

2. Reference Architecture: Agent Pipeline, Not Chat Completion

A regulatory-grade system should be structured as a deterministic agent pipeline, where each stage is system-controlled.

			
User Question
↓
Intent Classification (rule-based / constrained)
↓
Retrieval Scope Definition
↓
Keyword + Vector Retrieval
↓
Evidence Pruning & Validation
↓
Bounded LLM Synthesis
↓
Post-Processing Guardrails
↓
Final Answer

		

Design rule:
The LLM is never allowed to decide what is relevant — only how to phrase validated facts.

3. Intent Classification as a Control Surface

Intent classification should not rely solely on the model.

Example: distinguishing table-bound questions from general text queries.

			
enum QueryIntent
{
    TextLookup,
    TableLookup,
    DefinitionLookup,
    TranslationLookup
}
QueryIntent DetectIntent(string question)
{
    if (Regex.IsMatch(question, @"\b(table|column|row|value)\b", RegexOptions.IgnoreCase))
        return QueryIntent.TableLookup;
    if (Regex.IsMatch(question, @"\bdefine|meaning of\b", RegexOptions.IgnoreCase))
        return QueryIntent.DefinitionLookup;
    return QueryIntent.TextLookup;
}

		

Intent directly controls:

allowed retrieval sources,
evidence format,
failure behavior.

4. Retrieval: Evidence First, Semantics Second

Pure vector search is insufficient for legal material.

Recommended pattern:

keyword search for precision,
vector search for recall,
explicit prioritization of primary documents.

			
var keywordHits = keywordIndex.Search(query);
var vectorHits  = vectorIndex.Search(embedding);
var merged = MergeAndRank(
    keywordHits,
    vectorHits,
    preferPrimarySource: true
);

		

Before passing context to the model, evidence is hard-pruned:

			
var allowedEvidence = merged
    .Where(e => e.Source == PrimarySource)
    .Take(MaxEvidenceChunks)
    .ToList();

If no valid evidence remains → fail closed.

5. Fail-Closed Guardrails (Hard Stops)

Fail-closed behavior must be implemented before the model is invoked.

Example: table-based queries requiring a date.

			
string AnswerTableDate(Table table)
{
    if (!table.Columns.Contains("Date"))
        return "The provided table does not specify the date.";
    return table.GetValue("Date");
}

		

The model is never asked to “figure it out”.

6. Table QA: Treat Tables as Structured Data

Tables should not be passed as raw text blobs.

Instead:

parse tables into structured representations,
enforce single-document binding,
forbid cross-table joins.

			
class TableEvidence
{
    public string DocumentId { get; init; }
    public IReadOnlyList<Row> Rows { get; init; }
}
void ValidateTableScope(IEnumerable<TableEvidence> tables)
{
    if (tables.Select(t => t.DocumentId).Distinct().Count() > 1)
        throw new InvalidOperationException("Cross-document table merging is not allowed.");
}

		

This prevents one of the most common RAG hallucination patterns.

7. Bounded LLM Invocation

When the model is invoked, it must operate under strict constraints.

Example prompt strategy (simplified):

			
You may only use the provided evidence.
If the answer is not explicitly stated, respond exactly with:
"The information is not specified in the provided sources."
Do not infer, summarize, or generalize.

The system enforces post-conditions:

			
void ValidateAnswer(string answer)
{
    if (ContainsSpeculation(answer))
        throw new GuardrailViolationException();
}

		

8. Translation Without Interpretation

Translation must be evidence-backed.

			
string TranslateTerm(string term, EvidenceSet evidence)
{
    var official = evidence.FindOfficialTranslation(term);
    if (official is null)
        return "No official translation is provided in the source documents.";
    return official;
}

		

No fallback to “best guess” translation.

9. Deterministic Ingestion and Indexing

Correct answers require correct data foundations.

Recommended ingestion pattern:

explicit _SUCCESS markers,
retry with backoff,
no partial index visibility.

			
if (!File.Exists("_SUCCESS"))
    throw new IngestionFailedException();

This ensures queries are never answered against incomplete data.

10. Testing for Impossibility, Not Possibility

Tests should assert what must never happen.

Example guardrail test:

			
[Fact]
public void Must_Not_Invent_Date_When_Missing()
{
    var answer = RunQuery("What is the date?", tableWithoutDate);
    Assert.Equal(
        "The provided table does not specify the date.",
        answer
    );
}

		

This locks behavior across model upgrades.

11. Key Takeaway

For regulatory and policy-driven AI systems:

The goal is not to maximize helpfulness,
but to minimize the possibility of being wrong.

This requires:

shifting authority from the model to the system,
treating absence of data as a valid outcome,
enforcing determinism at every stage.

This is not conversational AI.
This is engineering for trust.

That’s all folks!

Cheers!
Gašper Rupnik

{End.}

Engineering Deterministic Agent-Based AI Systems for Regulatory Domains

1. Problem Definition: Why Probabilistic Answers Are Unacceptable

2. Reference Architecture: Agent Pipeline, Not Chat Completion

3. Intent Classification as a Control Surface

4. Retrieval: Evidence First, Semantics Second

5. Fail-Closed Guardrails (Hard Stops)

6. Table QA: Treat Tables as Structured Data

7. Bounded LLM Invocation

8. Translation Without Interpretation

9. Deterministic Ingestion and Indexing

10. Testing for Impossibility, Not Possibility

11. Key Takeaway

Leave a comment Cancel reply

Follow me on Twitter

Follow Me

RSS

1. Problem Definition: Why Probabilistic Answers Are Unacceptable

2. Reference Architecture: Agent Pipeline, Not Chat Completion

3. Intent Classification as a Control Surface

4. Retrieval: Evidence First, Semantics Second

5. Fail-Closed Guardrails (Hard Stops)

6. Table QA: Treat Tables as Structured Data

7. Bounded LLM Invocation

8. Translation Without Interpretation

9. Deterministic Ingestion and Indexing

10. Testing for Impossibility, Not Possibility

11. Key Takeaway

Share this:

Related

Leave a comment Cancel reply

Follow me on Twitter

Follow Me

RSS