When teams first introduce a Microsoft 365 declarative agent into an existing system, the goal is usually straightforward: expose backend capabilities through a Copilot-friendly interface.
That’s useful—but not yet transformative.
The real shift happens when the agent stops being just a thin wrapper over an API and starts driving interaction based on grounded evidence. Instead of returning only an answer, the system begins suggesting the next valid step—and does so deterministically, based on real legal sources.
This article walks through that transition from an engineering perspective.
The problem with “one-shot” Copilot interactions
A typical declarative agent interaction looks like this:
- The user asks a question
- The backend returns an answer
- Sources are displayed
- The conversation stops
At that point, the burden shifts back to the user: What should I ask next?
In domains like legal or policy workflows, this is a serious limitation. The system already has the context, the sources, and the structure—but the interaction model doesn’t leverage it.
The challenge becomes:
How do we guide the user forward without letting the model improvise or hallucinate the next step?
Design goals
The solution was built around a few strict constraints:
- Only generate follow-up prompts when they can be grounded in real legal sources
- Keep the API contract stable (no Copilot-specific hacks)
- Make prompts directly actionable in the UI
- Avoid redundant or repeated suggestions
This leads to a key principle:
The system should not invent the next step. It should derive it from evidence.
Architecture overview
The implementation builds on top of an existing “rich answer” pipeline.
High-level flow:
- Backend returns an answer (text + sources)
- A parser extracts structured data from the response
- Source metadata is analyzed
- A prompt builder evaluates whether grounded suggestions can be generated
- If conditions are met → prompts are emitted
- UI renders them as clickable actions
This keeps the system data-first:
- Backend = logic + structure
- UI = rendering only
No hidden heuristics in the frontend.
Stable API contract
Instead of introducing a new response shape, the system extends the existing one:
public sealed record CopilotSuggestedPrompt(string Text, string Kind);public sealed record CopilotRichAnswerResponse( string Answer, CopilotSectionReference? SectionReference, IReadOnlyList<CopilotSource> Sources, IReadOnlyList<CopilotSuggestedPrompt> SuggestedPrompts);
This is important for a few reasons:
- Works for Copilot, plugins, and tests alike
- Keeps responsibilities clean
- Enables testability without UI
The endpoint remains standard and connector-friendly, while simply returning richer structured data.
Parse once, reuse everywhere
A subtle but important design decision:
Parse the answer once—and use that structured result everywhere.
var parsed = ParseSourcesWithMeta(sourcesBlock);var sources = parsed.Select(p => p.Source).ToList();var sectionReference = BuildSectionReferenceFromSources(parsed);var suggestedPrompts = await CopilotSuggestedPromptBuilder.BuildAsync( answer, originalQuestion, sectionReference?.Celex, parsed.Select(p => p.Lang).ToList(), getExtractionResultAsync, ct);
This ensures:
- No duplicate parsing logic
- No UI-specific interpretation layer
- Full consistency between answer, sources, and suggestions
Grounding follow-up prompts in legal evidence
Here’s the core innovation.
Instead of asking an LLM:
“What’s a good next question?”
The system asks:
- Is this answer tied to a known regulation?
- Does it include a defined legal term?
- Can that term be traced back to a source?
- Is the user already asking about it?
Only if all checks pass → generate a prompt.
if (!IsRegulation(celex)) return Array.Empty<CopilotSuggestedPrompt>();var extraction = await getExtractionResultAsync(lang, ct);if (!extraction.Success || extraction.Terms.Count == 0) return Array.Empty<CopilotSuggestedPrompt>();var matched = FindMatchedTerms(answerText, extraction.Terms);if (matched.Count == 0) return Array.Empty<CopilotSuggestedPrompt>();
Generated prompt example:
var text = lang == "SL" ? $"Kaj pomeni \"{canonical}\" po členu 3 Uredbe (ES) št. 1333/2008?" : $"What does \"{canonical}\" mean under Article 3 of Regulation (EC) No 1333/2008?";
This is not a suggestion.
It is a validated next step.
Extracting legal terms from the source
Instead of maintaining a static list of definitions, the system extracts them directly from EUR-Lex HTML.
private const string ConsolidatedExternalId = "02008R1333-20250731";var html = await loadHtmlAsync(lang, ct);if (string.IsNullOrWhiteSpace(html)){ return new ExtractionResult( Success: false, FailureReason: "source_file_not_found", Terms: Array.Empty<DefinedTerm>());}
Extraction is intentionally strict and regex-based:
private static readonly Regex EnDefinitionRegex = new( "^\\s*[\"“](?<term>[^\"”]{3,180})[\"”]\\s+(?:means|shall\\s+mean)", RegexOptions.Compiled | RegexOptions.IgnoreCase);
Why this matters:
- No manual drift from legal sources
- Transparent failure modes
- Fully deterministic behavior
UI: rendering instead of reasoning
The frontend (Adaptive Card) simply renders what it receives:
{ "type": "Action.Submit", "title": "${suggestedPrompts[0].text}", "data": { "msteams": { "type": "imBack", "value": "${suggestedPrompts[0].text}" } }}
This is critical:
The UI does not generate logic. It only exposes it.
Why this works better than generic suggestions
Most AI-driven “next prompt” systems fail in regulated domains because they:
- Suggest plausible but unsupported questions
- Repeat user intent
- Drift into unrelated legal areas
- Change language unexpectedly
This implementation avoids that by enforcing strict conditions:
A prompt exists only if:
- The source corpus is known
- The concept exists in the answer
- The concept is extracted from source text
- The language is deterministic
- The prompt is not redundant
That makes it production-safe.
Tests matter more than the feature
The behavior is locked with unit tests:
rich.SuggestedPrompts.Should().ContainSingle(p => p.Kind == "defined_term" && p.Text == "Kaj pomeni \"živilo z zmanjšano energijsko vrednostjo\" po členu 3 Uredbe (ES) št. 1333/2008?");
And equally important:
- If the user already asks for a definition → no prompt
This avoids the most common UX failure: repetition.
What actually changed
From a user perspective:
- Answers still look the same
- Sources are still visible
- But now → next steps are clickable
From an engineering perspective:
- Backend owns interaction logic
- UI stays simple
- Legal sources drive the flow
- The system becomes guided, not just reactive
Key takeaways
- Declarative agents become powerful when they return interaction-ready data, not just text
- In regulated domains, prompts must be source-grounded, not generated freely
- Stable API contracts enable safe evolution
- UI should render decisions, not make them
Final thought
The biggest improvement didn’t come from making the system more generative.
It came from making it more constrained.
The strongest Copilot experiences are built by limiting model freedom and maximizing the evidence path.
That’s all folks!
Cheers!
Gašper Rupnik
{End.}

Leave a comment