Loading...
Bank Cut Compliance Lookups from Hours to Minutes
Agentic compliance platform gives BFSI teams faster regulatory answers with live citations, traceability, and audit-ready workflows.
How a Tier-1 Bank Cut Compliance Lookups from Hours to Minutes - Without Trading Off Auditability
Our Solution
https://cdn.sanity.io/images/qdztmwl3/production/0c465b2c4b4ea0e022e00b3dda27ec939908fbac-1920x1080.png
Executive Summary
A Tier-1 BFSI enterprise asked us a deceptively simple question: can AI handle routine compliance lookups without creating regulatory exposure? Their compliance teams were spending hours navigating regulations, amendments, and internal policy interpretations to answer everyday product-team questions defensibly. GenAI Protos designed and shipped an agentic AI Compliance Intelligence Platform that interprets natural-language questions, plans a retrieval strategy across authoritative regulatory sources in real time, and returns grounded, citation-bound answers. The result compressed hours of manual cross-referencing into minutes - with the audit trail second lines of defence and regulators expect.
Challenges
Thousands of regulations, directives, and amendments to track and interpret - most of which interlock and reference each other.
Volume and complexity
Scattered sources
Compliance teams were spending hours cross-referencing legal texts by hand - five browser tabs open to answer one product-team question.
Manual research
Missing a regulatory update - or misreading a clause - could lead to non-compliance penalties and reputational damage. A keyword search was not the answer. The institution needed a system that could reason across documents, and prove it had reasoned correctly.
Risk of oversight
The Solution: An Agentic Compliance Intelligence Platform
We designed an AI-powered system that crawls, searches, and queries the institution’s authoritative regulatory sources in real time to deliver precise, cited answers in plain language. Six capabilities define what the platform does: Natural Language Q&A - analysts ask compliance questions in plain language and get grounded, cited answers. Live Source Access - real-time crawl and query of authoritative regulator endpoints; no stale data, always current. Agentic Reasoning - multi-agent orchestration with real-time search, planning, and self-verification. Citation Engine - every answer linked to the exact article, paragraph, or annex in the source regulation. Compliance-Aligned - region-appropriate data residency, audit trails, and full alignment with applicable data-protection regulation. Custom Tooling - purpose-built tools that query official sources in real time to fetch and reason over live legal content.
How It Works - End-to-End Flow
164d40076b37
block
4c5e948431a4
span
A query travels through six steps, from user input to cited answer:
normal
2ad3c428430a
1d1a0afd679b
strong
User Query
445ef0f2f46d
- a compliance analyst submits a question in natural language through the web interface.
number
b0fbe448cd90
bd699e422ee8
Intent & Routing
2403caedb720
- a planning agent classifies intent and decomposes the query, deciding which authoritative sources to consult and in what order.
1134e57abd7a
c7e62770818f
Live Crawl & Search
29f4008a694d
- retrieval agents execute real-time calls against authoritative source endpoints; fetched evidence lands in shared workflow state.
90e27dbf42aa
5221e3226910
Reasoning
622ecbbfde0c
- the underlying LLM reasons over the live passages and produces a structured draft answer with each clause tagged to its evidence span.
f861591a60c4
b6b8ad83d10d
Verification
e7c76a296724
- a self-check verifier agent validates every citation against its source passage and flags any uncertainty. Failed clauses are rewritten or escalated.
c96e70aa1508
71119309ead8
Cited Response
257ba66a7741
- the verified answer is returned with direct links to the official articles in the source regulation.
8197ef992c0f
84d97381fd4a
The pipeline runs as a deterministic, step-based workflow rather than a free-form agent loop. Compliance Q&A is a repeatable process, and step-based execution produces auditable checkpoints that open-ended ReAct-style traces cannot.
dfa0bd9fa5bc
c001fa46875c
a83a24084846
e1f76f8c9a66
Inside the Architecture: Decisions That Mattered
h2
08199fe98dad
c5ce0a95ae27
A handful of engineering decisions defined the system.
5b7abd39447e
3cc8b7a7dc38
Framework choice
h3
d17e872c505e
bbc20df4070d
We built on Agno after evaluating LangGraph, CrewAI, and AutoGen. All of them can be used to build deterministic, step-based workflows; the differentiator for us was that Agno is lightweight and fast. Its small runtime footprint and low per-step overhead mattered in a workload where every query already pays the latency cost of live retrieval plus a multi-step reasoning loop, and where the platform had to scale across many concurrent compliance officers without ballooning infra spend. Agno also ships with a production runtime out of the box (AgentOS), which kept the team focused on the compliance problem rather than rebuilding scaffolding.
b3c47d97fc48
3a9ba9a85cad
Tools, narrow and typed
b38a4b5ac4b6
d694d22e0f70
Five Pydantic-typed tools wrap the retrieval surface: a primary retrieval tool against authoritative regulator endpoints; a secondary source tool for supplementary guidance; a section / article lookup tool for resolving inside a regulation; a cross-reference tool that follows citations between documents; and a date / version resolution tool that ensures the agent reasons over the version of the rule in force on the relevant date. Narrow, well-typed tools produced cleaner audit trails than any omnibus search tool we prototyped.
512741343767
5152644a4600
Memory at two tiers
6895d62cb18a
b65d4cc27ad6
Working state is shared across the agent team during a single workflow run. Cross-session memory persists what analysts have asked before and the regulations they tend to reason over. Sessions checkpoint to a Postgres store so an interrupted query can resume cleanly, and retrieved passages are cached in session state - important for both latency and rate-limit hygiene against live source endpoints.
d0ee9a36e34c
9f43d0a6135a
Guardrails and hallucination controls
b9713947b67f
f6b6da84ea9d
This is where the project lived or died. A post-execution check on the composer rejects any clause without an attached evidence span - no span, no clause. A confidence-scoring guardrail labels low-confidence answers as needs human review rather than answering with false confidence. When authoritative evidence is missing, a human-in-the-loop step routes the query to a named senior analyst rather than improvising. Input-side guardrails (PII detection, prompt-injection defence) run before the planner ever sees the query.
0e5cbf90a3bb
b4aa07ebff73
Observability and the AI gateway
ac5c652520aa
c70855b1ea40
Every LLM call routes through Portkey, our AI gateway and observability layer, which captures full request/response traces, model routing decisions, retries, and cost telemetry per query. Combined with Agno’s workflow-level traces over tool calls, retrieved passages, and verification outcomes, the result is a per-query record of every reasoning step the system took - usable not just by engineers debugging, but by audit and second-line teams reviewing how an answer was derived.
4d7fda258028
4f8bf775827d
Why Agentic AI Belongs in BFSI Compliance
34e196e1c067
72ecc4534329
In a regulated industry, the question is not “can the model answer this?” - it is “can the institution defend the answer?” That reframes the entire technology choice.
a12c23741abc
78677924cbe9
Agentic systems win in compliance for four reasons.
ccce8e1ca80e
Multi-hop reasoning
c53ec2a6f0d0
is native - the agent plans across documents, follows citations, and resolves amendments where a flat pipeline cannot.
59b9e2353865
Auditability is built in
e90453073b66
- every agent trace is a contemporaneous record of how an answer was derived, exactly the artifact regulators and internal audit want.
98879eba1090
Tool-use transparency
aa6b260cbb55
means the agent’s actions are not a black box: queries, passages, and evidence are all logged and inspectable. And
c4aad9c53d70
graceful degradation
cdaf3b0f3559
- the agent escalates when uncertain rather than hallucinating with confidence. In a consumer app, hallucination is a UX problem. In compliance, it is a regulatory exposure.
c17eea09d983
1127f103c079
The bet underneath the architecture:
7da11db42a02
in regulated industries, the most defensible AI is the one that can show its work - and an agent’s trace is that work.
efbad0536ee3
428eae6e4a0e
What We Learned
288e447229d9
89d6258f25be
Workflows beat free-form agent loops for regulated work.
c14f112ce842
Determinism, replayability, and step-level traces matter more than agent autonomy. We started free-form and moved deliberately to a step-based workflow. The system got more reliable and far easier to audit.
aed01106e38a
b592c6d7ac5b
Tool boundaries matter more than tool count.
41fa3a67af8d
Narrow, typed tools produced cleaner traces and more predictable behaviour than any omnibus search tool we prototyped.
cc0513bdb71d
fc8f4c07d1b1
Observability isnon-negotiable.
16cd51ef3036
Without per-query agent traces, debugging a wrong answer is archaeology. Tracing infrastructure was core build, not a nice-to-have.
4eac50d55aa7
03be1cd93a76
Evals belong in thebuild, not after it.
2718136bd22f
Accuracy and reliability evals from week one - LLM-as-a-judge on accuracy, tool-call verification on reliability - caught regressions long before UAT.
bdc78b6835be
6f1e71d7cda8
Live retrieval beats pre-indexed corpora for live regulation.
b27e028dc5bc
An index is stale the moment a new amendment publishes. For compliance, real-time grounded retrieval is the architecturally honest choice.
47c293c13141
1dbecf03837b
HITL first, autonomy second.
31570cf97165
Designing the escalation path early was more valuable than chasing full autonomy.
62af89463cc7
ba76dee22dfa
fa5d9e64669d
137bb2056d48
af428c8521c1
a236f240ae14
eefaecb3b065
6452db67f54d
be9526c9aba0
4932287dc9fe
https://cdn.sanity.io/images/qdztmwl3/production/8bf594fd94cc77e7c1e486762f3832d4c795063c-1920x1080.png
Four Layer Hallucination Guardrails
https://cdn.sanity.io/images/qdztmwl3/production/e814962172dbbbbb04551811fedcf37734993006-1920x1080.png
What Actually Changed For the Compliance Team
Key Benefits
The platform replaced the slowest and most error-prone parts of the compliance workflow with a system that is faster, more consistent, and - critically - auditable.
Target
Query resolution time
Collapsed from hours of manual cross-referencing to minutes for routine queries.
Compliance team load
For first-line interpretation queries fell substantially, freeing senior analysts for genuinely novel work.
Accuracy with traceability
Every answer is bound to live source passages, so accuracy is a property the system can demonstrate clause by clause.
Audit wins
Full agent traces are captured per query, turning the AI’s reasoning into reviewable artifacts for the second line of defence.
The platform is built across seven layers:
authoritative regulator endpoints, custom crawl & query tooling, document parsers, metadata extraction.
Real-Time Access
multi-agent orchestration (on Agno): intent classification, query planning, and self-verification agents, coordinated through workflows and teams.
Agentic Layer
enterprise LLM (Azure OpenAI / Anthropic Claude / open-source as configured) optimised for legal reasoning.
LLM Engine
source attribution, article-level linking, confidence scoring, and answer versioning.
Citation Engine
smart caching for frequently queried regulations, TTL-based invalidation, graceful fallback when source endpoints are degraded.
Caching Layer
web-based Q&A dashboard with inline citation viewer, search history, saved queries, alerts, and a role-based admin panel.
Interface
region-appropriate data residency, SSO/RBAC, immutable audit logging, encryption at rest and in transit.
Security & Compliance
Ready to Deploy Agentic AI Where Stakes Are Real?
If you lead engineering, compliance, or AI strategy at a bank, NBFC, asset manager, or insurer, the highest-leverage place to deploy agentic AI is where audit trails are mandatory and answers must be defensible. GenAI Protos builds production-grade agentic AI for regulated industries - grounded by design, observable end-to-end, and engineered for BFSI-grade auditability. We have shipped it for one of the most established banking groups in our market.
Talk to us at genaiprotos.com about a scoped pilot for your compliance, risk, or regulatory operations function.
Deploy agentic regulatory Q&A that delivers grounded answers, live citations, and traceable workflows for BFSI teams.
Book a Demo
https://calendly.com/contact-genaiprotos/3xde

A Tier-1 BFSI enterprise asked us a deceptively simple question: can AI handle routine compliance lookups without creating regulatory exposure? Their compliance teams were spending hours navigating regulations, amendments, and internal policy interpretations to answer everyday product-team questions defensibly. GenAI Protos designed and shipped an agentic AI Compliance Intelligence Platform that interprets natural-language questions, plans a retrieval strategy across authoritative regulatory sources in real time, and returns grounded, citation-bound answers. The result compressed hours of manual cross-referencing into minutes - with the audit trail second lines of defence and regulators expect.
Thousands of regulations, directives, and amendments to track and interpret - most of which interlock and reference each other.
Compliance teams were spending hours cross-referencing legal texts by hand - five browser tabs open to answer one product-team question.
Missing a regulatory update - or misreading a clause - could lead to non-compliance penalties and reputational damage. A keyword search was not the answer. The institution needed a system that could reason across documents, and prove it had reasoned correctly.
We designed an AI-powered system that crawls, searches, and queries the institution’s authoritative regulatory sources in real time to deliver precise, cited answers in plain language. Six capabilities define what the platform does: Natural Language Q&A - analysts ask compliance questions in plain language and get grounded, cited answers. Live Source Access - real-time crawl and query of authoritative regulator endpoints; no stale data, always current. Agentic Reasoning - multi-agent orchestration with real-time search, planning, and self-verification. Citation Engine - every answer linked to the exact article, paragraph, or annex in the source regulation. Compliance-Aligned - region-appropriate data residency, audit trails, and full alignment with applicable data-protection regulation. Custom Tooling - purpose-built tools that query official sources in real time to fetch and reason over live legal content.
A query travels through six steps, from user input to cited answer:
The pipeline runs as a deterministic, step-based workflow rather than a free-form agent loop. Compliance Q&A is a repeatable process, and step-based execution produces auditable checkpoints that open-ended ReAct-style traces cannot.
A handful of engineering decisions defined the system.
We built on Agno after evaluating LangGraph, CrewAI, and AutoGen. All of them can be used to build deterministic, step-based workflows; the differentiator for us was that Agno is lightweight and fast. Its small runtime footprint and low per-step overhead mattered in a workload where every query already pays the latency cost of live retrieval plus a multi-step reasoning loop, and where the platform had to scale across many concurrent compliance officers without ballooning infra spend. Agno also ships with a production runtime out of the box (AgentOS), which kept the team focused on the compliance problem rather than rebuilding scaffolding.
Five Pydantic-typed tools wrap the retrieval surface: a primary retrieval tool against authoritative regulator endpoints; a secondary source tool for supplementary guidance; a section / article lookup tool for resolving inside a regulation; a cross-reference tool that follows citations between documents; and a date / version resolution tool that ensures the agent reasons over the version of the rule in force on the relevant date. Narrow, well-typed tools produced cleaner audit trails than any omnibus search tool we prototyped.
Working state is shared across the agent team during a single workflow run. Cross-session memory persists what analysts have asked before and the regulations they tend to reason over. Sessions checkpoint to a Postgres store so an interrupted query can resume cleanly, and retrieved passages are cached in session state - important for both latency and rate-limit hygiene against live source endpoints.
This is where the project lived or died. A post-execution check on the composer rejects any clause without an attached evidence span - no span, no clause. A confidence-scoring guardrail labels low-confidence answers as needs human review rather than answering with false confidence. When authoritative evidence is missing, a human-in-the-loop step routes the query to a named senior analyst rather than improvising. Input-side guardrails (PII detection, prompt-injection defence) run before the planner ever sees the query.
Every LLM call routes through Portkey, our AI gateway and observability layer, which captures full request/response traces, model routing decisions, retries, and cost telemetry per query. Combined with Agno’s workflow-level traces over tool calls, retrieved passages, and verification outcomes, the result is a per-query record of every reasoning step the system took - usable not just by engineers debugging, but by audit and second-line teams reviewing how an answer was derived.
In a regulated industry, the question is not “can the model answer this?” - it is “can the institution defend the answer?” That reframes the entire technology choice.
Agentic systems win in compliance for four reasons. Multi-hop reasoning is native - the agent plans across documents, follows citations, and resolves amendments where a flat pipeline cannot. Auditability is built in - every agent trace is a contemporaneous record of how an answer was derived, exactly the artifact regulators and internal audit want. Tool-use transparency means the agent’s actions are not a black box: queries, passages, and evidence are all logged and inspectable. And graceful degradation - the agent escalates when uncertain rather than hallucinating with confidence. In a consumer app, hallucination is a UX problem. In compliance, it is a regulatory exposure.
The bet underneath the architecture: in regulated industries, the most defensible AI is the one that can show its work - and an agent’s trace is that work.
Workflows beat free-form agent loops for regulated work. Determinism, replayability, and step-level traces matter more than agent autonomy. We started free-form and moved deliberately to a step-based workflow. The system got more reliable and far easier to audit.
Tool boundaries matter more than tool count. Narrow, typed tools produced cleaner traces and more predictable behaviour than any omnibus search tool we prototyped.
Observability isnon-negotiable. Without per-query agent traces, debugging a wrong answer is archaeology. Tracing infrastructure was core build, not a nice-to-have.
Evals belong in thebuild, not after it. Accuracy and reliability evals from week one - LLM-as-a-judge on accuracy, tool-call verification on reliability - caught regressions long before UAT.
Live retrieval beats pre-indexed corpora for live regulation. An index is stale the moment a new amendment publishes. For compliance, real-time grounded retrieval is the architecturally honest choice.
HITL first, autonomy second. Designing the escalation path early was more valuable than chasing full autonomy.
Four Layer Hallucination Guardrails
What Actually Changed For the Compliance Team
Collapsed from hours of manual cross-referencing to minutes for routine queries.
For first-line interpretation queries fell substantially, freeing senior analysts for genuinely novel work.
Every answer is bound to live source passages, so accuracy is a property the system can demonstrate clause by clause.
Full agent traces are captured per query, turning the AI’s reasoning into reviewable artifacts for the second line of defence.
authoritative regulator endpoints, custom crawl & query tooling, document parsers, metadata extraction.
multi-agent orchestration (on Agno): intent classification, query planning, and self-verification agents, coordinated through workflows and teams.
enterprise LLM (Azure OpenAI / Anthropic Claude / open-source as configured) optimised for legal reasoning.
source attribution, article-level linking, confidence scoring, and answer versioning.
smart caching for frequently queried regulations, TTL-based invalidation, graceful fallback when source endpoints are degraded.
web-based Q&A dashboard with inline citation viewer, search history, saved queries, alerts, and a role-based admin panel.
region-appropriate data residency, SSO/RBAC, immutable audit logging, encryption at rest and in transit.
If you lead engineering, compliance, or AI strategy at a bank, NBFC, asset manager, or insurer, the highest-leverage place to deploy agentic AI is where audit trails are mandatory and answers must be defensible. GenAI Protos builds production-grade agentic AI for regulated industries - grounded by design, observable end-to-end, and engineered for BFSI-grade auditability. We have shipped it for one of the most established banking groups in our market.

Deploy agentic regulatory Q&A that delivers grounded answers, live citations, and traceable workflows for BFSI teams.