Built by Toystack AI

Behavioral Health
Claims Audit
Platform

An end-to-end agentic system that ingests SOP documents, generates audit workflows, binds rules, and executes claims decisions — built for Wipro & UHC Optum.

Anthropic Claude Sonnet

OpenAI GPT-4o

LangGraph

Neo4j Knowledge Graph

Toystack AI built for Wipro & UHC Optum — Behavioral Health Division

The Problem

BH Claims Auditing
Is Fundamentally Broken

01 · Locked in Documents

SOPs live in PDFs and Word docs

Every audit rule starts as unstructured prose. Systems cannot read, execute, or query them.

02 · Post-Adjudication Gap

Claims are already paid or denied

Audits run after adjudication. Recovery is slow, expensive, and rarely complete.

03 · Manual Cross-Check

Auditors work claim by claim

No two auditors apply the same rule the same way. Edge cases are missed. It doesn't scale.

04 · Zero Explainability

Decisions live in spreadsheets

No citation trail, no logic chain, no confidence score. Appeals and compliance audits have nothing to show.

Stage 1 — SOP Ingestion

We Read Every SOP
Automatically

A 122-agent LangGraph BFS pipeline crawls HTML, DOCX, XLSX, and PDF documents — enriches with LLMs — and writes to a multi-store knowledge graph.

PDF / HTML

Source Doc

›

Parse

DOM + Layout

›

Enrich

Claude + GPT-4o

›

Context Graph

PDF Sections

›

Validate

IR Checker

›

Neo4j

Knowledge Graph

›

Postgres

Structured Rules

›

MongoDB

Raw Documents

›

Redis

Hot Cache

Stage 2 — AI Workflow Generation

SOP PDF → Deployable
Workflow in Minutes

01

ir_maker Agent

Claude Sonnet reads the ingested SOP and produces a typed Intermediate Representation: rules, navigation, conditions, and action nodes.

RuleNode · Subrule · Navigation
SopIR · ActionNode · Condition

02

ir_checker Agent

A second LLM pass validates the IR against the Pydantic schema — checks routing completeness, condition coverage, and logical consistency.

Pydantic v2 validation
goto · fallback · branch checks

03

Workflow Canvas

The validated IR is materialised into a drag-drop workflow on the XYFlow canvas. Shapes, lanes, and rule bindings are auto-populated.

React · XYFlow · Django REST
JSONB shape properties

Stage 3 — Workflow Builder

Drag. Drop.
Configure.

Auditors assemble claim-audit workflows on a visual canvas — no code, no SQL, no engineering dependency.

Shape Palette

Server-driven catalog: Decision, Action, API Agent, Condition, and Swim-Lane shapes.

Property Inspector

JSONB property_schema drives the inspector form — adding a field requires no frontend deployment.

Atomic Save

Single PUT /workflow/:id/graph/ persists the entire nested graph transactionally to Postgres.

Data Model

          Workflow

           └─ WorkArea (swim lane)

              └─ Workbench

                 └─ Shape (XYFlow node)

                    └─ props: { sop_rules,

                                 tool_calls }

Stage 4 — Rule Binding

Every Shape. Every Rule.
Precisely Bound.

Each workflow shape exposes an "attachable" picker — auditors select exactly which SOP rules and API agents apply at that decision point.

Attachable API

Per-node rule + tool picker

GET /workflow/:id/attachable/

Returns filtered SOP rules and registered tool calls scoped to that node's shape type.

SOP Rules

Direct Neo4j linkage

Rules ingested from BH SOPs are stored as graph nodes. Binding creates a direct edge between the workflow shape and the rule node in Neo4j.

API Agents

Runtime tool registry

HTTP endpoints registered in the Tool Registry are bound to shapes via tool_calls JSONB — invoked automatically during execution.

Stage 5 — Execution Engine

Claim Goes In.
Decision Comes Out.

A 7-node LangGraph pipeline walks the bound workflow, evaluates each rule against the claim, calls external APIs, and produces a fully cited decision.

Input

Claim

adjudicated payload

Step 1

Load

workflow + rules

Step 2

Walk

node by node

Step 3

Call

external APIs

Step 4

Cite

SOP evidence

Output

Decision

with confidence

Deny

×

Allow

✓

Pending

⏸

Escalate

↑

Stage 6 — HITL Review

Humans Stay
In Control

The AI proposes. The auditor decides. Every escalated or borderline claim lands in the HITL Review Dashboard for human override.

Review Queue

Prioritised claim list

Sorted by confidence delta, dollar exposure, and SOP rule category.

Evidence Panel

Full AI reasoning chain

Each AI decision shows the exact SOP rule, the matching claim field, and the logic path taken.

Override

One-click reversal

Auditor can override with a free-text justification. All overrides feed back as training signals.

Audit Trail

Immutable log

Every decision — AI or human — is timestamped and stored for compliance and appeals.

Toystack IP — Model Context Protocol

We Built the
MCP Layer

Toystack AI designed and built a unified Model Context Protocol for all tool calls in the execution engine — standardising auth, logging, schema validation, and retry.

What MCP Solves

Every tool call goes through a single, typed protocol layer
Standardised bearer · basic · api_key · custom auth modes
Built-in retry with cross-provider LLM fallback
Results cached in Redis (24h), persisted to Postgres + MongoDB
Each invocation fully logged for debugging and compliance

Tool Registry

HTTP endpoints registered and versioned in the Tool Registry
Lazy-loaded factory pattern — one broken tool cannot crash the registry
Idempotent upsert keeps DB in sync with code on every deploy
REST endpoint: POST /api/agent-tools/:name/invoke
CLI: api-agent call <url> --bearer sk-xyz

Toystack IP — Graph MCP (In Development)

Context at Scale:
Graph MCP

Tool call responses are huge JSON blobs. Stuffing them into LLM context windows is expensive and lossy. Graph MCP stores each response as a Neo4j subgraph — the LLM queries the graph instead of receiving the full payload.

Before — Raw JSON in Context

Full API response (often 50–200 KB) injected directly into the LLM prompt. Context window fills up. Relevant fields are buried. Costs spike per call.

        { "claim": { "member": {...},
"provider": {...}, "codes": [...],
"history": [...1400 lines...] } }
      

VS

After — Neo4j Subgraph Query

Response is stored as a typed graph. The LLM receives only a schema summary and runs Cypher queries to fetch exactly the nodes it needs. Context stays small.

        MATCH (c:Claim)-[:HAS_CODE]->(code)
WHERE code.value IN $audit_codes
RETURN code, c.member_id LIMIT 20
      

Vision

AI Adjudication
Is Next

Now — Delivered

Agentic Audit Platform

SOP ingestion, AI workflow generation, rule binding, execution engine, and HITL review — live for BH claims at UHC Optum.

Next — In Development

Graph MCP Context Engine

Replace raw JSON injection with Neo4j subgraph queries. Cut LLM context costs and improve decision accuracy on complex claims.

Future — Roadmap

Pre-Adjudication AI Gating

Move from post-pay audit to pre-pay gate. The execution engine runs before adjudication, blocking non-compliant claims before payment.

Let's Build Together

The Future of Claims
Auditing Is Agentic

Built by Toystack AI · Delivered to Wipro × UHC Optum · Behavioral Health Division
Powered by Anthropic Claude · OpenAI GPT-4o · LangGraph · Neo4j

SOP Ingestion

AI Workflow Generation

Rule Binding

Execution Engine

HITL Review

MCP Layer

Graph MCP

Behavioral HealthClaims AuditPlatform

BH Claims AuditingIs Fundamentally Broken

We Read Every SOPAutomatically

SOP PDF → DeployableWorkflow in Minutes

Drag. Drop.Configure.

Every Shape. Every Rule.Precisely Bound.

Claim Goes In.Decision Comes Out.

Humans StayIn Control

We Built theMCP Layer