Forem: Victor Okefie

The Illusion of Data Custody in Legal AI — and the Architecture I Built to Replace It

Victor Okefie — Wed, 01 Apr 2026 09:33:31 +0000

**There is a moment every legal AI founder eventually has to confront.

You have built a capable system. The retrieval is good. The citations hold up. The interface is clean. A lawyer uploads a sensitive client document and asks a question. The system answers correctly.

Then they ask: what happens to this document when I delete it?
And that is where most legal AI products fail quietly.

Not because the founders were careless. Because they treated data custody as a policy question rather than an architecture question. They added a delete button, wrote a privacy policy, and moved on.

This article is about what I built instead — and why the distinction between a deletion confirmation and a cryptographic Destruction Receipt matters enormously in legal contexts.**

SECTION 1: What actually happens when you click delete

Most AI SaaS platforms handle deletion at the application layer. The record is flagged as deleted. The UI stops showing it. The underlying data — the vector embeddings, the chunked source text, the inference logs — frequently persists on the server for operational or safety-monitoring reasons.

OpenAI's standard API retains inference logs for 30 days by default. This is not a secret. It is documented. It is reasonable for consumer applications. It is architecturally incompatible with a system holding M&A filings, client privilege documents, or regulatory correspondence.
The problem is not malicious intent. The problem is that "deletion" in these systems was never designed to mean what a lawyer means when they say deletion.

A lawyer means: gone. Provably gone. Gone in a way I can demonstrate to a regulator if asked.
A standard SaaS confirmation means: removed from your view.
These are not the same thing.

SECTION 2: RLS isolation — enforcing security where it cannot be bypassed

Row Level Security is a PostgreSQL feature that enforces access control at the database layer — below the application entirely.

Most applications enforce access control in the application layer. A user logs in, the application checks their permissions, and the query is run. The problem with this model is that if the application layer is compromised — a bug, a misconfiguration, a session handling error — the isolation fails. The underlying database is a single shared resource.
With RLS, the isolation is enforced by the database itself. Every query is filtered automatically based on the authenticated user's identity. There is no application-layer bypass because the restriction is not in the application.

-- Enable RLS on the documents table
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;

-- Policy: users can only access their own documents
CREATE POLICY "Users can only access their own documents"
ON documents
FOR ALL
USING (auth.uid() = user_id);

SECTION 3: Zero Data Retention via Azure OpenAI enterprise infrastructure

Standard OpenAI infrastructure retains inference data for abuse monitoring and model improvement purposes. This is the infrastructure most legal AI tools are built on.

Azure OpenAI, Microsoft's enterprise offering, operates under a fundamentally different contractual model. Zero data retention is the default. Content logging is disabled. Your queries are processed and discarded — not stored, not used for model training, not retained for monitoring.

This is not a policy distinction. It is a contractual and architectural distinction. Microsoft's enterprise SLA makes commitments that a privacy policy does not.

The migration in PRISM involved building an abstraction layer that routes inference through Azure while keeping the interface and API calls identical. The user experience is unchanged. What changed is what the infrastructure underneath actually guarantees.

import { AzureOpenAI } from 'openai';

const client = new AzureOpenAI({
endpoint: process.env.AZURE_OPENAI_ENDPOINT!,
apiKey: process.env.AZURE_OPENAI_API_KEY!,
apiVersion: '2024-08-01-preview',
deployment: process.env.AZURE_OPENAI_DEPLOYMENT!,
});

The deployment name points to a model instance running under enterprise data handling terms. The rest of the codebase does not change.

SECTION 4: The Atomic Purge — destroying all layers simultaneously

A document in PRISM exists across multiple data layers: the original PDF, the extracted text chunks, the vector embeddings used for retrieval, and the associated chat history. Standard deletion in most systems touches one or two of these layers. The others linger.

The Atomic Purge executes a single database transaction that destroys all layers simultaneously. Either everything is deleted, or nothing is. There is no partial deletion state.

async function atomicPurge(documentId: string, userId: string) {
const { error } = await supabase.rpc('atomic_document_purge', {
p_document_id: documentId,
p_user_id: userId
});

if (error) throw new Error(Purge failed: ${error.message});
return generateDestructionReceipt(documentId, userId);
}

The stored procedure handles deletion across all tables in sequence within a single transaction. If any step fails, the entire operation rolls back. Nothing is half-deleted.

SECTION 5: The Destruction Receipt — generating a verifiable audit artifact

After the purge completes, PRISM generates a Destruction Receipt: a SHA-256 hash of the document content combined with the deletion timestamp, packaged as a verifiable PDF artifact.

async function generateDestructionReceipt(
documentId: string,
userId: string
): Promise {
const timestamp = new Date().toISOString();
const hash = crypto
.createHash('sha256')
.update(${documentId}:${userId}:${timestamp})
.digest('hex');

return {
documentId,
deletionTimestamp: timestamp,
sha256Hash: hash,
verified: true,
receiptId: DR-${hash.substring(0, 16).toUpperCase()}
};
}

The receipt can be independently verified. Given the document ID, user ID, and timestamp, anyone can recompute the hash and confirm the receipt is authentic. This is not a confirmation email. It is an auditable artifact.

In a legal context, the difference matters. A confirmation email proves that a button was clicked. A cryptographic receipt proves that a specific document, processed by a specific user, was permanently destroyed at a specific moment — and that the receipt itself has not been altered.

**Data custody is not a layer you add to a legal AI product.

It is the foundation you build the product on.

The distinction between a deletion confirmation and a Destruction Receipt seems small in a demo. In a regulatory audit, in a client data incident, in a courtroom — it is not small at all.

Build the receipt before anyone asks for it.
That is what it means to build Left of Bang.**

PRISM v1.1 is live at prism-mu-one.vercel.app

How I Migrated PRISM to Azure OpenAI and Built Cryptographic Data Deletion in 48 Hours

Victor Okefie — Thu, 26 Mar 2026 05:00:30 +0000

After a discovery call with a legal tech consultant
who has spent ten years in the field, one thing
became completely clear: before a lawyer evaluates
capability, they evaluate data custody.
Standard OpenAI infrastructure was not built to
answer the questions a law firm's security team asks.
Azure OpenAI is.
This is how I migrated PRISM in 48 hours —
and what I built on top of the migration.

Why Azure OpenAI Over Standard OpenAI

The contractual difference: zero data retention is a Microsoft commitment, not an application policy
Content logging disabled by default
Regional data residency and what it means for GDPR
Enterprise SLA and why it matters for legal clients
The migration process: what changed in the codebase and what stayed identical
Code snippet: Azure OpenAI client initialisation vs standard OpenAI client

RLS Isolation — Mathematical Impossibility

What Row Level Security actually means at the database level
Why application-layer access controls are insufficient for legal document contexts
How RLS is implemented in PRISM's PostgreSQL layer
The key principle: isolation enforced where it cannot be bypassed
Code snippet: RLS policy implementation for document isolation

Cryptographic Deletion and the Destruction Receipt

The problem with standard deletion confirmation
SHA-256 hashing of document content before deletion
Timestamp generation and receipt assembly
How the receipt is stored and delivered to the user
Why this is audit-admissible where a confirmation is not
Code snippet: destruction receipt generation logic

The Glass Box — Real-Time Inference Transparency

The problem: legal professionals cannot trust what they cannot observe
How Glass Box works: streaming inference stages to the UI in real time
The four stages and how they are triggered
Implementation: server-sent events for stage updates
Why this is different from a loading spinner
Code snippet: stage streaming implementation

The Security Command Center

What the Data Custody tab shows and why
Audit trail architecture: what is logged, when, and how it is surfaced
The principle behind it: data custody is a continuous record, not an on-demand report

Security is not a layer you add to an AI product.
It is the foundation you build everything on top of.
Left of Bang on security means the breach is
prevented before it is possible, not detected
after it has happened.
That is the only standard worth building to
when the documents inside your system carry real stakes.
PRISM v1.1 is live.

Building a Graph-Based Pattern Detection System: What I Learned and Where It Led

Victor Okefie — Thu, 19 Mar 2026 12:12:00 +0000

I built Ascent Ledger as a career diagnostic OS —

graph-based pattern detection on professional trajectories.
The product taught me more about AI system architecture
than almost anything else I built.
This is the technical story — what the graph approach
unlocked, what it cost, and how the thinking transferred
directly into PRISM and NexOps.

Why Graph Over Vector for Pattern Detection

The limitation of vector similarity for career data: Vectors find similarity, graphs find structure
A career trajectory is not a set of similar documents. It is a sequence of connected decisions with causal relationships
Why FalkorDB: native graph queries, relationship traversal, pattern matching across nodes
Code snippet: basic graph schema for career nodes and edges

The Pattern Recognition Layer

What a "stall pattern" looks like in graph form vs in a CV
How the system detects structural loops — the same role type, different company, no progression
The difference between movement and ascent — the insight that became Epopteia's philosophy
Code snippet: pattern detection query in graph syntax

What Graph Architecture Taught Me About PRISM

The cross-reference validation problem in legal documents is the same problem as career pattern detection, finding structure across connected nodes, not just similar chunks
How the graph thinking transferred: PRISM's internal reference mapping uses the same relational logic
Why this matters: a legal document is a graph, not a sequence of paragraphs
Code snippet: document reference mapping as a graph traversal

The Lesson About Building AI for High-Stakes Contexts

Pattern detection only has value when the user can trust the pattern
The auditability problem: a graph pattern means nothing if the user cannot see how the system found it
How this became the foundation of PRISM's forensic citation layer show the path, not just the conclusion
The architectural principle: never surface a result without surfacing the reasoning

The systems I build now are different because of what

building Ascent Ledger taught me about the relationship
between structure and trust.
A pattern the user cannot verify is just a claim.
A result without a path is just a guess.
Left of Bang systems show their work.
That is the only standard worth building to.

Why I Chose Local-First Architecture for a Zero-Latency Operations Dashboard

Victor Okefie — Wed, 11 Mar 2026 14:21:57 +0000

Every major operations dashboard I evaluated before building NexOps
had the same single point of failure: the cloud.
When the network is slow, the dashboard is slow.
When the API is rate-limited, the decision is delayed.
When the server is in a different continent, latency is not
a performance issue — it is a trust issue.
In logistics operations, where a two-second pause before
an answer introduces doubt into a decision that cannot
afford doubt, I made a different architectural choice.
Local-first.
This is why — and what it cost me.

Section 1: What Local-First Actually Means (and What It Doesn't)

Common misconception: local-first = offline only
The real definition: local state as the source of truth, sync as an enhancement, not a dependency
Why this matters for ops dashboards specifically: The operator needs to trust the data before they trust the system
The key principle: a system that waits for a server teaches its users to doubt it

Section 2: The Latency Problem in Enterprise Ops

Sub-50ms as a design target, not a benchmark
What happens psychologically when a system pauses: the doubt window
The architecture decision: SQLite local store + selective cloud sync vs pure cloud dependency
Code snippet: local-first data layer setup

Section 3: The Audit Trail Problem

Ops decisions need to be documented before they can be disputed
Standard cloud-first tools: audit trail lives on the server, accessible after the fact
NexOps approach: every decision is logged locally first, synced to the cloud second — the record is immutable from the moment of decision
Why this is Left of Bang: the documentation exists before anyone asks for it
Code snippet: decision logging architecture

Section 4: The Anomaly Detection Layer

Surfacing what matters before the operator has to look for it
The Priority Action Queue: not a list, a ranked decision surface
How anomaly detection is implemented without ML overhead: threshold-based pattern detection on local data
Why local processing beats cloud processing for real-time ops: no round-trip, no rate limit, no dependency
Code snippet: priority queue logic

Section 5: What I Gave Up and Why It Was Worth It

Honest trade-offs: local-first increases complexity
Sync conflicts — how they are resolved
The offline edge case: what happens when the operator genuinely has no connectivity
The conclusion: for logistics teams making decisions under pressure, trust is the product.
Local-first architecture is how you engineer trust.

The architecture choice is always a values choice.
I built NexOps local-first because I believe that a
system which makes an operator wait — even for two
seconds — has already failed them in the moment that matters.
Left of Bang is not a feature. It is an architectural commitment.

How I Built an Anti-Hallucination Pipeline for Enterprise Legal Documents

Victor Okefie — Tue, 10 Mar 2026 07:37:37 +0000

The standard advice for building RAG pipelines is to improve your retrieval. Better embeddings. Smarter chunking. Larger context windows.
That advice is incomplete.
I spent three months building PRISM — a document intelligence system for legal and compliance teams. The retrieval was never the hardest part. The hardest part was keeping the model inside the boundaries of what it retrieved.
This is how I solved it.

Section 1: The Problem With Standard RAG (Trust Without Enforcement)

Explain what RAG does correctly: retrieves relevant chunks
Explain the gap: the model is trusted but not constrained
Real scenario: a legal document with internal cross-references where the model invents a clause it has seen in other documents but not this one
Why this is catastrophic in legal/compliance contexts
Code snippet: basic RAG pipeline showing the trust gap

Section 2: Layer 1 — Boundary Enforcement

The technique: strict context injection with explicit system prompt constraints
"You may only use information present in the following retrieved sections. If the answer is not present, say so."
Why this alone is not enough (the model still paraphrases away from accuracy)
Additional enforcement: output validation against retrieved text
Code snippet: boundary-enforced prompt structure

Section 3: Layer 2 — Forensic Citation

Every generated claim is mapped back to a specific source paragraph
How this is implemented: post-generation attribution pass
Confidence scoring: cosine similarity between claim and source
What happens when confidence falls below threshold — the system flags rather than guesses
Why this matters to a legal professional: they do not trust outputs they cannot audit
Code snippet: citation attribution logic

Section 4: Layer 3 — Cross-Reference Validation

Legal documents reference themselves: definitions, clauses, schedules
Standard pipelines treat each chunk independently
PRISM maps internal references before generation begins
Consistency check: if Clause 4.2 is referenced in Clause 7.1, the output must be consistent with both
Code snippet: cross-reference graph construction

Section 5: What This Costs You (Performance Trade-offs)

Honest account of latency increase from multi-layer processing
How the architecture compensates: async validation, cached citation maps
Where it is worth it (legal, compliance, contracts) vs where it is overkill (general Q&A)

Closing: What I Learned

The minimum standard for AI in high-stakes document contexts is not accuracy. It is auditability.
An answer that is right 95% of the time and shows no evidence of the 5% is worse than an answer that admits uncertainty.
PRISM is live at prism.vercel.app — built for teams that cannot afford black boxes.