Skip to main content

Your Agent Framework Doesn’t Matter - Your Data Boundary Does

I recently worked through a practical problem that many enterprise teams will run into when using LLMs: how do you use powerful frontier models without exposing proprietary data unnecessarily?

I cannot describe the exact use case, because it involves internal and proprietary information. So the example below is invented. But the architectural issue is the same.

Imagine a financial-services workflow where a system receives an abbreviated security description such as:

“UBS Grp 4.20% Call Sr Nts 30”

and needs to resolve it into something more explicit:

“UBS Group AG, 4.20% Callable Senior Notes, maturity 2030.”

The obvious way to solve this is to give the task to a frontier model. It will probably do a good job. But in a real enterprise setting, the question is not only whether the model can solve the task. The more important question is: what else does the model see while solving it?

That is where the architecture matters.

There are at least two common ways to build this kind of pipeline. One is a fixed graph workflow. The steps are predefined: ingest the input, resolve the term, validate the result. This gives you control and predictability. It is easier to reason about, easier to test, and often easier to audit.

Another approach is dynamic orchestration. Instead of fixing the path in advance, an orchestrator decides which worker or tool should handle the next step. One worker may extract the relevant text, another may resolve it, and another may enrich or validate the result. This gives you more flexibility, especially when the workflow is less predictable.

Both approaches are valid.

And both can make the same mistake.

If all proprietary context is sent into a single frontier model, then the framework choice does not really solve the underlying problem. Whether the workflow is a graph or an orchestrator-worker setup, the data exposure can be essentially the same.

That was the main lesson for me.

The agent framework is not the first-order issue.

The boundary is.

A graph gives you structure. An orchestrator gives you flexibility. A supervisor pattern may make sense in one case, a deterministic pipeline in another. These are useful engineering decisions, but they do not answer the central governance question:

Which model gets to see which data?

When I implemented and compared different workflow and orchestration approaches, I came to the same conclusion in both cases. For many steps in the process, a frontier model is not necessary. Most orchestration, routing, tool-calling, validation, and workflow-management tasks can be handled by on-premise or private models.

These models may not be as broadly capable as the best frontier models, but they are often good enough for the local task. And that distinction matters. In enterprise systems, you often do not need one model to see everything. You need different models to see only what they need.

In the invented financial example, the frontier model may only need the abbreviated text string. It does not need the client ID. It does not need to know who holds the security. It does not need the position size, the portfolio context, the counterparty, or the mapping between internal holdings and market instruments.

The model only needs to answer one narrow question:

What does this string most likely mean?

Everything else should stay inside the enterprise boundary.

That is the practical shift: isolate the data, not necessarily the model.

I do not think the right answer is to avoid frontier models altogether. That is neither realistic nor always desirable. Frontier models are extremely useful, and for a small subset of tasks I have found that they are still difficult to replace. But they should not become the default place where proprietary context accumulates.

The better pattern is to keep sensitive steps on-premise, reduce or redact the context, and then call the frontier model only when needed, with the smallest possible input.

The frontier-facing call should behave almost like a pure function.

A narrow input goes in.

A narrow output comes back.

No memory.

No broad business context.

No unnecessary metadata.

No “just in case” information.

This is different from how many people use LLMs today. The default instinct is often to provide more context, because more context usually improves the answer. But in enterprise systems, more context is not always better. More context can also mean more exposure.

So the design question becomes: what is the minimum context required for this model to perform this step well?

This is also why multi-LLM architectures are useful. Not because they are fashionable, and not because every problem needs more moving parts. They are useful because they allow a clearer separation of concerns.

One model can handle local orchestration. Another can work with proprietary internal data.

A frontier model can be used selectively for the narrow tasks where its additional capability is actually required.

The architecture becomes less about one powerful model seeing everything, and more about several models each seeing only what they need.

After implementing and comparing these approaches, my conclusion is that the framework choice is secondary. Graph workflows, orchestrator-worker patterns, and supervisor setups all have their place. But none of them automatically protects proprietary data.

The invariant has to be the boundary.

What stays inside? What gets redacted? What crosses into the frontier model? What comes back? What is logged? What is never exposed?

These are the questions that decide whether the architecture is safe enough for serious enterprise use.

The example I am using here is invented, but the lesson is not. In real workflows, the hard part is not simply getting the model to produce the right answer. The hard part is making sure it produces the right answer without seeing data it never needed in the first place.

That is why, for me, the main architectural decision is not the agent framework: It is the LLM strategy.

Use on-premise or private models for the majority of routine and sensitive steps. Use frontier models selectively, only where their capabilities are genuinely needed. And when they are used, make the call narrow, stateless, and context-poor by design.

The future of enterprise AI will not be one model seeing everything.

It will be systems where each model sees only what it needs.


Comments

Popular posts from this blog

Vibe Coding Alert! How I Rebuilt a Wix Site and Fed the “AI Will End SaaS” Panic

My better half is an artist and maintains a Wix.com site. For the second time in two years, Wix decided to raise the hosting fees. That’s when I suggested to my spouse that I could rebuild the website and host it on Firebase (where I host most of my projects). I assumed this wouldn’t be a big deal (I was wrong) and started researching ways to use a lightweight CMS with Firebase support. Such a system exists — it’s called FireCMS — and it’s excellent. Before I dive deeper, here’s her original site (no longer a paid Wix site):  Miyuki's WIX site Her instructions were clear: replicate it as closely as possible. So I went to work. I created a product development document with use cases, scope, screenshots from the original site, the required features, and of course FireCMS integration. I used ChatGPT to draft the document, then set up a new Firebase instance, and finally launched the Vibe Coding agent (Claude Code). The process wasn’t too different from my other projects, but what sur...

How I Ended Up Creating an AI Playground to Illustrate and Educate

TL;DR AI Playground User Guide

I've Been Vibe Coding for 2 Months, Here's What I Believe Will Happen

In the past few months, I've embarked on an experiment that has fundamentally changed how I approach software development. I've been "vibe coding" - essentially directing AI to build software for me without writing a single line of code myself. This journey has been eye-opening, and I'd like to share what I've learned and where I think this is all heading. My Vibe Coding Journey I started vibe coding with Claude and Anthropic's Sonnet 3.5 model, later upgrading to Sonnet 3.7, Claude Code, and other tools. My goal was straightforward but comprehensive: create a CRM system with all the features I needed: Contact management (CRUD operations, relationships, email integration, notes) Calendar management (scheduling meetings, avoiding conflicts) Group management for organizing contacts A campaign system with templates A standalone application using the CRM's APIs for external contacts to book meetings direct The technical evolution of this project was inter...