Why Vessel

Who Owns Your AI Agent's Knowledge?

Every correction you make to your AI agent is an investment. “Too formal.” “Cite the holding, not the dicta.” “That's not how we talk to clients.” Over months, those corrections accumulate into something genuinely valuable: your professional judgment, encoded and compounding. The question most professionals haven't asked yet: who else gets to use it?

What your corrections are actually worth

A consultant spends four months training an AI agent on her firm's methodology. Every correction sharpens the output. “That's not how we structure a recommendation.” “Use the client's terminology, not ours.” “Executive summary first, findings second.”

These aren't throwaway edits. They're the encoding of professional judgment that took years to develop.

In July 2024, researchers at Oxford and Cambridge published a paper in Nature demonstrating that AI models trained on their own outputs undergo “model collapse.” The tails of the original data distribution disappear. The model's output converges to a flattened, degraded version of reality.

The antidote? Human corrections. Real expert judgment. The kind of feedback that can't be synthesized or scraped from the web.

The correction flywheel

Your Correction
Agent Learns
Better Output
More Trust
More Corrections

Each cycle compounds your advantage. The agent becomes a reflection of your expertise.

That's what makes your agent's accumulated corrections different from generic training data. They're the proprietary expression of how you think, evaluate, and decide. And they create competitive advantages that compound over time.

Gartner predicts that by 2028, half of all organizations will implement zero-trust data governance, driven by the proliferation of unverified AI-generated content. In that world, human-verified, expert-corrected knowledge becomes scarcer and more valuable. Not less.

The training problem (and who it actually affects)

Most professionals interact with AI through consumer products: ChatGPT, Claude, Gemini. These products default to training on your conversations. That's the deal. You get a free or cheap tool; the platform gets your data to improve their models.

There is a different tier. LLM providers also offer API access for developers building applications. The API tier has meaningfully better privacy: no training on your data, shorter retention windows, and available zero-data-retention agreements. Vessel uses this API tier, not the consumer product.

But the distinction matters because most professionals don't know it exists. They use the consumer product, under consumer terms.

Consumer products (what most professionals use)

OpenAI (ChatGPT)Trains by default

Free and Plus tier conversations may be used to train models. Opt-out available, but the default is in.

Source: OpenAI Data Usage Policy

Anthropic (Claude)Policy reversed

Positioned as the safety-focused alternative for years. Then in September 2025, updated consumer terms: Claude now trains on user inputs by default. Opted-in data retained for up to five years.

Source: TechCrunch, August 2025

Google Gemini3-year retention

Consumer tier: reviewed conversations retained for up to three years, may be used to improve AI models.

Source: Gemini Apps Privacy Hub

API access (what Vessel uses for inference)

LLM API tierNo training

OpenAI, Anthropic, and Google all maintain a strict separation between consumer and API products. API customer data is not used for model training by any major provider. Retention is limited to 7–30 days for abuse monitoring, then deleted. Zero-data-retention agreements are available for enterprise customers who need even that eliminated.

Sources: OpenAI, Anthropic, Google Vertex AI

The distinction is real, but most professionals never encounter it. They use the consumer product, under consumer terms. The Heppner ruling below shows what happens when they do.

What the courts are saying

On February 10, 2026, Judge Jed Rakoff of the Southern District of New York issued the first federal ruling on whether AI-generated documents are protected by attorney-client privilege.

The answer was no.

February 2026US v. Heppner (SDNY)Privilege denied

Defendant used Claude's consumer product to prepare 31 documents outlining his defense strategy. Judge Rakoff rejected all privilege claims on three grounds: an AI tool is not an attorney; the platform's consumer privacy policy allows data collection and disclosure to third parties including government; materials were not prepared by or at the direction of counsel.

July 2024ABA Formal Opinion 512Ethics obligation

The ABA's first ethics guidance on AI: lawyers must investigate whether an AI tool trains on their data before using it with client information. Consent must be genuinely informed, not boilerplate.

August 2025Bartz v. Anthropic$1.5B settlement

Largest copyright settlement in US history, and the clearest signal yet that AI training data has a price. Anthropic agreed to pay $1.5 billion and destroy the pirated datasets used to train Claude.

The Heppner ruling turns on a specific point: the consumer platform's terms allowed data collection and disclosure. That's what destroyed privilege. An AI agent running on infrastructure where no platform has access to the data presents a fundamentally different legal posture.

As of October 2025, over 70 AI copyright infringement lawsuits were active in US courts. The FTC launched an inquiry into AI chatbot data practices in September 2025. State bar associations across the country are issuing guidance at an accelerating pace. This regulatory landscape isn't settling down.

The gap between worry and behavior

The risk isn't theoretical. It's behavioral.

72%

of employees using GenAI at work use non-corporate accounts

Verizon 2025 DBIR

$670K

extra breach cost for organizations with shadow AI

IBM 2025 Cost of a Data Breach

56%

jump in AI-related incidents in a single year

Stanford 2025 AI Index

A 2025 Cisco study of 2,600 privacy professionals found that 64% worry about inadvertently sharing sensitive information with competitors via generative AI. Yet nearly half admit to inputting personal employee data or non-public company information into AI tools anyway.

Professionals know the risk exists. They use the tools anyway, because the tools are useful. The answer isn't to stop using AI. It's to use AI on infrastructure that works hard to minimize the risk.

How Vessel protects your data

Privacy isn't a single switch you flip. It's layers, each reducing risk. Here's how Vessel approaches it honestly.

Defense in depth

Layer 1: Compute isolationYour agent runs on a dedicated VM with its own kernel and disk. Your files, corrections, and agent state never touch another customer's environment. Vessel doesn't look into your agent's conversations, memory, or files.
Layer 2: API-tier LLM accessWhen your agent needs to reason, it calls an LLM API. API-tier data is not used for model training by any major provider. Retention is 7–30 days for abuse monitoring, then deleted.
Layer 3: Network controlsNo public IPs. No SSH. Inbound access only through encrypted tunnels. Outbound via Cloud NAT (no public IP exposed).
Layer 4: Additional hardening (coming)PII redaction before API calls. Zero-data-retention agreements with LLM providers. On-premise deployment options for regulated industries.

We're honest about the trade-off. When your agent reasons, inference requests travel to the LLM provider. That's how AI agents work today unless you run a self-hosted model, which costs significantly more and delivers weaker results. Complete privacy through self-hosted LLMs is possible but comes at the expense of agent capability and cost.

What Vessel guarantees: your persistent data (the corrections, the context, the files, the accumulated judgment that makes your agent yours) stays on your machine. It's never shared across tenants and never pooled in a database. Vessel doesn't look into your agent's conversations, memory, or files. The LLM sees individual inference requests under API terms that prohibit training. Your agent's knowledge stays yours.

Consumer AI tools

  • Your data trains their models (by default)
  • All users share the same infrastructure
  • Privacy policies can change any quarter
  • No isolation between tenants
  • Platform has full access to your data

Your vessel

  • No training on your data (Vessel or LLM API)
  • Dedicated VM per customer
  • Privacy is structural, not a policy toggle
  • Full tenant isolation at the hardware level
  • Vessel doesn't look into your conversations, memory, or files

Your expertise compounds with every correction. Vessel works hard to keep that knowledge where it belongs: on your machine, under your control, working for you.

Your expertise deserves its own machine.

Private VMs. No shared infrastructure. Your agents, your data, your rules.