Applying industry LLMs to the work you already do.

Door three. The unglamorous one. Where almost all of the business value of AI in 2026 actually lives.

What this actually involves

Applying industry LLMs means using the major labs' models - the ones you have heard of - to automate or augment work that already happens in your business. You do not train anything. You do not fine-tune anything. You build the harness around the model that turns it from a chat box into a useful piece of operational infrastructure.

The work is not in the model. The work is in everything around it: how the model is prompted, what context it gets to see, what tools it can call, what its outputs are checked against, and where a human stays in the loop. The model is the engine. The harness is the vehicle.

RAG, in plain English

"RAG" stands for retrieval-augmented generation. It is the most useful thing to understand about applied LLM work, and the most over-mystified.

Plain version: when a user asks the system a question, the system first looks in your data for relevant information, then pastes that information into the prompt the LLM sees, and asks the LLM to answer using that context. The LLM never "learned" your data; it just had access to the right bits of it at the right moment.

That is it. The technical sophistication is in how well the "look in your data for relevant information" step works - indexing strategy, query rewriting, ranking, freshness, permissions. But the pattern itself is simple, and it is the workhorse of the applied LLM world. Most organisations get most of their AI value from this single pattern, well executed.

Agents, in plain English

An agent is an LLM that has been given tools (functions it can call: read this database, send this email, file this ticket) and the ability to plan a sequence of actions to achieve a goal. Where RAG retrieves and answers, an agent retrieves, decides, acts, and continues until the job is done.

Agentic patterns are powerful in narrow domains where the actions are well-bounded, the failure modes are recoverable, and the supervision pattern is clearly defined. They are risky when deployed without thinking through what happens when the agent does something wrong - because eventually it will.

We use agentic patterns in our own work, including in Rubicon Probity. We are also cautious about deploying them at clients without an honest conversation about supervision, audit, and blast radius.

What this typically looks like in business

The shape of door-three work is repetitive across industries:

  • Document-heavy review work. Underwriting, legal review, compliance checks, claims processing. People reading documents and making structured judgements about them. RAG plus structured output captures most of the value.
  • Internal knowledge access. Your team needs to find the answer to something that lives in your documents, your tickets, your wiki, your emails. A retrieval system over your own corpus, surfaced through a chat interface or a slack integration, removes a category of friction.
  • Drafting and structured generation. First drafts of recurring artefacts: reports, briefings, customer communications, code stubs. The LLM does the boring 80%; a human does the judgement-heavy 20% and the sign-off.
  • Triage and classification. Inbound tickets, emails, requests. The LLM categorises, summarises, and routes. The human focuses on the cases where it matters.

What it costs, what it does not cost

Door three is the cheapest of the three doors at build time. You are not paying to train a model; you are paying for the harness around it and for ongoing model usage. A useful internal tool can be built and deployed in weeks, not years.

The costs that matter are not always the ones the vendor talks about. They are: the cost of bad outputs reaching users without a check, the cost of hallucinated citations in regulated workflows, the cost of agents acting on stale data, the cost of permissions leaking through retrieval. None of these are reasons not to walk through door three. All of these are reasons to govern the work properly while you do.

When it is the right answer

Door three is the right answer for almost every business looking to "do something with AI". The exceptions are the small set of door-one and door-two cases described in the other pages.

More usefully: door three is the right answer when you have real operational work that touches text, documents, or conversations; when the workload is large enough that automation pays back; and when the failure modes are recoverable enough that a sensible human-in-the-loop pattern can be designed.

What we do here

This is where most of our delivery work sits. We help organisations identify the highest-value door-three opportunities, design the supervision and governance pattern that makes the work safe to deploy, and build the harness - retrieval, prompting, agents where appropriate, evaluation, audit trail - that turns an industry model into a useful piece of your operation.

Crucially, we do this with Rubicon Probity sitting above the work, capturing the decisions the system is making, surfacing the biases at play, and producing the audit trail that lets you defend the deployment to a board, a regulator, or a customer who asks. The model is the engine. The harness is the vehicle. Probity is the dashboard.

One conversation, no commitment. We listen first, then suggest a path.

Ready to see where you stand?

Two minutes. No email required. Find out where AI fits your business - and where the gaps are.

Or skip ahead and start a conversation