Shipping agentic research to 10,000 analysts

A year of building agent workflows inside a real product, and the five lessons I wish someone had handed me on day one.

In January 2025, we shipped the first version of our agentic research workflow at AlphaSense. It was slow, sometimes wrong, and — to our surprise — beloved by a small cohort of analysts who would send us hand-written bug reports at 2 a.m.

A year later, it runs thousands of sessions a day. What changed is not the model. What changed is how we thought about the seams: where the agent defers to a human, where it persists state, how it surfaces uncertainty, and how it recovers from its own mistakes.

The five seams

Most discussions of “agentic AI” focus on the loop — plan, act, observe, reflect. That is a solved problem; every decent open-source framework gives you this in an afternoon. The hard problems sit outside the loop, at the seams where the agent meets the rest of your system.

The loop is the easy part. Everything worth learning happens at the seams.

Seam one is the input boundary. Analysts don’t ask questions the way documentation suggests. They paste half-formed sentences, ticker symbols, screenshots. Every useful production agent I’ve shipped has had a small, sturdy normalization layer in front of it that is essentially a compiler for human intent.

Seam two is memory. Stateless agents are a research artifact. Real analysts work on a thesis for weeks. They expect to come back to a session and find it where they left it, with the reasoning available for audit. This is a database problem dressed in an AI costume.

Seam three is uncertainty. Models hallucinate; calibration is hard; analysts will trust a confident wrong answer over a hedged right one. The best interface move we made was showing the sources the agent actually read, not the ones it was told to read.

The loop itself is the part everyone draws first:

flowchart LR
  A[Plan] --> B[Act]
  B --> C[Observe]
  C --> D[Reflect]
  D --> A

But the seams — input, memory, uncertainty, recovery, audit — sit around this loop, not inside it. That is where almost all of the work has gone, and where almost all of the surprise has lived.

What I would do differently

I would start with the seams before the loop. I would ship a small, boring, deterministic version first — one that a human could have written — and earn the right to add the agent later. I would measure trust before I measured accuracy. And I would write more of this down, sooner.

#ai #engineering #production