Learn

What is agentic trading?

Agentic trading, in the disciplined sense: an AI agent like Claude composes a typed strategy graph at design time; a deterministic engine executes the compiled artifact at runtime. The agent helps you build the strategy. The agent does not trade your account.

By Keel Research Team · Updated May 20, 2026

"Agentic trading" started showing up as a search term in late 2025 and grew through early 2026 as TradeStation, Gemini, and several smaller venues shipped agent-integrated trading surfaces. The term is genuinely ambiguous. Half the people searching for it want an autonomous bot that trades their wallet while they sleep. The other half want Claude or Cursor to help them research, build, and validate a systematic strategy. These are very different products and very different risk profiles.

This page draws the line, then walks through what a real agentic-trading workflow looks like — where LLMs add value, where they introduce failure modes, and what to do about it. If you are looking for the protocol explainer, see what is MCP.

Two definitions of agentic

Loose definition (autonomous-bot frame). An LLM is in the runtime loop. It reads market data, decides when to enter and exit, sizes positions, and places orders. The user grants the agent ongoing access to their funds and trusts it to make trading decisions in real time. This is what most "AI trading bot" pitches mean. It is also the failure mode most prone to blowing up — LLMs hallucinate deterministic math; they have no inherent grasp of funding accrual, fee terms, or position-sizing kelly; they generate plausible-looking entries that have no statistical edge underneath. The autonomous wallet bots in this space (Senpi, Katoshi, OpenClaw, and various Telegram-distributed bots) have short track records.

Tight definition (strategy-builder frame). An LLM is at design time only. It collaborates with you to express a thesis as a typed component graph. The graph compiles to a deterministic artifact. The artifact runs in a backtest engine; the same artifact deploys live with bit-for-bit parity. The LLM does what LLMs are actually good at (composition, refactoring, summarization). The engine does what engines are good at (deterministic math, exact bookkeeping, reproducibility). The agent is not in the runtime.

Keel uses the tight definition. When you read "agentic trading on Keel", read it as: Claude helps you build a strategy; the compiled strategy trades.

The taxonomy of "agentic" approaches

Three frames in the wild right now. Each has different failure modes; understanding them is how you decide which to use.

1. Autonomous-bot agentic. The LLM makes runtime decisions on your funds. Products: autonomous wallet bots such as Senpi, Katoshi, OpenClaw, and various Telegram-distributed bots. Risk surface: hallucination at trade time (the model generates a position the math does not support), no deterministic backtest (the same prompt run twice produces different outputs), no funding model (perp carry is invisible to the LLM), and no audit trail you can replay deterministically. The category exists; the track records are short.

2. Code-generator agentic. The LLM writes Python that calls a broker REST API directly. Output: a bespoke trading script per strategy. Risk surface: drift between research script and live script (subtle differences in how prices are read, how funding is accrued, how slippage is modeled produce backtest-to-live divergence). No engine underneath to enforce consistency. Every strategy is its own snowflake. Works for one-off experiments; does not scale to a research practice.

3. Strategy-builder agentic. The LLM composes a typed strategy graph via MCP. The engine runs a deterministic backtest. The compiled strategy artifact (the same one that backtested) deploys to live execution with bit-for-bit parity. The agent is out of the runtime. Risk surface: the agent can still propose a bad strategy, but the engine math is exact; the live execution is the artifact you signed off on; and any discrepancy is a code bug, not a hallucination.

Why the third frame matters

The split is empirical. LLMs are sharp at things that have lots of valid solutions: prose, code composition, refactoring, summarization, "what does this do?" explanations. LLMs hallucinate on things that have a single correct answer: arithmetic, deterministic state, exact accounting. Trading hits both categories. Composing a thesis (what signals to combine, what regime to gate on, what to vol-target) is the soft side — there are many reasonable answers, and a good LLM can iterate quickly. Computing P&L is the hard side — there is exactly one correct answer, and an LLM that hallucinates a fill price by 5 bps is breaking your live execution.

The strategy-builder frame is the only one that lines up cleanly with this split. Soft work to the LLM; hard work to the engine. The other two frames put the LLM in charge of work it is not equipped for.

A worked example

A user asks an MCP-capable agent (Claude Code, Cursor, Codex CLI — any of them) to compose a funding-carry strategy with a funding-level regime gate, and backtest it from August 2024 to April 2026. The agent searches the component library, finds the carry signal and the funding-level regime component, composes the typed DSL graph, submits it via keel_strategy_compose, and triggers keel_backtest_run.

The engine runs deterministically. Result: Sharpe 2.17, +79.6% total return, −9.7% max drawdown, over 2024-08-15 to 2026-04-30 on Hyperliquid perps with real funding, real fees, and modeled slippage. Full tearsheet: app.usekeel.io/share/gDXjURKqWPs8CZ4eXdqAI.

The crucial point: the agent did not compute the Sharpe. The agent composed the strategy. The engine computed the Sharpe from a serialized strategy artifact. If you re-run the same artifact, you get the same Sharpe — bit-for-bit. If you deploy the same artifact live, it places the same trades the backtest placed (modulo unavoidable real-world slippage). The agent’s output is fully auditable, fully reproducible, and fully separable from the math.

A backtest is not a robustness proof — see the robustness checklist for the full pre-deploy diligence. Frame the share URL as "the engine produces real numbers on a real strategy", not "this strategy is ready for live capital".

What an agentic-trading workflow looks like end-to-end

The five-step flow:

  • 1. Compose. You describe a thesis in natural language to the agent. The agent searches the component library, picks the relevant signals + regime gates + sizing primitives, and authors a typed DSL graph. You review the graph (it is human-readable).
  • 2. Backtest. The agent submits the strategy and calls the backtest engine. The engine runs deterministically against real Hyperliquid funding + price history. Results come back as a structured tearsheet — Sharpe, drawdown, hit rate, decomposed P&L by signal, exposure by asset.
  • 3. Iterate. You ask the agent to adjust — different lookback, different gate threshold, different vol target. The agent forks the strategy, re-runs the backtest, and compares. This is the loop the LLM is genuinely good at.
  • 4. Deploy. Once a candidate clears your pre-deploy checklist, you stage it for live execution. The compiled artifact moves to the execution layer. The agent is no longer in the path; the strategy runs on its own schedule.
  • 5. Audit. Every live trade is recorded with a deterministic replay artifact. If something looks wrong, you (or the agent) can replay the exact bar through the engine and see the signal values, sizing decisions, and fill simulations side by side with the live fill.

Steps 1-3 are agent-driven. Steps 4-5 are deterministic. Both halves are doing what they are good at.

What this surface does NOT promise

If a product page sells you any of the following, it is over-claiming what currently ships:

  • "AI portfolio manager." No LLM should be making ongoing portfolio decisions on your funds without a deterministic strategy underneath. The "manager" is the compiled strategy, not the LLM.
  • "Set-and-forget agent." Strategies need monitoring even when the LLM is not in the loop. Markets change. Regimes shift. A strategy that worked in backtest can degrade live. "Agent does it for you forever" is not a thing.
  • "Vibe trading", "AI-powered alpha", "agent-managed wallet." Vocabulary that signals the autonomous-bot frame.

What Keel does ship: a deterministic backtest engine on Hyperliquid history with real fees, funding, and slippage, 182 typed DSL components, OAuth-secured MCP for agent-driven composition over MCP tools, bit-for-bit backtest-to-live parity, and non-custodial HL execution via native delegated signing. The verified backtest above is a real worked example.

The install

If you want to try the strategy-builder frame yourself, the install is two commands:

pipx install keel-trade
claude mcp add keel -- keel mcp serve

First command installs the keel-trade CLI from PyPI; second registers the local-stdio MCP with Claude Code (Cursor / Windsurf / Codex have equivalent one-liners). First call triggers an OAuth login at app.usekeel.io. After that, the agent has access to all MCP tools — search components, compose strategies, run backtests, summarize results, fork, diff.

For the agent-specific landing pages on the keel-site (per-host setup, prompt patterns, the tutorial flow), see the Keel MCP product page and build a trading bot with Claude on Hyperliquid.

This article is educational. A passing backtest is not a guarantee of live performance — markets change, regimes shift, and no historical validation forecasts the next structural break. Before deploying a strategy, see the robustness checklist. References: Model Context Protocol spec at modelcontextprotocol.io.
Automate it

Trade systematically on Keel

Keel is a Strategy OS for AI-assisted systematic trading on Hyperliquid. Backtest, optimize, and run live strategies across single-stock perps, indices, and crypto majors — realistic fees, slippage, and funding modeled.

Free to start — connect a Hyperliquid wallet when you’re ready to go live.

What you can do
  • Backtest any strategy with realistic fees, slippage, and funding.
  • Optimize parameter grids by Sharpe, drawdown, hit rate.
  • Deploy live to HL with stops + position limits + funding-aware execution.
  • Iterate with AI — describe a thesis, get a tradeable pipeline.
FAQ

Agentic trading — questions

What does "agentic trading" actually mean?

The term is used two ways. Loose definition: an AI agent participates in any part of the trading workflow — idea generation, code drafting, runtime decisions. Tight definition: an agent composes a strategy that runs deterministically without the agent in the runtime loop. Keel uses the tight definition. The agent is a collaborator at design time; the compiled strategy executes on its own schedule, deterministically, against the same engine that backtested it.

Can Claude trade my account autonomously?

No. With Keel, Claude composes strategies via the MCP server and runs backtests on your behalf. Live trading is gated behind two independent locks: an explicit OAuth scope (sign in with `keel auth login --scope live`) and a local arming step on your machine (`keel arm live set`). Even after both locks open, the agent's job is to compose and stage; the compiled strategy (a versioned artifact) is what trades, on a schedule you authored, with no LLM in the runtime path. If a product promises you "let Claude run your account", it is using the loose definition of agentic — read the failure-mode section of this page carefully.

Which agents are 'agentic' in the deterministic sense?

Any LLM host that speaks MCP — Claude Code, Cursor, Windsurf, Codex CLI, Gemini CLI, Claude Desktop, Cline — can drive Keel in the strategy-builder sense. The host’s job ends when the strategy is composed, backtested, and deployed; from that point, the compiled strategy runs without the host. So in the deterministic frame, the universe of agentic-compatible hosts is large; what matters is which servers (like Keel’s) expose a deterministic engine underneath.

Does the agent need to be online while my strategy trades?

No. Once a strategy is composed and deployed, the compiled artifact runs on Keel's execution layer on its own schedule. The agent (Claude, Cursor, whatever) does not need to be open, connected, or even installed. This is the central architectural decision behind "agent as builder, engine as executor" — putting an LLM in the runtime loop introduces latency, cost, and a hallucination surface that has no business being there.

What kinds of strategies work well in this paradigm?

Anything that can be expressed as a typed component graph: cross-sectional momentum, funding-carry, regime-gated variants, volatility-targeted baselines, mean-reversion overlays, multi-signal mixes. The 182-component DSL covers most published systematic crypto strategies. What does not fit: discretionary judgment calls (no, the agent should not be making them live), strategies that require text comprehension of news at trade time (the engine does not run an LLM), and strategies that need millisecond-latency execution decisions (the architecture is bar-clocked, not tick-clocked).

How do I get started?

Two commands. `pipx install keel-trade` installs the CLI from PyPI; `claude mcp add keel -- keel mcp serve` registers the local stdio MCP with Claude Code. On first call, `keel auth login` runs an OAuth flow against app.usekeel.io and persists tokens locally. After that, Claude can call all MCP tools — search components, compose strategies, run backtests, summarize results. See usekeel.io/docs/getting-started for the full flow including Cursor / Windsurf / Codex configs.