Methodology

How to backtest a Hyperliquid strategy

A defensible Hyperliquid backtest needs four pieces: clean perp data with funding, a portfolio simulator that respects fees and slippage, an honest cost model, and validation beyond a single in-sample window. Most public 'HL backtests' today are 15-day parameter-grid demos — this page is the working recipe instead.

By Keel Research Team · Updated May 18, 2026

What “backtest a HL strategy” actually means

A Hyperliquid backtest is four problems stacked: a data problem (which perps, which timeframe, which funding cadence, how to handle delistings and newer listings), a simulator problem (how cash, positions, and funding accumulate bar by bar), a cost problem (maker/taker fees, slippage, funding P&L), and a validation problem (was the result a property of the strategy or a property of the window). Skip any one and the equity curve is a fiction.

Most public “HL backtests” you see in tweets and landing pages are a single 15-day parameter sweep on three symbols. They optimize fees-and-funding away and call the result a strategy. That is a demo, not a backtest. The rest of this page is the recipe for the other one.

Data: which markets, which timeframe

Hyperliquid lists roughly 220 perpetual contracts. Keel ingests 15-minute OHLCV bars for every active perp and 1-hour funding-rate bars for the same universe. That is the granularity the simulator works in. Higher-frequency data (1m, tick) exists upstream but is not used in the backtest graph — most cross-sectional and trend signals over a multi-week horizon are insensitive to sub-15m noise and over-fit easily on tick data.

SeriesCadenceCoverage
Perp OHLCV (close, high, low, volume)15-minute bars~220 active perps
Funding rate1-hour barsSame universe, paid hourly
Open interest1-hour barsSame universe

History caveat. Older HL perps go back to mid- 2023. Newer listings have only 6-12 months. A universe filter that requires “12 months of bars” cuts the universe roughly in half and biases you toward survivors — assets that listed early, didn’t delist, and are usually larger. Decide up front whether you want that survivorship bias or want the broader universe with a shorter history per name.

The portfolio simulator

Keel’s production-grade portfolio simulator walks the (bars × assets) close and weight matrices bar by bar — the same engine that drives live execution. At every bar it computes target notionals from weights, the orders required to move from current positions to target, applies fees and slippage on the order size, and updates cash, positions, and cumulative funding P&L.

# per-bar accounting (conceptual)
target_value    = portfolio_value[t-1] * weights[t]
target_position = target_value / close[t]
delta           = target_position - position[t-1]
fill_price      = close[t] * (1 + sign(delta) * slippage)
notional        = abs(delta) * fill_price
cash[t]         = cash[t-1] - delta * fill_price - notional * fees
funding_pnl[t]  = position[t-1] * close[t] * -funding_rate[t]
position[t]     = position[t-1] + delta
portfolio_value[t] = cash[t] + sum(position[t] * close[t]) + funding_pnl[t]

Three properties matter. First, funding is integrated into equity bar by bar — it is not bolted on afterward as a flat haircut. Second, the simulator supports a buffered rebalance mode: when actual-vs-target drift is below a configured threshold no order fires, which collapses turnover on slow-moving signals. Third, the engine produces a decomposed equity curve — price-only, funding-only, and combined — on the same run, so attribution is an output of the simulator, not a post-hoc estimate.

Costs: fees, slippage, funding

Hyperliquid charges 2 bps maker / 4.5 bps taker on perp trades for unstaked accounts, with maker rebates and tier discounts from there. The simulator defaults to the full 4.5 bps taker fee on every order, plus another 4.5 bps in slippage, both applied to fill price. That is intentionally conservative for top-of-book perps and roughly calibrated to live fills on the ~50 most-liquid HL pairs.

Cost componentDefaultWhen to override
Taker fee4.5 bps per sideLower for staked tiers; higher for non-USDC margin
Slippage4.5 bps per sideRaise for thin names or orders > $50k notional
FundingModeled from 1h funding seriesNo override — paid every hour to current position

For sizing slippage realistically, the rule of thumb: model 1 bp per ~$10k of taker notional on the top 30 HL perps, 1 bp per ~$2-3k on names ranked 50-100, and refuse to backtest names outside the top 150 at material size. Slippage is the single easiest place to manufacture a fake edge — start strict and loosen only with live evidence.

Validation: what’s enough, what’s not

A backtest on a single window is one observation. It tells you whether the strategy could have worked once, not whether it generalizes. The minimum bar for considering a result real:

  • Hold out the last 3-6 months. Fit parameters on the older slice, score on the newer one. Reject if out-of-sample Sharpe drops below half of in-sample.
  • Score across regimes. Hyperliquid has had at least three distinct funding/vol regimes since launch; if your strategy only earns in one of them, that’s a regime bet, not an edge.
  • Sanity-check parameter sensitivity. Vary each parameter ±25% and look at metric stability. If a 10% shift in a window length flips the Sharpe sign, the apparent edge is fit noise.
  • Compare to the dumb benchmark. Equal-weight long-only on the same universe with the same fees is the floor; the strategy needs to clear it on risk-adjusted return, not on absolute return alone.

Walk-forward optimization (rolling fit/test windows) and Monte Carlo bootstrap on returns are stronger tools — both are on the Keel roadmap and are covered separately. Until they ship in-platform, the four bullets above are the practical minimum.

An end-to-end example

The fastest path is the Keel web app — compose the pipeline in the visual builder, click Run Backtest, set dates, and read the report. The backtest detail page shows the decomposed equity curve (price-only, funding-only, combined), per-bar target weights, fills, and the full metric set. Share the tearsheet by URL or fork it back into another account in one click.

Or fork a verified backtest as a starting point — the funding-carry tearsheet lands in your account ready to edit in the visual builder, with all the same components, parameters, and sizing rules used in the methodology above.

Driving Keel from a terminal or an AI agent? pipx install keel-trade puts the keel CLI on your PATH; the CLI reference covers strategy create, backtest run, and backtest results. Same engine, same data, same decomposed series.

Common failure modes

  • Survivorship bias. Filtering for assets with 12+ months of history drops everything that delisted. Whatever Sharpe you compute on that universe is conditional on surviving — adjust expectations downward.
  • Look-ahead in signal construction. Computing a 30-day vol on close-of-bar t and trading on the same bar is a free 1-bar lookahead. Always lag signals by 1 bar before they feed the weight aggregator.
  • Funding ignored on long-bias strategies. During positive-funding regimes, holding longs across 4-8 hourly funding pays can dwarf the price P&L. A price-only backtest looks great; the live equity curve is a cliff.
  • No live parity. If the live trading code is a separate hand-written implementation, the backtest is testing a strategy that does not exist. Keel runs the same pipeline graph in backtest, paper, and live — this is the structural fix.
  • Regime overfit. Strategies fit only on the 2023-2024 bull-funding regime usually break in the neutral-funding stretches of 2025. Always score across at least two distinct funding/vol regimes before sizing up.

Where Keel fits

The point of running this methodology on Keel rather than wiring it up from scratch is structural: the same pipeline graph runs live. A backtest is not a separate codepath — the pipeline you author feeds either the portfolio simulator or the live HL adapter, with no reimplementation in between. That eliminates the most common class of backtest-vs-live drift, which is reimplemented signal logic.

On top of that you get a deep component library covering signal generation, regime gates, cross-sectional aggregation, vol targeting, and buffered rebalancing; HL-native data ingestion; and the Keel web app as the primary surface (visual builder, backtest UI, share links). Terminal and AI-agent users can hit the same backtest engine through the keel-trade CLI. None of that is novel research on its own; the compounding value is that all four pieces of the recipe above are baked into the same engine.

Try it

Open the Keel app, build the pipeline in the visual builder, and run a backtest against your Hyperliquid account. Terminal and AI-agent users can drive the same backtest from the keel-trade CLI — same pipeline, same simulator, same costs.

FAQ

Common questions

How much price history do I need?

A reasonable minimum is two full regime cycles — for Hyperliquid that means at least 12 months of 15-minute bars and 1-hour funding, ideally back to launch (June 2023) for the assets that have it. Newer listings (e.g. tokens launched in 2025) only have ~6-9 months of history; treat any single-window result on those as an indication, not proof.

What about walk-forward optimization?

Keel ships single-window parameter optimization today; rolling and anchored walk-forward are on the roadmap. Until then, validate by holding out the most recent 3-6 months as a manual out-of-sample window — run the backtest with an earlier end-date to fit, then re-run on the held-out window. Refuse to ship if Sharpe collapses by more than ~50%.

How realistic is the slippage model?

Default is a flat 4.5 bps per side, applied symmetrically to entries and exits. That matches Hyperliquid taker fees plus a thin top-of-book spread for the top ~50 perps. For thinner names or for orders larger than $50k notional, you should raise it — slippage scales with order size relative to displayed depth, which the simulator does not model directly.

Can I model cross-venue funding spreads (Binance vs HL, etc.)?

No — Keel currently backtests funding on Hyperliquid only. Cross-venue spread strategies are out of scope. If you want to research the spread itself you can pull other-venue funding manually, but the simulator only accepts a single funding-rate series tied to the HL price series it is trading.

Can I export results?

Yes. The backtest detail page in the Keel app exposes the equity curve, per-bar position weights, fills, and metrics; the same payload is available as JSON via the API and the keel-trade CLI for terminal or AI-agent workflows.

Can I deploy directly from a backtest?

Yes. The pipeline you backtest is the same pipeline that runs live — same components, same parameters, same vol-targeting and buffered-rebalancing logic. After a backtest you can promote a strategy to paper or live mode against your Hyperliquid account; nothing about the pipeline changes between modes.

How many assets can I backtest at once?

The Keel simulator scales to the full ~220-perp HL universe in a single backtest. Most workflows run a top-30 or top-60 dynamic universe selected by volume or open interest; running all ~220 is fine for breadth research but inflates turnover and noise.

What differs from QuantConnect or VectorBT?

Three things. (1) Native Hyperliquid data and funding cadence baked in — no adapter layer. (2) The same pipeline graph runs live, so backtest-to-live drift is structural rather than reimplemented. (3) The component library (199 registered components) is built specifically for perps research, not general equities. Trade-off: no support for other venues and a smaller community than the generalist platforms.