Methodology

How to backtest a perp strategy

Perp backtests differ from spot in three places: funding is a separate cashflow, leverage geometry is nonlinear, and liquidation is a discrete cliff. This page walks the four pieces of the recipe with a concrete Hyperliquid example — momentum on top-30 perps with vol targeting and a carry overlay.

By Keel Research Team · Updated May 18, 2026

Why perp backtests differ from spot

A spot backtest has one accounting line: price moves, P&L accrues, end of story. A perp backtest has three more. First, funding is a separate cashflow paid on notional every funding interval — on Hyperliquid, hourly. Holding a 1x long through 720 funding intervals at +0.01% per hour costs ~7% of notional in a month, independent of price. Second, leverage geometry is nonlinear: a 10% adverse price move on a 5x book is a 50% equity drawdown, and the next 10% move is computed off a smaller base. Third, liquidation is a cliff, not a slope — once equity hits maintenance margin, the position is closed at the worst possible price by the exchange, and there is no recovery.

The four pieces of the backtest recipe — data, simulator, costs, validation — all change when you move from spot to perps. The rest of this page walks each one.

Data: bar-level OHLCV plus funding cadence

Perp backtests need two aligned series per asset: OHLCV bars and funding-rate bars. On Hyperliquid that is 15-minute OHLCV for ~220 active perps plus 1-hour funding on the same universe. Keel ingests both into a unified store so the simulator can walk them in lock-step.

SeriesCadence (HL)What it drives
Perp OHLCV15-minute barsSignals, mark-to-market, fill price
Funding rate1-hour bars (paid hourly)Funding-cashflow leg of equity
Open interest1-hour barsUniverse filters, capacity sanity check

Time alignment matters. A 15-minute OHLCV bar closed at HH:00 and the funding rate stamped at HH:00 should both apply to the position carried into HH:00 — not the position taken at HH:15. Get the convention wrong and you fabricate a free 1-bar look- ahead on funding decisions.

Modeling funding: a separate cashflow, not a price haircut

The single most common mistake in perp backtests is folding funding into the price series — subtracting a flat haircut per day, or treating it as a fee. Both destroy the analysis. Funding is a signed cashflow on the prior bar position, paid every funding interval. The simulator computes it as:

# every bar, after positions update:
funding_pnl[t] = position[t-1] * close[t] * -funding_rate[t]
equity[t]      = equity[t-1] + price_pnl[t] + funding_pnl[t] - costs[t]

That structure gives you two free properties. First, a long in a positive-funding regime pays funding (it is a cost); a short in the same regime receives funding (it is income) — and the sign flips automatically with the position. Second, the engine emits both a price-only and funding-only equity curve on the same run, which makes it obvious whether the strategy is earning from price moves or from carry.

Modeling leverage and liquidation

The Keel simulator tracks two leverage numbers at every bar: gross (sum of absolute notional / equity) and net (signed sum / equity). Target weights from the pipeline are scaled so that gross leverage stays under the configured cap (default 1.0; configurable up to whatever Hyperliquid’s maintenance margin allows for the instruments traded). When a strategy emits weights that would breach the cap, the simulator re-normalizes — it does not silently over-leverage.

Liquidation is the failure mode the simulator does not fire. Instead, the simulator assumes the strategy keeps gross leverage well below the maintenance-margin boundary — which is true for any vol-targeted book (positions are sized off realized vol, so they shrink during drawdowns). For strategies that run fixed-leverage close to the liquidation boundary, the backtested equity curve is an upper bound: in reality, a 30% adverse move at 5x is account zero, not 150% drawdown. Always stress-test fixed-leverage strategies separately by replaying the worst observed drawdown with a discrete liquidation rule.

Sizing for perps: vol targeting and leverage caps

Two sizing rules, applied in order:

  • Per-leg vol target. Each position is scaled so its annualized contribution to portfolio vol matches a target — default 15-20% per leg. The Keel VolTarget component takes realized vol over a configurable window (default 30 days of 15m bars) and rescales target weights inversely.
  • Portfolio gross-leverage cap. After per-leg vol scaling, the weights are renormalized so gross leverage stays under the configured cap. Without this second step, a 100-asset basket with each leg vol-targeted to 15% can run at 10x+ gross — fine in a low-vol regime, account-zero in a spike.

The pair of rules together produces a book whose realized vol is stable across regimes and whose tail risk is bounded by the leverage cap, not by the strategy’s instantaneous gross exposure.

A third sizing question that matters for perps but not for spot is turnover under buffered rebalancing. With vol-targeted weights changing every bar, naively executing the full target delta every 15 minutes runs turnover into the ground — and on perps, turnover is a direct linear cost in fees and slippage. The Keel buffered-rebalance mode suppresses the order unless actual-vs-target drift exceeds a threshold (default ~10% relative); for a vol-targeted book that collapses turnover by 5-10x with negligible tracking error, and shifts the strategy from cost-bound to signal-bound.

OOS validation: what Keel ships today

Single-window parameter optimization is shipped — the Keel app runs a grid search over declared parameters and ranks by Sharpe (or any metric). Walk-forward optimization (rolling fit/test windows) is on the roadmap. Until WFO ships in-platform, the practical minimum is a manual out-of-sample window:

  • Hold out the most recent 3-6 months. Run a backtest with an end-date set 3-6 months before today to fit parameters on the older slice, then re-run on the held-out window to score. Reject the strategy if OOS Sharpe is below half of IS.
  • Score across at least two distinct funding/vol regimes — Hyperliquid has had three since launch.
  • Sanity-check parameter sensitivity: vary each parameter ±25% and look at metric stability. A 10% shift that flips the Sharpe sign means the apparent edge is fit noise.

For the broader case (rolling windows, degradation plots, IS/OOS anchoring choices), see the walk-forward methodology.

A concrete HL example

Putting the recipe together: momentum on the top-30 Hyperliquid perps by 30-day dollar volume, with vol targeting and a small funding-carry overlay. Universe filter rejects names with fewer than 90 days of bars (so the backtest does not include freshly-listed tokens). Each leg is vol-targeted to 15% annualized; gross leverage capped at 2x. The carry overlay tilts long the highest-funding-paid shorts and short the lowest. The same pipeline runs in backtest, paper, and live.

The fastest path is the Keel web app — compose the pipeline in the visual builder, click Run Backtest, and read the decomposed equity curve right there. Fork an existing strategy from a share link to skip the from-scratch build entirely.

The decomposed equity curve is the diagnostic that matters most for perps. The Keel app renders price-only, funding-only, and combined on the backtest detail page. If combined Sharpe is 1.8 but price-only Sharpe is 0.4, you are running a carry strategy with a momentum hat — fine, as long as you know that’s what you shipped.

Driving Keel from a terminal or AI agent? pipx install keel-trade installs the keel CLI; the CLI reference covers strategy create, backtest run, and backtest results — the same three decomposed series come back in the JSON response.

Common perp-specific failure modes

  • Funding folded into the price series. The most common bug. Subtracting a flat funding haircut per day from returns destroys the sign relationship between position and funding cashflow. Always model funding as a separate cashflow on the prior bar position.
  • Leverage drift from fixed-notional sizing. A 10x book at the start of a calm regime can be a 25x book two months later as vol compresses. Vol-target every leg, and cap gross leverage at the portfolio level.
  • Look-ahead on funding decisions. The HL funding rate stamped at HH:00 applies to the position carried into HH:00, not the position taken at HH:15. Misaligning the cadences is a free 1-bar look-ahead on the most lucrative decision the strategy makes.
  • Ignoring discrete liquidation. A fixed-5x book that experiences a -20% adverse move in the backtest shows a -100% equity drawdown and a slow recovery. In live trading the position is closed at the maintenance-margin boundary at the worst tick, and the recovery curve does not exist. Stress-test fixed-leverage strategies with a discrete liquidation replay.

Try it

Open the Keel app, compose the pipeline in the visual builder, and backtest against your Hyperliquid account. Terminal and AI-agent users can drive the same backtest from the keel-trade CLI — same pipeline, same simulator, same funding model.

FAQ

Common questions

How is a perp backtest different from a spot backtest?

Three structural differences. (1) Funding is a separate cashflow paid on notional every funding interval — it is not a price effect. (2) Equity compounds against leveraged notional, not cash, so P&L geometry is nonlinear in leverage. (3) Liquidation is a discrete cliff at the maintenance-margin boundary — a 30% drawdown on 5x leverage is not a 30% loss, it is account zero. None of those exist in spot.

How is funding modeled in the backtest?

As an explicit cashflow on the prior bar position, paid at the funding cadence (hourly on Hyperliquid). The simulator computes funding_pnl[t] = position[t-1] * close[t] * -funding_rate[t] every bar and folds it into equity. It is not subtracted from the price series — keeping it separate lets you decompose price-only vs funding-only attribution on the same equity curve.

Does the simulator model leverage and liquidation?

It tracks portfolio gross and net leverage against the configured caps and rejects target weights that would exceed them. The simulator does not currently fire a discrete liquidation event — instead it assumes the strategy stays within its leverage caps, which is true for any vol-targeted book sized off realized vol. For strategies running fixed leverage close to the maintenance margin, treat the equity curve as upper-bound and stress-test separately.

How should I size positions for perps?

Vol-target every position to a fixed annualized vol budget (default 15-20% per leg, scaled by gross leverage cap). For cross-sectional baskets, normalize signal strength to target weights, then scale to the gross-leverage cap. Avoid fixed-notional sizing on perps — vol drift turns a stable book into a high-leverage one in two months.

What out-of-sample testing does Keel support today?

Single-window parameter optimization is shipped in the Keel app today. Rolling and anchored walk-forward are on the roadmap. Until then, hold out the most recent 3-6 months as a manual out-of-sample window — run a backtest with an earlier end-date to fit, then re-run on the held-out window. Refuse to ship if Sharpe drops more than ~40%.

Can I backtest a cross-venue perp strategy (e.g. HL vs Binance)?

Not in Keel. The simulator currently runs single-venue Hyperliquid only — one price series, one funding series per asset. Cross-venue spread research has to happen outside the platform; cross-venue execution is not on the near roadmap.