Why perp backtests differ from spot
A spot backtest has one accounting line: price moves, P&L accrues, end of story. A perp backtest has three more. First, funding is a separate cashflow paid on notional every funding interval — on Hyperliquid, hourly. Holding a 1x long through 720 funding intervals at +0.01% per hour costs ~7% of notional in a month, independent of price. Second, leverage geometry is nonlinear: a 10% adverse price move on a 5x book is a 50% equity drawdown, and the next 10% move is computed off a smaller base. Third, liquidation is a cliff, not a slope — once equity hits maintenance margin, the position is closed at the worst possible price by the exchange, and there is no recovery.
The four pieces of the backtest recipe — data, simulator, costs, validation — all change when you move from spot to perps. The rest of this page walks each one.
Data: bar-level OHLCV plus funding cadence
Perp backtests need two aligned series per asset: OHLCV bars and funding-rate bars. On Hyperliquid that is 15-minute OHLCV for ~220 active perps plus 1-hour funding on the same universe. Keel ingests both into a unified store so the simulator can walk them in lock-step.
| Series | Cadence (HL) | What it drives |
|---|---|---|
| Perp OHLCV | 15-minute bars | Signals, mark-to-market, fill price |
| Funding rate | 1-hour bars (paid hourly) | Funding-cashflow leg of equity |
| Open interest | 1-hour bars | Universe filters, capacity sanity check |
Time alignment matters. A 15-minute OHLCV bar closed at HH:00 and the funding rate stamped at HH:00 should both apply to the position carried into HH:00 — not the position taken at HH:15. Get the convention wrong and you fabricate a free 1-bar look- ahead on funding decisions.
Modeling funding: a separate cashflow, not a price haircut
The single most common mistake in perp backtests is folding funding into the price series — subtracting a flat haircut per day, or treating it as a fee. Both destroy the analysis. Funding is a signed cashflow on the prior bar position, paid every funding interval. The simulator computes it as:
# every bar, after positions update: funding_pnl[t] = position[t-1] * close[t] * -funding_rate[t] equity[t] = equity[t-1] + price_pnl[t] + funding_pnl[t] - costs[t]
That structure gives you two free properties. First, a long in a positive-funding regime pays funding (it is a cost); a short in the same regime receives funding (it is income) — and the sign flips automatically with the position. Second, the engine emits both a price-only and funding-only equity curve on the same run, which makes it obvious whether the strategy is earning from price moves or from carry.
Modeling leverage and liquidation
The Keel simulator tracks two leverage numbers at every bar: gross (sum of absolute notional / equity) and net (signed sum / equity). Target weights from the pipeline are scaled so that gross leverage stays under the configured cap (default 1.0; configurable up to whatever Hyperliquid’s maintenance margin allows for the instruments traded). When a strategy emits weights that would breach the cap, the simulator re-normalizes — it does not silently over-leverage.
Liquidation is the failure mode the simulator does not fire. Instead, the simulator assumes the strategy keeps gross leverage well below the maintenance-margin boundary — which is true for any vol-targeted book (positions are sized off realized vol, so they shrink during drawdowns). For strategies that run fixed-leverage close to the liquidation boundary, the backtested equity curve is an upper bound: in reality, a 30% adverse move at 5x is account zero, not 150% drawdown. Always stress-test fixed-leverage strategies separately by replaying the worst observed drawdown with a discrete liquidation rule.
Sizing for perps: vol targeting and leverage caps
Two sizing rules, applied in order:
- Per-leg vol target. Each position is scaled so its annualized contribution to portfolio vol matches a target — default 15-20% per leg. The Keel
VolTargetcomponent takes realized vol over a configurable window (default 30 days of 15m bars) and rescales target weights inversely. - Portfolio gross-leverage cap. After per-leg vol scaling, the weights are renormalized so gross leverage stays under the configured cap. Without this second step, a 100-asset basket with each leg vol-targeted to 15% can run at 10x+ gross — fine in a low-vol regime, account-zero in a spike.
The pair of rules together produces a book whose realized vol is stable across regimes and whose tail risk is bounded by the leverage cap, not by the strategy’s instantaneous gross exposure.
A third sizing question that matters for perps but not for spot is turnover under buffered rebalancing. With vol-targeted weights changing every bar, naively executing the full target delta every 15 minutes runs turnover into the ground — and on perps, turnover is a direct linear cost in fees and slippage. The Keel buffered-rebalance mode suppresses the order unless actual-vs-target drift exceeds a threshold (default ~10% relative); for a vol-targeted book that collapses turnover by 5-10x with negligible tracking error, and shifts the strategy from cost-bound to signal-bound.
OOS validation: what Keel ships today
Single-window parameter optimization is shipped — the Keel app runs a grid search over declared parameters and ranks by Sharpe (or any metric). Walk-forward optimization (rolling fit/test windows) is on the roadmap. Until WFO ships in-platform, the practical minimum is a manual out-of-sample window:
- Hold out the most recent 3-6 months. Run a backtest with an end-date set 3-6 months before today to fit parameters on the older slice, then re-run on the held-out window to score. Reject the strategy if OOS Sharpe is below half of IS.
- Score across at least two distinct funding/vol regimes — Hyperliquid has had three since launch.
- Sanity-check parameter sensitivity: vary each parameter ±25% and look at metric stability. A 10% shift that flips the Sharpe sign means the apparent edge is fit noise.
For the broader case (rolling windows, degradation plots, IS/OOS anchoring choices), see the walk-forward methodology.
A concrete HL example
Putting the recipe together: momentum on the top-30 Hyperliquid perps by 30-day dollar volume, with vol targeting and a small funding-carry overlay. Universe filter rejects names with fewer than 90 days of bars (so the backtest does not include freshly-listed tokens). Each leg is vol-targeted to 15% annualized; gross leverage capped at 2x. The carry overlay tilts long the highest-funding-paid shorts and short the lowest. The same pipeline runs in backtest, paper, and live.
The fastest path is the Keel web app — compose the pipeline in the visual builder, click Run Backtest, and read the decomposed equity curve right there. Fork an existing strategy from a share link to skip the from-scratch build entirely.
The decomposed equity curve is the diagnostic that matters most for perps. The Keel app renders price-only, funding-only, and combined on the backtest detail page. If combined Sharpe is 1.8 but price-only Sharpe is 0.4, you are running a carry strategy with a momentum hat — fine, as long as you know that’s what you shipped.
Driving Keel from a terminal or AI agent? pipx install keel-trade installs the keel CLI; the CLI reference covers strategy create, backtest run, and backtest results — the same three decomposed series come back in the JSON response.
Common perp-specific failure modes
- Funding folded into the price series. The most common bug. Subtracting a flat funding haircut per day from returns destroys the sign relationship between position and funding cashflow. Always model funding as a separate cashflow on the prior bar position.
- Leverage drift from fixed-notional sizing. A 10x book at the start of a calm regime can be a 25x book two months later as vol compresses. Vol-target every leg, and cap gross leverage at the portfolio level.
- Look-ahead on funding decisions. The HL funding rate stamped at HH:00 applies to the position carried into HH:00, not the position taken at HH:15. Misaligning the cadences is a free 1-bar look-ahead on the most lucrative decision the strategy makes.
- Ignoring discrete liquidation. A fixed-5x book that experiences a -20% adverse move in the backtest shows a -100% equity drawdown and a slow recovery. In live trading the position is closed at the maintenance-margin boundary at the worst tick, and the recovery curve does not exist. Stress-test fixed-leverage strategies with a discrete liquidation replay.
Try it
Open the Keel app, compose the pipeline in the visual builder, and backtest against your Hyperliquid account. Terminal and AI-agent users can drive the same backtest from the keel-trade CLI — same pipeline, same simulator, same funding model.