The concept of walk-forward optimization is generic — read the concept explainer if you want a from-scratch primer. This page covers what changes when you apply it to Hyperliquid: short market histories, fast regime shifts, hourly funding cycles, and a venue where the microstructure of any given perp has measurably changed inside the last year.
Why Hyperliquid needs walk-forward specifically
Three properties of HL data make single-split backtests especially dangerous:
- Newer-listed assets dominate the universe. Roughly half of HL's ~220 perps have less than a year of continuous funding history. A single 70/30 split on a top-of-vol universe will end up training on a few large markets and testing on a regime where the universe composition itself has changed (new listings included, old illiquid markets pruned). Walk-forward at least exposes that drift per window.
- Regime shifts are frequent and large. The HL funding-rate distribution from October 2024 looks nothing like February 2026. Mean absolute funding compressed by more than half across that window. Anything tuned to one regime — carry bands, vol scalers, signal lookbacks — will degrade in the other. Walk-forward catches that; one OOS slice usually does not.
- Hourly funding cycles introduce a fast clock. On most centralized venues funding is 8-hourly, so an OOS window of a few months contains a few hundred funding events. On HL it contains a few thousand. That cuts the variance of funding-derived metrics enough that a per-window OOS Sharpe computed on funding-carry P&L is actually distinguishable from noise — even on 3-month windows. Use this.
IS/OOS window sizing for HL data
Translate strategy frequency to trade count, then size windows for at least 100 trades in-sample. Concrete recommendations for common Keel pipelines on HL:
| Strategy shape | IS window | OOS window | Step |
|---|---|---|---|
| Daily rebalance, top-30 vol universe | 180 days | 60 days | 60 days |
| Weekly carry book, regime-gated | 270 days | 90 days | 90 days |
| Intraday momentum on majors only | 90 days | 30 days | 30 days |
| Hierarchical multi-signal portfolio, monthly reallocation | 365 days | 90 days | 90 days |
These are starting points, not laws. The check is simple: compute the realized trade count in the first IS window. If it's under 100, lengthen IS. If it's over 1,000, shorten it — extra trades past that point don't buy more parameter stability, and shorter windows let you run more of them.
Rolling vs anchored: which mode on HL
Rolling drops the oldest IS bar every step; anchored keeps everything from inception and just appends. On HL, choose by market maturity:
- Anchored — default for majors. BTC, ETH, SOL perps on HL have enough history that anchoring lets the IS window grow into a statistically meaningful estimate. The microstructure of these markets has been stable enough that old data is still informative.
- Rolling — default for newer listings. A perp that listed in mid-2025 and went through a liquidity step change in early 2026 has two completely different markets inside it. Anchoring averages them. Rolling lets the IS window catch up to current microstructure.
- Hybrid is fine. Run anchored on the top-of-vol-30 subset, rolling on everything outside it. Or run both and compare aggregate OOS Sharpe — if anchored substantially beats rolling, your strategy is leaning on history that may not repeat.
The scheduler in pseudocode
Both modes collapse to a few lines. The scheduler emits IS/OOS slice pairs; an outer loop fits parameters on each IS slice and applies the frozen parameter set to the corresponding OOS.
# Rolling walk-forward over HL 15-min bars
# Yields (is_slice, oos_slice) pairs; refit on every step.
def rolling_walk_forward(bars, is_len, oos_len, step):
start = 0
while start + is_len + oos_len <= len(bars):
is_slice = bars[start : start + is_len]
oos_slice = bars[start + is_len : start + is_len + oos_len]
yield is_slice, oos_slice
start += step # step = oos_len for non-overlapping OOS
# Anchored variant: in-sample expands, OOS still walks forward
def anchored_walk_forward(bars, initial_is_len, oos_len, step):
is_end = initial_is_len
while is_end + oos_len <= len(bars):
is_slice = bars[: is_end]
oos_slice = bars[is_end : is_end + oos_len]
yield is_slice, oos_slice
is_end += step
In practice, the work happens inside the optimizer step, not the scheduler. The scheduler is trivial; sizing the windows and defining the optimization objective are where the engineering effort lives. Common objectives on HL: maximize OOS Sharpe subject to OOS max-drawdown floor, or maximize median per-window OOS Sharpe (more robust than the mean to a single big window dominating).
What Keel ships today vs WFO on the roadmap
Keel today runs single-window backtests from the web app — date-picker on the backtest screen, click Run, read the tearsheet. Terminal and AI-agent users can drive the same backtest from the keel-trade CLI with explicit --start-date and --end-date flags on keel backtest run. Real, shipped, usable.
What Keel does not yet ship is a native walk-forward run that automates the IS/OOS rolling, refit-per-window, and aggregate OOS reporting in one command. That is planned for a later release this year. Until it ships, the workflow is manual: pick an IS window, backtest a few candidate parameter sets, freeze the configuration, run a second backtest on the corresponding OOS window, and aggregate the per-window results yourself. The visualizer below is built for exactly that flow.
Until Keel ships WFO natively
Two practical paths:
- Use the WFO visualizer linked below. Paste or upload a returns series, set window length and step, and see per-window IS vs OOS Sharpe and the mean degradation. Useful for any returns series — not just Keel-generated ones.
- Run the windows by hand in the app. For each fold, change the start/end dates in the backtest screen, run, and record the metrics. Side-by-side comparison via shared tearsheet URLs. Tedious but defensible — worth doing for any strategy you intend to deploy with real capital.
- Script the manual loop from a terminal. If you drive Keel from a shell or an AI agent, a short wrapper around
keel backtest runover a list of IS and OOS date ranges aggregates per-window results into a degradation table without the click-through.
Try the visualizer
Pure-browser WFO visualizer: paste or upload a returns series, pick window length and mode, and see per-window IS vs OOS Sharpe plus mean degradation. No upload to server, no signup.