Guides
Export Hyperliquid OHLCV Candles to Parquet cover

Export Hyperliquid OHLCV Candles to Parquet

Generate Parquet files for a chosen Hyperliquid market and candle interval by iterating Dwellir's OHLCV endpoint.

LanguagePython
FormatParquet
ProtocolREST

If you want a compressed, columnar export for analytics workflows, Parquet is the better fit than CSV. This guide shows how to iterate Dwellir's Hyperliquid OHLCV endpoint and write sparse candles to a .parquet file for one market and one interval.

Copy-Paste Prompt for Your Coding Tool

Paste this into Claude Code, Codex, Cursor, Windsurf, or another coding agent:

Text
Build me a small script that exports Hyperliquid OHLCV candles from Dwellir into a Parquet file.

Requirements:
- First check whether the `dwellir` CLI is installed.
- If it is not installed, recommend installing it with one of these options:
  - `brew tap dwellir-public/homebrew-tap && brew install dwellir`
  - `curl -fsSL https://raw.githubusercontent.com/dwellir-public/cli/main/scripts/install.sh | sh`
- After installation, ask me to authenticate it by running `dwellir auth login`.
- Check authentication with `dwellir auth status`.
- Get an enabled API key with `dwellir keys list --toon`. If there are multiple good candidates, ask me which key to use.
- Use the Dwellir Hyperliquid OHLCV REST endpoint.
- Export one market and one interval at a time.
- Treat the data as sparse: if a bucket returns 404, skip it rather than fabricating a candle.
- The OHLCV archive starts at `2025-07-27T08:00:00Z`.
- Iterate bucket-open timestamps between a start and end time.
- Support `1s`, `1m`, and `5m`.
- Write a Parquet file with columns: market, interval, bucket_start, open, high, low, close, volume, trades_count, vwap.
- Show me how to run the script for BTC 1m candles over a chosen date range.

What This Guide Builds

By the end of this guide you will have:

  • a Python script that fetches candles from Dwellir's Hyperliquid OHLCV REST API
  • a .parquet file for one market and one interval
  • an export workflow that is better suited for DuckDB, pandas, Spark, and columnar analytics

Before You Start

Install Python 3.10+ and the required packages:

Bash
pip install requests pyarrow

Make sure you have a Dwellir API key. The easiest terminal workflow is:

Bash
dwellir auth login
dwellir keys list --toon

If you do not have the CLI installed yet:

Bash
brew tap dwellir-public/homebrew-tap
brew install dwellir

Why Parquet Here

Parquet is useful when you want:

  • smaller on-disk exports than CSV
  • faster reads in columnar analytics engines
  • typed columns for downstream processing

The request pattern is still the same as CSV export: one candle per response, identified by market, interval, and bucket-open time.

Python Export Script

Save this as export_ohlcv_parquet.py:

Python
import sys
from datetime import datetime, timedelta, timezone

import pyarrow as pa
import pyarrow.parquet as pq
import requests


STEP_BY_INTERVAL = {
    "1s": timedelta(seconds=1),
    "1m": timedelta(minutes=1),
    "5m": timedelta(minutes=5),
}


def parse_utc(value: str) -> datetime:
    return datetime.fromisoformat(value.replace("Z", "+00:00")).astimezone(timezone.utc)


def format_utc(value: datetime) -> str:
    return value.astimezone(timezone.utc).isoformat().replace("+00:00", "Z")


def iter_bucket_starts(start: datetime, end: datetime, step: timedelta):
    current = start
    while current <= end:
        yield current
        current += step


def fetch_candle(api_key: str, market: str, interval: str, bucket_start: datetime):
    response = requests.get(
        f"https://api-hyperliquid-ohlcv.n.dwellir.com/{api_key}/v1/candles",
        params={
            "market": market,
            "interval": interval,
            "time": format_utc(bucket_start),
        },
        timeout=30,
    )

    if response.status_code == 404:
        return None

    response.raise_for_status()
    return response.json()


def export_parquet(api_key: str, market: str, interval: str, start: str, end: str, output_path: str):
    if interval not in STEP_BY_INTERVAL:
        raise ValueError(f"unsupported interval: {interval}")

    start_dt = parse_utc(start)
    end_dt = parse_utc(end)
    step = STEP_BY_INTERVAL[interval]

    rows = []
    for bucket_start in iter_bucket_starts(start_dt, end_dt, step):
        candle = fetch_candle(api_key, market, interval, bucket_start)
        if candle is None:
            continue
        rows.append(candle)

    table = pa.Table.from_pylist(rows, schema=pa.schema([
        ("market", pa.string()),
        ("interval", pa.string()),
        ("bucket_start", pa.string()),
        ("open", pa.string()),
        ("high", pa.string()),
        ("low", pa.string()),
        ("close", pa.string()),
        ("volume", pa.string()),
        ("trades_count", pa.int64()),
        ("vwap", pa.string()),
    ]))

    pq.write_table(table, output_path)


if __name__ == "__main__":
    if len(sys.argv) != 7:
        raise SystemExit(
            "usage: python export_ohlcv_parquet.py <API_KEY> <MARKET> <INTERVAL> <START> <END> <OUT_PARQUET>"
        )

    _, api_key, market, interval, start, end, output_path = sys.argv
    export_parquet(api_key, market, interval, start, end, output_path)

Run It

Example: export BTC 1m candles for 30 days:

Bash
python export_ohlcv_parquet.py \
  YOUR_API_KEY \
  BTC \
  1m \
  2026-03-01T00:00:00Z \
  2026-03-30T23:59:00Z \
  btc-1m-march-2026.parquet

Example: export SOL 1s candles for one hour:

Bash
python export_ohlcv_parquet.py \
  YOUR_API_KEY \
  SOL \
  1s \
  2026-03-30T12:00:00Z \
  2026-03-30T12:59:59Z \
  sol-1s-2026-03-30T12.parquet

Reading the Result

Quick validation with Python:

Python
import pyarrow.parquet as pq

table = pq.read_table("btc-1m-march-2026.parquet")
print(table.schema)
print(table.slice(0, 5).to_pydict())

Quick validation with DuckDB:

sql
SELECT *
FROM 'btc-1m-march-2026.parquet'
ORDER BY bucket_start
LIMIT 5;

Historical Coverage Notes

  • The intended historical floor is 2025-07-27T08:00:00Z.
  • Use interval-aligned bucket-open timestamps when you build your range.
  • Sparse buckets are omitted from the Parquet file because they do not exist in the source dataset.

When to Use Parquet

Choose Parquet when you want:

  • efficient long-range exports
  • better compression than CSV
  • direct use in DuckDB, Polars, Spark, or pandas analytics pipelines

If you want a simpler flat-file export, use the companion guide: