This repository is deprecated to be private as of 2025-08-18.
This repository manages high-frequency Level 2 (DOM) data from a major cryptocurrency spot exchange. The goal is to structure this data for use in Reinforcement Learning (RL) and Transformer-based models to extract alpha signals from market microstructure dynamics.
- Automate the download and organization of spot OrderBook tick data.
- Visualize snapshots and deltas via GUI.
- Generate datasets at fine timestamp resolutions (e.g., 10s or true tick).
- Build infrastructure for training RL and Transformer models:
- Focus exclusively on order book data (exclude chart-based history).
- Normalize time series to eliminate symbol dependency.
- Use recent time frames to predict future movement likelihood.
- Date: 2025.07.27
- Range (78 days): 2025.05.10 ~ 2025.07.26
- Symbols (30):
ETHUSDT,BTCUSDT,SOLUSDT,ETHUSDC,BTCUSDC,XRPUSDT,PEPEUSDT,DOGEUSDT,SUIUSDT,AAVEUSDT,ONDOUSDT,SOLUSDC,ADAUSDT,XRPUSDC,LTCUSDT,DOGEUSDC,HBARUSDT,UNIUSDT,SUIUSDC,DOTUSDT,ADAUSDC,WLDUSDT,NEARUSDT,AVAXUSDT,TONUSDT,BCHUSDT,PEPEUSDC,LINKUSDT,BNBUSDT,SHIBUSDT
Next data acquisition date is +13 days from the latest data acquisition date, e.g.,
Latest Acq. Date +00: 2025-07-27
Next Acq.... Date +13: 2025-08-09
-
📘 OrderBook Format
(Tick-level DOM snapshots and deltas —.datafiles) -
📙 Execution Format
(Trade history CSV with RPI flags —.csvfiles)
See also 🔗 ByBit Data Explanation.
- Summarize the features, advantages, and limitations of Spot Chart and Order Book data, including their intended purpose.
- Fully understand the field structures and data schemas involved.
- Add relevant papers and video links that help explain the underlying concepts.
This script automates the download of historical spot chart data (executions) and DOM snapshots (orderbook) from ByBit public archives.
It supports multi-day, multi-symbol batch retrieval via parallel curl executions
and includes post-download validation to ensure data format integrity.
Configuration parameters (e.g., date range, trading pairs, max parallelism)
are specified in the external file get_bybit_chart_dom.conf.
Please note: This script involves rate-sensitive infrastructure. Public sharing is deliberately restricted.
A companion script to validate the integrity of previously downloaded .csv.gz and .data.zip files.
It performs full-format checks using parallel validation,
ensuring the data is safe to use for RL training or analysis.
This script is intended to be run after get_bybit_chart_dom.py
and does not require internet access.
For detailed instructions on setting up the development environment, including Docker Desktop and VS Code integration for cross-platform workflows, refer to:
This guide provides step-by-step instructions for configuring Windows-based development environments with Ubuntu deployment targets, ensuring consistency and compatibility across platforms.
Deliver a normalized, high-resolution dataset for training RL and Transformer models capable of predicting future market behavior from raw order book streams.
This project is licensed under the
✍️ Creative Commons Attribution-NonCommercial 4.0 International License – Legal Code.
🚫💰 Commercial use is prohibited.
✨🛠️ Adaptation is permitted with attribution.
