🧑💻About me
I am a graduate student at the School of Software, Tsinghua University, with a background in information and communication engineering from Beijing University of Posts and Telecommunications.
I'm interested in database systems, with a focus on storage, query optimization, and scheduling, and actively involved in the evolution of open-source databases and file formats. My current work focuses on optimizing file formats and database systems for emerging hardware and AI workloads.
🚀 Current Projects and Explorations
- Apache TsFile -- Exploring optimizations of the time-series file format for embedded and analytical workloads, with a focus on improving performance under widespread SIMD adoption and AI training scenarios.
- MiniGU -- Contributing to the development of a Rust-based embedded graph database with GQL support, currently aimed at educational and research use in universities, and exploring native graph–vector integration for future RAG systems.
- [SafeBound-for-Update] -- Investigating a pessimistic cardinality estimation method for graph path matching that better accommodates update operations while providing a tighter upper bound than the original Safebound.
- [LoadaWise-ORCA-for-NewHardwareDB] -- Designing load-aware query optimization and scheduling methods in the ORCA optimizer framework, targeting multi-model databases on emerging hardware platforms.
📜 Past Projects:
- Apache IoTDB -- Developed the authorization and user management system, designing and implementing permission mechanisms for both the tree model and the table model.
- TugraphDB Designed a metadata encoding scheme to resolve update anomalies caused by the original encoding method, enabling metadata modifications with O(1) complexity.
- Apache HAWQ (commercial version a.k.a. OushuDB) — Implemented resource and cluster virtualization to support multi-tenancy and affinity-based query scheduling.
- Benchmark for Multi-Language TsFile — A configurable performance benchmarking tool for TsFile across multiple programming languages. It supports scheduled execution and continuous monitoring, with results periodically reported to the TsFile community via GitHub issues.
- SIMD_TS2DIFF -- A vectorized decoding approach for the core encoding/decoding method in TsFile, incorporating block-level filtering with bit-packing. This design improves time-range query performance by up to two orders of magnitude.
📰 News:
- Give a talk about data model convertion in IoTDB at CCF bigdata 2025.
- Release TsFile (C,C++,Python) V2.1.0 as Release Manager.
- Honored to have become a committer of Apache TsFile.
- Recognized as Tugraph Core Contributor of the Year by the community.
- Honored to have become a committer of Apache IoTDB.



