About Me AI Research Engineer with a deep passion for neural networks, large language models, and cutting-edge machine
learning techniques. Skilled in designing and optimizing scalable systems with a strong foundation in PyTorch
and CUDA. Committed to open-source development, multilingual AI accessibility, and building intuitive tools
that bridge the gap between research and real-world applications by translating them into practical implementations.
For more, check out my:
- ExpertRAG: Efficient RAG with Mixture of Experts -- Optimizing Context Retrieval for Adaptive LLM Responses
- Galvatron: Automatic Distributed Training for Large Transformer Models
- Theoretical Foundations and Mitigation of Hallucination in Large Language Models
- Mixture of Transformers: Macro-Level Gating for Sparse Activation in Large Language Model Ensembles
- Bachelor Thesis: AI Engine: Deep Learning and Neural Network Engine
- Universal Approximation Theorem for a Single-Layer Transformer
- Mixture of Attention Schemes (MoAS): Learning to Route Between MHA, GQA, and MQA


