Skip to content

Commit cf54e25

Browse files
authored
Add latest machine learning and openmp projects for Clad (#1691)
1 parent 42f827e commit cf54e25

File tree

3 files changed

+126
-0
lines changed

3 files changed

+126
-0
lines changed
+44
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
---
2+
title: Enhancing LLM Training with Clad for efficient differentiation
3+
4+
layout: gsoc_proposal
5+
project: Clad
6+
year: 2025
7+
difficulty: medium
8+
duration: 350
9+
mentor_avail: June-October
10+
organization:
11+
- CompRes
12+
---
13+
14+
## Description
15+
16+
This project aims to leverage Clad, an automatic differentiation (AD) plugin for Clang, to optimize large language model (LLM) training primarily in C++. Automatic differentiation is a crucial component of deep learning training, enabling efficient computation of gradients for optimization algorithms such as stochastic gradient descent (SGD). While most modern LLM frameworks rely on Python-based ecosystems, their heavy reliance on interpreted code and dynamic computation graphs can introduce performance bottlenecks. By integrating Clad into C++-based deep learning pipelines, we can enable high-performance differentiation at the compiler level, reducing computational overhead and improving memory efficiency. This will allow developers to build more optimized training workflows without sacrificing flexibility or precision.
17+
18+
Beyond performance improvements, integrating Clad with LLM training in C++ opens new possibilities for deploying AI models in resource-constrained environments, such as embedded systems and HPC clusters, where minimizing memory footprint and maximizing computational efficiency are critical. Additionally, this work will bridge the gap between modern deep learning research and traditional scientific computing by providing a more robust and scalable AD solution for physics-informed machine learning models. By optimizing the differentiation process at the compiler level, this project has the potential to enhance both research and production-level AI applications, aligning with compiler-research.org’s broader goal of advancing computational techniques for scientific discovery.
19+
20+
21+
## Expected Results
22+
23+
* Develop a simplified LLM setup in C++
24+
* Apply Clad to compute gradients for selected layers and loss functions
25+
* Enhance clad to support it if necessary, and prepare performance benchmarks
26+
* Enhance the LLM complexity to cover larger projects such as llama
27+
* Repeat bugfixing and benchmarks
28+
* Develop tests to ensure correctness, numerical stability, and efficiency
29+
* Document the approach, implementation details, and performance gains
30+
* Present progress and findings at relevant meetings and conferences
31+
32+
## Requirements
33+
34+
* Automatic differentiation
35+
* Parallel programming
36+
* Reasonable expertise in C++ programming
37+
* Background in LLM is preferred but not required
38+
39+
## Mentors
40+
* **[Vassil Vassilev](mailto:[email protected])**
41+
* [David Lange](mailto:[email protected])
42+
43+
## Links
44+
* [Repo](https://github.com/vgvassilev/clad)
+43
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
title: Enable automatic differentiation of OpenMP programs with Clad
3+
layout: gsoc_proposal
4+
project: Clad
5+
year: 2025
6+
difficulty: medium
7+
duration: 350
8+
mentor_avail: June-October
9+
organization:
10+
- CompRes
11+
---
12+
13+
## Description
14+
15+
Clad is an automatic differentiation (AD) clang plugin for C++. Given a C++ source code of a mathematical function, it can automatically generate C++ code for computing derivatives of the function. Clad is useful in powering statistical analysis and uncertainty assessment applications.
16+
ONNX (Open Neural Network Exchange) provides a standardized format for machine learning models, widely used for interoperability between frameworks like PyTorch and TensorFlow
17+
18+
This project aims to integrate Clad, an automatic differentiation (AD) plugin for Clang, with ONNX-based machine learning models. Clad can generate derivative computations for C++ functions, making it useful for sensitivity analysis, optimization, and uncertainty quantification. By extending Clad’s capabilities to ONNX models, this project will enable efficient differentiation of neural network operations within an ONNX execution environment.
19+
20+
## Expected Results
21+
22+
* Enumerate ONNX modules with increasing complexity and analyze their differentiation requirements.
23+
* Develop a structured plan for differentiating the identified ONNX operations.
24+
* Implement forward mode differentiation for selected ONNX operations.
25+
* Extend support to reverse mode differentiation for more complex cases.
26+
* Create comprehensive tests to validate correctness and efficiency.
27+
* Write clear documentation to ensure ease of use and future maintenance.
28+
* Present results at relevant meetings and conferences.
29+
30+
31+
## Requirements
32+
33+
* Automatic differentiation
34+
* Parallel programming
35+
* Reasonable expertise in C++ programming
36+
* Basic knowledge of Clang is preferred but not mandatory
37+
38+
## Mentors
39+
* **[Vassil Vassilev](mailto:[email protected])**
40+
* [David Lange](mailto:[email protected])
41+
42+
## Links
43+
* [Repo](https://github.com/vgvassilev/clad)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
---
2+
title: Enable automatic differentiation of OpenMP programs with Clad
3+
layout: gsoc_proposal
4+
project: Clad
5+
year: 2025
6+
difficulty: medium
7+
duration: 350
8+
mentor_avail: June-October
9+
organization:
10+
- CompRes
11+
---
12+
13+
## Description
14+
15+
Clad is an automatic differentiation (AD) clang plugin for C++. Given a C++ source code of a mathematical function, it can automatically generate C++ code for computing derivatives of the function. Clad is useful in powering statistical analysis and uncertainty assessment applications.
16+
OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and other computing platforms.
17+
18+
This project aims to develop infrastructure in Clad to support the differentiation of programs that contain OpenMP primitives.
19+
20+
## Expected Results
21+
22+
* Extend the pragma handling support
23+
* List the most commonly used OpenMP concurrency primitives and prepare a plan for how they should be handled in both forward and reverse accumulation in Clad
24+
* Add support for concurrency primitives in Clad’s forward and reverse mode automatic differentiation.
25+
* Add proper tests and documentation.
26+
* Present the work at the relevant meetings and conferences.
27+
28+
## Requirements
29+
30+
* Automatic differentiation
31+
* C++ programming
32+
* Parallel Programming
33+
34+
## Mentors
35+
* **[Vassil Vassilev](mailto:[email protected])**
36+
* [David Lange](mailto:[email protected])
37+
38+
## Links
39+
* [Repo](https://github.com/vgvassilev/clad)

0 commit comments

Comments
 (0)