Skip to content

Commit

Permalink
GITBOOK-25: update experimental doc
Browse files Browse the repository at this point in the history
  • Loading branch information
LexLuthr authored and gitbook-bot committed Oct 16, 2024
1 parent d71f3c6 commit 77566fa
Show file tree
Hide file tree
Showing 3 changed files with 87 additions and 0 deletions.
2 changes: 2 additions & 0 deletions documentation/en/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,5 @@
* [Curio](curio-cli/curio.md)
* [Sptool](curio-cli/sptool.md)
* [API](api.md)
* [Experimental Features](experimental-features/README.md)
* [GPU Over Provisioning](experimental-features/gpu-over-provisioning.md)
13 changes: 13 additions & 0 deletions documentation/en/experimental-features/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
description: This section covers the current experimental features available in Curio
---

# Experimental Features

Curio is developing new features on a regular basis as part of the overall development. This section covers the experimental features released by Curio along with details on how to use them.

It is **not** recommended to run experimental features in production environments. The features should be tested as per your requirements, and any issues or requests should be reported to the team via Github or Slack.

Once the new features have been tested and vetted, they may be released as part of a stable Curio release and all documentation concerning those features will be moved to an appropriate section of this site.

Current experimental features are listed below.
72 changes: 72 additions & 0 deletions documentation/en/experimental-features/gpu-over-provisioning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
---
description: >-
This page explains how to allow Curio to run more than multiple GPU tasks on
a single GPU at the same time
---

# GPU Over Provisioning

## Overview

The `HARMONY_GPU_OVERPROVISION_FACTOR` environment variable enables GPU over-provisioning by allowing each physical GPU to present itself as multiple logical GPUs. When set to a value greater than 1, this feature allows a single GPU to handle multiple independent processes concurrently.

## Usage

### Enabling Over provisioning

Set the `HARMONY_GPU_OVERPROVISION_FACTOR` environment variable to the desired over-provisioning factor.

#### **Example**

```bash
export HARMONY_GPU_OVERPROVISION_FACTOR=2
```

* **Effect**: Each physical GPU is treated as two logical GPUs.
* **Application**: In a snap encode worker, this setting allows each GPU to handle two independent encode processes simultaneously.

#### Example with Service File

**/etc/curio.env File**

```sh
CURIO_LAYERS=gui,post
CURIO_ALL_REMAINING_FIELDS_ARE_OPTIONAL=true
CURIO_DB_HOST=yugabyte1,yugabyte2,yugabyte3
CURIO_DB_USER=yugabyte
CURIO_DB_PASSWORD=yugabyte
CURIO_DB_PORT=5433
CURIO_DB_NAME=yugabyte
CURIO_REPO_PATH=~/.curio
CURIO_NODE_NAME=ChangeMe
FIL_PROOFS_USE_MULTICORE_SDR=1
HARMONY_GPU_OVERPROVISION_FACTOR=2
```

## Considerations

* **Workload Compatibility**: Ideal for workloads that are not heavily memory-bound.
* **Snap Encode Workloads**: Generally suitable for over-provisioning.
* **SNARK Workloads**: May encounter memory limitations, especially on GPUs with lower memory capacity.
* **GPU Specifications**: Enterprise GPUs with higher memory are better suited for over-provisioning.
* **Performance Testing**: It's important to test and validate the optimal over-provisioning factor for your specific hardware and workloads.

### Benefits

* **Increased Throughput**: Potentially improves processing capacity per GPU.
* **Enhanced Utilization**: Makes better use of GPU resources that might otherwise be underutilized.

### Limitations

* **Memory Constraints**: Over-provisioning can lead to memory bottlenecks on GPUs with limited memory.
* **Potential Instability**: Running multiple processes on a single GPU may affect system stability and performance.

### Recommendations

* **Start with Lower Values**: Begin with an over-provisioning factor of 2 and monitor system performance.
* **Monitor Resource Usage**: Keep an eye on GPU memory usage, temperatures, and overall system load.
* **Increment Gradually**: Adjust the over-provisioning factor incrementally to find the optimal balance.

### Feedback and Support

As this is an experimental feature, we encourage users to provide feedback on their experience. Your insights are valuable for improving GPU over-provisioning support in future releases.

0 comments on commit 77566fa

Please sign in to comment.