-
Notifications
You must be signed in to change notification settings - Fork 593
HDDS-13579. [Docs] Explain how Ratis write pipelines are calculated #9580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jojochuang
wants to merge
4
commits into
apache:master
Choose a base branch
from
jojochuang:HDDS-13579
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+57
−0
Open
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -69,6 +69,63 @@ Ratis handles concurrent logs per node. | |
| This property is effective only when the previous property is set to 0. | ||
| The value of this property must be greater than 0. | ||
|
|
||
| ### Calculating Ratis Pipeline Limits | ||
|
|
||
| ReplicationFactor.THREE is controlled by three configuration properties that limit the | ||
| number of pipelines in the cluster at a cluster-wide level and a datanode level, respectively. | ||
| The number of pipelines created by SCM is restricted by these limits. | ||
|
|
||
| 1. **Cluster-wide Limit (`ozone.scm.ratis.pipeline.limit`)** | ||
| * **Description**: An absolute, global limit for the total number of open Ratis pipelines | ||
| across the entire cluster. This acts as a final cap on the total number of pipelines. | ||
| * **Default Value**: `0` (which means no global limit by default). | ||
|
|
||
| 2. **Datanode-level Fixed Limit (`ozone.scm.datanode.pipeline.limit`)** | ||
| * **Description**: When set to a positive number, this property defines a fixed maximum number of pipelines for | ||
| every datanode. | ||
| * **Default Value**: `2` | ||
| * **Cluster-wide Limit Calculation**: If this property is set, | ||
| the number of pipelines in the cluster is in addition limited by | ||
| `(<this value> * <number of healthy datanodes>) / 3`. | ||
|
|
||
| 3. **Datanode-level Dynamic Limit (`ozone.scm.pipeline.per.metadata.disk`)** | ||
| * **Description**: This property takes effect when `ozone.scm.datanode.pipeline.limit` is not set to a positive number. | ||
| It calculates a dynamic limit for each datanode based on its available metadata disks. | ||
| * **Default Value**: `2` | ||
|
|
||
| #### How Limits are Applied | ||
|
|
||
| SCM first calculates a target number of pipelines based on either the **Datanode-level Fixed Limit** or the | ||
| **Datanode-level Dynamic Limit**. It then compares this calculated target to the **Cluster-wide Limit**. The | ||
| **lowest value** is used as the final target for the number of open pipelines. | ||
|
|
||
| **Example (Dynamic Limit):** | ||
|
|
||
| Consider a cluster with **10 healthy datanodes**. | ||
| * **8 datanodes** have 4 metadata disks each. | ||
| * **2 datanodes** have 2 metadata disks each. | ||
|
|
||
| And the configuration is: | ||
| * `ozone.scm.ratis.pipeline.limit` = **30** (A global cap is set) | ||
| * `ozone.scm.datanode.pipeline.limit` = **0** (Use dynamic calculation) | ||
| * `ozone.scm.pipeline.per.metadata.disk` = **2** (Default) | ||
|
|
||
| **Calculation Steps:** | ||
| 1. Calculate the limit for the first group of datanodes: `8 datanodes * (2 pipelines/disk * 4 disks/datanode) = 64 pipelines` | ||
| 2. Calculate the limit for the second group of datanodes: `2 datanodes * (2 pipelines/disk * 2 disks/datanode) = 8 pipelines` | ||
| 3. Calculate the total raw target from the dynamic limit: `(64 + 8) / 3 = 24` | ||
| 4. Compare with the global limit: `min(24, 30) = 24` | ||
|
|
||
| SCM will attempt to create and maintain approximately **24** open, FACTOR_THREE Ratis pipelines. | ||
|
|
||
| **Production Recommendation:** | ||
|
|
||
| For most production deployments, using the dynamic per-disk limit (`ozone.scm.datanode.pipeline.limit=0`) is | ||
| recommended, as it allows the cluster to scale pipeline capacity naturally with its resources. You can use the | ||
| global limit (`ozone.scm.ratis.pipeline.limit`) as a safety cap if needed. A good starting value for | ||
| `ozone.scm.pipeline.per.metadata.disk` is **2**. Monitor the section **Pipeline Statistics** in SCM web UI, or run | ||
| the command `ozone admin pipeline list` to see if the actual number of pipelines aligns with your configured targets. | ||
|
Comment on lines
+123
to
+127
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there should a tradeoff of having a lot of concurrent pipelines. This might be worth documenting. Here are a few I can thinks of
|
||
|
|
||
| ## How to Use | ||
| 1. Configure Datanode metadata directories: | ||
| ```xml | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tricky since we have a bug; see HDDS-14369.