-
Notifications
You must be signed in to change notification settings - Fork 1.2k
HLD for port FEC FLR support #1948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 7 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
8f9f9e4
[FEC FLR] Add initial version of HLD for FEC FLR support in SONiC
vkjammala-arista 520168c
Update HLD
vkjammala-arista 1af964a
Merge branch 'sonic-net:master' into fec_flr_support
vkjammala-arista dc34117
Update HLD to include predicted FEC FLR design
vkjammala-arista d0bdee5
Add configurable interval factor and interleaving mapping.
vkjammala-arista 8442188
Minor nit corrections
vkjammala-arista 3d080bc
Address review comments
vkjammala-arista 9fd1879
Address review comments
vkjammala-arista File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+43.9 KB
doc/port_fec_flr/img/Logarithm_curve_of_CER_and_Linear_regression_fit.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,293 @@ | ||
| # FEC FLR support in SONiC # | ||
|
|
||
| ## Table of Content | ||
| - [Revision](#revision) | ||
| - [Scope](#scope) | ||
| - [Definitions/Abbreviations](#abbreviations) | ||
| - [1 Overview](#1-overview) | ||
| - [2 Requirements](#2-requirements) | ||
| - [2.1 Functional Requirements](#21-functional-requirements) | ||
| - [2.2 CLI Requirements](#22-cli-requirements) | ||
| - [3 Architecture Design](#3-architecture-design) | ||
| - [4 High level design](#4-high-level-design) | ||
| - [4.1 Assumptions](#41-assumptions) | ||
| - [4.2 SAI counters used](#42-sai-counters-used) | ||
| - [4.3 SAI API](#43-sai-api) | ||
| - [4.4 FEC interleaving](#44-fec-interleaving) | ||
| - [4.5 Observed FEC FLR](#45-observed-fec-flr) | ||
| - [4.6 Predicted FEC FLR](#46-predicted-fec-flr) | ||
| - [5 Sample output](#5-sample-output) | ||
| - [6 Acknowledgements](#6-Acknowledgements) | ||
|
|
||
| ### Revision | ||
|
|
||
| | Rev | Date | Author | Change Description | | ||
| |:---:|:-----------:|:----------------------:|-----------------------------------| | ||
| | 0.1 | 19-Mar-2025 | Pandurangan R S, Vinod Kumar Jammala (Arista Networks)| Initial version | | ||
| | 0.2 | 07-Jul-2025 | Apoorv Sachan, Pandurangan R S, Vinod Kumar Jammala (Arista Networks)| Add predicted FEC FLR | | ||
|
|
||
| ### Scope | ||
|
|
||
| This document describes the implementation of Port Forward Error Correction (FEC) Frame Loss Ratio (FLR) support in SONiC. | ||
|
|
||
| ### Definitions/Abbreviations | ||
|
|
||
| | Term | Definition / Abbreviation | | ||
| |---------|-----------------------------------------------------------------------| | ||
| | CER | Codeword Error Ratio | | ||
| | FEC | Forward Error Correction | | ||
| | FLR | Frame Loss Ratio | | ||
|
|
||
| ### 1 Overview | ||
| Frame Loss Ratio (FLR) is a key performance metric used to measure the percentage of lost frames relative to the total transmitted frames over a network link. | ||
|
|
||
| FLR is expressed as, | ||
| FLR = (Total Transmitted Frames - Total Received Frames) / Total Transmitted Frames | ||
|
|
||
| Based on the Forward Error Correction (FEC) data, receiver device can compute and estimate Codeword Error Ratio (CER), and FEC FLR will be calculated from CER. | ||
|
|
||
| ## 2 Requirements | ||
| ### 2.1 Functional Requirements | ||
| This HLD introduces the following enhancements: | ||
| - Calculation of FEC FLR at a configurable interval. | ||
| - Storing per-interface FEC FLR in the Redis DB for telemetry streaming. | ||
| - Enhancement of the `show interfaces counters fec-stats` CLI to include FEC FLR statistics. | ||
|
|
||
| ### 2.2 CLI Requirements | ||
|
|
||
| * The existing `show interfaces counters fec-stats` command will be enhanced to include the following FEC FLR columns: | ||
| - FEC_FLR | ||
| - FEC_FLR_PREDICTED | ||
| * A new `counterpoll port` sub-command will be introduced to configure FEC FLR interval factor: | ||
| - `counterpoll port fec-flr-interval-factor FEC_FLR_INTERVAL_FACTOR` | ||
| - The default value of FEC_FLR_INTERVAL_FACTOR will be 120. | ||
|
|
||
| ## 3 Architecture Design | ||
|
|
||
| There are no changes to the current SONiC Architecture. | ||
|
|
||
| ## 4 High-Level Design | ||
|
|
||
| * SWSS changes: | ||
|
|
||
| + port_flr.lua | ||
|
|
||
| This new lua script will | ||
| - Access the COUNTER_DB for already available counters for SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES, SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES, | ||
| and SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_Si representing codewords with i symbol errors where i ranges from 0 to 15 in case of RS-544 FEC. | ||
| - Compute both observed and predicted FEC FLR per port. | ||
| - Store the computed FEC FLR values and the previous Redis counter values back into the Redis DB. | ||
| - Perform the FEC FLR computation on each port once every `port_stat POLL_INTERVAL * FEC_FLR_INTERVAL_FACTOR` seconds, where FEC_FLR_INTERVAL_FACTOR is retrieved from the FLEX_COUNTER_DB. | ||
|
|
||
| + portsorch.cpp | ||
| - Link the new "port_flr.lua" script as a plugin to the existing PORT_STAT_COUNTER_FLEX_COUNTER_GROUP, alongside "port_rates.lua". | ||
|
|
||
| + flexcounterorch.cpp | ||
| - Enhance "FlexCounterOrch" to propagate FEC_FLR_INTERVAL_FACTOR from CONFIG_DB to FLEX_COUNTER_DB. | ||
|
|
||
| * Utilities Common changes: | ||
|
|
||
| + portstat.py: | ||
| - Enhance the `portstat` command with the `-f` option (used by the CLI command `show interfaces counters fec-stats`) to include the FEC_FLR and FEC_FLR_PREDICTED columns. | ||
|
|
||
| + counterpoll/main.py: | ||
| - Add a new argument `fec-flr-interval-factor` to the exisiting `counterpoll port` command. | ||
|
|
||
| ``` | ||
| root@sonic:~$ counterpoll port --help | ||
| Usage: counterpoll port [OPTIONS] COMMAND [ARGS]... | ||
|
|
||
| Port counter commands | ||
|
|
||
| Options: | ||
| --help Show this message and exit. | ||
|
|
||
| Commands: | ||
| disable Disable port counter query | ||
| enable Enable port counter query | ||
| interval Set port counter query interval | ||
| fec-flr-interval-factor Set port fec flr interval factor | ||
|
|
||
|
|
||
| root@sonic:~$ counterpoll port fec-flr-interval-factor --help | ||
| Usage: counterpoll port fec-flr-interval-factor [OPTIONS] FEC_FLR_INTERVAL_FACTOR | ||
|
|
||
| Set port fec flr interval factor | ||
|
|
||
| Options: | ||
| --help Show this message and exit. | ||
| ``` | ||
|
|
||
| ### 4.1 Assumptions | ||
|
|
||
| SAI provide access to each interface the following attributes | ||
| - SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES, which represents the number of uncorrectable FEC codewords. | ||
| - return not support if its not working for an interface | ||
| - SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES, which represents the number of correctable FEC codewords. | ||
| - return not support if its not working for an interface | ||
| - SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_Si, which represents the number of codewords with i symbol errors. | ||
| - return not support if its not working for an interface | ||
|
|
||
|
|
||
| ### 4.2 Sai Counters Used | ||
|
|
||
| The following redis DB entries will be accessed for the FEC FLR calculations | ||
|
|
||
| |Redis DB |Table|Entries|New, RW| Format | Description| | ||
| |--------------|-------------|------------------|--------|----------------|----------------| | ||
| |COUNTER_DB |COUNTERS_PORT_NAME_MAP | oid |R |string |Name to oid mapping | | ||
| |COUNTER_DB |COUNTERS |SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES |R |number |Total number of uncorrectable codewords | | ||
| |COUNTER_DB |COUNTERS |SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES |R |number |Total number of correctable codewords | | ||
| |COUNTER_DB |COUNTERS |SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_Si |R |number |Total number of codewords with i symbol errors | | ||
| |COUNTER_DB |RATES |FEC_FLR |New, RW| floating |calculated observed FEC FLR | | ||
| |COUNTER_DB |RATES |FEC_FLR_PREDICTED |New, RW| floating |calculated predicted FEC FLR | | ||
| |COUNTER_DB |RATES |SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES_last |NEW, RW |number |Last uncorrectable codewords | | ||
| |COUNTER_DB |RATES |SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES_last |NEW, RW |number |Last correctable codewords | | ||
| |COUNTER_DB |RATES |SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_Si_last |NEW, RW |number |Last codewords with i symbol errors | | ||
|
|
||
|
|
||
| ### 4.3 SAI API | ||
|
|
||
| No change in the SAI API. No new SAI object accessed. | ||
|
|
||
| ### 4.4 FEC interleaving | ||
| With FEC interleaving factor (X) incorporated, As per [IEEE 802.3df Logic Ad Hoc](https://www.ieee802.org/3/df/public/adhoc/logic/22_0630/opsasnick_3df_logic_220630a.pdf) FEC FLR is expressed as | ||
|
|
||
| FEC_FLR = CER * (1 + X * MFC)/MFC, where MFC (MAC frames per codeword) is 8 in the case of RS-544 FEC. Thus, | ||
|
|
||
| For X=1 (no interleaving), FEC_FLR = 1.125 * CER <br> | ||
vkjammala-arista marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| For X=2, FEC_FLR = 2.125 * CER <br> | ||
| For X=4, FEC_FLR = 4.125 * CER | ||
|
|
||
| To include the interleaving factor in the FEC FLR computation, a new SAI port attribute will be required to retrieve the underlying port interleaving factor. | ||
| Until such an attribute is available, the interleaving factor can be derived based on the following port speed to interleaving factor mapping: | ||
|
|
||
| | Port Speed | No. of lanes | FEC interleaving factor(X) | | ||
| |------------|--------------|----------------------------| | ||
| | 1600G | 8 | 4 | | ||
| | 800G | 8 | 4 | | ||
| | 400G | 8 | 2 | | ||
| | 400G | 4 | 2 | | ||
| | 200G | 4 | 2 | | ||
| | 200G | 2 | 2 | | ||
| | 100G | 2 | 2 | | ||
| | 100G | 1 | 1 or 2 (autonegotiated) | | ||
|
|
||
| ### 4.5 Observed FEC FLR | ||
|
|
||
| ``` | ||
| Step 1: calculate observed CER per interval | ||
| Observed CER is expressed as, CER = Uncorrectable FEC codewords / Total FEC codewords Received, which can be expanded to | ||
|
|
||
| CER = Uncorrectable FEC codewords / (Uncorrectable FEC codewords + Codewords with no symbol errors + Correctable FEC codewords) | ||
|
|
||
| where, Uncorrectable FEC codewords = SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES - SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES_last | ||
| Codewords with no symbol errors = SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_S0 - SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_S0_last | ||
| Correctable FEC codewords = SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES - SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES_last | ||
|
|
||
|
|
||
| Step 2: calculate FEC FLR using CER and considering interleaving factor (X) | ||
vkjammala-arista marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| If X=1, FEC_FLR = 1.125 * CER | ||
| If X=2, FEC_FLR = 2.125 * CER | ||
|
|
||
|
|
||
| Step 3: the following data will be updated and its latest value will be stored in the COUNTER_DB:RATES table after each computation | ||
|
|
||
| FEC_FLR, SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES_last, SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES_last and SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_S0_last | ||
|
|
||
| ``` | ||
|
|
||
| ### 4.6 Predicted FEC FLR | ||
|
|
||
| The goal is to estimate FEC FLR by extrapolating from observed codeword error distribution. | ||
| ``` | ||
| Step 1: Prepare codeword error index vector (x) | ||
|
|
||
| x = { 1, 2, ..., max_correctable_cw_symbol_errors } | ||
|
|
||
| where, max_correctable_cw_symbol_errors = 15 in case of RS-544 | ||
|
|
||
| For each index i in vector x, codeword_errors[i] represents number of codewords with i symbol errors in the | ||
| current interval i.e SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_Si - SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_Si_last. | ||
| ``` | ||
|
|
||
|
|
||
| The codeword error ratio typically follows an exponential decay curve, as shown in the image below. | ||
|  | ||
|
|
||
| ``` | ||
| Step 2: Compute logarithm codeword error ratio vector (y) | ||
|
|
||
| By applying a logarithm to the codeword error ratio, the exponential decay curve is transformed into a | ||
| linear pattern, making it suitable for linear regression modeling. | ||
|
|
||
| For each index i in vector x, compute logarithm of codeword error ratio y[i] as follows | ||
|
|
||
| y[i] = log10( codeword_errors[i] / total_codewords ) | ||
| where, total_codewords is total number of codewords | ||
| i.e Σ from i=0 to 15 of (SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_Si - SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_Si_last) | ||
| ``` | ||
|
|
||
|
|
||
| The image below shows the linear pattern of the codeword error ratio (CER) after applying a logarithm. | ||
|  | ||
|
|
||
| ``` | ||
| Step 3: Perform linear regresion to arrive at slope and intercept | ||
|
|
||
| slope = (n * Σ(x*y) - Σx * Σy) / (n * Σ(x²) - (Σx)²) | ||
| intercept = (Σy - slope * Σx) / n | ||
| where, n: number of data points (length of x or y vector) | ||
|
|
||
| This gives the best-fit line, y = slope * x + intercept. | ||
| ``` | ||
|
|
||
|
|
||
| The image below shows the linear regression line along with the logarithmic curve of the codeword error ratio (CER). | ||
|  | ||
|
|
||
| ``` | ||
| Step 4: Compute extrapolated CER | ||
|
|
||
| Using linear regression line, predicted CER for an index representing j symbol errors is | ||
| predicted_cer_j = 10 ^ ( j * slope + intercept ) | ||
|
|
||
| The predicted CER for a window of codewords with uncorrectable symbol errors is calculated as: | ||
| predicted_cer = Σ from j=16 to 20 of predicted_cer_j | ||
|
|
||
| Note: We use the uncorrectable symbol error window from 16 to 20 because for values above 20, the predicted CER becomes insignificant. | ||
|
|
||
|
|
||
| Step 5: Compute FLR from extrapolated CER by considering interleaving factor | ||
| If X=1, FEC_FLR_PREDICTED = 1.125 * predicted_cer | ||
| If X=2, FEC_FLR_PREDICTED = 2.125 * predicted_cer | ||
|
|
||
|
|
||
| Step 6: Store FEC_FLR_PREDICTED, SAI_PORT_STAT_IF_IN_FEC_CODEWORD_ERRORS_Si_last in the COUNTER_DB:RATES table | ||
| ``` | ||
|
|
||
| ## 5 Sample CLI Output | ||
| ``` | ||
| root@sonic:~$ portstat -f | ||
| IFACE STATE FEC_CORR FEC_UNCORR FEC_SYMBOL_ERR FEC_PRE_BER FEC_POST_BER FEC_FLR FEC_FLR_PREDICTED | ||
vkjammala-arista marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ----------- ------- ---------- ------------ ---------------- ------------- -------------- --------- --------------------- | ||
| Ethernet0 U 0 0 0 0.00e+00 0.00e+00 0 0 | ||
| Ethernet8 U 0 0 0 0.00e+00 0.00e+00 0 0 | ||
| Ethernet16 X 0 0 0 0.00e+00 0.00e+00 0 0 | ||
| Ethernet24 X 0 0 0 0.00e+00 0.00e+00 0 0 | ||
| Ethernet32 U 0 0 0 0.00e+00 0.00e+00 0 0 | ||
| Ethernet40 D 21 0 0 0.00e+00 0.00e+00 0 0 | ||
| Ethernet48 X 0 0 0 0.00e+00 0.00e+00 0 0 | ||
| Ethernet56 X 0 0 0 0.00e+00 0.00e+00 0 0 | ||
| Ethernet64 U 1,334 0 4 0.00e+00 0.00e+00 0 0 | ||
| Ethernet72 U 28,531 0 31 0.00e+00 0.00e+00 0 2.68e-09 (R^2 = 0.79) | ||
| Ethernet80 U 25,890 0 25 0.00e+00 0.00e+00 0 6.03e-09 (R^2 = 0.79) | ||
| Ethernet88 U 21,909 0 49 0.00e+00 0.00e+00 0 0 | ||
| Ethernet96 U 5,635 0 8 0.00e+00 0.00e+00 0 0 | ||
| Ethernet104 U 21,141 0 7 0.00e+00 0.00e+00 0 7.08e-09 (R^2 = 0.79) | ||
| ``` | ||
|
|
||
| If FEC is not supported for an interface, the FEC_FLR and FEC_FLR_PREDICTED fields will display `N/A` for the corresponding entry. If there is insufficient data to compute the FEC FLR (e.g., when the link is performing well), these fields will display `0` (note that `0` is shown instead of `0.00e+00` for better readability). | ||
vkjammala-arista marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| ## 6 Acknowledgements | ||
| Thanks to Prince and Cameron from Microsoft for sharing the details of the predicted FEC FLR algorithm and the mapping of port speed to interleaving factor. | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.