[prediction] pipeline duplication#956
Conversation
murphe67
left a comment
There was a problem hiding this comment.
Hey Elena!
Thanks for the really nice work on this PR, it is in super great shape 😁
I wrote a lot of comments, but they are kind of all in the same vein of:
I think you know what this does, and I think it works the way you think it works, but I think you could spend more time reading and writing it so that the file exists as an "act of communication"- the code is not only about doing what it is supposed to do, but also about taking a reader on a journey of what decisions you made and why.
Secondly, I think this PR currently assumes no branches in the duplicated region? and I think it would be nice to support that before merging 😁 Also duplicated regions which have additional input beyond the one we will eventually do data prediction on, so theyre not replaced with constants, just branched to their consumers in the different bbs based on which bb executes from the comparisons?
Let me know about any questions or concerns you have, I will maybe write more later but I think this is plently for you to get started with in terms of building a narrative in the code files 😁
| a[i] = (y - c) * 10.0f; | ||
| } | ||
| } | ||
| ``` |
There was a problem hiding this comment.
I think we could expand this to include a c++ snippet that matches what your pass does, and then maybe draw a control-flow graph with lines of code in it so that the example shows the duplication?
|
|
||
| ### Overview | ||
| The core pass driver (`runDynamaticPass`) executes the transformation in four sequential phases for each prediction marker: | ||
| 1. **Block Splitting**: The basic block containing the start operation is split immediately before. This isolates the original operations and the following logic into a separate block (`exitBlock`). |
There was a problem hiding this comment.
if the prediction goes through an if statement, the logic below doesnt all belong to the same bb, right?
| The core pass driver (`runDynamaticPass`) executes the transformation in four sequential phases for each prediction marker: | ||
| 1. **Block Splitting**: The basic block containing the start operation is split immediately before. This isolates the original operations and the following logic into a separate block (`exitBlock`). | ||
| 2. **DFS**: A Depth-first Search goes through the data-flow chain beginning at `startOp` and terminates at either a user-defined `endOp` or store operations. This tracks all operations that must be duplicated. If the DFS does not find an `endOp` or store operation at all the "leaves" of the graph, it returns an error. | ||
| 3. **Cloning**: The pass iterates over the list of constants (`values`). For each constant, it inserts a comparison check and a conditional branch to ensure that we still have correct control flow. The `true` branch then generates a new block with cloned versions of the operations identified by the DFS, substituting `predInput` with the hardcoded constant. |
There was a problem hiding this comment.
i think correct execution here over correct control flow, since the control flow changes but is still functionally equivalent, i think this sentence could make that a little bit clearer?
| 1. **Block Splitting**: The basic block containing the start operation is split immediately before. This isolates the original operations and the following logic into a separate block (`exitBlock`). | ||
| 2. **DFS**: A Depth-first Search goes through the data-flow chain beginning at `startOp` and terminates at either a user-defined `endOp` or store operations. This tracks all operations that must be duplicated. If the DFS does not find an `endOp` or store operation at all the "leaves" of the graph, it returns an error. | ||
| 3. **Cloning**: The pass iterates over the list of constants (`values`). For each constant, it inserts a comparison check and a conditional branch to ensure that we still have correct control flow. The `true` branch then generates a new block with cloned versions of the operations identified by the DFS, substituting `predInput` with the hardcoded constant. | ||
| 4. **False Path**: When all of the paths with constants have been created, the last `false` branch has all of the original operations moved into the last alternative block. All of these paths then merge back into the `exitBlock`. |
There was a problem hiding this comment.
you could either cover the case of the 'leaf' operations being in different bbs here or say we first discuss the "converged" case and then discuss the "diverged" case after?
murphe67
left a comment
There was a problem hiding this comment.
Hey Elena!
Thanks for the really nice work on this PR, it is in super great shape 😁
I wrote a lot of comments, but they are kind of all in the same vein of:
I think you know what this does, and I think it works the way you think it works, but I think you could spend more time reading and writing it so that the file exists as an "act of communication"- the code is not only about doing what it is supposed to do, but also about taking a reader on a journey of what decisions you made and why.
Secondly, I think this PR currently assumes no branches in the duplicated region? and I think it would be nice to support that before merging 😁 Also duplicated regions which have additional input beyond the one we will eventually do data prediction on, so theyre not replaced with constants, just branched to their consumers in the different bbs based on which bb executes from the comparisons?
Let me know about any questions or concerns you have, I will maybe write more later but I think this is plently for you to get started with in terms of building a narrative in the code files 😁
The Pipeline Duplication Pass duplicates specific program paths with hardcoded constants. By using pragmas, you can specify a variable that frequently holds a specific value. The pass then generates an additional parallel path in the pipeline where that variable is treated as a constant.