Skip to content

Conversation

tianrui-wei
Copy link
Contributor

This commit fixes a corner-case, where a memory module itself is not effectful, but it is connected to an effectful ext memory module. If we simply mark all memory modules as pure, this would optimise out the ext memory module, causing inconsistencies in the sifive -repl-seq-mem flow. The previous firrtl transforms and hierarchies would be generated correctly, yet the final _ext sram would be omitted completely.

This fixes #8994, where the offending chisel looks like

  val rob_debug_inst_mem   = SyncReadMem(numRobRows, Vec(coreWidth, UInt(32.W)))
  val rob_debug_inst_wmask = WireInit(VecInit(0.U(coreWidth.W).asBools))
  val rob_debug_inst_wdata = Wire(Vec(coreWidth, UInt(32.W)))
  rob_debug_inst_mem.write(rob_tail, rob_debug_inst_wdata, rob_debug_inst_wmask)
  val rob_debug_inst_rdata = rob_debug_inst_mem.read(rob_head, will_commit.reduce(_||_))

Before this patch, the following would have all the wires optimised out by this pass under rob_debug_inst_mem

 @rob_debug_inst_mem(in %R0_addr: !firrtl.uint<5>, in %R0_en: !firrtl.uint<1>, in %R0_clk: !firrtl.clock, in %W0_addr: !firrtl.uint<5>, in %W0_clk: !firrtl.clock, in %W0_data: !firrtl.uint<64>, in %W0_mask: !
 firrtl.uint<2>) {
     %c1_ui1 = firrtl.constant 1 : !firrtl.uint<1>
     %rob_debug_inst_mem_ext_R0_addr, %rob_debug_inst_mem_ext_R0_en, rob_debug_inst_mem_ext_R0_clk, %rob_debug_inst_mem_ext_R0_data, rob_debug_inst_mem_ext_W0_addr, %rob_debug_inst_mem_ext_W0_en,
 %rob_debug_inst_mem_ext_W0_clk, %rob_debug_inst_mem_ext_W0_data, %rob_debug_inst_mem_ext_W0_mask = firrtl.instance rob_debug_inst_mem_ext @rob_debug_inst_mem_ext(in R0_addr: !firrtl.uint<5>, in R0_en: !
 firrtl.uint<1>, in R0_clk: !firrtl.clock, out R0_data: !firrtl.uint<64>, in W0_addr: !firrtl.uint<5>, in W0_en: !firrtl.uint<1>, in W0_clk: !firrtl.clock, in W0_data: !firrtl.uint<64>, in W0_mask: !
 firrtl.uint<2>)
     firrtl.matchingconnect %rob_debug_inst_mem_ext_R0_addr, %R0_addr : !firrtl.uint<5>
     firrtl.matchingconnect %rob_debug_inst_mem_ext_R0_en, %R0_en : !firrtl.uint<1>
     firrtl.matchingconnect %rob_debug_inst_mem_ext_R0_clk, %R0_clk : !firrtl.clock
     firrtl.matchingconnect %rob_debug_inst_mem_ext_W0_addr, %W0_addr : !firrtl.uint<5>
     firrtl.matchingconnect %rob_debug_inst_mem_ext_W0_en, %c1_ui1 : !firrtl.uint<1>
     firrtl.matchingconnect %rob_debug_inst_mem_ext_W0_clk, %W0_clk : !firrtl.clock
     firrtl.matchingconnect %rob_debug_inst_mem_ext_W0_data, %W0_data : !firrtl.uint<64>
     firrtl.matchingconnect %rob_debug_inst_mem_ext_W0_mask, %W0_mask : !firrtl.uint<2>
   }

Open to suggestions on how to mitigate this issue in a potentially cleaner manner :)

Copy link
Member

@seldridge seldridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for jumping in with a contribution!

This doesn't seem quire right to me. That said, the change to change how LayerSink works helps highlight what the problem is: memories that are sunk after going through the -repl-seq-mem flow is causing incorrect information in the files produce by -repl-seq-mem.

What test should be introduced to check the correctness of this? Something that is firtool end-to-end is fine. I expect it's something like a memory whose only users are in a layer block. This should then be able to show that one of the metadata files is wrong. Having this test/minimal example will help guide what the solution should look like.

I think there are three alternatives:

  1. Metadata needs to be encoded in such a way that it works across relocations that happen after LayerSink.
  2. The presence of metadata (likely evidenced by inner symbols on things) should prevent LayerSink from moving things. There's something weird going on here where things are moving when they shouldn't.
  3. The metadata formats should be dropped entirely and an alternative like the use of the Python APIs should be used instead.

@tianrui-wei tianrui-wei force-pushed the dev/tianrui/fix-extmem-usefulness branch from b5648a5 to a53f9a0 Compare September 25, 2025 04:07
@tianrui-wei
Copy link
Contributor Author

@seldridge Hi Schuyler, thanks for your feedback for my patch, both on discord and github. I've rewritten the patch to use non-local annotations to mark the lowered memory modules, and it is currently working for me locally (as well as passing the tests). Please let me know if you have any feedback on this :)

@tianrui-wei tianrui-wei changed the title fix: mark memory module as effectful fix: use NLA to preserve externalized memory from being dropped Sep 25, 2025
This fixes llvm#8994. We use NLA here to
pass information from LowerMemory to LayerSink to avoid marking memories
with an extmodule as not effectful.

Signed-off-by: Tianrui Wei <[email protected]>
@tianrui-wei tianrui-wei force-pushed the dev/tianrui/fix-extmem-usefulness branch from a53f9a0 to c112bb0 Compare September 25, 2025 05:58
@tianrui-wei
Copy link
Contributor Author

force-pushed to comply with clang-format rules

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[circt] LowerMemory might not update module hierarchy properly

2 participants