Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 18 additions & 16 deletions lib/PTO/Transforms/InsertSync/SyncEventIdAllocation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -417,22 +417,24 @@ void SyncEventIdAllocation::UpdateBackwardMatchSync(
syncFront->eventIds.push_back(eventId);
syncEnd->eventIds.push_back(eventId);

if (reallocatedPipePair.count(ScopePair(setFlag))) {
auto *ptr = dyn_cast<LoopInstanceElement>(
syncIR_[setFlag->GetForEndIndex().value()].get());
assert(ptr != nullptr);
syncFront->SetSyncIRIndex(ptr->beginId);
syncEnd->SetSyncIRIndex(ptr->endId);
syncFront->reallocatedLoopHeadTailSync = true;
syncEnd->reallocatedLoopHeadTailSync = true;
syncIR_[ptr->beginId]->pipeBefore.push_back(syncFront.get());
syncIR_[ptr->endId]->pipeAfter.push_back(syncEnd.get());
} else {
syncFront->SetSyncIRIndex(0);
syncEnd->SetSyncIRIndex(syncIR_.size() - 1);
syncIR_[0]->pipeBefore.push_back(syncFront.get());
syncIR_[syncIR_.size() - 1]->pipeAfter.push_back(syncEnd.get());
}
auto *ptr = dyn_cast<LoopInstanceElement>(
syncIR_[setFlag->GetForEndIndex().value()].get());
assert(ptr != nullptr);

// Keep loop-carried head/tail syncs scoped to their owning loop.
// Hoisting them to function entry/exit can make them execute even when an
// enclosing branch is not taken, which is both noisy and semantically risky.
syncFront->SetSyncIRIndex(ptr->beginId);
syncEnd->SetSyncIRIndex(ptr->endId);
Comment on lines +427 to +428
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Anchoring loop-carried syncs to the loop boundaries (ptr->beginId and ptr->endId) instead of the function boundaries (index 0 and size-1) is correct for scoping, but it breaks assumptions in other functions that specifically look at the function boundaries for these syncs:

  1. ClearReallocatedBackwardMatchSync (lines 572-588): This function only iterates over syncIR_[0] and syncIR_[last]. Since loop-carried syncs are now anchored to loop boundaries from the first allocation attempt, this function will fail to remove them during reallocation, leading to duplicate sync instructions in the final output.
  2. IgnoreBackHeadAndTailSync (lines 644-661): This optimization only checks syncIR_[0]. It will now miss loop-carried syncs anchored at loop boundaries, effectively disabling the MTE1->M optimization for these cases.

These functions should be refactored to either iterate over all syncIR_ elements or leverage the insertedBackwardSync set to identify and process all synthetic backward match syncs regardless of their anchor point.


// Preserve historical marker semantics: this flag is only used by the
// reallocation fallback path.
bool markAsReallocatedHeadTail = reallocatedPipePair.count(ScopePair(setFlag));
syncFront->reallocatedLoopHeadTailSync = markAsReallocatedHeadTail;
syncEnd->reallocatedLoopHeadTailSync = markAsReallocatedHeadTail;

syncIR_[ptr->beginId]->pipeBefore.push_back(syncFront.get());
syncIR_[ptr->endId]->pipeAfter.push_back(syncEnd.get());

insertedBackwardSync.insert(syncFront.get());
insertedBackwardSync.insert(syncEnd.get());
Expand Down
Loading