Skip to content

Conversation

heshpdx
Copy link
Contributor

@heshpdx heshpdx commented Oct 14, 2025

This is the correct fix for gcc-16's build error. It works and doesn't crash on two different machines tested by SPEC CPU folks.

error: no match for 'operator[]' (operand types are 'edge_sort_iterator' and 'long int')

This is the correct fix for gcc-16.

error: no match for 'operator[]' (operand types are 'edge_sort_iterator' and 'long int')
@github-actions github-actions bot added the lang-cpp C/C++ code label Oct 14, 2025
@amin1377 amin1377 self-requested a review October 14, 2025 18:07
@vaughnbetz
Copy link
Contributor

Seems OK to me. @AmirhosseinPoolad : sending to you to take a look at since you are a C++ geek :). @AmirhosseinPoolad : for safety, probably we should run a QoR check on some reasonable benchmark suite (VTR or Titan) to check there is no CPU impact. I don't expect any, but don't know enough about how often this function is called or what the old / default implementation was to be 100% sure.

@amin1377
Copy link
Contributor

@heshpdx: Since you’re now returning by value, does it still need to be const? In the previous PR, I mentioned making it a const reference because it was returned by reference. For the [] operator, it’s common to return a const reference for better memory efficiency.

@heshpdx
Copy link
Contributor Author

heshpdx commented Oct 15, 2025

Please do check this with a gcc-16 build if you can. @AmirhosseinPoolad

@heshpdx
Copy link
Contributor Author

heshpdx commented Oct 15, 2025

@heshpdx: Since you’re now returning by value, does it still need to be const? In the previous PR, I mentioned making it a const reference because it was returned by reference. For the [] operator, it’s common to return a const reference for better memory efficiency.

You may be right. One of our folks said that the initial const is not required. But he only looked at it, he did not test it. We do know that the current code (with the const) has been tested.

return lhs.swapper_.idx_ <= rhs.swapper_.idx_;
}

const t_rr_edge_info operator[] (ssize_t n) const {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const t_rr_edge_info operator[] (ssize_t n) const {
t_rr_edge_info operator[] (ssize_t n) const {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some early testing reports from our next integration hint at needing to remove the const. We will get back to you.

@AmirhosseinPoolad
Copy link
Contributor

I just compiled VtR master using gcc's master branch and it compiled just fine. @heshpdx Not sure why this code is necessary.

image

@heshpdx
Copy link
Contributor Author

heshpdx commented Oct 16, 2025

I just compiled VtR master using gcc's master branch and it compiled just fine. @heshpdx Not sure why this code is necessary.

The build errors are seen with gcc-16 ToT. Which compiler version did you use?

@AmirhosseinPoolad
Copy link
Contributor

AmirhosseinPoolad commented Oct 16, 2025

The build errors are seen with gcc-16 ToT. Which compiler version did you use?

I used the master branch of gcc for x86_64 on linux, which I assume is the latest in development version (Running gcc --version reports gcc 16). I'm not familiar with ToT. If you could send me the exact branch of gcc you used I could test the PR using that branch.

@AmirhosseinPoolad
Copy link
Contributor

AmirhosseinPoolad commented Oct 16, 2025

Some more information on the specific GCC version I used:

  • commit 08b2c542e4d0c223f50e46692bccf0c2ebbe3454
image
  • Configure options:
    ../gcc-dev/configure --prefix=$PWD/gcc-16 --enable-languages=c,c++ --disable-multilib --disable-bootstrap

  • Compile options: (Compiled with 'gcc (GCC) 15.2.1 20250808 (Red Hat 15.2.1-1)')
    make -j8

  • gcc --version output:

image

@AmirhosseinPoolad
Copy link
Contributor

AmirhosseinPoolad commented Oct 17, 2025

I think I figured out the issue. edge_sort_iterator advertises itself as a std::random_access_iterator_tag but it's a lie. random access iterators require operator[] but in their infinite wisdom C++ compilers don't actually check that and will only error out if some algorithm using the iterator e.g. std::sort tries to use operator[]. I can't be sure, but I think what happened here is that my STL implementation didn't use operator[] while yours did.

I do agree with this PR, this iterator should definitely have an operator[] and it's honestly a total coincidence that it compiles on any machine right now. I'm not sure if this implementation is correct, however. each time you call operator[n] the internal index is added to by n. This is similar to what operator+= does. I don't think operator[] should change the internal index. I'll take a look at the relevant code to see what a correct implementation would look like, and maybe port this iterator to use concepts instead.

@vaughnbetz
Copy link
Contributor

Thanks for the careful review @AmirhosseinPoolad !

@AmirhosseinPoolad
Copy link
Contributor

Some more updates on this:

  • The current iterator is very broken and is not a valid random access iterator
  • operator[] should work similar to operator* to work correctly, as in it should return a reference like operator* but idx is added to by n before returning.
  • The current operator* is also invalid in the formal sense of the word, but works for the current stable_sort implementation. See [RR_GRAPH] Fixed Bug in LLVM17 Build #2522.

The main issue here is that rr graph is stored as an struct of arrays, which means it needs it's own custom iterator. Iterators (from what I understand) need a value type and reference type. Here the reference type (edge_swapper) holds an index into rr graph storage and the value type is constructed from the reference type. This is weird because as we saw in #2522 the iterator is not behaving like a pointer at all and adding operator[] (which would work the same way) would only add to the problem. I tried a lot of different options to fix this and make a nice iterator, as far as I know* C++20 straight up does not have a good answer for this. With C++23 we could use ranges::zip_view which would completely fix our problem and we could remove the iterator and everything would be alright. There is the ranges-v3 library which does the same thing (and was the basis for the C++23 stuff) but of course adding any new dependency would need a good amount of testing.

* Of course, I don't know all or most of C++. If anyone knows a way to make a well behaved random access iterator for rr graph edges, please let me know.

I think what we should do is this:

  • Add operator[] that works exactly like operator* does, but with an offset
  • Hope that nothing breaks until we move to C++23 sometime in the future
  • Remove this iterator and related structures, use ranges::zip_view and std::ranges::stable_sort and fix this once and for all.

Copy link
Contributor

@AmirhosseinPoolad AmirhosseinPoolad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for all the rants I did in this PR @heshpdx! Thank you for flagging this issue. I'm not entirely sure if the current implementation of operator[] is correct. I had some thoughts on it. You should definitely test this PR locally on the machine you're having build issues on (since my machine and the CI machine doesn't use operator[] in std::stable_sort) to make sure it's completely correct. I'm sure the strong tests would suffice:

./run_reg_test.py vtr_reg_strong -j[num_cores]

Comment on lines +238 to +243
const t_rr_edge_info operator[] (ssize_t n) const {
edge_sort_iterator ret = *this;
ret.swapper_.idx_ += n;
return ret.swapper_;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this compile successfully on your system if it returned an edge_swapper& like operator* does? operator[n] is (or should be) equivalent to *(it + n). We basically need a combination of operator+ and operator* here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think the previous PR was fine. The function was const and that was correct since (*this + n) constructs a new iterator and doesn't modify 'this' at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!3298 did not work as it still led to segfaults, so Amin reverted it.

ToT stands for Top of Tree, sorry about that. The newer compilers appear to be more strict.

The current PR as is, has now been tested successfully across a wide variety of compilers and systems (gcc, llvm, nvidia/intel/amd/cray/ibm compilers, and their respective hardware). SPEC CPU 2026 is built with -std=c++17; builds with newer standards might work but we are focusing on C++17 as the baseline. The successful tests are NOT using upstream VPR's unit test or regressions, but rather just the few titan cmdlines of place+route that we run for the benchmarks.

I'm not a tester, I'm just the project leader and messenger 😄. Maybe the community waits to merge this until the problem is exposed by a more mature gcc-16, and maybe the real solution is to use ranges::zip_view. I don't have an opinion on whether or not this change should be accepted here, I am just sharing that it works for SPEC. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our compiler team did a bisect and said that the "first bad commit" on gcc-16 was the following, so maybe that gives a clue to the behavioral change for VPR which necessitates this PR.
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=aaeca77a79a9a897c97f00f83f2471e7a8bd6685

@vaughnbetz
Copy link
Contributor

I'm fine with merging this as everyone seems to agree the old behaviour was not really guaranteed by the standard. @AmirhosseinPoolad : I think you're also saying this is fine to merge, but we may someday be able to do it in a cleaner way (but not yet ... would have to move to a later C++ standard). @AmirhosseinPoolad : I'd still like a quick CPU time check, which I expect won't show any change, but I like to be paranoid ...

@vaughnbetz
Copy link
Contributor

Had a discussion with @AmirhosseinPoolad and @AlexandreSinger . They believe the code we're fixing (not the new code, the existing code) that stores edges in multiple parallel arrays and sorts them is fragile in its use of iterators / subscript operator. @AmirhosseinPoolad is going to rewrite the edge sorting code to be simpler so we don't need to make a complex iterator / subscript operator here. Basic idea:

  • Make a struct with (sort_val, edge_index) for each edge. sort_val will be computed from the various edge data arrays (probably 2 of them) in the same way as now.
  • Call stable_sort with this new data structure
  • Make a new copy of each of the edge_data_arrays, with the indices re-ordered as specified by the sort output
  • Delete the old copies / move the new edge_data_arrays into the main storage.
  • Delete the temporary sort_val,edge_index vector

It might be slightly less efficient, but this is not very hot code anyway (we do one sort after making an rr-graph and before routing). It also shouldn't be the peak memory footprint as the router data structures shouldn't be loaded yet. We should measure that though, and we could do the edge_data_array copy one at a time if it was the peak memory footprint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lang-cpp C/C++ code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants