-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[llvm:ir] Add support for constant data exceeding 4GiB #126481
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-clang Author: None (pzzp) ChangesThe test file is over 4GiB, which is too big, so I didn’t submit it. 😄 Full diff: https://github.com/llvm/llvm-project/pull/126481.diff 5 Files Affected:
diff --git a/clang/lib/CodeGen/CGExprConstant.cpp b/clang/lib/CodeGen/CGExprConstant.cpp
index ef11798869d3b13..7d57c71e7e61b98 100644
--- a/clang/lib/CodeGen/CGExprConstant.cpp
+++ b/clang/lib/CodeGen/CGExprConstant.cpp
@@ -364,13 +364,13 @@ bool ConstantAggregateBuilder::split(size_t Index, CharUnits Hint) {
// FIXME: If possible, split into two ConstantDataSequentials at Hint.
CharUnits ElemSize = getSize(CDS->getElementType());
replace(Elems, Index, Index + 1,
- llvm::map_range(llvm::seq(0u, CDS->getNumElements()),
+ llvm::map_range(llvm::seq(0ul, CDS->getNumElements()),
[&](unsigned Elem) {
return CDS->getElementAsConstant(Elem);
}));
replace(Offsets, Index, Index + 1,
llvm::map_range(
- llvm::seq(0u, CDS->getNumElements()),
+ llvm::seq(0ul, CDS->getNumElements()),
[&](unsigned Elem) { return Offset + Elem * ElemSize; }));
return true;
}
diff --git a/llvm/include/llvm/IR/Constants.h b/llvm/include/llvm/IR/Constants.h
index 15b90589b7e2b55..d2ca214e781ee04 100644
--- a/llvm/include/llvm/IR/Constants.h
+++ b/llvm/include/llvm/IR/Constants.h
@@ -644,7 +644,7 @@ class ConstantDataSequential : public ConstantData {
Type *getElementType() const;
/// Return the number of elements in the array or vector.
- unsigned getNumElements() const;
+ uint64_t getNumElements() const;
/// Return the size (in bytes) of each element in the array/vector.
/// The size of the elements is known to be a multiple of one byte.
diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
index 7ca63c2c7251ded..f8968116dcc5319 100644
--- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -2773,7 +2773,7 @@ void ModuleBitcodeWriter::writeConstants(unsigned FirstVal, unsigned LastVal,
cast<ConstantDataSequential>(C)->isString()) {
const ConstantDataSequential *Str = cast<ConstantDataSequential>(C);
// Emit constant strings specially.
- unsigned NumElts = Str->getNumElements();
+ uint64_t NumElts = Str->getNumElements();
// If this is a null-terminated string, use the denser CSTRING encoding.
if (Str->isCString()) {
Code = bitc::CST_CODE_CSTRING;
@@ -2801,10 +2801,10 @@ void ModuleBitcodeWriter::writeConstants(unsigned FirstVal, unsigned LastVal,
Code = bitc::CST_CODE_DATA;
Type *EltTy = CDS->getElementType();
if (isa<IntegerType>(EltTy)) {
- for (unsigned i = 0, e = CDS->getNumElements(); i != e; ++i)
+ for (uint64_t i = 0, e = CDS->getNumElements(); i != e; ++i)
Record.push_back(CDS->getElementAsInteger(i));
} else {
- for (unsigned i = 0, e = CDS->getNumElements(); i != e; ++i)
+ for (uint64_t i = 0, e = CDS->getNumElements(); i != e; ++i)
Record.push_back(
CDS->getElementAsAPFloat(i).bitcastToAPInt().getLimitedValue());
}
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 44b10c3ef997267..34ea45ad8d14541 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -3628,9 +3628,9 @@ static void emitGlobalConstantDataSequential(
return AP.OutStreamer->emitBytes(CDS->getAsString());
// Otherwise, emit the values in successive locations.
- unsigned ElementByteSize = CDS->getElementByteSize();
+ uint64_t ElementByteSize = CDS->getElementByteSize();
if (isa<IntegerType>(CDS->getElementType())) {
- for (unsigned I = 0, E = CDS->getNumElements(); I != E; ++I) {
+ for (uint64_t I = 0, E = CDS->getNumElements(); I != E; ++I) {
emitGlobalAliasInline(AP, ElementByteSize * I, AliasList);
if (AP.isVerbose())
AP.OutStreamer->getCommentOS()
@@ -3640,7 +3640,7 @@ static void emitGlobalConstantDataSequential(
}
} else {
Type *ET = CDS->getElementType();
- for (unsigned I = 0, E = CDS->getNumElements(); I != E; ++I) {
+ for (uint64_t I = 0, E = CDS->getNumElements(); I != E; ++I) {
emitGlobalAliasInline(AP, ElementByteSize * I, AliasList);
emitGlobalConstantFP(CDS->getElementAsAPFloat(I), ET, AP);
}
diff --git a/llvm/lib/IR/Constants.cpp b/llvm/lib/IR/Constants.cpp
index 33f4dc78c6d3f9b..9c3f28a3faa6807 100644
--- a/llvm/lib/IR/Constants.cpp
+++ b/llvm/lib/IR/Constants.cpp
@@ -2855,7 +2855,7 @@ bool ConstantDataSequential::isElementTypeCompatible(Type *Ty) {
return false;
}
-unsigned ConstantDataSequential::getNumElements() const {
+uint64_t ConstantDataSequential::getNumElements() const {
if (ArrayType *AT = dyn_cast<ArrayType>(getType()))
return AT->getNumElements();
return cast<FixedVectorType>(getType())->getNumElements();
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test that the result actually has the correct contents, in addition to the correct size? It looks like it might still wrap...
@@ -1583,7 +1583,7 @@ static void printConstant(const Constant *COp, unsigned BitWidth, | |||
bool IsInteger = EltTy->isIntegerTy(); | |||
bool IsFP = EltTy->isHalfTy() || EltTy->isFloatTy() || EltTy->isDoubleTy(); | |||
unsigned EltBits = EltTy->getPrimitiveSizeInBits(); | |||
unsigned E = std::min(BitWidth / EltBits, CDS->getNumElements()); | |||
unsigned E = std::min(BitWidth / EltBits, (unsigned)CDS->getNumElements()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsigned E = std::min(BitWidth / EltBits, (unsigned)CDS->getNumElements()); | |
uint64_t E = std::min((uint64_t)(BitWidth / EltBits), CDS->getNumElements()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tested, actually it work well on our llvm-15 based TVM for over half a year...
The function printConstant
is only invoked for constants retrieved from the constant pool, which is unlikely to be a large constant.
The purpose of our patch is to enable emitting constant data exceeding 4GB, hence we avoid to do extensive modifications like replace unsigned with
uint64_t` across the codebase.
This script gen_large_string.py
can generate a test.
print(
'''; ModuleID = 'large_string.c'
source_filename = "large_string.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
'''
)
print('@.str = private unnamed_addr constant [4294968320 x i8] c"', end='')
for i in range(4294968320//1024):
print('abcdefgh' * 128, end='')
print('", align 1')
print(
'''@s = dso_local local_unnamed_addr global ptr @.str, align 8
!llvm.module.flags = !{!0, !1, !2, !3}
!llvm.ident = !{!4}
!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 8, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = !{i32 7, !"uwtable", i32 2}
!4 = !{!"clang version 21.0.0git (https://github.com/llvm/llvm-project.git 0395fd9680a348c6afc52a165453222c02a3cd48)"}
''')
python3 gen_large_string.py > large_string.ll
llc large_string.ll
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give you that printConstant isn't really important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"abcdefgh" is not a good string to use because its length is a power of 2; that means you won't notice if integer indexes wrap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your meticulous review.
I update the test to this gen_large_string.sh
#!/bin/bash
cat << EOF
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
EOF
# generate @S = constant [4294967303 x i8] c"......", align 1
for (( i = 0; i<128; i++)); do
S_1K+='abcdefgh'
done
for (( i = 0; i<1024; i++)); do
S_1M+=$S_1K
done
echo -n "@S = constant [$((1024 * 1024 * 1024 * 4 + 7)) x i8] c\""
for (( i = 0; i<$((1024 * 4)); i++)); do
echo -n $S_1M
done
echo -n 'last7ch'
echo '", align 1'
cat << EOF
!llvm.module.flags = !{!0, !1, !2, !3}
!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 8, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = !{i32 7, !"uwtable", i32 2}
EOF
bash gen_large_string.sh > a.ll
llc a.ll
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just switch it to uint64_t even if it "works"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to make too many changes. Changing unsigned to uint64_t here will lead to a series of modifications to printConstant, but printConstant doesn't actually handle constants exceeding 4GiB
@@ -644,7 +644,7 @@ class ConstantDataSequential : public ConstantData { | |||
Type *getElementType() const; | |||
|
|||
/// Return the number of elements in the array or vector. | |||
unsigned getNumElements() const; | |||
uint64_t getNumElements() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you also need to update other APIs? ConstantDataSequential::getElementAsInteger() etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other APIs do not affect the emission of large constant data, so we do not modify them for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the AsmPrinter, at least, is using getElementAsInteger. Please try some tests that don't hit the CDS->isString()
optimized path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I update getElementAsInteger
and getElementAsFloat
, it seems that these two are used in codegen.
I test it with this script's generation
#!/bin/bash
cat << EOF
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
EOF
# generate @ARR = global [1073741825 x i32] [...], align 16
for (( i = 0; i<1024; i++)); do
ARR_1K+="i32 $((RANDOM % 97)), "
done
for (( i = 0; i<1024; i++)); do
ARR_1M+="$ARR_1K"
done
echo -n "@ARR = global [$((1024 * 1024 * 1024 + 1)) x i32] ["
for (( i = 0; i<$((1024)); i++)); do
echo -n "$ARR_1M"
done
echo -n 'i32 233], align 16'
cat << EOF
!llvm.module.flags = !{!0, !1, !2, !3}
!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 8, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = !{i32 7, !"uwtable", i32 2}
EOF
I checked that the result is ok. The last number in result asm is 233
and the size is 4294967300
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I add these test to regression tests? but it is really large
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a python script as a regression test isn't necessarily an issue, but I assume it would take an excessively large amount of time/memory to process a 4GB file. We don't really have any place in-tree for a test like that. Leaving it out is probably fine... it's here for reference if we need it in the future.
clang/lib/CodeGen/CGExprConstant.cpp
Outdated
@@ -364,13 +364,13 @@ bool ConstantAggregateBuilder::split(size_t Index, CharUnits Hint) { | |||
// FIXME: If possible, split into two ConstantDataSequentials at Hint. | |||
CharUnits ElemSize = getSize(CDS->getElementType()); | |||
replace(Elems, Index, Index + 1, | |||
llvm::map_range(llvm::seq(0u, CDS->getNumElements()), | |||
llvm::map_range(llvm::seq(uint64_t(0u), CDS->getNumElements()), | |||
[&](unsigned Elem) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[&](unsigned Elem) { | |
[&](uint64_t Elem) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌
✅ With the latest revision this PR passed the C/C++ code formatter. |
5b1adbc
to
a1d2d35
Compare
@@ -644,7 +644,7 @@ class ConstantDataSequential : public ConstantData { | |||
Type *getElementType() const; | |||
|
|||
/// Return the number of elements in the array or vector. | |||
unsigned getNumElements() const; | |||
uint64_t getNumElements() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a python script as a regression test isn't necessarily an issue, but I assume it would take an excessively large amount of time/memory to process a 4GB file. We don't really have any place in-tree for a test like that. Leaving it out is probably fine... it's here for reference if we need it in the future.
llvm/include/llvm/IR/Constants.h
Outdated
@@ -617,15 +617,15 @@ class ConstantDataSequential : public ConstantData { | |||
|
|||
/// If this is a sequential container of integers (of any size), return the | |||
/// specified element in the low bits of a uint64_t. | |||
uint64_t getElementAsInteger(unsigned i) const; | |||
uint64_t getElementAsInteger(uint64_t i) const; | |||
|
|||
/// If this is a sequential container of integers (of any size), return the | |||
/// specified element as an APInt. | |||
APInt getElementAsAPInt(unsigned i) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, I'm going to ask you change the rest of the getElementAs*. (I agree this change shouldn't grow indefinitely, so I'm not going to request anything beyond that. But having half the accessors use "unsigned" and the other half use "uint64_t" is confusing.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! ✨
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
(I'll merge once the premerge checks come back.)
Code formatter returned issues; please fix (#126481 (comment)) |
@pzzp Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
The test file is over 4GiB, which is too big, so I didn’t submit it. 😄