Skip to content

Conversation

@RiverDave
Copy link
Collaborator

Device variables (__device__, __constant__) now have internal linkage for their host-side shadow variables (non-RDC mode), matching OG behavior.

@github-actions
Copy link

github-actions bot commented Nov 29, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@koparasy
Copy link
Contributor

koparasy commented Dec 1, 2025

I have some concerns with this. Some of the functionality you perform here should be done during lowering to LLVM IR (

void LoweringPreparePass::buildCUDARegisterVars(cir::CIRBaseBuilderTy &builder,
). I believe the code that you have regarding internalize is proper here.

Comment on lines +65 to +68
/// Keeps track of variable containing handle of GPU binary. Populated by
/// ModuleCtorFunction() and used to create corresponding cleanup calls in
/// ModuleDtorFunction()
llvm::GlobalVariable *gpuBinaryHandle = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this being used at CodeGen. We handle the "gpuBinaryHandle" during "lowering". Why do you think we need this here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this is merely an artifact from bringing the skeleton from OG, Will remove.

DeviceVarFlags flags;
};

llvm::SmallVector<VarInfo, 16> deviceVars;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this? Does this exist in OG?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you are mixing CUDA and HIP tests here. This is ok, but we had historically split them between CUDA/HIP directories.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Point, I'll make sure to split both things from now on.

@@ -1,4 +1,4 @@
#include "cuda.h"
#include "../Inputs/cuda.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We prefer on not having a relative path here #include <cuda.h> :D. We give th path through the -I%S/../Inputs/ flag we pass to CC1.

@RiverDave RiverDave changed the title [CIR][CUDA][HIP] Handle global variable registration [CIR][CUDA][HIP] Set internal linkage for device variable shadows Dec 1, 2025
@RiverDave
Copy link
Collaborator Author

I have some concerns with this. Some of the functionality you perform here should be done during lowering to LLVM IR (

void LoweringPreparePass::buildCUDARegisterVars(cir::CIRBaseBuilderTy &builder,

). I believe the code that you have regarding internalize is proper here.

Okay, it took me some time. But I realize I misled with the initial title of this PR, Registration was recently handled in your PR (thanks for that!). The vector stored in the runtime deviceVars is metadata that we need to consume and utilize when making these registration calls. The problem I have with my PR is that we're not consuming that information at the loweringPrepare when we should make use of that.

See how OG handles the deviceVars:

for (auto &&Info : DeviceVars) {

The equivalent we have to bookkeep globals in CIR is:

for (auto &[deviceSideName, global] : cudaVarMap) {

If you look at the way we're currently handling the variables to be shadowed on the host in CIR:

llvm::StringMap<GlobalOp> cudaVarMap;

I believe we somehow need to preserve the information coming from VarInfo, specifically in DeviceVarFlags. Doing that allows us to give special handling to the different types of globals as seen in OG:

switch (Info.Flags.getKind()) {

@RiverDave
Copy link
Collaborator Author

RiverDave commented Dec 2, 2025

I have some concerns with this. Some of the functionality you perform here should be done during lowering to LLVM IR (

void LoweringPreparePass::buildCUDARegisterVars(cir::CIRBaseBuilderTy &builder,

). I believe the code that you have regarding internalize is proper here.

Okay, it took me some time. But I realize I misled with the initial title of this PR, Registration was recently handled in your PR (thanks for that!). The vector stored in the runtime deviceVars is metadata that we need to consume and utilize when making these registration calls. The problem I have with my PR is that we're not consuming that information at the loweringPrepare when we should make use of that.

See how OG handles the deviceVars:

for (auto &&Info : DeviceVars) {

The equivalent we have to bookkeep globals in CIR is:

for (auto &[deviceSideName, global] : cudaVarMap) {

If you look at the way we're currently handling the variables to be shadowed on the host in CIR:

llvm::StringMap<GlobalOp> cudaVarMap;

I believe we somehow need to preserve the information coming from VarInfo, specifically in DeviceVarFlags. Doing that allows us to give special handling to the different types of globals as seen in OG:

switch (Info.Flags.getKind()) {

The way I see it, we have two paths:

  • Utilizing a richer data structure to preserve those flags and consume that information in loweringPrepare
  • Preserving those flags through attributes attached to Global Ops, although the implementation would take longer.

Let me know what you think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants