-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[flang][openacc] map data operand results to symbols inside compute region #162306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…egion Variable references inside OpenACC compute and loop region were currently always lowered to usages of the same SSA values than in the host thread, even for variables that appear in data clauses and for which acc data operations are created. This makes it a non-trivial task to identify implicit data usages vs usage of data appearing in clauses because the SSA addresses used in the region may have a non-trivial SSA relationship with the SSA addresses used as inputs of the data operations, especially after CSE runs that may merge component or array element addressing operations with similar addressing on the host thread (fir.coordinate/hlfir.designate). This patch updates OpenACC lowering to remap the Symbol that appear in data clauses to the related acc data operation result for the scope of the compute or loop region. To allow FIR passes to reason about these addresses, a new hlfir.declare operation is created with the acc data operation result. This gives access to the shape, contiguity, attributes, and dummy argument relationships inside the region without having FIR extended to understand the data operations.
@llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-openacc Author: None (jeanPerier) ChangesVariable references inside OpenACC compute and loop region were currently always lowered to usages of the same SSA values than in the host thread, even for variables that appear in data clauses and for which acc data operations are created. This makes it a non-trivial task to identify implicit data usages vs usage of data appearing in clauses because the SSA addresses used in the region may have a non-trivial SSA relationship with the SSA addresses used as inputs of the data operations, especially after CSE runs that may merge component or array element addressing operations with similar addressing on the host thread (fir.coordinate/hlfir.designate). This patch updates OpenACC lowering to remap the Symbol that appear in data clauses to the related acc data operation result for the scope of the compute or loop region. To allow FIR passes to reason about these addresses, a new hlfir.declare operation is created with the acc data operation result. This gives access to the shape, contiguity, attributes, and dummy argument relationships inside the region without having FIR extended to understand the data operations. Patch is 109.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/162306.diff 11 Files Affected:
diff --git a/flang/include/flang/Lower/AbstractConverter.h b/flang/include/flang/Lower/AbstractConverter.h
index 0ffe27ea038e8..f8322a50effc4 100644
--- a/flang/include/flang/Lower/AbstractConverter.h
+++ b/flang/include/flang/Lower/AbstractConverter.h
@@ -123,6 +123,10 @@ class AbstractConverter {
virtual Fortran::lower::SymMap::StorageDesc
getSymbolStorage(SymbolRef sym) = 0;
+ /// Return the Symbol Map used to map semantics::Symbol to their SSA
+ /// values in the generated MLIR.
+ virtual Fortran::lower::SymMap &getSymbolMap() = 0;
+
/// Override lowering of expression with pre-lowered values.
/// Associate mlir::Value to evaluate::Expr. All subsequent call to
/// genExprXXX() will replace any occurrence of an overridden
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 780d56f085f69..3a022e188bbdd 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -643,6 +643,8 @@ class FirConverter : public Fortran::lower::AbstractConverter {
return localSymbols.lookupStorage(sym);
}
+ Fortran::lower::SymMap &getSymbolMap() override final { return localSymbols; }
+
void
overrideExprValues(const Fortran::lower::ExprToValueMap *map) override final {
exprValueOverrides = map;
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index 4a9e49435a907..f164b3b629f6d 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -20,6 +20,7 @@
#include "flang/Lower/PFTBuilder.h"
#include "flang/Lower/StatementContext.h"
#include "flang/Lower/Support/Utils.h"
+#include "flang/Lower/SymbolMap.h"
#include "flang/Optimizer/Builder/BoxValue.h"
#include "flang/Optimizer/Builder/Complex.h"
#include "flang/Optimizer/Builder/FIRBuilder.h"
@@ -33,6 +34,7 @@
#include "flang/Semantics/scope.h"
#include "flang/Semantics/tools.h"
#include "mlir/Dialect/ControlFlow/IR/ControlFlowOps.h"
+#include "mlir/IR/IRMapping.h"
#include "mlir/IR/MLIRContext.h"
#include "mlir/Support/LLVM.h"
#include "llvm/ADT/STLExtras.h"
@@ -60,6 +62,16 @@ static llvm::cl::opt<bool> lowerDoLoopToAccLoop(
llvm::cl::desc("Whether to lower do loops as `acc.loop` operations."),
llvm::cl::init(true));
+static llvm::cl::opt<bool> enableSymbolRemapping(
+ "openacc-remap-symbols",
+ llvm::cl::desc("Whether to remap symbols that appears in data clauses."),
+ llvm::cl::init(true));
+
+static llvm::cl::opt<bool> enableDevicePtrRemap(
+ "openacc-remap-device-ptr-symbols",
+ llvm::cl::desc("sub-option of openacc-remap-symbols for deviceptr clause"),
+ llvm::cl::init(false));
+
// Special value for * passed in device_type or gang clauses.
static constexpr std::int64_t starCst = -1;
@@ -624,17 +636,19 @@ void genAtomicCapture(Fortran::lower::AbstractConverter &converter,
}
template <typename Op>
-static void
-genDataOperandOperations(const Fortran::parser::AccObjectList &objectList,
- Fortran::lower::AbstractConverter &converter,
- Fortran::semantics::SemanticsContext &semanticsContext,
- Fortran::lower::StatementContext &stmtCtx,
- llvm::SmallVectorImpl<mlir::Value> &dataOperands,
- mlir::acc::DataClause dataClause, bool structured,
- bool implicit, llvm::ArrayRef<mlir::Value> async,
- llvm::ArrayRef<mlir::Attribute> asyncDeviceTypes,
- llvm::ArrayRef<mlir::Attribute> asyncOnlyDeviceTypes,
- bool setDeclareAttr = false) {
+static void genDataOperandOperations(
+ const Fortran::parser::AccObjectList &objectList,
+ Fortran::lower::AbstractConverter &converter,
+ Fortran::semantics::SemanticsContext &semanticsContext,
+ Fortran::lower::StatementContext &stmtCtx,
+ llvm::SmallVectorImpl<mlir::Value> &dataOperands,
+ mlir::acc::DataClause dataClause, bool structured, bool implicit,
+ llvm::ArrayRef<mlir::Value> async,
+ llvm::ArrayRef<mlir::Attribute> asyncDeviceTypes,
+ llvm::ArrayRef<mlir::Attribute> asyncOnlyDeviceTypes,
+ bool setDeclareAttr = false,
+ llvm::SmallVectorImpl<std::pair<mlir::Value, Fortran::semantics::SymbolRef>>
+ *symbolPairs = nullptr) {
fir::FirOpBuilder &builder = converter.getFirOpBuilder();
Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext};
const bool unwrapBoxAddr = true;
@@ -655,6 +669,9 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList,
/*strideIncludeLowerExtent=*/strideIncludeLowerExtent);
LLVM_DEBUG(llvm::dbgs() << __func__ << "\n"; info.dump(llvm::dbgs()));
+ bool isWholeSymbol =
+ !designator || Fortran::evaluate::UnwrapWholeSymbolDataRef(*designator);
+
// If the input value is optional and is not a descriptor, we use the
// rawInput directly.
mlir::Value baseAddr = ((fir::unwrapRefType(info.addr.getType()) !=
@@ -668,6 +685,11 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList,
asyncOnlyDeviceTypes, unwrapBoxAddr, info.isPresent);
dataOperands.push_back(op.getAccVar());
+ // Track the symbol and its corresponding mlir::Value if requested
+ if (symbolPairs && isWholeSymbol)
+ symbolPairs->emplace_back(op.getAccVar(),
+ Fortran::semantics::SymbolRef(symbol));
+
// For UseDeviceOp, if operand is one of a pair resulting from a
// declare operation, create a UseDeviceOp for the other operand as well.
if constexpr (std::is_same_v<Op, mlir::acc::UseDeviceOp>) {
@@ -681,6 +703,8 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList,
asyncDeviceTypes, asyncOnlyDeviceTypes,
unwrapBoxAddr, info.isPresent);
dataOperands.push_back(op.getAccVar());
+ // Not adding this to symbolPairs because it only make sense to
+ // map the symbol to a single value.
}
}
}
@@ -1264,7 +1288,9 @@ static void genPrivatizationRecipes(
llvm::SmallVector<mlir::Attribute> &privatizationRecipes,
llvm::ArrayRef<mlir::Value> async,
llvm::ArrayRef<mlir::Attribute> asyncDeviceTypes,
- llvm::ArrayRef<mlir::Attribute> asyncOnlyDeviceTypes) {
+ llvm::ArrayRef<mlir::Attribute> asyncOnlyDeviceTypes,
+ llvm::SmallVectorImpl<std::pair<mlir::Value, Fortran::semantics::SymbolRef>>
+ *symbolPairs = nullptr) {
fir::FirOpBuilder &builder = converter.getFirOpBuilder();
Fortran::evaluate::ExpressionAnalyzer ea{semanticsContext};
for (const auto &accObject : objectList.v) {
@@ -1284,6 +1310,9 @@ static void genPrivatizationRecipes(
/*strideIncludeLowerExtent=*/strideIncludeLowerExtent);
LLVM_DEBUG(llvm::dbgs() << __func__ << "\n"; info.dump(llvm::dbgs()));
+ bool isWholeSymbol =
+ !designator || Fortran::evaluate::UnwrapWholeSymbolDataRef(*designator);
+
RecipeOp recipe;
mlir::Type retTy = getTypeFromBounds(bounds, info.addr.getType());
if constexpr (std::is_same_v<RecipeOp, mlir::acc::PrivateRecipeOp>) {
@@ -1297,6 +1326,11 @@ static void genPrivatizationRecipes(
/*implicit=*/false, mlir::acc::DataClause::acc_private, retTy, async,
asyncDeviceTypes, asyncOnlyDeviceTypes, /*unwrapBoxAddr=*/true);
dataOperands.push_back(op.getAccVar());
+
+ // Track the symbol and its corresponding mlir::Value if requested
+ if (symbolPairs && isWholeSymbol)
+ symbolPairs->emplace_back(op.getAccVar(),
+ Fortran::semantics::SymbolRef(symbol));
} else {
std::string suffix =
areAllBoundConstant(bounds) ? getBoundsString(bounds) : "";
@@ -1310,6 +1344,11 @@ static void genPrivatizationRecipes(
async, asyncDeviceTypes, asyncOnlyDeviceTypes,
/*unwrapBoxAddr=*/true);
dataOperands.push_back(op.getAccVar());
+
+ // Track the symbol and its corresponding mlir::Value if requested
+ if (symbolPairs && isWholeSymbol)
+ symbolPairs->emplace_back(op.getAccVar(),
+ Fortran::semantics::SymbolRef(symbol));
}
privatizationRecipes.push_back(mlir::SymbolRefAttr::get(
builder.getContext(), recipe.getSymName().str()));
@@ -1949,15 +1988,16 @@ mlir::Type getTypeFromIvTypeSize(fir::FirOpBuilder &builder,
return builder.getIntegerType(ivTypeSize * 8);
}
-static void
-privatizeIv(Fortran::lower::AbstractConverter &converter,
- const Fortran::semantics::Symbol &sym, mlir::Location loc,
- llvm::SmallVector<mlir::Type> &ivTypes,
- llvm::SmallVector<mlir::Location> &ivLocs,
- llvm::SmallVector<mlir::Value> &privateOperands,
- llvm::SmallVector<mlir::Value> &ivPrivate,
- llvm::SmallVector<mlir::Attribute> &privatizationRecipes,
- bool isDoConcurrent = false) {
+static void privatizeIv(
+ Fortran::lower::AbstractConverter &converter,
+ const Fortran::semantics::Symbol &sym, mlir::Location loc,
+ llvm::SmallVector<mlir::Type> &ivTypes,
+ llvm::SmallVector<mlir::Location> &ivLocs,
+ llvm::SmallVector<mlir::Value> &privateOperands,
+ llvm::SmallVector<std::pair<mlir::Value, Fortran::semantics::SymbolRef>>
+ &ivPrivate,
+ llvm::SmallVector<mlir::Attribute> &privatizationRecipes,
+ bool isDoConcurrent = false) {
fir::FirOpBuilder &builder = converter.getFirOpBuilder();
mlir::Type ivTy = getTypeFromIvTypeSize(builder, sym);
@@ -2001,15 +2041,8 @@ privatizeIv(Fortran::lower::AbstractConverter &converter,
builder.getContext(), recipe.getSymName().str()));
}
- // Map the new private iv to its symbol for the scope of the loop. bindSymbol
- // might create a hlfir.declare op, if so, we map its result in order to
- // use the sym value in the scope.
- converter.bindSymbol(sym, mlir::acc::getAccVar(privateOp));
- auto privateValue = converter.getSymbolAddress(sym);
- if (auto declareOp =
- mlir::dyn_cast<hlfir::DeclareOp>(privateValue.getDefiningOp()))
- privateValue = declareOp.getResults()[0];
- ivPrivate.push_back(privateValue);
+ ivPrivate.emplace_back(mlir::acc::getAccVar(privateOp),
+ Fortran::semantics::SymbolRef(sym));
}
static void determineDefaultLoopParMode(
@@ -2088,7 +2121,8 @@ static void processDoLoopBounds(
llvm::SmallVector<mlir::Value> &upperbounds,
llvm::SmallVector<mlir::Value> &steps,
llvm::SmallVector<mlir::Value> &privateOperands,
- llvm::SmallVector<mlir::Value> &ivPrivate,
+ llvm::SmallVector<std::pair<mlir::Value, Fortran::semantics::SymbolRef>>
+ &ivPrivate,
llvm::SmallVector<mlir::Attribute> &privatizationRecipes,
llvm::SmallVector<mlir::Type> &ivTypes,
llvm::SmallVector<mlir::Location> &ivLocs,
@@ -2178,26 +2212,122 @@ static void processDoLoopBounds(
}
}
-static mlir::acc::LoopOp
-buildACCLoopOp(Fortran::lower::AbstractConverter &converter,
- mlir::Location currentLocation,
- Fortran::semantics::SemanticsContext &semanticsContext,
- Fortran::lower::StatementContext &stmtCtx,
- const Fortran::parser::DoConstruct &outerDoConstruct,
- Fortran::lower::pft::Evaluation &eval,
- llvm::SmallVector<mlir::Value> &privateOperands,
- llvm::SmallVector<mlir::Attribute> &privatizationRecipes,
- llvm::SmallVector<mlir::Value> &gangOperands,
- llvm::SmallVector<mlir::Value> &workerNumOperands,
- llvm::SmallVector<mlir::Value> &vectorOperands,
- llvm::SmallVector<mlir::Value> &tileOperands,
- llvm::SmallVector<mlir::Value> &cacheOperands,
- llvm::SmallVector<mlir::Value> &reductionOperands,
- llvm::SmallVector<mlir::Type> &retTy, mlir::Value yieldValue,
- uint64_t loopsToProcess) {
+/// Remap symbols that appeared in OpenACC data clauses to use the results of
+/// the corresponding data operations. This allows isolating symbol accesses
+/// inside the OpenACC region from accesses in the host and other regions while
+/// preserving Fortran information about the symbols for optimizations.
+template <typename RegionOp>
+static void remapDataOperandSymbols(
+ Fortran::lower::AbstractConverter &converter, fir::FirOpBuilder &builder,
+ RegionOp ®ionOp,
+ const llvm::SmallVector<
+ std::pair<mlir::Value, Fortran::semantics::SymbolRef>>
+ &dataOperandSymbolPairs) {
+ if (!enableSymbolRemapping || dataOperandSymbolPairs.empty())
+ return;
+
+ // Map Symbols that appeared inside data clauses to a new hlfir.declare whose
+ // input is the acc data operation result.
+ // This allows isolating all the symbol accesses inside the compute region
+ // from accesses in the host and other regions while preserving the Fortran
+ // information about the symbols for Fortran specific optimizations inside the
+ // region.
+ Fortran::lower::SymMap &symbolMap = converter.getSymbolMap();
+ mlir::OpBuilder::InsertionGuard insertGuard(builder);
+ builder.setInsertionPointToStart(®ionOp.getRegion().front());
+ llvm::SmallPtrSet<const Fortran::semantics::Symbol *, 8> seenSymbols;
+ mlir::IRMapping mapper;
+ for (auto [value, symbol] : dataOperandSymbolPairs) {
+
+ // If A symbol appears on several data clause, just map it to the first
+ // result (all data operations results for a symbol are pointing same
+ // memory, so it does not matter which one is used).
+ if (seenSymbols.contains(&symbol.get()))
+ continue;
+ seenSymbols.insert(&symbol.get());
+ std::optional<fir::FortranVariableOpInterface> hostDef =
+ symbolMap.lookupVariableDefinition(symbol);
+ assert(hostDef.has_value() && llvm::isa<hlfir::DeclareOp>(*hostDef) &&
+ "expected symbol to be mapped to hlfir.declare");
+ auto hostDeclare = llvm::cast<hlfir::DeclareOp>(*hostDef);
+ // Replace base input and DummyScope inputs.
+ mlir::Value hostInput = hostDeclare.getMemref();
+ mlir::Type hostType = hostInput.getType();
+ mlir::Type computeType = value.getType();
+ if (hostType == computeType) {
+ mapper.map(hostInput, value);
+ } else if (llvm::isa<fir::BaseBoxType>(computeType)) {
+ assert(!llvm::isa<fir::BaseBoxType>(hostType) &&
+ "box type mismatch between compute region variable and "
+ "hlfir.declare input unexpected");
+ if (Fortran::semantics::IsOptional(symbol))
+ TODO(regionOp.getLoc(),
+ "remapping OPTIONAL symbol in OpenACC compute region");
+ auto rawValue =
+ fir::BoxAddrOp::create(builder, regionOp.getLoc(), hostType, value);
+ mapper.map(hostInput, rawValue);
+ } else {
+ assert(!llvm::isa<fir::BaseBoxType>(hostType) &&
+ "compute region variable should not be raw address when host "
+ "hlfir.declare input was a box");
+ assert(fir::isBoxAddress(hostType) == fir::isBoxAddress(computeType) &&
+ "compute region variable should be a pointer/allocatable if and "
+ "only if host is");
+ assert(fir::isa_ref_type(hostType) && fir::isa_ref_type(computeType) &&
+ "compute region variable and host variable should both be raw "
+ "addresses");
+ mlir::Value cast =
+ builder.createConvert(regionOp.getLoc(), hostType, value);
+ mapper.map(hostInput, cast);
+ }
+ if (mlir::Value dummyScope = hostDeclare.getDummyScope()) {
+ // Copy the dummy scope into the region so that aliasing rules about
+ // Fortran dummies are understood inside the region and the abstract dummy
+ // scope type does not have to cross the OpenACC compute region boundary.
+ if (!mapper.contains(dummyScope)) {
+ mlir::Operation *hostDummyScopeOp = dummyScope.getDefiningOp();
+ assert(hostDummyScopeOp &&
+ "dummyScope defining operation must be visible in lowering");
+ (void)builder.clone(*hostDummyScopeOp, mapper);
+ }
+ }
+
+ mlir::Operation *computeDef =
+ builder.clone(*hostDeclare.getOperation(), mapper);
+
+ // The input box already went through an hlfir.declare. It has the correct
+ // local lower bounds and attribute. Do not generate a new fir.rebox.
+ if (llvm::isa<fir::BaseBoxType>(hostDeclare.getMemref().getType()))
+ llvm::cast<hlfir::DeclareOp>(*computeDef).setSkipRebox(true);
+
+ symbolMap.addVariableDefinition(
+ symbol, llvm::cast<fir::FortranVariableOpInterface>(computeDef));
+ }
+}
+
+static mlir::acc::LoopOp buildACCLoopOp(
+ Fortran::lower::AbstractConverter &converter,
+ mlir::Location currentLocation,
+ Fortran::semantics::SemanticsContext &semanticsContext,
+ Fortran::lower::StatementContext &stmtCtx,
+ const Fortran::parser::DoConstruct &outerDoConstruct,
+ Fortran::lower::pft::Evaluation &eval,
+ llvm::SmallVector<mlir::Value> &privateOperands,
+ llvm::SmallVector<mlir::Attribute> &privatizationRecipes,
+ llvm::SmallVector<std::pair<mlir::Value, Fortran::semantics::SymbolRef>>
+ &dataOperandSymbolPairs,
+ llvm::SmallVector<mlir::Value> &gangOperands,
+ llvm::SmallVector<mlir::Value> &workerNumOperands,
+ llvm::SmallVector<mlir::Value> &vectorOperands,
+ llvm::SmallVector<mlir::Value> &tileOperands,
+ llvm::SmallVector<mlir::Value> &cacheOperands,
+ llvm::SmallVector<mlir::Value> &reductionOperands,
+ llvm::SmallVector<mlir::Type> &retTy, mlir::Value yieldValue,
+ uint64_t loopsToProcess) {
fir::FirOpBuilder &builder = converter.getFirOpBuilder();
- llvm::SmallVector<mlir::Value> ivPrivate;
+ llvm::SmallVector<std::pair<mlir::Value, Fortran::semantics::SymbolRef>>
+ ivPrivate;
llvm::SmallVector<mlir::Type> ivTypes;
llvm::SmallVector<mlir::Location> ivLocs;
llvm::SmallVector<bool> inclusiveBounds;
@@ -2231,10 +2361,25 @@ buildACCLoopOp(Fortran::lower::AbstractConverter &converter,
builder, builder.getFusedLoc(locs), currentLocation, eval, operands,
operandSegments, /*outerCombined=*/false, retTy, yieldValue, ivTypes,
ivLocs);
-
- for (auto [arg, value] : llvm::zip(
- loopOp.getLoopRegions().front()->front().getArguments(), ivPrivate))
- fir::StoreOp::create(builder, currentLocation, arg, value);
+ // Ensure the iv symbol is mapped to private iv SSA value for the scope of
+ // the loop even if it did not appear explicitly in a PRIVATE clause (if it
+ // appeared explicitly in such clause, that is also fine because duplicates
+ // in the list are ignored).
+ dataOperandSymbolPairs.append(ivPrivate.begin(), ivPrivate.end());
+ // Remap symbols from data clauses to use data operation results
+ remapDataOperandSymbols(converter, builder, loopOp, dataOperandSymbolPairs);
+
+ for (auto [arg, iv] :
+ llvm::zip(loopOp.getLoopRegions().front()->front().getArguments(),
+ ivPrivate)) {
+ // Store block argument to the related iv private variable.
+ auto privateValue =
+ converter.getSymbolAddress(std::get<Fortran::semantics::SymbolRef>(iv));
+ // if (auto declareOp =
+ // mlir::dyn_cast<hlfir::DeclareOp>(privateValue.getDefiningOp()))
+ // privateValue = declareOp.getResults()[0];
+ fir::StoreOp::create(builder, currentLocation, arg, privateValue);
+ }
loopOp.setInclusiveUpperbound(inclusiveBounds);
@@ -2260,6 +2405,10 @@ static mlir::acc::LoopOp createLoopOp(
llvm::SmallVector<int32_t> tileOperandsSegments, gangOperandsSegments;
llvm::SmallVector<int64_t> collapseValues;
+ // Vector to track mlir::Value results and their corresponding Fortran symbols
+ llvm::SmallVector<std::pair<mlir::Value, Fortran::semantics::SymbolRef>>
+ dataOperandSymbolPairs;
+
llvm::SmallVector<mlir::Attribute> gangArgTypes;
llvm::SmallVector<mlir::Attribute> seqDeviceTypes, independentDeviceTypes,
autoDeviceTypes, vectorOperandsDeviceTypes, workerNumOperandsDeviceTypes,
@@ -2380,7 +2529,8 @@ static mlir::acc::LoopOp createLoopOp(
genPrivatizationRecipes<mlir::acc::PrivateRecipeOp>(
privateClause->v, converter, semanticsContext, stmtCtx,
privateOperands, privatizationRecipes, /*async=*/{},
- /*asyncDeviceTypes=*/{}, /*asyncOnlyDeviceTypes...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Variable references inside OpenACC compute and loop region were currently always lowered to usages of the same SSA values than in the host thread, even for variables that appear in data clauses and for which acc data operations are created.
This makes it a non-trivial task to identify implicit data usages vs usage of data appearing in clauses because the SSA addresses used in the region may have a non-trivial SSA relationship with the SSA addresses used as inputs of the data operations, especially after CSE runs that may merge component or array element addressing operations with similar addressing on the host thread (fir.coordinate/hlfir.designate).
This patch updates OpenACC lowering to remap the Symbol that appear in data clauses to the related acc data operation result for the scope of the compute or loop region.
To allow FIR passes to reason about these addresses, a new hlfir.declare operation is created with the acc data operation result. This gives access to the shape, contiguity, attributes, and dummy argument relationships inside the region without having FIR extended to understand the data operations.