-
Notifications
You must be signed in to change notification settings - Fork 2k
Limit stack walking in the java agent to frames before the call to AccessController.doPrivileged #17894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Craig Perkins <[email protected]>
Signed-off-by: Craig Perkins <[email protected]>
❌ Gradle check result for 92d83d2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
@reta @andrross @kumargu this completely explains the entries that have needed to be added to policy files since the merge of the agent. If we proceed with walking the entire stack then there's going to need to be a lot of new entries for the agent. We need to visit this logic to determine which frames to examine. |
Thanks @cwperks , I also see the issue but |
Should we consider making a replacement in OpenSearch and driving a campaign across plugins to replace AccessController.doPrivileged calls with the replacement? |
That could be an option, yes, but it really depends on how far we would like to go, the |
that would be a huge campaign right now considering Beta-1 timelines. Our plan was to remove usage of Also we should reconsider making replacements: we should probably not end up writing our own security manager with all sorts of replacements (like PolicyFile, PolicyParser, AccessController, SocketPermissions, FilePermissions etc..) |
Another marker we could use is SpecialPermission.check(), but this has not been consistently added before AccessController.doPrivileged() in all cases |
I don't think we need to immediately switch over the imports from java.security.AccessController to its replacement (in OpenSearch) if we go this route. Replacing the imports could be done in 3.1. I can see AccessController in JDK24 (marked for removal, but still there). I think writing a replacement would be fairly trivial because it just wraps a callable and executes the callable. Its exact purpose is to limit what frames are examined on the stack when iterating through the ProtectionDomains to ensure that the code executing the privileged action is explicitly authorized with grants. |
I will take some time to study how the AccessController relates to JSM. I thought this comment from @rmuir on the initial contribution of SpecialPermission was insightful: 6d8c035
SpecialPermission.check() was always supposed to be called before AccessController.doPrivileged() to take into account the callstack before AccessController.doPrivileged to ensure the original code came from OpenSearch or one of its deps (i.e. if a painless script were to run privileged code then SpecialPermission.check() would fail) Once inside AccessController.doPrivileged it wouldn't consider previous frames |
Yeah that's right. It's a hack, from the times before painless where there was untrusted groovy code. Stuff was setup for this untrusted groovy code to run with zero permissions, and the intent of this check was to try to prevent access to doPrivileged completely, as it could allow escalation. |
I think I found where JSM is assembling the list of ProtectionDomains: https://github.com/openjdk/jdk17u/blob/305512ccc6bc87afe972781e6953c4517a365f86/src/hotspot/share/prims/jvm.cpp#L1267-L1269 This PR seems to be in line with the AccessControllerContext code here:
|
Thank you for confirming @rmuir ! |
Yeah, I feel like java made it really hard to "drop permissions" explicitly. At the time the default permissions were quite generous, which made it worse. So that's why you see strange stuff here. I feel like at some point code was changed to use a new accesscontroller method, that kinda sorta allowed you to "drop privs". Classloader filtering and painless came later IIRC. So you might have another way to accomplish the goals, without the hack? This SpecialPermission is not a good one to keep around because it always confused everyone. I guess my best analogy is, being in |
❕ Gradle check result for 92d83d2: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #17894 +/- ##
============================================
- Coverage 72.43% 72.35% -0.08%
+ Complexity 66789 66759 -30
============================================
Files 5449 5451 +2
Lines 309085 309148 +63
Branches 44979 44988 +9
============================================
- Hits 223899 223698 -201
- Misses 66906 67100 +194
- Partials 18280 18350 +70 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@cwperks @reta Seems like the two options are either to truncate the stack, or give all the core protection domains a superset of all plugin permissions, right? If we go with the option to truncate the stack, I think we can rely on the JDK access controller for the time being and later introduce a replacement, and then support both until the JDK version is removed. |
@andrross No, not necessarily, please check https://github.com/opensearch-project/k-NN/pull/2657/files#diff-9650701a1206ba7746d46a213c115232a9c39b597cfb85d009fe6d9b7c695b1fR180 for the option(s) that should work without any changes (in the nutshell - we should be going on case by case basis). Personally, I think we should be prepared that everything security related will be gone (including the permissions, see please https://bugs.openjdk.org/browse/JDK-8353642), so we should not be building on these APIs (or tempting to replicate them, it will get very complex), thanks |
Description
This PR seeks to resolve the issue described here. After the merge of the Java agent (replacement for JSM), its been necessary to add new grants to different policy files (for example this) and one possible reason is that the java agent is walking more of the stack then the Java Security Manager did previously.
The purpose of AccessController.doPrivileged() is to limit the stack for the java security manager when traversing the ProtectionDomains to see if all callers in the stack are granted permission to do the privileged action.
I verified the changes in this PR by using the job-scheduler and tested with FileChannel.open interception.
Testing steps:
./gradlew publishToMavenLocal -Dbuild.snapshot=true -Dbuild.version_qualifier=beta1
createComponents
with the following:./gradlew integTest --tests JobSchedulerPluginIT.testPluginsAreInstalled -i
The test will fail without the policy grant. Test this with the call to
readAFile
both 1) wrapped by AccessController.doPrivileged()` and 2) without wrapping.In that case for 2) without this change in core you will see that it fails with the following:
Its clearly walking up the stack to check for a grant in a different ProtectionDomain, whereas when wrapping in AccessController.doPrivileged it will not walk up to that point in the stack.
Related Issues
Resolves #17884 (comment)
Separate Issue
Separately, there is an issue with the FileInterceptor.
If I change the implementation of
readAFile
to this:Then the interceptor is not properly evaluating permissions because behind the scenes its calling
newByteChannel
and notopen
.Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.