Conversation
| /// Create a new account. | ||
| /// | ||
| /// This function is a wrapper around the system program's `create_account` | ||
| /// instruction. | ||
| /// | ||
| /// # Safety | ||
| /// | ||
| /// This function assumes that accounts are not mutably borrowed. |
There was a problem hiding this comment.
btw idk if you or anyone on the team know this already but enabling potential UB like this can be extremely dangerous for onchain programs. By right, UB means the following might happen:
- user has code that forgot to drop a lamports
RefMut - a similar unsafe CPI helper is used, which modifies the lamports field, resulting in a simultaneous mutable borrow UB
- there is code later that checks whether the lamports amount has changed after the CPI using the same
RefMutfrom above. Because of the above UB, the rust compiler assumes that theRefMuthas exclusive borrow of lamports even across the CPI call, so it optimizes the check to always pass instead of generating the code to actually perform the check
Granted, pinocchio's RefMut contains raw pointers so this is probably not going to happen because rustc is generally very conservative about optimizing anything to do with raw pointers, but still something worth keeping in mind imo.
There was a problem hiding this comment.
That is why the function is marked unsafe. 😉 As the comment says, you need to guarantee that the account is not already borrowed - I probably need to make that comment more clear but most likely I will replace it with a "proper" client once the system crate gets refactored.
In any case, this is not a "generic" helper for the CPI. It is only used for the purpose of this benchmark. But overall the idea is to offer both versions: a safe one with all the necessary checks and an unsafe where you are responsible for the checks. If a developer decides to use the unsafe, then it needs to make sure that it satisfies the constraints. And yes, it enables creating UB, but that is not different than unsafe rust allowing you to create UB, right?
There was a problem hiding this comment.
The difference is that exposing a public unsafe api, as opposed to encapsulating all the unsafe behaviour behind a safe public api, puts more burden on devs and auditors since now every call site of such a fn needs to be checked thoroughly rather than just the src.
For this case, this is made harder by the fact that Ref and RefMut dont have their scope automatically determined like references. Forgot to call drop() or surround your RefMut in curly braces before unsafe CPI? Instant UB
There was a problem hiding this comment.
Hmmm, I am not sure if I follow the argument. There is a "safe" variant invoke_signed that performs all the required checks. If a dev chooses to use the unsafe invoke_signed_unchecked, then it is an explicit choice that comes with constraints. This is no different than using assume_init for example – it is up to the caller to guarantee that the value is really initialized; calling on an uninitialized value leads to instant UB.
In the particular context of this create_account_unchecked, this is not meant to be a "public" CPI helper for the system program. It is just here for the benchmark and it is being used in a context where there are no borrows to the accounts.
There was a problem hiding this comment.
Yes, this is a completely acceptable design decision and standpoint for pinocchio to have. I'm not requesting any changes or anything. The original comment was just a friendly reminder that taking this standpoint can have far-reaching effects on users of this library, and therefore the entire ecosystem, in case these effects weren't considered. By exposing a public unsafe API that can be hard to get right (as demonstrated by the discussion above) and presenting it as part of an '"average" program implementation', pinocchio program devs who don't 100% know what they're doing can get seriously bitten. As library devs, I think that part of our job should be to take away as many of these footguns as possible; ensuring security of onchain programs' business logic is hard enough already. But this is just my own standpoint, pinocchio's standpoint might be to expose low-level unsafe functionality in order to enable program devs to achieve maximum performance, and that's fine too
There was a problem hiding this comment.
This is no different than using assume_init for example – it is up to the caller to guarantee that the value is really initialized; calling on an uninitialized value leads to instant UB.
Yep, but there is a difference between making a public library function that uses assume_init internally and verifies that its usage of it is safe in all cases, and making a public unsafe library function that puts the burden of making sure that its safe on the library user
There was a problem hiding this comment.
This is no different than using assume_init for example – it is up to the caller to guarantee that the value is really initialized; calling on an uninitialized value leads to instant UB.
Yep, but there is a difference between making a public library function that uses
assume_initinternally and verifies that its usage of it is safe in all cases, and making a public unsafe library function that puts the burden of making sure that its safe on the library user
I was referring to the fact that assume_init is a public library unsafe function in the same way that invoke_signed_uncheckedis a public library unsafe function. From this standpoint, there is no difference. 😉
There was a problem hiding this comment.
Yes, this is a completely acceptable design decision and standpoint for pinocchio to have. I'm not requesting any changes or anything. The original comment was just a friendly reminder that taking this standpoint can have far-reaching effects on users of this library, and therefore the entire ecosystem, in case these effects weren't considered. By exposing a public unsafe API that can be hard to get right (as demonstrated by the discussion above) and presenting it as part of an '"average" program implementation', pinocchio program devs who don't 100% know what they're doing can get seriously bitten. As library devs, I think that part of our job should be to take away as many of these footguns as possible; ensuring security of onchain programs' business logic is hard enough already. But this is just my own standpoint, pinocchio's standpoint might be to expose low-level unsafe functionality in order to enable program devs to achieve maximum performance, and that's fine too
That is precisely the design decision: offer both variants and it is up to the developer to decide which one to use. I would not consider it "hard to get it right" once you understand the implications – for example, for you it is very clear the implication of using it and how to avoid issues. By no means it is being presented as an "average" program implementation – if that is what comes across, it is not intentional and we should definitely improve the documentation to make that clear. Not sure if your comment is generic or specific to the example on this PR – if it is the latter, we should also consider the context of this PR, since this discussion is happening on an unfinished draft PR of a benchmark program.
Developers "who don't 100% know what they're doing" should not be using the unsafe variants, the same way they should not be using unsafe Rust – at the end of the day, no library can prevent you doing the wrong thing if you don't know what you are doing. The fact that you have to wrap unsafe code in an unsafe block is already a good sign that if you don't know what that code is doing, you probably should not be using it. But I think this should not prevent a library to offer low-level unsafe functionality the same way that Rust exposes unsafe functionality – safer "footgun-free" variants are available. It is possible to write a whole program without any unsafe Rust.
I appreciate your comment and agree that going through examples that use unsafe code can lead to people copying the code without understanding its implications. But I think this is a separate issue – it is more about developer education/discipline than whether a library should offer unsafe API or not.
There was a problem hiding this comment.
I appreciate your detailed replies too, thank you for clarifying pinocchio's design decisions.
By no means it is being presented as an "average" program implementation – if that is what comes across, it is not intentional and we should definitely improve the documentation to make that clear
The quote is from this part of the README
There was a problem hiding this comment.
Good catch! I think adding the CPI benches did not fit well the original intent of the benchmark, might need to revise the whole thing. Later on the same section there is this text:
The implementation across all different entrypoint programs is as similar as possible. In most cases, the only differences are on the types import, since each entrypoint defines their own AccountInfo and/or Pubkey types.
This is not possible when it comes to CPI clients, since each implementation does things differently.
|
Closed in favour of #4 |
Problem
PR #2 shows that the current CPI helpers used in the pinocchio benchmark are far from ideal:
Solution
This PR refactor the CPI code for both pinocchio and solana-nostd programs. The refactoring led to decrease in both compute units and binary size of the programs.