Refactor CPI helpers by febo · Pull Request #3 · febo/eisodos

febo · 2025-04-29T10:30:46Z

Problem

PR #2 shows that the current CPI helpers used in the pinocchio benchmark are far from ideal:

consume more CUs
significantly increase the binary size

Solution

This PR refactor the CPI code for both pinocchio and solana-nostd programs. The refactoring led to decrease in both compute units and binary size of the programs.

program	before	after
pinocchio	10528	3680
solana-nostd-entrypoint	17064	9576

billythedummy · 2025-04-30T16:35:19Z

programs/pinocchio/src/cpi/create_account.rs

+/// Create a new account.
+///
+/// This function is a wrapper around the system program's `create_account`
+/// instruction.
+///
+/// # Safety
+///
+/// This function assumes that accounts are not mutably borrowed.


btw idk if you or anyone on the team know this already but enabling potential UB like this can be extremely dangerous for onchain programs. By right, UB means the following might happen:

user has code that forgot to drop a lamports RefMut

a similar unsafe CPI helper is used, which modifies the lamports field, resulting in a simultaneous mutable borrow UB

there is code later that checks whether the lamports amount has changed after the CPI using the same RefMut from above. Because of the above UB, the rust compiler assumes that the RefMut has exclusive borrow of lamports even across the CPI call, so it optimizes the check to always pass instead of generating the code to actually perform the check

Granted, pinocchio's RefMut contains raw pointers so this is probably not going to happen because rustc is generally very conservative about optimizing anything to do with raw pointers, but still something worth keeping in mind imo.

That is why the function is marked unsafe. 😉 As the comment says, you need to guarantee that the account is not already borrowed - I probably need to make that comment more clear but most likely I will replace it with a "proper" client once the system crate gets refactored.

In any case, this is not a "generic" helper for the CPI. It is only used for the purpose of this benchmark. But overall the idea is to offer both versions: a safe one with all the necessary checks and an unsafe where you are responsible for the checks. If a developer decides to use the unsafe, then it needs to make sure that it satisfies the constraints. And yes, it enables creating UB, but that is not different than unsafe rust allowing you to create UB, right?

The difference is that exposing a public unsafe api, as opposed to encapsulating all the unsafe behaviour behind a safe public api, puts more burden on devs and auditors since now every call site of such a fn needs to be checked thoroughly rather than just the src.

For this case, this is made harder by the fact that Ref and RefMut dont have their scope automatically determined like references. Forgot to call drop() or surround your RefMut in curly braces before unsafe CPI? Instant UB

Hmmm, I am not sure if I follow the argument. There is a "safe" variant invoke_signed that performs all the required checks. If a dev chooses to use the unsafe invoke_signed_unchecked, then it is an explicit choice that comes with constraints. This is no different than using assume_init for example – it is up to the caller to guarantee that the value is really initialized; calling on an uninitialized value leads to instant UB.

In the particular context of this create_account_unchecked, this is not meant to be a "public" CPI helper for the system program. It is just here for the benchmark and it is being used in a context where there are no borrows to the accounts.

Yes, this is a completely acceptable design decision and standpoint for pinocchio to have. I'm not requesting any changes or anything. The original comment was just a friendly reminder that taking this standpoint can have far-reaching effects on users of this library, and therefore the entire ecosystem, in case these effects weren't considered. By exposing a public unsafe API that can be hard to get right (as demonstrated by the discussion above) and presenting it as part of an '"average" program implementation', pinocchio program devs who don't 100% know what they're doing can get seriously bitten. As library devs, I think that part of our job should be to take away as many of these footguns as possible; ensuring security of onchain programs' business logic is hard enough already. But this is just my own standpoint, pinocchio's standpoint might be to expose low-level unsafe functionality in order to enable program devs to achieve maximum performance, and that's fine too

This is no different than using assume_init for example – it is up to the caller to guarantee that the value is really initialized; calling on an uninitialized value leads to instant UB.

Yep, but there is a difference between making a public library function that uses assume_init internally and verifies that its usage of it is safe in all cases, and making a public unsafe library function that puts the burden of making sure that its safe on the library user

This is no different than using assume_init for example – it is up to the caller to guarantee that the value is really initialized; calling on an uninitialized value leads to instant UB.

Yep, but there is a difference between making a public library function that uses assume_init internally and verifies that its usage of it is safe in all cases, and making a public unsafe library function that puts the burden of making sure that its safe on the library user

I was referring to the fact that assume_init is a public library unsafe function in the same way that invoke_signed_uncheckedis a public library unsafe function. From this standpoint, there is no difference. 😉

Yes, this is a completely acceptable design decision and standpoint for pinocchio to have. I'm not requesting any changes or anything. The original comment was just a friendly reminder that taking this standpoint can have far-reaching effects on users of this library, and therefore the entire ecosystem, in case these effects weren't considered. By exposing a public unsafe API that can be hard to get right (as demonstrated by the discussion above) and presenting it as part of an '"average" program implementation', pinocchio program devs who don't 100% know what they're doing can get seriously bitten. As library devs, I think that part of our job should be to take away as many of these footguns as possible; ensuring security of onchain programs' business logic is hard enough already. But this is just my own standpoint, pinocchio's standpoint might be to expose low-level unsafe functionality in order to enable program devs to achieve maximum performance, and that's fine too

That is precisely the design decision: offer both variants and it is up to the developer to decide which one to use. I would not consider it "hard to get it right" once you understand the implications – for example, for you it is very clear the implication of using it and how to avoid issues. By no means it is being presented as an "average" program implementation – if that is what comes across, it is not intentional and we should definitely improve the documentation to make that clear. Not sure if your comment is generic or specific to the example on this PR – if it is the latter, we should also consider the context of this PR, since this discussion is happening on an unfinished draft PR of a benchmark program.

Developers "who don't 100% know what they're doing" should not be using the unsafe variants, the same way they should not be using unsafe Rust – at the end of the day, no library can prevent you doing the wrong thing if you don't know what you are doing. The fact that you have to wrap unsafe code in an unsafe block is already a good sign that if you don't know what that code is doing, you probably should not be using it. But I think this should not prevent a library to offer low-level unsafe functionality the same way that Rust exposes unsafe functionality – safer "footgun-free" variants are available. It is possible to write a whole program without any unsafe Rust.

I appreciate your comment and agree that going through examples that use unsafe code can lead to people copying the code without understanding its implications. But I think this is a separate issue – it is more about developer education/discipline than whether a library should offer unsafe API or not.

I appreciate your detailed replies too, thank you for clarifying pinocchio's design decisions.

By no means it is being presented as an "average" program implementation – if that is what comes across, it is not intentional and we should definitely improve the documentation to make that clear

The quote is from this part of the README

Good catch! I think adding the CPI benches did not fit well the original intent of the benchmark, might need to revise the whole thing. Later on the same section there is this text:

The implementation across all different entrypoint programs is as similar as possible. In most cases, the only differences are on the types import, since each entrypoint defines their own AccountInfo and/or Pubkey types.

This is not possible when it comes to CPI clients, since each implementation does things differently.

febo · 2025-05-31T02:39:59Z

Closed in favour of #4

febo added 2 commits April 29, 2025 11:20

Refactor cpi

7021ece

Update compute units

a140c3c

febo mentioned this pull request Apr 29, 2025

Add jiminy results using solana v2.2.6 and also binary sizes comparison #2

Merged

febo marked this pull request as draft April 29, 2025 16:27

febo force-pushed the febo/refactor-cpi branch from cfe6369 to a140c3c Compare April 29, 2025 16:27

Update pinocchio dependency

a830867

billythedummy reviewed Apr 30, 2025

View reviewed changes

febo closed this May 31, 2025

febo deleted the febo/refactor-cpi branch June 4, 2025 13:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor CPI helpers#3

Refactor CPI helpers#3
febo wants to merge 3 commits intomainfrom
febo/refactor-cpi

febo commented Apr 29, 2025

Uh oh!

billythedummy Apr 30, 2025

Uh oh!

febo Apr 30, 2025

Uh oh!

billythedummy May 1, 2025

Uh oh!

febo May 1, 2025

Uh oh!

billythedummy May 1, 2025

Uh oh!

billythedummy May 1, 2025

Uh oh!

febo May 1, 2025

Uh oh!

febo May 1, 2025

Uh oh!

billythedummy May 1, 2025

Uh oh!

febo May 2, 2025

Uh oh!

febo commented May 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

febo commented Apr 29, 2025

Problem

Solution

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

febo commented May 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants