Skip to content

Conversation

@banditopazzo
Copy link
Contributor

@banditopazzo banditopazzo commented Aug 20, 2025

I ended my first iteration for memory linking and api changes.

To recap my goals:

  1. in memory linking
  2. and since I am touching the external apis, add the possibility of reusing the linker instance avoiding a full initialization each time (in my use case I have to basically link in a for loop)

Major changes:

  • better/safer wrappers for llvm objects (custom Drop in every wrapper, lifetimes, mutability)
  • api change to have 2 main methods: link_to_file, link_to_buffer
  • linker accepts both files and in-memory data at the same time
  • moved some linker methods to standalone functions

Open points or missing:

  • from what I understand the target_machine could be created early and stored in the linker instance, in theory it's based on the cli options
  • names of the new structures
  • dump_module should be moved to link methods

I am not surprised if I'm forgetting something.

In any case this should be enough at least to have an idea of changes I am proposing


This change is Reviewable

@banditopazzo
Copy link
Contributor Author

another question is what to do with libs path list: to remove or to handle

@banditopazzo
Copy link
Contributor Author

Hi @alessandrod , have you had a chance to take a look at this draft? I'd appreciate your thoughts on whether the changes work for you and if you like the direction it’s heading in.

Copy link
Member

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think you could split out the wrapper parts as a separate change? If so, we could merge it fast, and then review just the in-memory part of the code (which I still need to play with and review properly).

@alessandrod
Copy link
Collaborator

Hi @alessandrod , have you had a chance to take a look at this draft? I'd appreciate your thoughts on whether the changes work for you and if you like the direction it’s heading in.

gonna take a look today, sorry for the delay!

@banditopazzo banditopazzo force-pushed the in-memory-linking branch 2 times, most recently from f49caa9 to 22bab7e Compare September 4, 2025 13:36
@banditopazzo
Copy link
Contributor Author

I rebased on the main and squashed to remove useless commits.

The I addressed most of the review comments.

@vadorovsky: I tried to create a new PR without the in memory linking but it ended just like this, the in memory linking is marginal. I think it doesn't make the review process faster, but if really want as separate PR, I will do it.

@alessandrod: taking a slice instead of the Vec<LinkerInput> as input wouldn't work in my opinion. The LinkerInput objects have a mutable state inside because of the cursor they use for the Read, so it needs to be a &mut [LinkerInput]. I don't see any advantage doing this, considering LinkerInput is just a type that we use to accept a mixed vector of both files and in memory data

@banditopazzo banditopazzo marked this pull request as ready for review September 4, 2025 15:03
@banditopazzo
Copy link
Contributor Author

banditopazzo commented Sep 5, 2025

Thank you for the review comments.

To accept the iterator and have a cleaner api I had to introduce another layer for the input. All the other solutions ended with a strange API.

@vadorovsky I used as_mut_ptr instead of as_ptr since the LLVM__Ref are all aliases for *mut LLVM__.

@banditopazzo
Copy link
Contributor Author

to check if a file can be opened or not I allocate myself a reader in the link_ methods. I can in theory move this when the file is actually used, but I think it's better to ensure everything is accessible before starting to link modules

Copy link
Collaborator

@alessandrod alessandrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay! I just finished my crazy month of traveling, so I'll be more responsive from now.

It's looking pretty good, just a couple of API comments

@vadorovsky
Copy link
Member

No comments besides what @alessandrod said. The wrappers look good now, thanks for addressing my previous comments. 🙂

There is a huge chance that the CI failures will disappear after you rebase (beta switched to newer LLVM, 1.86.0 looks like a flake).

@banditopazzo
Copy link
Contributor Author

applied the requested changes and rebased. let's hope the CI failures will disappear

@vadorovsky
Copy link
Member

The LLVM 19 job seems actually broken. 😢 I can take a look closer to the evening, but my wild guess is that there is some breaking change between LLVM 19 and 20 in one of the wrapped types, which we'll need to handle with feature flags.

@vadorovsky
Copy link
Member

@banditopazzo I pushed a commit with a fix. The problem was that the C strings were not null terminated and apparently LLVM 19 expects the null character. It's a good practice to add it for all FFI calls anyways. Sorry for suggesting a faulty path conversion in my previous comments!

Copy link
Collaborator

@alessandrod alessandrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #295 (comment)

That's the only thing left then it's ready to go

@tamird tamird requested a review from Copilot September 28, 2025 12:57
@tamird
Copy link
Member

tamird commented Sep 28, 2025

@codex review

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces in-memory linking capabilities to the BPF linker, enabling linking of both file-based and in-memory object/bitcode data. The implementation includes significant API changes to support reusing linker instances and improved LLVM object management through safer wrappers.

Key changes:

  • Added in-memory linking support alongside existing file-based linking
  • Redesigned API with link_to_file and link_to_buffer methods for better instance reuse
  • Improved LLVM object safety with custom Drop implementations and proper lifetime management

Reviewed Changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/llvm/types/target_machine.rs New wrapper for LLVM target machine with proper resource management and emit methods
src/llvm/types/module.rs New LLVM module wrapper with lifetime safety and bitcode/IR output capabilities
src/llvm/types/mod.rs Module declarations for new LLVM wrapper types
src/llvm/types/memory_buffer.rs New memory buffer wrapper with automatic disposal and slice access
src/llvm/types/context.rs New LLVM context wrapper with module creation and diagnostic handler support
src/llvm/mod.rs Refactored LLVM utilities to work with new wrapper types and removed standalone functions
src/llvm/di.rs Updated debug info sanitizer to use new LLVM wrapper types
src/linker.rs Major API restructure with new input types, linking methods, and improved resource management

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

@banditopazzo
Copy link
Contributor Author

banditopazzo commented Sep 30, 2025

@tamird thanks, I will try to solve also some of the codex issues

@alessandrod a few more points:

@tamird tamird requested a review from vadorovsky October 20, 2025 15:41
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird reviewed 9 of 9 files at r59, 8 of 8 files at r60, 2 of 2 files at r61, 2 of 2 files at r62, 2 of 2 files at r63, 7 of 7 files at r64, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @alessandrod, @banditopazzo, and @vadorovsky)


src/linker.rs line 357 at r42 (raw file):

Previously, vadorovsky (Michal R) wrote…

These two snippets are not the same if the result types (of the result you catch and the result you want to return) are not the same. And that's exactly the case here. If you change it like / you proposed:

diff --git a/src/linker.rs b/src/linker.rs
index 070355b..857a692 100644
--- a/src/linker.rs
+++ b/src/linker.rs
@@ -558,7 +558,7 @@ where
                     Err(LinkerError::MissingBitcodeSection(_)) => {
                         warn!("ignoring file {:?}: no embedded bitcode", path);
                     }
-                    Err(err) => return Err(err),
+                    err => return err,
                 }
             }
         }

Then the build fails with:

   Compiling bpf-linker v0.9.15 (/home/vad/src/bpf-linker)
error[E0308]: mismatched types
   --> src/linker.rs:561:35
    |
497 | ) -> Result<LLVMModule<'ctx>, LinkerError>
    |      ------------------------------------- expected `Result<module::LLVMModule<'_>, LinkerError>` because of return type
...
561 |                     err => return err,
    |                                   ^^^ expected `Result<LLVMModule<'_>, LinkerError>`, found `Result<(), LinkerError>`
    |
    = note: expected enum `Result<module::LLVMModule<'_>, _>`
               found enum `Result<(), _>`

For more information about this error, try `rustc --explain E0308`.
error: could not compile `bpf-linker` (lib) due to 1 previous error

So I think it's good as it is.

Ah, it's the same E but a different T. Thanks for explaining.


src/llvm/di.rs line 32 at r44 (raw file):

Previously, vadorovsky (Michal R) wrote…

A little TODO would suffice. Especially given that, given the latest Discord discussions, the future of DISanitizer is uncertain and we might move away from it.

As mentioned in the other commit - keep the TODO next to the code, not in a random file.


src/linker.rs line 350 at r64 (raw file):

        output: &Path,
        output_type: OutputType,
        export_symbols: &HashSet<Cow<'static, str>>,

now that we are immediately turning this into an iterator, maybe just make this impl IntoIterator<Item=&str> or even impl IntoIterator<Item: impl AsRef<str>> (i think you'll need another generic for this).

here and in link_to_buffer. This is important because this is public API and the current shape is pretty awkward.


TODO.md line 3 at r60 (raw file):

# TODO

- [ ] use safe wrappers in `src/llvm/di.rs`

eh, why? just put it where the code is (right on the phantom)

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @alessandrod, @banditopazzo, and @vadorovsky)


src/llvm/types/module.rs line 42 at r64 (raw file):
Ah, I see @vadorovsky's point here that this string is basically useless. Pasting his earlier comment:

The difference between this and other errors is that LLVMWriteBitcodeToFile doesn't populate any string message, like the other functions that have variants here. It only has a return code. I would be fine with just including the integer for now. Bonus points for resolving the error code into some meaningful message with https://crates.io/crates/errno

Copy link
Contributor Author

@banditopazzo banditopazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @alessandrod, @tamird, and @vadorovsky)


TODO.md line 3 at r60 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

eh, why? just put it where the code is (right on the phantom)

Done. sorry I understood wrong


src/linker.rs line 350 at r64 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

now that we are immediately turning this into an iterator, maybe just make this impl IntoIterator<Item=&str> or even impl IntoIterator<Item: impl AsRef<str>> (i think you'll need another generic for this).

here and in link_to_buffer. This is important because this is public API and the current shape is pretty awkward.

Done.


src/llvm/di.rs line 32 at r44 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

As mentioned in the other commit - keep the TODO next to the code, not in a random file.

Done.


src/llvm/types/module.rs line 42 at r64 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Ah, I see @vadorovsky's point here that this string is basically useless. Pasting his earlier comment:

The difference between this and other errors is that LLVMWriteBitcodeToFile doesn't populate any string message, like the other functions that have variants here. It only has a return code. I would be fine with just including the integer for now. Bonus points for resolving the error code into some meaningful message with https://crates.io/crates/errno

Done.

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird reviewed 10 of 10 files at r65, 10 of 10 files at r66, 3 of 3 files at r67, 3 of 3 files at r68, 3 of 3 files at r69, 8 of 8 files at r70, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @alessandrod, @banditopazzo, and @vadorovsky)


src/llvm/types/module.rs line 42 at r64 (raw file):

Previously, banditopazzo wrote…

Done.

https://github.com/lambda-fairy/rust-errno?tab=readme-ov-file#comparison-with-stdioerror

can we just use https://doc.rust-lang.org/std/io/struct.Error.html#method.last_os_error and return std::io::Error instead of String?


Cargo.toml line 37 at r66 (raw file):

thiserror = { version = "2.0.12" }
tracing = "0.1"
errno = "0.3.14"

let's not bring in a dependency for this


src/bin/bpf-linker.rs line 274 at r70 (raw file):

    let export_symbols = export_symbols.map(fs::read_to_string).transpose()?;

    // TODO: the data is owned by this call frame; we could make this zero-alloc.

I think you just resolved this TODO.


src/linker.rs line 690 at r70 (raw file):

    let mut export_symbols: HashSet<Cow<'_, [u8]>> = export_symbols
        .into_iter()
        .map(|s| Cow::Owned(s.as_ref().as_bytes().to_vec()))

this...does not need to be owned.

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @alessandrod, @banditopazzo, and @vadorovsky)

Copy link
Contributor Author

@banditopazzo banditopazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @alessandrod, @tamird, and @vadorovsky)


Cargo.toml line 37 at r66 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

let's not bring in a dependency for this

Done.


src/linker.rs line 690 at r70 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

this...does not need to be owned.

I think it's not possible with the previous function signature. The only way I found to avoid the allocation is to change the signature taking IntoIterator<Item = &'a str>,


src/bin/bpf-linker.rs line 274 at r70 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

I think you just resolved this TODO.

Done.


src/llvm/types/module.rs line 42 at r64 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

https://github.com/lambda-fairy/rust-errno?tab=readme-ov-file#comparison-with-stdioerror

can we just use https://doc.rust-lang.org/std/io/struct.Error.html#method.last_os_error and return std::io::Error instead of String?

Done.

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a pair of comments outstanding.

@tamird reviewed 11 of 11 files at r71, 9 of 9 files at r72, 3 of 3 files at r73, 3 of 3 files at r74, 3 of 3 files at r75, 8 of 8 files at r76, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @alessandrod, @banditopazzo, and @vadorovsky)

Now the function is only used inside context, there is no reason to
have it the top llvm mod.
This option is not used inside bpf-linker, it only exists on the cli to
keep compatibility with rustc.
Removed from linker options since this is not used by the majority of
the users, it's needed only if you want to debug llvm optimization
phase.
Copy link
Contributor Author

@banditopazzo banditopazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 15 files reviewed, 2 unresolved discussions (waiting on @alessandrod, @tamird, and @vadorovsky)

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird reviewed 10 of 10 files at r77, 8 of 8 files at r78, 2 of 2 files at r79, 2 of 2 files at r80, 2 of 2 files at r81, 7 of 7 files at r82, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @alessandrod and @vadorovsky)

Copy link
Member

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@vadorovsky reviewed 4 of 9 files at r50.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @alessandrod)

@tamird
Copy link
Member

tamird commented Oct 21, 2025

@codex review

@tamird tamird requested a review from Copilot October 21, 2025 13:32
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 8 out of 10 changed files in this pull request and generated 4 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

for mut input in inputs {
let path = match input {
InputReader::File { path, .. } => path.into(),
InputReader::Buffer { name, .. } => PathBuf::from(format!("in_memory::{}", name)),
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The synthetic path in_memory::{name} is used for error reporting but could conflict with actual file paths. Consider using a more distinctive prefix or format (e.g., <memory:{name}>) to clearly distinguish in-memory buffers from file paths in error messages.

Suggested change
InputReader::Buffer { name, .. } => PathBuf::from(format!("in_memory::{}", name)),
InputReader::Buffer { name, .. } => PathBuf::from(format!("<memory:{}>", name)),

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion is plausible.

let path = path.join("post-opt.ll");
let path = CString::new(path.as_os_str().as_bytes()).unwrap();
self.write_ir(&path)?;
let path = CString::new(path.as_os_str().as_encoded_bytes()).unwrap();
Copy link

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using unwrap() here could panic if the path contains null bytes. Consider using a more descriptive error message with expect() or propagating the error properly.

Copilot uses AI. Check for mistakes.
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +512 to +516
for mut input in inputs {
let path = match input {
InputReader::File { path, .. } => path.into(),
InputReader::Buffer { name, .. } => PathBuf::from(format!("in_memory::{}", name)),
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Borrow moved input reader in link loop

The loop computes path with match input { … }, which moves the InputReader because the enum holds File/Cursor values that are not Copy. The code then calls input.read_exact, input.rewind, and later passes input into Archive::new, but input has already been moved by the match, so this code will not compile. You likely meant to pattern-match on a reference (e.g. match &input or match &mut input) to keep the reader available for the subsequent operations.

Useful? React with 👍 / 👎.

@vadorovsky
Copy link
Member

vadorovsky commented Oct 21, 2025

I've restarted the failing job, it it fails everytime with:

cargo:warning=error: failed to write /tmp/rustczv1z8A/lib.rmeta: No space left on device (os error 28)
cargo:warning=
cargo:warning=
error: could not compile `rustix` (lib) due to 1 previous error
warning: build failed, waiting for other jobs to finish...
cargo:warning=error: failed to build archive at `/home/runner/work/bpf-linker/bpf-linker/aya/target/release/deps/libaya_obj-dee67e733d80af3e.rlib`: No space left on device (os error 28)
cargo:warning=
cargo:warning=
error: could not compile `aya-obj` (lib) due to 1 previous error
rustc-LLVM ERROR: IO failure on output stream: No space left on device

And the latest build on main also complains about the lack of space:

https://github.com/aya-rs/bpf-linker/actions/runs/18672411715/job/53235931949

I'll try to fix in a separate PR, but I think it's safe to merge this one, given that all the other jobs also run the integration tests.

@vadorovsky
Copy link
Member

vadorovsky commented Oct 22, 2025

@banditopazzo mind doing one last rebase? There should be no conflicts. CI should become green after it.

@alessandrod we have almost an approval from you here #295 (review) I'd aim for merging it today, if you have no further comments. 🙂

Copy link
Collaborator

@alessandrod alessandrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a stray emoji but otherwise I think we're good to go! thanks!

update: i'm the dumbest

@tamird
Copy link
Member

tamird commented Oct 22, 2025

I think we don't need a rebase. Merging.

@tamird tamird merged commit 0d3fc7a into aya-rs:main Oct 22, 2025
15 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants