-
Notifications
You must be signed in to change notification settings - Fork 112
Preliminary virtual memory work #339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
With virtual memory, seemingly consecutive I/O virtual memory regions may actually be fragmented across multiple pages in our userspace mapping. Existing `descriptor_utils::Reader::new()` (and `Writer`) implementations (e.g. in virtiofsd or vm-virtio/virtio-queue) use `GuestMemory::get_slice()` to turn guest memory address ranges into valid slices in our address space; but with this fragmentation, it is easily possible that a range no longer corresponds to a single slice. To fix this, add a `get_slices()` method that iterates over potentially multiple slices instead of a single one. We should probably also deprecate `get_slice()`, but I’m hesitant to do it in the same commit/PR. (We could also try to use `try_access()` as an existing internal iterator instead of this new external iterator, which would require adding lifetimes to `try_access()` so the region and thus slices derived from it could be moved outside of the closure. However, that will not work for virtual memory that we are going to introduce later: It will have a dirty bitmap that is independent of the one in guest memory regions, so its `try_access()` function will need to dirty it after the access. Therefore, the access must happen in that closure and the reference to the region must not be moved outside.) Signed-off-by: Hanna Czenczek <[email protected]>
read() and write() must not ignore the `count` parameter: The mappings passed into the `try_access()` closure are only valid for up to `count` bytes, not more. (Note: We cannot really have a test case for this, as right now, memory fragmentation will only happen exactly at memory region boundaries. In this case, `region.write()`/`region.read()` will only access the region up until its end, even if the passed slice is longer, and so silently ignore the length mismatch. This change is necessary for when page boundaries result in different mappings within a single region, i.e. the region does not end at the fragmentation point, and calling `region.write()`/`region.read()` would write/read across the boundary. Because we don’t have IOMMU support yet, this can’t be tested.) Signed-off-by: Hanna Czenczek <[email protected]>
When we switch to a (potentially) virtual memory model, we want to compact the interface, especially removing references to memory regions because virtual memory is not just split into regions, but pages first. The one memory-region-referencing part we are going to keep is `try_access()` because that method is nicely structured around the fragmentation we will have to accept when it comes to paged memory. `to_region_addr()` in contrast does not even take a length argument, so for virtual memory, using the returned region and address is unsafe if doing so crosses page boundaries. Therefore, switch `Bytes::load()` and `store()` from using `to_region_addr()` to `try_access()`. (Note: We cannot really have a test case for this, as right now, memory fragmentation will only happen exactly at memory region boundaries. In this case, `region.load()` and `region.store()` would have already returned errors. This change is necessary for when page boundaries result in different mappings within a single region, but because we don’t have IOMMU support yet, this can’t be tested.) Signed-off-by: Hanna Czenczek <[email protected]>
6d42950
to
96c70ea
Compare
Sorry, messed up the safety formatting: Replaced |
@XanClic Oops, some missing safety comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe we can keep .get_slice() as a sort of utility function for when someone wants to get a contiguous slice, and where receiving something cross-region boundary would be an error condition.
match unsafe { self.do_next() } { | ||
Some(Ok(slice)) => Some(Ok(slice)), | ||
other => { | ||
// On error (or end), reset to 0 so iteration remains stopped |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could implement FusedIterator
, since after a single None
we never return not-None ever again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XanClic oops, missed this observation. please update :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I just wanted to know how to proceed with try_access()
first – you think we should leave it as-is for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I do
and `GuestRegionMmap::from_range` to be separate from the error type returned by `GuestRegionCollection` functions. | ||
Change return type of `GuestRegionMmap::new` from `Result` to `Option`. | ||
- \[#324](https:////github.com/rust-vmm/vm-memory/pull/324)\] `GuestMemoryRegion::bitmap()` now returns a `BitmapSlice`. Accessing the full bitmap is now possible only if the type of the memory region is know, for example with `MmapRegion::bitmap()`. | ||
- \[[#339](https://github.com/rust-vmm/vm-memory/pull/339)\] Fix `Bytes::read()` and `Bytes::write()` not to ignore `try_access()`'s `count` parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably go into a Fixed
section. Also, let's specify that this only applies to the blanket impl provided for T: GuestMemory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, right!
let expected = size_of::<O>(); | ||
|
||
let completed = self.try_access( | ||
expected, | ||
addr, | ||
|offset, len, region_addr, region| -> Result<usize> { | ||
assert_eq!(offset, 0); | ||
if len < expected { | ||
return Err(Error::PartialBuffer { | ||
expected, | ||
completed: 0, | ||
}); | ||
} | ||
region.store(val, region_addr, order).map(|()| expected) | ||
}, | ||
)?; | ||
|
||
if completed < expected { | ||
Err(Error::PartialBuffer { | ||
expected, | ||
completed, | ||
}) | ||
} else { | ||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one (and the one below) would be a bit simpler in terms of get_slices
maybe? Shouldn't that be something like
let iter = self.get_slices(addr, size_of::<O>());
let vslice = iter.next()?;
if iter.next().is_some() {
return Err(PartialBuffer {0})
}
vslice.store(val)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, or just self.get_slice(addr, size_of::<O>())?.store(val)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, OK. If VolatileSlice::store()
does the same as GuestMemoryRegion::store()
, that will be much better indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does! In fact, GuestMemoryRegion::store() just defers to VolatileSlice::store()
.
Btw, this function is exactly what made me go "mh, maybe there is a use for get_slice() after all". Because atomic loads/stores that cross region boundaries cannot work, and not something that is an implementation TODO that can be solved somehow (e.g. if the guest gives us a virt queue where an atomic goes across region boundaries, then best we can do is fail the device activation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s indeed true and a very good point; for atomic accesses, the reason why they shouldn’t go across boundaries is that they should be naturally aligned, which automatically prevents them from crossing e.g. page boundaries.
Then again, specifically for atomic accesses, I think users outside of vm-memory shouldn’t need to implement them and instead use what Bytes
provides.
I still think it could be beneficial to force users to deal with an iterator even when they only expect a single slice because it might encourage them to write a comment why this is OK. Making get_slice()
unsafe
would achieve the same effect, but the thing is, it isn’t really unsafe, so it would technically be wrong. But maybe I’m overreaching and shouldn’t dictate how vm-memory users have to document their code…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If atomic accesses are the only use-case of get_slice()
, then maybe we can just inline it into this function as a compromise? That should still be shorter than the alternatives (or, deprecate it today, and use it with a #[allow(deprecated)]
on it, and the later inline when we remove the deprecated function)
Change return type of `GuestRegionMmap::new` from `Result` to `Option`. | ||
- \[#324](https:////github.com/rust-vmm/vm-memory/pull/324)\] `GuestMemoryRegion::bitmap()` now returns a `BitmapSlice`. Accessing the full bitmap is now possible only if the type of the memory region is know, for example with `MmapRegion::bitmap()`. | ||
- \[[#339](https://github.com/rust-vmm/vm-memory/pull/339)\] Fix `Bytes::read()` and `Bytes::write()` not to ignore `try_access()`'s `count` parameter | ||
- \[[#339](https://github.com/rust-vmm/vm-memory/pull/339)\] Implement `Bytes::load()` and `Bytes::store()` with `try_access()` instead of `to_region_addr()` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also specify that it's only relevant for the blanket impl
Preface: For this PR, that is what is done. But, yes, I would prefer to deprecate User should just not expect Next, I can’t think of a valid reason for someone wanting to get a contiguous slice other than it being easier to handle, but that’s a problem with tooling. QEMU e.g. has good tooling around I/O vectors (a vector of buffers), so maybe we need a Finally, what will users do when Summarizing, I don’t find it good to facilitate users’ incomplete (and I would argue incorrect) implementations that work most of the time but aren’t guaranteed to work. I find there’s no better way to let users know instinctively that their implementation is incomplete than by forcing them to deal with an iterator and manually get exactly the first element. Rust makes it easy to ignore error handling, so I expect users to call In the end, innocent [1] It just depends on the implementation and circumstances, and so is hard to say in general. In the vanilla vhost-user case with |
Summary of the PR
This PR contains fixes for fragmented guest memory, i.e. situations where a consecutive guest memory slice does not translate into a consecutive slice in our userspace address space. Currently, that is not really an issue, but with virtual memory (where such discontinuities can occur on any page boundary), it will be.
(See also PR #327).
Specifically:
GuestMemory::get_slices()
, which returns an iterator over slices instead of just a single oneBytes::read()
andBytes::write()
to correctly work for fragmented memory (i.e. multipletry_access()
closure calls)Bytes::load()
andBytes::store()
usetry_access()
instead ofto_region_addr()
, so they can at least detect if there is fragmentation, and return an error. (Their address argument being properly (naturally) aligned should prevent any problems with fragmentation.)Requirements
Before submitting your PR, please make sure you addressed the following
requirements:
git commit -s
), and the commit message has max 60 characters for thesummary and max 75 characters for each description line.
test.
Release" section of CHANGELOG.md (if no such section exists, please create one).
unsafe
code is properly documented.