Skip to content

update lmdb-rkv to latest version 0.11 #105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 16, 2019

Conversation

mykmelez
Copy link
Contributor

@ncloudioj Per mozilla/lmdb-rs#14, the latest version of the lmdb-rkv crate returns an error instead of panicking when mdb_cursor_get() returns an error. That changes the return value of lmdb::Iter.next(), which is now a Result; so we need to update rkv::Iter.next() accordingly.

We could maintain backward compatibility by continuing to return (&'env [u8], Result<Option<Value<'env>>, StoreError>) from rkv::Iter.next(), either ignoring an Err result from lmdb::Iter.next() or hiding it in the key-value tuple's value field (which can currently be a StoreError if read_transform() fails to transform the value). But that's either a silent failure or a misleading result.

Alternately, we could distinguish between a StoreError result of lmdb::Iter.next() and a StoreError result of read_transform() by returning Result<(&'env [u8], Result<Option<Value<'env>>, StoreError>), StoreError>, i.e. either an error or a key-value tuple whose value field may contain an error. But that feels overcomplex, requiring two layers of error handling by the consumer.

This branch adopts the third approach of reporting all errors while consolidating error reporting by returning Result<(&'env [u8], Option<Value<'env>>), StoreError>, i.e. either a key-value tuple (that does not contain an error) or a StoreError (which can be either from lmdb::Iter.next() or from read_transform()).

As this is a breaking change, the branch also increments the minor version of rkv.

@mykmelez mykmelez requested a review from ncloudioj January 12, 2019 00:30
Copy link
Member

@ncloudioj ncloudioj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+!

This is much better than panicking for sure, though the type of Self::Item is still quite involving to deal with.

Wonder if we can further simplify this by re-visiting LMDB's mdb_cursor_get function. According to the doc, it returns two types of errors, (i.e. MDB_NOTFOUND and EINVAL). Perhaps we can introduce various helper functions to generate the iterators, and handle both errors before putting them in action. Once the iterators are successfully created, we only need to deal with the error from read_transform and checking the end condition, i.e. MDB_NOTFOUND. I'd assume it's unlikely to encounter EINVAL in the course of iteration.

Again, this could be a followup optimization.

@mykmelez
Copy link
Contributor Author

I'd assume it's unlikely to encounter EINVAL in the course of iteration.

Per this code comment, it actually can encounter EINVAL in the course of iteration, if the cursor was previously seeked to a non-existent value, as when you call Cursor.iter_from() with a key greater than all keys in the database.

Perhaps there's a cheap way to detect that before returning the iterator, unsure. However, note that the documentation for mdb_cursor_get() describes MDB_NOTFOUND and EINVAL as "some possible errors," so there may be others that aren't listed but may be returned. Thus I'm not sure how feasible it is to avoid error results from Iter.next(). But it's certainly worth investigating!

@mykmelez mykmelez merged commit 9ff784c into mozilla:master Jan 16, 2019
@mykmelez mykmelez deleted the update-lmdb-rkv branch January 16, 2019 22:43
@ncloudioj
Copy link
Member

ncloudioj commented Jan 16, 2019

Per this code comment, it actually can encounter EINVAL in the course of iteration, if the cursor was previously seeked to a non-existent value, as when you call Cursor.iter_from() with a key greater than all keys in the database.

Right, I was wondering if we could capture those EINVALs in the iterator creators, before returning them to the consumers. e.g. if calling iter_from() with a key greater than all keys, then returns an Err(...), otherwise, returns an OK(iter). So iterators in rkv always start from the valid state.

After taking a quick glimpse of mdb_cursor_get, it indeed only exposes MDB_NOTFOUND and EINVAL, which acts as a generic error for any cursor ops they believe being invalid.

Will take a deeper look into this, stay stuned! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants