-
Notifications
You must be signed in to change notification settings - Fork 14
Fix segmentation fault on invalid SMARTS #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
aleebberg
wants to merge
9
commits into
rdkit-rs:main
Choose a base branch
from
aleebberg:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 4 commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
d60d2ae
Add error handling to RWMol::from_smarts
aleebberg d8d0b7d
Add test for invalid input
aleebberg d12879f
add empty SMARTS handling and facilitate errors
aleebberg f6da32a
Adapt error handling and add tests für empty SMARTS
aleebberg 8eea60f
Add enum error handling
aleebberg a9c9e84
Adapt tests to enum errors
aleebberg 27ded30
Make romol mut
aleebberg 79d9b4e
Run linter
aleebberg 5461f0c
Remove empty smarts error
aleebberg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I do not study chemistry myself, but informatics, I am not sure, if this is the wanted behavior or if you'd prefer a wildcard here or if maybe an empty string is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably don't want to allow an empty smarts, so this works for me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for contributing this PR! It's close but I think a little guidance would help. First off, it's good to reiterate the structure of this git repository. It contains two rust crates: rdkit and rdkit-sys. The sys crate is where pointers are built and where rust meets the C++ code of rdkit. You want all exception handling to happen in the sys crate. The
rdkitcrate is for high-level convenience. Every pointer type fromrdkit-sysis wrapped in a struct -- with hidden pointers, as you've seen, we wan't to create a "nanny state" where our users can't do unsafe things. I believe you want to take a look at the sys crate as part of your bug fix.Now for my feedback on this code snippet:
Using the
Resulttype is absolutely the right way to go -- we need to discriminate between good values and error values. My only objection is the type on the error path,&stris not very Rust-y. We want to avoid "stringly-typed" data and instead use more enums.See
rdkit/src/graphmol/ro_mol.rs
Lines 21 to 34 in f6da32a
RWMolErrorenum and instantiate it in your error path.My next concern is performing condition checking in
rw_mol.rsitself -- in general I want rdkit (c++) to drive what happens. It's very possible in the future our friends at the RDKit project could change how empty strings behave (although unlikely) and I don't want the Rust logic to be at odds with whatever they decide. Instead, check if the returned pointer is null and if so, build the appropriate enum.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for the detailed feedback! You are totally right about the enums and I am going to adapt it to the handling of errors in ro_mol.rs.
Concerning the empty SMARTS, I checked what happens in Python and they return a valid Mol object that doesn't have any contents (so not a nullpointer, but empty). So maybe we should go for that, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to model what the C++ library does -- it does mean that python folks are going to have to rethink how they use rdkit. The rdkit-rs implementation puts errors upfront and in your face. Clear error propagation is important for our users of the the cheminee search engine, for example. We don't want too much "magic".
So think of this project as C++ flavored RDKit 😄. Hopefully you are familiar with the autogenerated rdkit docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a deeper look at the rdkit-Documentation, though I must say I have never dealt with C++. I still found an interesting comment in the actual C++ code stating:
// empty strings produce empty molecules. So maybe having an error in Rust at this part would not really fit?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, please address my last comment and I'm happy to merge this and push it up to crates.io. Thanks for your perseverance and patience!