-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMBER: are we too permissive in reading the files? (and also we are doing a double-reading? #238
Comments
Thanks for posting this issue.
Will probably have more questions when I have a more closer read. |
Regarding the first two questions:
You suppose right, the only difference for the user is that (in the case of 2b) if he/she is trying to read an invalid file he gets an exception risen instead of a
Ideally yes, but I don't think this will be a drastic change in execution time. The bigger advantage for me is to promptly notify to the user when the file he/she is trying to read is invalid. |
This looks weird to me as well. I suspect that this is from some legacy code and the
Agree
2b is good enough for me but if it is done in a better sense, 2c would be better.
Agree |
Great, I admit all this section of the AMBER parser really bothers me somehow, so I'm really happy to change it! I'll address this issue when PR #240 is done, and maybe also when the PR with the new test is done in alchemtest, because here lot's of tests will have to be changed to reflect the modified behaviour (all the tests of invalid_files at least)! |
Note that PR #253 addresses some of the issues. |
General comment: I am in favor of raising an exception at the first problem — there's no point continuing if essential data are missing. Thank you for taking on the task of cleaning up the AMBER parser. |
* fix #238 Now raise exceptions for *any* problems in AMBER input files instead of issuing a warning and continuing and then perhaps failing at the end. * fix #227 All problems in AMBER parser are now properly logged. * removed double run of `file_validation` * added docstring to `file_validation` * update CHANGELOG * added tests for the logging and the new exceptions Co-authored-by: Oliver Beckstein <[email protected]>
Here I want to tackle an issue that may change the behavior of some existing scripts, but I think for the better.
Currently, in the AMBER parser, we have a function called
file_validation
(the name is pretty misleading because indeed it validates the file, but more importantly it returns eitherFalse
or an object representation of the input AMBER file, on which then the other functions operate).This function, which should return a "file parsed object
FEData
", whenever it encounters some problem reading the file, just prints a descriptive WARNING and switches an internal variableinvalid
toTrue
, then continues reading the file as it nothing happened, only to returnFalse
ifinvalid
wasTrue
instead of returning the parsed (and invalid) object.This function is called when extracting dHdl and u_nk, and here is funny, because in both cases we do:
which in itself is redundant, as it was sufficient to check
if not file_datum: return None
without parsing the same file twice.The flow right now then is that we read and parse the AMBER output file to create the
FEData
object, if the file it's ok we continue parsing the AMBER out to get the data we need, otherwise, bothextract_*
functions returnNone
.I'm proposing some changes here (which have not to be implemented altogether)
file_Validation
call (this I think must be done 😆 )file_validation
, as soon as problems that now make the function returnFalse
, instead of switchinginvalid=True
, raise a proper Exception (maintaining the current output warning). In this way, we won't continue parsing a file we already decided is invalid... after that we have three scenarios:2a. Maintain the current behavior, we catch the exception when we use
file_validation
, liketry: file_datum = file_validation(outfile) ; except ExceptionWeWillRise: return None
2b. We don't catch the exception, so instead of returning
None
when an AMBER output file was not read we let thefile_validation
interrupt the flow of the program and let the user be aware that a file was not parsed.2c. An hybrid version of 3a and 3b, we catch the exception from
file_validation
inside theextract_*
functions, and raise here a more informative exception, maybe printing both a general warning here and also the warning from the exception risen insidefile_validation
file_validation
to something more coherent to its behavior, I'm proposingfile_initialiation
, but I'm open to any suggestion.I would certainly want to do 1., probably want to do 3. and I propend towards 2b. as it will let the users be aware that a file is not read, letting them be able to just catch the exception if they want to continue parsing more files.
Depending on the route we'll choose, also the tests for invalid files in test_amber.py will have to be updated, but this is fine because it addresses also issue #234 in a single PR (I could first address #234 to separate the common single-test in one test for each case of invalid file in the current implementation, and then change the tests accordingly to the changes made here, but it would be a waste of time I think, without a proper gain in traceability). I think the only advantage would be that if we revert this specific PR, we'll still have the separate tests if this was done in two separate PRs).
Please, let me know what you think!
The text was updated successfully, but these errors were encountered: