-
Notifications
You must be signed in to change notification settings - Fork 5
Add wd1 processing #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
27d632d to
fee38cc
Compare
bpalmeiro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of comments, have fun!
| # THIS SHOULD BE MOVED ELSEWHERE | ||
| class MalformedHeaderError(Exception): | ||
| ''' | ||
| Header created for when two headers don't match up consecutively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guess you mean exception?
| import pandas as pd | ||
| import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment
| header = np.fromfile(file_object, dtype = 'i', count = 6) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess these are fixed for WD2 right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its fixed for WD1, WD2 uses an adaptively sized header, but since each file in Wavedump1 is a channel, this issue doesn't occur.
| sanity_header = header.copy() | ||
|
|
||
| # continue only if data exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why copy it before knowing if it has anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we rewrite the header variable in the next steps to compare to this 'initial sanity check' header. If it is malformed, an error is returned.
This code could be restructured to check if it has none before copying, but copying these headers once isn't particularly expensive.
| header = np.fromfile(file_object, dtype = 'i', count = 6) | ||
|
|
||
| # check if header has correct number of elements and correct information ONCE. | ||
| if sanity_header is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comparison should be made at the beginning and not compared all the time; this object is unchanged, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, given you already did the while with this in the 1st iteration technically you've checked this already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is only made at the beginning in the first iteration, if it passes the checks sanity_header is set to None after it is checked once and as such this if statement is never checked again.
| save_path (str) : Path to saved file | ||
| sample_size (int) : Size of each sample in an event (2 ns in the case of V1730B digitiser) | ||
| overwrite (bool) : Boolean for overwriting pre-existing files | ||
| counts (int) : The number of events per chunks. -1 implies no chunking of data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
counts not used, print mod used and not reported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be altered to print_mod, will change
| [optional] | ||
|
|
||
| overwrite = True | ||
| counts = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
counts not used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoopsies, it should have been replaced with print_mod as the lazy processing no longer requires chunking. I'll fix this
| file_path = '/home/casper/Documents/MULE/packs/tests/data/one_channel_WD1.dat' | ||
| save_path = '/home/casper/Documents/MULE/packs/tests/data/one_channel_WD1_tmp.h5' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall these paths be more generic? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could try and tie them into the provided environment variables for the MULE directory, but these are just sample configs. They're not meant to work out of the box, but provide a template to work upon.
| assert [x for x in reader(save_path, 'RAW', 'rwf')] == [x for x in reader(comparison_path, 'RAW', 'rwf')] | ||
|
|
||
|
|
||
| def test_lazy_loading_malformed_data(MULE_dir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may can also add the sanity_header being None if you keep that part of the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the None case and the reverse are tested in the process of WD1 processing, but I can create explicit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment comes from the aligment confusion, disregard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But i can transform it to check not only the specific values but also the "len(header) == 6", right?
| if conf_dict['wavedump_edition'] == 2: | ||
| process_bin_WD2(**arg_dict) | ||
| elif conf_dict['wavedump_edition'] == 1: | ||
| process_bin_WD1(**arg_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if conf_dict['wavedump_edition'] == 2: | |
| process_bin_WD2(**arg_dict) | |
| elif conf_dict['wavedump_edition'] == 1: | |
| process_bin_WD1(**arg_dict) | |
| if conf_dict['wavedump_edition'] == 2: | |
| process_bin_WD2(**arg_dict) | |
| elif conf_dict['wavedump_edition'] == 1: | |
| process_bin_WD1(**arg_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also test the new case? :)
This PR introduces the functionality of processing into MULE, with single channel decoding from waveDump 1 .dat files now being possible. Addresses #11 which will be resolved when this is complete.
The data is stored in h5 files with a storage path and name provided by the user. The h5 format is similar but does not match the wavedump 2 formatting, hence the need for malleable reader and writers (as introduced in #40). This will be resolved in future PRs to be equivalent across both.
This PR rests on top of #40, so should be merged after it.