Skip to content

Conversation

@jwaiton
Copy link
Member

@jwaiton jwaiton commented Apr 22, 2025

This PR introduces the functionality of processing into MULE, with single channel decoding from waveDump 1 .dat files now being possible. Addresses #11 which will be resolved when this is complete.

The data is stored in h5 files with a storage path and name provided by the user. The h5 format is similar but does not match the wavedump 2 formatting, hence the need for malleable reader and writers (as introduced in #40). This will be resolved in future PRs to be equivalent across both.

This PR rests on top of #40, so should be merged after it.

@jwaiton jwaiton mentioned this pull request Apr 22, 2025
3 tasks
@bpalmeiro bpalmeiro self-assigned this Jan 16, 2026
@jwaiton jwaiton force-pushed the add-WD1-processing branch from 27d632d to fee38cc Compare January 22, 2026 17:07
Copy link
Collaborator

@bpalmeiro bpalmeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments, have fun!

# THIS SHOULD BE MOVED ELSEWHERE
class MalformedHeaderError(Exception):
'''
Header created for when two headers don't match up consecutively.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess you mean exception?

Comment on lines 3 to 4
import pandas as pd
import numpy as np
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alignment

Comment on lines +390 to +391
header = np.fromfile(file_object, dtype = 'i', count = 6)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess these are fixed for WD2 right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its fixed for WD1, WD2 uses an adaptively sized header, but since each file in Wavedump1 is a channel, this issue doesn't occur.

Comment on lines +393 to +395
sanity_header = header.copy()

# continue only if data exists
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why copy it before knowing if it has anything?

Copy link
Member Author

@jwaiton jwaiton Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we rewrite the header variable in the next steps to compare to this 'initial sanity check' header. If it is malformed, an error is returned.

This code could be restructured to check if it has none before copying, but copying these headers once isn't particularly expensive.

header = np.fromfile(file_object, dtype = 'i', count = 6)

# check if header has correct number of elements and correct information ONCE.
if sanity_header is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comparison should be made at the beginning and not compared all the time; this object is unchanged, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, given you already did the while with this in the 1st iteration technically you've checked this already

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is only made at the beginning in the first iteration, if it passes the checks sanity_header is set to None after it is checked once and as such this if statement is never checked again.

save_path (str) : Path to saved file
sample_size (int) : Size of each sample in an event (2 ns in the case of V1730B digitiser)
overwrite (bool) : Boolean for overwriting pre-existing files
counts (int) : The number of events per chunks. -1 implies no chunking of data.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

counts not used, print mod used and not reported

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be altered to print_mod, will change

[optional]

overwrite = True
counts = -1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

counts not used

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoopsies, it should have been replaced with print_mod as the lazy processing no longer requires chunking. I'll fix this

Comment on lines +4 to +5
file_path = '/home/casper/Documents/MULE/packs/tests/data/one_channel_WD1.dat'
save_path = '/home/casper/Documents/MULE/packs/tests/data/one_channel_WD1_tmp.h5'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall these paths be more generic? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could try and tie them into the provided environment variables for the MULE directory, but these are just sample configs. They're not meant to work out of the box, but provide a template to work upon.

assert [x for x in reader(save_path, 'RAW', 'rwf')] == [x for x in reader(comparison_path, 'RAW', 'rwf')]


def test_lazy_loading_malformed_data(MULE_dir):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may can also add the sanity_header being None if you keep that part of the code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the None case and the reverse are tested in the process of WD1 processing, but I can create explicit tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment comes from the aligment confusion, disregard

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But i can transform it to check not only the specific values but also the "len(header) == 6", right?

Comment on lines 24 to +27
if conf_dict['wavedump_edition'] == 2:
process_bin_WD2(**arg_dict)
elif conf_dict['wavedump_edition'] == 1:
process_bin_WD1(**arg_dict)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if conf_dict['wavedump_edition'] == 2:
process_bin_WD2(**arg_dict)
elif conf_dict['wavedump_edition'] == 1:
process_bin_WD1(**arg_dict)
if conf_dict['wavedump_edition'] == 2:
process_bin_WD2(**arg_dict)
elif conf_dict['wavedump_edition'] == 1:
process_bin_WD1(**arg_dict)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also test the new case? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants