Skip to content

Conversation

@jwaiton
Copy link
Member

@jwaiton jwaiton commented Nov 11, 2024

This PR introduces the functionality of processing into MULE, with multichannel decoding from WaveDump 2 .bin files now being possible.

The data is stored in h5 files with a storage path and name provided by the user. I'm using this PR as a draft to develop the general structure of a pack such that the following ones will be easier to implement.

@jwaiton
Copy link
Member Author

jwaiton commented Nov 11, 2024

Major additions that are still required:

  • Add basic integration with executable, proc.py and the config files
  • Refactor code significantly if possible
  • Add tests
  • complete documentation for the new functions
  • add type hints to functions
  • Further functionalise the code
  • add chunking to ensure memory safety

@jwaiton
Copy link
Member Author

jwaiton commented Nov 14, 2024

I will also add some tools for reading the files quickly based on the chunking. Should these perhaps go in a differing section (analysis, core, etc)? @bpalmeiro

@jwaiton jwaiton added the enhancement New feature or request label Nov 16, 2024
@jwaiton
Copy link
Member Author

jwaiton commented Nov 19, 2024

I've added executable functionality, but now need to modify the tests to allow for this.

The executable functionality also now includes the config file setup for MULE, which uses ConfigParser. It functions much like IC, in which you enter inputs in a pythonic format and get the expected results.

Marking it as ready for review, but needs more tests that I'll be writing over time. We can now also address issue #9 as config files are now part of MULE! 🥳

@jwaiton jwaiton force-pushed the include-WD2-processing branch from 76c5b3b to 6275472 Compare November 19, 2024 17:54
@jwaiton jwaiton marked this pull request as ready for review November 19, 2024 18:09
@jwaiton jwaiton requested a review from bpalmeiro November 19, 2024 18:10
@jwaiton
Copy link
Member Author

jwaiton commented Nov 21, 2024

pytest coverage at the moment:

Name                             Stmts   Miss  Cover   Missing
--------------------------------------------------------------
packs/acq/__init__.py                0      0   100%
packs/acq/acq.py                     6      0   100%
packs/core/__init__.py               0      0   100%
packs/core/core_utils.py             7      1    86%   14
packs/core/io.py                    31      0   100%
packs/proc/__init__.py               0      0   100%
packs/proc/proc.py                  18     10    44%   15-28
packs/proc/processing_utils.py     144     84    42%   66-114, 187-188, 205-206, 240-266, 321-324, 357-377, 439-484
packs/tests/processing_test.py      87      0   100%
packs/tests/setup_test.py           28      0   100%
packs/tests/temporal_test.py         2      0   100%
packs/tests/tests.py                 6      0   100%
packs/types/types.py                 9      0   100%
--------------------------------------------------------------
TOTAL                              338     95    72%

I'm not aiming for 100%, but I believe a single test will help in increasing coverage; a test that the processing pack creates a file that matches what we expect, from start to finish.

@jwaiton
Copy link
Member Author

jwaiton commented Nov 22, 2024

The new test explicitly attempts use the decode process with:

  • (no)/chunking
  • single/multiple channels

and tests them explicitly. The new coverage is:
image

Of processing_tests.py:
240-266 is deprecated code I will remove
66-114 is WD1 code that will need to be refactored (raised here #11)

The other areas can and will be tested.

@jwaiton
Copy link
Member Author

jwaiton commented Nov 22, 2024

Only sections no longer tests are WD1 related, or a specific line (#324 of processing_utils.py) that doesn't need testing (and may not be required anyway).

@jwaiton
Copy link
Member Author

jwaiton commented Nov 25, 2024

There is a bug when running the decoder for singular channels that seems to select the channel number as the number of samples, causing the output for samples to be zero, and then refusing to write an empty table to the h5 file. Why this occurs I haven't fully checked, but the tests should catch this when creating the singular channel file via decoding. Will figure this out and patch.

EDIT: This is an issue actually with Wavedump 2. If you select 'multiple channels per file' but only include one channel, the number of samples appears to be zero. Ensure when only reading out from one channel that you do not say 'many channels per file'!

@jwaiton jwaiton mentioned this pull request Dec 11, 2024
3 tasks
@jwaiton
Copy link
Member Author

jwaiton commented Jan 31, 2025

Quite a significant bug has been found! Mathias and I were working and encountered a MemoryError in process_header(), this is caused by a single channel data file being read as a multi-channel data file (as intended). The issue arises with what values end up being assigned to the channels and samples values.

Channels was set to some number 10^10, which meant when trying to execute file.read(4*samples*channels) a memory error occurred.

The solution to this is being implemented now.

@jwaiton jwaiton force-pushed the include-WD2-processing branch from 8f0f29a to 6cb3177 Compare January 31, 2025 17:32
Copy link
Collaborator

@bpalmeiro bpalmeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a """"light"""" review, so probably missing things, but here you have my first round of comments

assert evt_info.equals(check_evt_info)


def test_ensure_new_path_created():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing the overwriting and the successful iteration tests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by successful iteration?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case you don't want to overwrite, and it loops and finds an available name

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TD

Copy link
Collaborator

@bpalmeiro bpalmeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more or less satisfied with the addressing, although tonnes of things left to the future poor TD payer.

Good job!

@bpalmeiro bpalmeiro merged commit cdc7758 into nu-ZOO:main Feb 18, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants