-
Notifications
You must be signed in to change notification settings - Fork 5
Create processing_utils_lecroy and add first parser #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,56 @@ | ||||||||
| import numpy as np | ||||||||
| import pandas as pd | ||||||||
| import os | ||||||||
| from tqdm import tqdm | ||||||||
| import csv | ||||||||
| import re | ||||||||
|
|
||||||||
| """ | ||||||||
| Processing utilities for the Lecroy oscilloscope | ||||||||
|
|
||||||||
| This file holds all the relevant functions for the processing of data from csv files to h5. | ||||||||
| """ | ||||||||
|
|
||||||||
| def parse_lecroy_segmented(lines): | ||||||||
| # Line 1 has to have: Segments,1000,SegmentSize,5002 | ||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add documentation explaining what the function does, the input parameters and the expected output. An example can be seen here |
||||||||
| segments = int(lines[1][1]) | ||||||||
| seg_size = int(lines[1][3]) | ||||||||
|
Comment on lines
+16
to
+17
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps:
Suggested change
but this is picky, and perhaps a bit more unreadable. The choice is yours 🐱 |
||||||||
|
|
||||||||
| # Line 2, header is: Segment,TrigTime,TimeSinceSegment1 | ||||||||
| # Lines 3 to 3 + segments - 1 are header lines | ||||||||
| header_start = 3 | ||||||||
| header_end = header_start + segments | ||||||||
| header_lines = lines[header_start:header_end] | ||||||||
|
|
||||||||
| header_df = pd.DataFrame(header_lines, columns=["Segment", "TrigTime", "TimeSinceSegment1"]) | ||||||||
|
|
||||||||
| # Find the "Time,Ampl" line | ||||||||
| for i, line in enumerate(lines): | ||||||||
| if line[0].strip() == "Time": | ||||||||
| data_start = i + 1 | ||||||||
| break | ||||||||
|
Comment on lines
+27
to
+31
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the "Time,Ampl" line inconsistent within the data you use? Like, does it sometimes occur 5 lines in, and other times 10 lines? |
||||||||
| else: | ||||||||
| raise ValueError("Time,Ampl line not found") | ||||||||
|
|
||||||||
| # Read the data block (segments × segment size) | ||||||||
| raw_data = lines[data_start:] | ||||||||
| if len(raw_data) < segments * seg_size: | ||||||||
| print(f"Warning: expected {segments * seg_size} rows, got {len(raw_data)}") | ||||||||
|
|
||||||||
| value_list = [] | ||||||||
| for j in range(segments): | ||||||||
| segment_data = [] | ||||||||
| for k in range(seg_size): | ||||||||
| x = j * seg_size + k | ||||||||
| if x >= len(raw_data): # x = line in the file | ||||||||
| segment_data.append(None) | ||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This means that a segment will be lost if it is a bit shorter than expected right? Can you quantify how many events you lose this way? Either through a test or otherwise. |
||||||||
| else: | ||||||||
| try: | ||||||||
| value = float(raw_data[x][1]) # column 1 = Amplitude of signal | ||||||||
| segment_data.append(value) | ||||||||
| except (ValueError, IndexError): | ||||||||
| segment_data.append(None) | ||||||||
| value_list.append(segment_data) | ||||||||
|
|
||||||||
| value_df = pd.DataFrame(value_list) | ||||||||
| return value_df, header_df | ||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A sensible test for this function would be to take a small input file (you can save it within the repository) and ensure the output of said file is as expected. On the second revision I'll think of a nicer way to format the main 'work loop' here. |
||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best practices (that I want to implement somewhat retroactively) is to include type-checking for all functions.
In your case that would look like:
If
linesis a string, otherwise use the correct typeYou'll have to import Typing for tuples:
from typing import Tuple