Digitize full WF #705

JoranAngevaare · 2022-10-26T07:29:51Z

What is the problem / what does the code in this PR do
See #704

The sum WF is always conservative by truncating the peak when downsampling. While probably some loss is acceptable, having O(3%) differences between p['data'].sum() and p['area'] might be a bit more confusing than what we'd

Can you briefly describe how it works?
Rather than being conservative with the clipping of the data-field, use the time if there is any after the peak. This is computed by checking if the next peak is far away enough that we can claim the bit of extra time that would be needed for the extra sample

There may still be a chance that there is another peak directly after another peak that would prevent us from filling the entire buffer. There are a couple of things we could do about that:

just ignore it - the chances that this happens is extremely small - especially for smaller peaks where this last sample might make any difference
add the data of that last sample that doesn't fit within the time interval to the second to last sample
rewrite tons of code to support overlapping peaks

coveralls · 2022-10-26T07:47:01Z

Coverage: 92.763% (+0.6%) from 92.184% when pulling 457b4a3 on digitize_full_wf into 209735b on master.

terliuk · 2022-10-26T08:33:57Z

I would vote for option 2 with adding remaining part of the waveform to the last sample is the most reasonable option. As typical difference is below few %, this should preserve all the timing quantities that we typically use (50% width, 90% width, 10-to-50% area time etc). But we have to be sure that we document it clearly somewhere (at least as a very clear comment in code).

On the long run, we can decouple the relation endtime=length*dt + time and to store dedicated end time or total duration of all samples, but this might require way more effort for testing, as it might "backfire" in some place.

JoranAngevaare · 2022-10-26T09:02:21Z

I would vote for option 2 with adding remaining part of the waveform to the last sample is the most reasonable option. As typical difference is below few %, this should preserve all the timing quantities that we typically use (50% width, 90% width, 10-to-50% area time etc). But we have to be sure that we document it clearly somewhere (at least as a very clear comment in code).

Yes agree, it's all quite hypothetical but we got to be mindfull of future PHD students banging their heads against the wall why things don't sum up. I just added the functionality. I'll try and add some more tests too to illustrate the point.

On the long run, we can decouple the relation endtime=length*dt + time and to store dedicated end time or total duration of all samples, but this might require way more effort for testing, as it might "backfire" in some place.

Not sure if we would really gain much, We inherently loose information from downsampling, but that does not mean that the information lost is actually useful (of course depending on the application, for strax it's now mostly computing area deciles which relevant parameters aren't actually affected by any the changes made here).

JoranAngevaare · 2023-06-01T18:12:04Z

FYI, if someone wants to pick this up, I did not figure out where but something is still a bit fishy about this PR as it creates very different Sum WFs (whereas it should be a very minor change). I'll close this for now, people might find it back in #704

Joran Angevaare added 4 commits October 26, 2022 09:17

add test and solution to #704

1d82e0d

fix kwargs

c4b892b

some cleanup

2b76968

off by one

46adbbd

add truncation handling

d3a7a6e

Joran Angevaare added 2 commits October 26, 2022 12:01

add comments and tests

07c76a5

typo

7ec8936

JoranAngevaare marked this pull request as draft October 26, 2022 11:33

Merge branch 'master' into digitize_full_wf

457b4a3

JoranAngevaare closed this Jun 1, 2023

JoranAngevaare deleted the digitize_full_wf branch June 1, 2023 18:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Digitize full WF #705

Digitize full WF #705

JoranAngevaare commented Oct 26, 2022 •

edited

Loading

coveralls commented Oct 26, 2022 •

edited

Loading

terliuk commented Oct 26, 2022

JoranAngevaare commented Oct 26, 2022 •

edited

Loading

JoranAngevaare commented Jun 1, 2023

Digitize full WF #705

Digitize full WF #705

Conversation

JoranAngevaare commented Oct 26, 2022 • edited Loading

coveralls commented Oct 26, 2022 • edited Loading

terliuk commented Oct 26, 2022

JoranAngevaare commented Oct 26, 2022 • edited Loading

JoranAngevaare commented Jun 1, 2023

JoranAngevaare commented Oct 26, 2022 •

edited

Loading

coveralls commented Oct 26, 2022 •

edited

Loading

JoranAngevaare commented Oct 26, 2022 •

edited

Loading