-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Digitize full WF #705
Digitize full WF #705
Conversation
I would vote for option 2 with adding remaining part of the waveform to the last sample is the most reasonable option. As typical difference is below few %, this should preserve all the timing quantities that we typically use (50% width, 90% width, 10-to-50% area time etc). But we have to be sure that we document it clearly somewhere (at least as a very clear comment in code). On the long run, we can decouple the relation |
Yes agree, it's all quite hypothetical but we got to be mindfull of future PHD students banging their heads against the wall why things don't sum up. I just added the functionality. I'll try and add some more tests too to illustrate the point.
Not sure if we would really gain much, We inherently loose information from downsampling, but that does not mean that the information lost is actually useful (of course depending on the application, for strax it's now mostly computing area deciles which relevant parameters aren't actually affected by any the changes made here). |
FYI, if someone wants to pick this up, I did not figure out where but something is still a bit fishy about this PR as it creates very different Sum WFs (whereas it should be a very minor change). I'll close this for now, people might find it back in #704 |
What is the problem / what does the code in this PR do
See #704
The sum WF is always conservative by truncating the peak when downsampling. While probably some loss is acceptable, having O(3%) differences between
p['data'].sum()
andp['area']
might be a bit more confusing than what we'dCan you briefly describe how it works?
Rather than being conservative with the clipping of the data-field, use the time if there is any after the peak. This is computed by checking if the next peak is far away enough that we can claim the bit of extra time that would be needed for the extra sample
There may still be a chance that there is another peak directly after another peak that would prevent us from filling the entire buffer. There are a couple of things we could do about that: