-
Notifications
You must be signed in to change notification settings - Fork 77
Add main functions for the computation of Kr maps #866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 11 commits
5b331f8
aedc73e
7593cd9
e943e9c
bd79755
a27dd34
f480e1e
58b2824
7928ec4
a8de173
24419d8
cd4e170
6cb42eb
2ee3482
11cb1e5
ddfc3ea
887eac6
c4ca129
46955a9
b30b7b8
e6d0256
bae44b8
7775792
0a48e25
9cb9117
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,9 +1,18 @@ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import numpy as np | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import pandas as pd | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import copy | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import warnings | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import itertools | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import numpy as np | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import pandas as pd | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import scipy.stats as stats | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| from typing import Tuple, Optional | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| from .. types.symbols import KrFitFunction | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| from .. evm.ic_containers import FitFunction | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| from .. core.fit_functions import polynom, expo | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| from ..core.core_functions import in_range, shift_to_bin_centers | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| from ..core.fit_functions import fit, get_chi2_and_pvalue | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| def lin_seed(x : np.array, y : np.array): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
@@ -149,3 +158,325 @@ def transform_parameters(fit_output : FitFunction): | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| return par, err, cov | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| def create_df_kr_map(bins : Tuple[np.array, np.array], | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| counts : np.array, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| n_min : int, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| r_max : float)->pd.DataFrame: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ''' | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| This function creates the dataframe in which the map parameters are stored. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Parameters | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ---------- | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| fittype : KrFitFunction | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Chosen fit function for map computation | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| bins : Tuple[np.array, np.array] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Tuple containing bins in both axis | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Tuple containing bins in both axis | |
| Tuple containing bins in both axes |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blank line
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| geom_comb = list(itertools.product(b_center[0], b_center[1])) | |
| r_values = np.array([np.sqrt(x**2+y**2)for x, y in geom_comb]) | |
| geom_comb = np.array(list(itertools.product(*b_center)) | |
| r_values = np.sum(geom_comb**2, axis=1)**0.5 |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the suggestion above...
| X = list(zip(*geom_comb))[0], | |
| Y = list(zip(*geom_comb))[1], | |
| X = geom_comb.T[0], | |
| Y = geom_comb.T[1], |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can add these to the dictionary above since you defined r_values and counts
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not accept two arrays instead of a tuple of arrays? here and in create_df_kr_map.
A couple of tests for this function could be:
- Check the size of the output arrays
- Check that the sum of counts corresponds to the number of events in the dst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Parameters | |
| -------------- | |
| dst : pd.DataFrame | |
| Krypton dataframe. | |
| bins : Tuple[np.array, np.array] | |
| Bins used to compute the map. | |
| Returns | |
| ------------- | |
| counts : np.array | |
| Total number of events falling into each bin | |
| bin_labels : np.array | |
| 1D bin label for each event | |
| Further info: | |
| ----------------- | |
| Parameters | |
| ---------- | |
| dst : pd.DataFrame | |
| Krypton dataframe. | |
| bins : Tuple[np.array, np.array] | |
| Bins used to compute the map. | |
| Returns | |
| ------- | |
| counts : np.array | |
| Total number of events falling into each bin | |
| bin_labels : np.array | |
| 1D bin label for each event | |
| Further info: | |
| ------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| counts = counts.flatten() | |
| bin_indexes -= 1 | |
| bin_labels = np.ravel_multi_index(bin_indexes, dims=(n_xbins, n_ybins), | |
| mode='clip', order = 'F') | |
| return counts, bin_labels | |
| # indexes 0 and len+1 represent underflow and overflow, thus we subtract 1 | |
| bin_labels = np.ravel_multi_index(bin_indexes - 1, dims=(n_xbins, n_ybins), | |
| mode='clip', order = 'F') | |
| return counts.flatten(), bin_labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I understand correctly mode="clip" means that bins out of range will be assigned the closest bin. Is this what we used to do @bpalmeiro ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our case, we had three values for "fiducial":
-
The R max for the selection (just a quality one to ensure everything was inside)
-
The R max for the map, that defines the maximum extension of it. In this case, if the event didn't happen to fall inside the bins, it was disregarded.
-
Also (not sure here it's the most appropriate place to mention, tho), we had a posterior step where the peripheral bins (those further than a fiducial r -but within said Rmax for map production-) were set to nan.
The latter was used to ensure that, in the posterior analysis, all the hits reconstructed outside the volume were automatically flagged as nan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, the best approach would be to disregard events outside boundaries instead of fakely merging them in the border bins, even if we rely on a previous selection. It's not a very strong opinion, but I think it adds redundancy (and, therefore, a bit of robustness) to the process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my bad... I assumed that there would not be any events outside the boundaries here since I was expecting a "clean" dst in this part of the code, so when I defined bin_labels like that I didn't think of the consequences in case some event was actually out of the range.
I am thinking in a solution like this one:
| counts = counts.flatten() | |
| bin_indexes -= 1 | |
| bin_labels = np.ravel_multi_index(bin_indexes, dims=(n_xbins, n_ybins), | |
| mode='clip', order = 'F') | |
| return counts, bin_labels | |
| counts = counts.flatten() | |
| bin_indexes -= 1 | |
| valid_mask = ( in_range(bin_indexes[0], 0, n_xbins, right_closed=True) & | |
| in_range(bin_indexes[1], 0, n_ybins, right_closed=True) ) | |
| bin_labels = np.full(bin_indexes.shape[1], fill_value=np.nan) | |
| bin_labels[valid_mask] = np.ravel_multi_index((bin_indexes[0, valid_mask], | |
| bin_indexes[1, valid_mask]), | |
| dims=(n_xbins, n_ybins), | |
| order='F') |
In case some events are outside the desired range, their label would be a NaN instead of a numerical index, so when grouping the events bin by bin, none would be assigned to those events...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| res = y - fit_output.fn(x); std = res.std() | |
| res = y - fit_output.fn(x) | |
| std = res.std() |
I think I told you to make it a single line, but I didn't realize you needed the two variables. In any case, this function is so trivial, that if it is only used in fit_and_fill_map, you can remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can also remove this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this per one or percent? The docsstring suggests per one, the usage suggests percent. I suggest renaming it to make it more clear (validity_ratio or validity_percent)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, need tests for this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this is a pd.Series, not a pd.DataFrame
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not tested
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| dst_bin = dst.query(f'bin_index == {map_bin.bin}') | |
| dst_bin = dst.loc[dst_bin.bin_index == map_bin.bin] |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no longer a full_output parameter and the output is only fit_output
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests. At least:
- check it doesn't modify the input
- create a silly map with ones and a few "invalid entries", then check the invalid entries are restored to 1.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| new_map = copy.deepcopy(maps) | |
| new_map = maps.copy() |
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blank line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer a parameter. Remove blank line before.