Skip to content

Conversation

@henriquelettieri
Copy link
Collaborator

Implementation of 3D inegral for impure cluster counts

"""Makes mass distribution use additional integral with completeness"""
if self.purity is None:
"""Makes mass distribution use additional integral with purity"""
if self.purity is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good. Because if there is purity, we have to change both the mass distribution and the purity function. I just don't think it is a good name to say then that the mass distribtuion impure is mass_dist * purity. But just a naming problem

self._mass_distribution_distribution = self._impure_mass_distribution

def _impure_mass_distribution(self, log_mass, z, log_mass_proxy_limits):
def _impure_mass_distribution(self, log_mass, z, log_mass_proxy):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This here is the big problem for all the functions right? Because the impure does not take only a tuple but an NDArray of log_mass_proxy


return theory_prediction

def _get_theory_prediction_impure_counts(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is right, it works, but I do not like this. Because we are repeating so much code with all these functions. And if you go with these option, you will also have to duplicate everything for shear. My solution:

  • You make a get theory_prediction_counts that returns a callable like the impure one:
Callable[
        [npt.NDArray[np.float64], npt.NDArray[np.float64], npt.NDArray[np.float64], float],
        npt.NDArray[np.float64],

However:

            prediction = (
                self.cluster_theory.comoving_volume(z, sky_area)
                * self.cluster_theory.mass_function(mass, z)
                * self._completeness_distribution(mass, z)
                * self.redshift_distribution.distribution()
            )
   If purity is none:
         assert(mass_proxy.dim == 2) # massproxy is an NDArray with 2 values
         prediction *=   self._mass_distribution_distribution(mass, z, (mass_proxy[0], mass_proxy[1]))
     else:
     call with the NDarray

) -> npt.NDArray[np.float64]:
mass = int_args[:, 0]
z = int_args[:, 1]
mass_proxy = int_args[:, 2]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then here you check again if there is purity or not. If there is,mass_proxy is like this, an int arg. If there isnt, mass proxy is an extra arg, but continue to be an NDArray

@eduardojsbarroso
Copy link
Collaborator

@henriquelettieri If you apply the changes I did, you will only need one function for counts and one for shear.

@codecov
Copy link

codecov bot commented Dec 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.36%. Comparing base (a7b4813) to head (efbd70d).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #66      +/-   ##
==========================================
+ Coverage   98.27%   98.36%   +0.09%     
==========================================
  Files          28       28              
  Lines        1677     1770      +93     
  Branches       51       59       +8     
==========================================
+ Hits         1648     1741      +93     
  Misses         29       29              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

]
self.integrator.extra_args = np.array([*log_proxy_edges, sky_area])
assert len(log_proxy_edges) == 2
assert len(z_edges) == 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should be in get_theorey_prediction and not here. Even if here things are 2d, if the integrator calls the function get_theory_prediction with these intervals, it will give a NDArray with several values, and if purity was not properly set, this can be a problem.

@eduardojsbarroso
Copy link
Collaborator

@henriquelettieri It seems good with this approach. Nice job. Please change the test where it makes more sense, then do the same to shear profile and add tests

@eduardojsbarroso
Copy link
Collaborator

@henriquelettieri is this ready? If so, could you make sure it passes all the tests so I can review?

@henriquelettieri
Copy link
Collaborator Author

@eduardojsbarroso In principle everything is ready, I may add new tests. However the exact purity calculation for the shear is way too slow when using the same method for counts. I will fix the tests today

…nd create test_evaluates_theory_prediction_assertions
…nd create test_evaluates_theory_prediction_assertions
…nd create test_evaluates_theory_prediction_assertions
…nd create test_evaluates_theory_prediction_assertions
…nd create test_evaluates_theory_prediction_assertions
@github-actions github-actions bot changed the title Impure mass distribution exact Impure mass distribution exact [version:1.0.1] Jan 5, 2026
Copy link
Collaborator

@eduardojsbarroso eduardojsbarroso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the change in how to call the compute with shear, it seems good. However the 10% reldif tells me something is wrong. Also we have to decide if the purity will indeed be included in the shear as I said on slack. @henriquelettieri

deltasigma_list = []
for radius_center in radius_centers:
self.integrator.extra_args = np.array(
[*log_proxy_edges, sky_area, radius_center]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can change here so we avoid repeating the code:

if self.purity == None:
self.integrator.integral_bounds = [
                self.mass_interval,
                z_edges,
            ]
    extra_args = [*log_proxy_edges]
 else:
 if self.purity == None:
self.integrator.integral_bounds = [
                self.mass_interval,
                z_edges,
                log_proxy_edges,
            ]
    extra_args = []
            for radius_center in radius_centers:
                self.integrator.extra_args = np.array(
                    extra_args.extend([sky_area, radius_center]))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we dont need to rewrite everything for each case

"version": "3.14.1"
}
},
"nbformat": 4,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you did not add anything to this notebook, just ran it, please restore it before commiting

z_array = np.array([z_scalar])
log_proxy_array = np.array([log_proxy_scalar])
return binned_grid_w_pur._purity_distribution(log_proxy_array, z_array)
return binned_grid_w_pur._purity_distribution(log_proxy_array, z_array).item()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change this here? We did not change the _purity_distribution function in this PR right? I get it that quad will call this with scalars and we want to return a scalar, but was this not working before?

denom = np.where(pred_exact == 0, 1.0, pred_exact)
assert np.all(np.abs((pred_grid - pred_exact) / denom) < rel_tol)

rel_tol = 1.0e-1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is to low as relative tolerance. Because in the test with counts, we are able to test this with rel_tol = e-4. There must be either some parameter wrong here or something else.

I tested calling the completeness distrubtion before and after you change the property and it seems to be working fine, but we should investigate this. 10% while the counts works fine, and we can also have e-4 with purity, there must be something wrong here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants