-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upcoming Insights Package #357
Comments
I really like this approach. There could be multiple reasons why decoupling the plotting from the calculation of the SHAP values might be preferred:
|
Thanks, @Scienfitz, for the nice summary! I think it is very difficult to provide definitive answer to all your questions at the current point since there are still so many unclear parts, like
Anyway, only time will show us what is needed, so I think let's just start with one or two methods in mind and adapt from there. So regarding your questions:
|
Dear @Scienfitz and @AdrianSosic, I have taken my shot at implementing feature importance analysis in PR #391 for the new diagnostics package. The workflow and separation of computation and visualization are highly similar to the original SHAP package. Hence, the plotting is implemented by passing the I am looking forward to discussing this further, I hope this helps :)! |
Hi @Alex6022 let me just quickly also say "Hi" and thank you for your work :) Since you, @Scienfitz and @AdrianSosic already discussed here and in #391 I will leave the majority of the discussion to you guys and step in if I have the feeling that I can contribute anything (or if an additional opinion is required in case there is disagreement). |
@Alex6022 thanks so much for providing the PR. @AdrianSosic and I could not yet agree on all aspects, in particular the design of the actual explainer and how to handle its variance regarding the different avaialble shap explainers etc. Overall, we dont want to couple it to But below please find other aspects that we saw that you could already work on:
PS: I might update this post here with more boxes, you can simply tick them off to remember for yourself and signal to us |
Hello @Scienfitz, thank you very much for this list. I’ve addressed most of the points, except for the SHAP explanation error with categorical parameters, which I haven’t been able to replicate in a Python 3.12.4 environment with only Please also let me know if you have anything else to add to the list at the moment, I'd be happy to work on it. |
@Alex6022 great, thanks for working through the list. Please forget my point about the exp rep, it wasnt the result of a test or so, but a connection I made based on another post somewhere else (that might have turned out to be not relevant) I pushed some commits to your branch fixing minor things. I've also changed a few things in the test file, perhaps you can draw some inspiration from that regarding parameterization. I also left some more comments in a review I have two main questions to discuss with you
|
Dear @Scienfitz, Thank you for the feedback, I have described the changes based on your comments in the PR #391. Regarding the questions:
|
closed via #391 |
Update and ToDo's 22.11.2024
insights
instead ofdiagnostics
. At the heart of the newInsight
class would be the surrogate model as attribute. I.e. one would initialize it via the constructor likeInsight(my_surrogate_model)
orInsight(campaign.get_surrogate())
Insight.from_campaign()
andInsight.from_recommender()
methods which are just wrappers to extract the surrogate model. Check out this design pattern used somewhere else e.g. hereSHAPInsight
derived from the base classInsight
. It would implement all logic to initialize and extract shap correctly based on the surrogate model. We already have another classFitInsight
in mind, providing traditional goodness of fit metrics orOutlierInsight
- for the shap implementations we dont have to deal with those, just saying to provide context for this design choice.plot
method. It would accept any of the valid plotters from the shap library with a default choice set. From the current implementation it seems that most plots would work out of the box so that the shap plotter object can simply be called. Just for one plotter type, some additional work had to be done. This could be solved via singledispatchmethod. The main point is thatSHAPInsight
has just one.plot(plot_type)
method which executes the correct thing based on the requested plotter typeexplainer
andexplanation
structures become obsolete and somewaht absorbed into theSHAPInsight
class, make sure to check thatOriginal 30.06.2024: Opening this issue to collect and plan around the upcoming diagnostics package.
After #355 is merged, the last fitted surrogate model will be available. Since we deal with bayesian models our model exposes a
posterior
method. Applying.mean
turns this into a model essentially comparable to standard predictive models. Our last refactoring ensured that the input to that model can be in experimental representation, i.e. the same as accepted byadd_measurements
, i.e. it can include the unencoded labels etc.Preliminary Example
@AdrianSosic already shared how to utilize this in the PR, for easier access I will copy the crucial part of his example here:
@brandon-holt @Alex6022 tagging you here so you have this simple example to go already after #355 is merged. Questions or feedback on this application should ideally be collected here and not in the PR.
Turning this into a diagnostics package
Since the very start of this package we had requests for diagnostics, SHAP is one of them, but more traditional users might also want to look at traditional metrics such as goodness of fit etc etc.
Essentially we are now proposing to turn the above code into a subpackag that can be used like this:
Note that SHAP explainers seem to have a common interface
(model, data)
which means we can allow any shap explainer importable viashap.explainer
andshap.explainer.other
. The latter even offers non-shap methods such as LIME and MAPLE via the same interface.Conversely, we could have things like
Open Questions
For @AdrianSosic as he is going on longer absence, it would be good to have your definitive decisions on these questions before that.
shap
but probably others for the non-shap methods (that might still come as part of the shap package see above). I would group all of those into a new optional dependency groupdiagnostics
or do we need that more finegrained or even let if fail at runtime?The text was updated successfully, but these errors were encountered: