Improve Handling of Noise Points in Clustering Algorithms (Fixes #152) #200
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Description
Title: Improve Handling of Points Clustered as Noise
Related Issue: Fixes #152
Issue URL: Improve the handling of points clustered as noise
Summary
This pull request addresses issue #152, which focuses on enhancing the handling of noise points produced by clustering algorithms, specifically
DBSCAN
. Previously, noise points were labeled as-1
and removed from further analysis, which could lead to loss of potentially valuable information. This update introduces an improved mechanism for managing these noise points, allowing for various user-defined approaches.Changes Made
Created
NoiseHandlingClustering
Class:-1
and are not considered in analysis.Updated Functionality:
mapper_connected_components
function was modified to incorporate the new clustering strategies, while ensuring that the default behavior remains intact for backward compatibility.Testing and Validation:
-1
labels for noise points are preserved when selected.Next Steps
Further testing will be conducted with larger datasets to ensure consistent performance and functionality across varied scenarios. Feedback from team members regarding additional test cases or potential edge cases is welcome.
Thank you for considering this enhancement to improve the handling of noise points in our clustering implementations.
Please let me know if there are any questions or if further adjustments are needed for this pull request!