Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace PyRanges with NCLS to avoid pandas dependency #111

Open
tomwhite opened this issue Dec 3, 2024 · 1 comment
Open

Replace PyRanges with NCLS to avoid pandas dependency #111

tomwhite opened this issue Dec 3, 2024 · 1 comment

Comments

@tomwhite
Copy link
Contributor

tomwhite commented Dec 3, 2024

PyRanges uses NCLS (Nested Containment List) internally for efficient implementation of overlaps. We could use NCLS directly - which just uses NumPy arrays - if we wanted to remove our dependency on pandas.

The two functions that need re-implementing are overlap (https://github.com/pyranges/pyranges/blob/4f0a153336e7153cdfea15b141ce4ea35a24e233/pyranges/pyranges_main.py#L3490-L3628) and subtract (https://github.com/pyranges/pyranges/blob/4f0a153336e7153cdfea15b141ce4ea35a24e233/pyranges/pyranges_main.py#L4802-L4879).

While this is certainly doable, it's a bit fiddly, and probably shouldn't block a release.

@jeromekelleher
Copy link
Contributor

OK great, thanks for looking into it. Let's shelve this for a later release when we're looking to slim down our dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants