-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
External iterators: for language bindings #33
Comments
+1 |
e.g. in SpOdgi we do the following def handles(self):
nodeId = self.odgi.min_node_id()
maxNodeId = self.odgi.max_node_id()
while (nodeId <= maxNodeId):
if(self.odgi.has_node(nodeId)):
nodeId=nodeId+1
yield self.odgi.get_handle(nodeId-1) Which is the only way to iterate over all nodes in lazy way. But less than ideal performance wise. |
There is for_each_handle that does the same thing. Is that not working?
…On Thu, Nov 21, 2019, 18:59 JervenBolleman ***@***.***> wrote:
e.g. in SpOdgi we do the following
def handles(self):
nodeId = self.odgi.min_node_id()
maxNodeId = self.odgi.max_node_id()
while (nodeId <= maxNodeId):
if(self.odgi.has_node(nodeId)):
nodeId=nodeId+1
yield self.odgi.get_handle(nodeId-1)
Which is the only way to iterate over all nodes in lazy way. But less than
ideal performance wise.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#33?email_source=notifications&email_token=AABDQEPMWJ5WWIVXTP4EG4TQU3D7JA5CNFSM4JGEY6WKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE3DV5I#issuecomment-557202165>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABDQEIFRUK5CBW72HTVK6TQU3D7JANCNFSM4JGEY6WA>
.
|
@ekg no because you can't jump So you end up having to either collect all the nodes so you can iterate over them python style at humongous memory costs. Or one has a silly iterator implementation like above. |
So I came by today to open this same issue again :) it would be really nice to have std::iterator for edges,handles and paths. |
Would it be sufficient if we implemented something like the |
Yes, I _really_ think so 💯
…On Thu, Jun 23, 2022, 18:04 Jordan Eizenga ***@***.***> wrote:
Would it be sufficient if we implemented something like the scan_path
method for edges?
https://github.com/vgteam/libhandlegraph/blob/master/src/include/handlegraph/path_handle_graph.hpp#L161-L164
—
Reply to this email directly, view it on GitHub
<#33 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHQYFO4YQYKIY4ZTGQBEKDVQSDKRANCNFSM4JGEY6WA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
It's a tricky thing, because we haven't exposed an object that has knowledge about where it is in the container the way an iterator does (which the exception of Thoughts? |
I know. Copying and then iterating is not an option beyond test cases. I really need something new here, sorry. I have one more horrid alternative and that is to spin up a thread and use a small shared queue to load items in a buffer and iterate over that. Still we can't use raii there and will need to free the copied objects. |
In the case of handles (and other stuff, probably), xg and PackedGraph store them in arrays. So a get_ith_handle() type method would be possible and efficienit (as would, by extension, supporting iterators that count i internally). I don't think you could get random access from HashGraph without some re-engineering, but I guess it could in theory expose its STL iterator (or some kind of wrapper for it). |
I just need a STL style iterator, no need for fast random access. |
I made a quick prototype of what a hack-y vector-based implementation could look like. Again, I have some concerns about efficiency for iterating over the entire graph, but this would probably work fine for traversing edges: |
The current handlegraph api, uses the internal iterator approach (e.g.
for_each_edge
etc.) This is very inconvenient when trying to use this code from python. The only way to move data from inside the iteratee to the outside is by running the iteration in a different thread than the major python code and pass out the required information using a shared blocking queue. As you can imagine this is a pain to program.The python approach, like modern java is to use generators/streams instead. This works well in these languages because memory ownership is well defined. For C++ I imagine this approach was taken to make it easier to keep track of who owns what memory. Unfortunately, the push approach inherent in this solution is a pain to use from languages which like to use a pull model.
Could we have standard C++ iterators for each iteratee method?
The text was updated successfully, but these errors were encountered: