-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New: Interactive Segmentor Engine #2
base: develop
Are you sure you want to change the base?
Conversation
NuClick architecture, infer_batch
Renamed some parameters and added docstring
Create interactive_segmentation dataset. |
Fixed bug
Override __len__() function
Changed labels
Tidy up code
Improve doc string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Jiaqi,
Sorry for the late review, I'm experiencing three super busy weeks. I hope my late response has not stopped you from further development.
Anyway, Thanks for this code, it seems to do the job. However, there are some major comments from me that need to be answered or addressed before going forward.
raise ValueError(f"`{mode}` is not supported.") | ||
|
||
self.img_path = img_path | ||
self.label = label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the label?
if it's the label of the patch. include it in the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the label of the input image. I'll include it in the docstring.
However, we might not need it at all?
return data | ||
|
||
|
||
def get_boundingBox(self, idx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_bounding_box
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the reason for writing this bit as a separate method unless it is written in a @staticmethod
form to be used later in other modules as well. My suggestion is to merge this bit into the init method with good documentations (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have this as a function so we only compute a bounding box for a click when we are loading the actual patch in getitem.
Sure. But does that mean we compute all the bounding boxes in init and store all of them? Will it be a memory overhead? Or is it ok?
|
||
return bounds | ||
|
||
def get_exclusionMap(self, idx, boundingBox): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function has a good property that it does not need saving a big array in RAM but its computationally extensive. Imagine we have 100000 points in the point list, each time that __get_item__
method is called, this function also calls a for loop over all 99999 other cells, which is not optimized I believe. There are some ways to avoid this:
1- Use the original NuClick approach: creating a big array of click_point
and crop the bounding box region from it with each get_item . This way you get rid of the internal for loop and it's fast. Disadvantage is that it might require at max 500 MB of RAM for big whole slide image. (which I'm fine with)
2- This approach is to try vectorizing the inner for loop. Instead of going over all otherPoints
in a for loop, try using array filtering, for example:
xy_locations = np.concat([x_locations, y_locations], axis=1)
sel = np.any(xy_locations > boundingBox[:2], axis=1)
sel |= np.any(xy_locations < boundingBox[2:], axis=1)
filtered_xy_locs = xy_locations[sel]
exclusionMap[0, filtered_xy_locs[:, 1]-boundingBox[1], filtered_xy_locs[:, 0]-boundingBox[0]] = 1
To be honest, I just write this snippet in the comments and haven't checked if it works, but there should be a similar trick out there to avoid writing the inner for loop.
3- The smartest way maybe is to go with KDtree algorithm. I'm not an expert with this concept, but this might be the solution to both speed and memory efficiency. For example, in the init method you form the KDtree to find the K nearest neighbors of each point and for each point when you are forming the exclusion map, you somehow extract the nearest neighbors that fall inside the same bounding box as well.
All and all, my suggestion is this: check for a high number of point annotations (like random 100000 points) and see how long does it take with your current exclusionMap creation function. If it was considerable, neglect it and go with the classic way that we are doing in orifinal NuClick i.e., creating a big click_map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the code to use the second approach. I think this is good enough?
Removed parameter: 'mode'
Added 'generate_inst_dict' function. Edited 'post_proc' function.
No description provided.