How to get the separate quad(quad_like,rect_like) in PDF using search_for method when targeted letters are contiguous and adjacent to each other #2806
-
Hi, I would like to know how to get the separate quad(quad_like,rect_like) in PDF using search_for method when targeted letters are contiguous and adjacent to each other. Default method of search_for(target) retrunes just 1 quad(quad_like,rect_like) since target letters are contiguous and adjacent to each other. I would like to get the separate quad(quad_like,rect_like). |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 1 reply
-
I wonder the quad area by search_for method is changed to be narrow in horizontal. If so, could you let me know how to set the narrow. Thank you. |
Beta Was this translation helpful? Give feedback.
-
Like the text extraction variants, the search method also supports the In addition, there is the new parameter Maybe this helps. |
Beta Was this translation helpful? Give feedback.
-
Thank you for advise. I tried like below to the attached example pdf. Target character is "9". rect = page.search_for("9", clip=rect_like) I appreciate if you could advise me how to resolve this issue. |
Beta Was this translation helpful? Give feedback.
-
Thank you for explanation. I need just 1-char, so I will try another aproach. Thank you. |
Beta Was this translation helpful? Give feedback.
1-char searches are always problematic of course, because characters in a row may have tiny overlaps (created by the PDF maker). What you always do in similar cases is this:
The second hickup: if the search algorithm finds multiple adjacent copies of the needle, then one common rectangle is returned as visible above.