-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The extracted table box coordinates do not correspond to the images converted from the PDF #486
Comments
Curious to know how you get this exact value of |
Answer the question 1:When the camelot package obtains the box coordinates by the pdfminer package, whose resolution's default value is 72 (I fogot to where I saw it), but when the camelot obtains the image by the read_pdf function, whose resolution's default value is 300. Line 93 in cd8ac79
Answer the question 2:You can try others. |
@SWHL Tis really helped me to understand the conversion. However i have a similar problem in which i have a coordinates of an object got it from a page image(pdf page have been converted into page image). Now i want to convert these coordinates into camelot pdf level coordinates. I tried to follow above logic in reverse order which is not successful. |
@baleris You can try it by this: |
@SWHL, this has not worked, when i checked camelot detected table coordinates they are totally different. For example for the above mentioned coordinates, camelot's relevant coordinates are (72.0, 295.2, 563.04, 648.72) |
@SWHL i see in your above solution you are getting a page image from Any suggestions to get image for "stream" parameter/borderless tables ? |
You can refer this: Lines 35 to 40 in cd8ac79
The current issue is beyond the scope of this issue. Suggest opening a new issue to discuss. |
Checklist
Describe the bug
Environment
OS
: CentOS 7Python
: 3.7.11camelot-py
: 0.10.1Reproduction
Bug fix
The text was updated successfully, but these errors were encountered: