-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tail characters getting stripped off #15
Comments
Hi! Thank you, great to hear that the class helps you! Can you send me (here or through my email) a PDF file which doesn't work? |
` Independent Auditors Report
This is what the extracted text looks like, if you look closely few of the characters are missing from words that I have highlighted and have also highlighted the issue where the tail characters are getting stripped. I have attached the file as well, and the page number is 51 for the above extract. Thanks |
Thanks, I am going to investigate on that this week-end |
Thanks a lot, I was wondering if you could explain why were certain characters getting stripped! |
You were right, it has to to with this.setCurrentPageWidth(pageRectangle.getWidth()); |
@JonathanLink: About Your last commit (88bfd8c): I see it's still in the 'dev' branch and hasn't been merged to 'master'. Is there any reason for that? Also, do we have any way to set the page width externally (i.e.: call pdflayouttextstripper.setPageWidth() or something like that)? Thanks, |
I am working with a host of PDF reports and while I am able to maintain the layout using your class, sometimes the tail characters are getting stripped off, but the parent class i.e. PDFTextStripper works fine.
Does this have anything to do with
this.setCurrentPageWidth(pageRectangle.getWidth());
??By the way great work with the class, made the process of extracting tables so easy.
The text was updated successfully, but these errors were encountered: