Releases: aws-samples/amazon-textract-textractor
Releases · aws-samples/amazon-textract-textractor
Version 1.7.11
What's Changed
- Add figure layout prefix and suffix by @Belval in #362
- Add confidence scores at the DocumentEntity level by @Belval in #363
Full Changelog: v1.7.10...v1.7.11
Version 1.7.10
What's Changed
- Use AWS_REGION and AWS_DEFAULT_REGION environment variables in Textractor when available
- Fix missing figure layouts
Full Changelog: v1.7.9...v1.7.10
Version 1.7.9
Version 1.7.8
Version 1.7.7
What's Changed
Full Changelog: v1.7.6...v1.7.7
Version 1.7.6
Version 1.7.5
What's Changed
- Make KeyValue.key an EntityList by @Belval in #320
- Remove numpy from explicit dependencies by @Belval in #324
- Hide key value layouts by @Belval in #325
- Return query and query answer with get_text() by @Belval in #329
- Convert image to RGB in EntityList for Jupyter compatibility by @Belval in #330
- Support for Python 3.12 by @tb102122 in #311
Full Changelog: v1.7.4...v1.7.5
Version 1.7.4
What's Changed
- Fix table title .get_text() by @Belval in #314
- Fix .to_pandas() raising an exception by @Belval in #315
Full Changelog: v1.7.3...v1.7.4
Version 1.7.3
What's Changed
- 
Table linearization improvements by @Belval in #313 - Add .get_text(),.to_html()and.to_markdown()functions toLinearizablewhich is now implemented byDocument,Page,DocumentEntityandEntityList
- Add HTMLLinearizationConfigandMarkdownLinearizationConfigas pre-configuredTextLinearizationConfig
- Add the follow parameters to TextLinearizationConfig- duplicate_text_in_merged_cellsduplicates the text in merge cells to preserve row-level alignment
- table_flatten_headerscombines multi-row headers into a single row, duplicating the merged cells horizontally as needed
- table_tabulate_remove_extra_hyphensremoves extra hyphens '-' in markdown tables to reduce context length
- max_number_of_consecutive_spacesdefines the maximum number of contiguous whitespace characters, similar to- max_number_of_consecutive_new_lines
 
 
- Add 
- 
Fixes: 
New Contributors
Full Changelog: v1.7.2...v1.7.3
Version 1.7.2
What's Changed
- Fix for page objects not always having an image attached, causing an exception on .visualize()
Full Changelog: v1.7.1...v1.7.2