generated from amazon-archives/__template_MIT-0
-
Notifications
You must be signed in to change notification settings - Fork 50
new assesment #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kazmer97
wants to merge
30
commits into
develop
Choose a base branch
from
feat/new-assesment
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+5,770
−5,319
Open
new assesment #137
Changes from all commits
Commits
Show all changes
30 commits
Select commit
Hold shift + click to select a range
491d8f0
invoke extraction with long retries
kazmer97 3d38d1d
add review agent model config
kazmer97 be4d160
fixes
kazmer97 cbedb14
add review agent model config
kazmer97 049c4fe
new assesment
kazmer97 1371f29
typed metadata model
kazmer97 fe56e98
fixes
kazmer97 7601354
fix tool
kazmer97 eedd6a8
assesment update
kazmer97 6a6d610
further streamlining
kazmer97 076092f
fix template
kazmer97 839e520
missing dep
kazmer97 79fe3e7
update config model usage
kazmer97 2e93ad3
strands argument passing update
kazmer97 338c87f
update the tests for the config
kazmer97 16480da
bug fixes
kazmer97 984ac36
make sure model dump uses json mode
kazmer97 af89f47
bbox update
kazmer97 cedf899
memory update
kazmer97 a92f70e
cleanup
kazmer97 5a066fc
fix failing test
kazmer97 f3ab598
import fix
kazmer97 29b4773
cleanup: remove artifacts and redundant code from PR review
kazmer97 958d775
update tests to pass
kazmer97 049fb74
fixes
kazmer97 0777e5a
fix the ruler offset
kazmer97 12f3b10
encapsulate ruler
kazmer97 dd50777
add structured loggin
kazmer97 e21efb8
improve retry mechanism
kazmer97 acd8967
small fixes
kazmer97 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -405,11 +405,7 @@ assessment: | |
| image: | ||
| target_height: "" | ||
| target_width: "" | ||
| granular: | ||
| enabled: true | ||
| max_workers: "20" | ||
| simple_batch_size: "3" | ||
| list_batch_size: "1" | ||
| max_workers: "20" | ||
| default_confidence_threshold: "0.8" | ||
| top_p: "0.0" | ||
| max_tokens: "10000" | ||
|
|
@@ -456,107 +452,6 @@ assessment: | |
| - Provide tight, accurate bounding boxes around the actual text | ||
| </assessment-guidelines> | ||
|
|
||
| <spatial-localization-guidelines> | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, won't removing this break the current assessment implementation? |
||
| For each field, provide bounding box coordinates: | ||
| - bbox: [x1, y1, x2, y2] coordinates in normalized 0-1000 scale | ||
| - page: Page number where the field appears (starting from 1) | ||
|
|
||
| Coordinate system: | ||
| - Use normalized scale 0-1000 for both x and y axes | ||
| - x1, y1 = top-left corner of bounding box | ||
| - x2, y2 = bottom-right corner of bounding box | ||
| - Ensure x2 > x1 and y2 > y1 | ||
| - Make bounding boxes tight around the actual text content | ||
| - If a field spans multiple lines, create a bounding box that encompasses all relevant text | ||
| </spatial-localization-guidelines> | ||
|
|
||
| <final-instructions> | ||
| Analyze the extraction results against the source document and provide confidence assessments with spatial localization. Return a JSON object with the following structure based on the attribute type: | ||
|
|
||
| For SIMPLE attributes: | ||
| { | ||
| "simple_attribute_name": { | ||
| "confidence": 0.85, | ||
| "bbox": [100, 200, 300, 250], | ||
| "page": 1 | ||
| } | ||
| } | ||
|
|
||
| For GROUP attributes (nested object structure): | ||
| { | ||
| "group_attribute_name": { | ||
| "sub_attribute_1": { | ||
| "confidence": 0.90, | ||
| "bbox": [150, 300, 250, 320], | ||
| "page": 1 | ||
| }, | ||
| "sub_attribute_2": { | ||
| "confidence": 0.75, | ||
| "bbox": [150, 325, 280, 345], | ||
| "page": 1 | ||
| } | ||
| } | ||
| } | ||
|
|
||
| For LIST attributes (array of assessed items): | ||
| { | ||
| "list_attribute_name": [ | ||
| { | ||
| "item_attribute_1": { | ||
| "confidence": 0.95, | ||
| "bbox": [100, 400, 200, 420], | ||
| "page": 1 | ||
| }, | ||
| "item_attribute_2": { | ||
| "confidence": 0.88, | ||
| "bbox": [250, 400, 350, 420], | ||
| "page": 1 | ||
| } | ||
| }, | ||
| { | ||
| "item_attribute_1": { | ||
| "confidence": 0.92, | ||
| "bbox": [100, 425, 200, 445], | ||
| "page": 1 | ||
| }, | ||
| "item_attribute_2": { | ||
| "confidence": 0.70, | ||
| "bbox": [250, 425, 350, 445], | ||
| "page": 1 | ||
| } | ||
| } | ||
| ] | ||
| } | ||
|
|
||
| IMPORTANT: | ||
| - For LIST attributes like "Transactions", assess EACH individual item in the list separately with individual bounding boxes | ||
| - Each transaction should be assessed as a separate object in the array with its own spatial coordinates | ||
| - Do NOT provide aggregate assessments for list items - assess each one individually with precise locations | ||
| - Include assessments AND bounding boxes for ALL attributes present in the extraction results | ||
| - Match the exact structure of the extracted data | ||
| - Provide page numbers for all bounding boxes (starting from 1) | ||
| </final-instructions> | ||
|
|
||
| <<CACHEPOINT>> | ||
|
|
||
| <document-image> | ||
| {DOCUMENT_IMAGE} | ||
| </document-image> | ||
|
|
||
| <ocr-text-confidence-results> | ||
| {OCR_TEXT_CONFIDENCE} | ||
| </ocr-text-confidence-results> | ||
|
|
||
| <<CACHEPOINT>> | ||
|
|
||
| <attributes-definitions> | ||
| {ATTRIBUTE_NAMES_AND_DESCRIPTIONS} | ||
| </attributes-definitions> | ||
|
|
||
| <extraction-results> | ||
| {EXTRACTION_RESULTS} | ||
| </extraction-results> | ||
|
|
||
| evaluation: | ||
| enabled: true | ||
| llm_method: | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look backward compatible. Is it? We do not want to break the existing granular assessment. Or do you propose that we replace 'granular assessment' (our current default) with 'agentic assessment'?