Skip to content

extracted excel data in text_as_html have negative values #3934

Closed
@pradyrk

Description

@pradyrk

Describe the bug
I have a xlsx , read using partition_xlsx and parsing the text_as_html , I can find negative values whereas excel doesnt have any negative value

To Reproduce
from unstructured.partition.xlsx import partition_xlsx

elements = partition_xlsx(filename="excelfilepath")
print(elements[0].metadata.text_as_html)

Expected behavior
Values should be positive and exact value needs to be extracted

Screenshots

Image

Image

Environment Info
Databricks cluster - 16.1 ML Runtime - Complete details - https://docs.databricks.com/aws/en/release-notes/runtime/16.1ml

Additional context
file is sensitive , wouldnt be able to share the actual file , providing any direction to resolve this can help

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions