You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
3. Observed Error: AttributeError: 'lxml.etree._ProcessingInstruction' object has no attribute 'is_phrasing'
Expected Behavior:
• The Markdown file should be successfully loaded and parsed into elements.
• If the file has processing instructions, they should be ignored or handled gracefully without causing a crash.
Actual Behavior:
• The process crashes with an AttributeError in partition_md, specifically at:
while q and q[0].is_phrasing:
I’m encountering an issue when using
UnstructuredFileLoader
to process a Markdown (.md) file.The loader throws an AttributeError: 'lxml.etree._ProcessingInstruction' object has no attribute 'is_phrasing' when calling partition_md internally.
Steps to Reproduce:
pip install unstructured langchain-community
2. Run the following code:
attached the file used in this code
sparql-language-ref.md
3. Observed Error:
AttributeError: 'lxml.etree._ProcessingInstruction' object has no attribute 'is_phrasing'
Expected Behavior:
• The Markdown file should be successfully loaded and parsed into elements.
• If the file has processing instructions, they should be ignored or handled gracefully without causing a crash.
Actual Behavior:
• The process crashes with an AttributeError in partition_md, specifically at:
while q and q[0].is_phrasing:
Environment Details:
• unstructured Version: (0.16.23)
• langchain_community Version: (0.3.18)
• Python Version: 3.10
• OS: Ubuntu 22.04 (WSL/Cloud-based environment)
Would appreciate any insights or a workaround for this issue! Thanks! 🙌
The text was updated successfully, but these errors were encountered: