Skip to content

Manual UI Integration + Optimization in Company Metadata extraction #233

@Deodat-Lawson

Description

@Deodat-Lawson

Issue: Manual UI Integration + Optimization in Company Metadata Extraction

Summary

  1. Manual Editing UI: Allow employers to manually view, correct, and enrich extracted metadata directly from the dashboard.
  2. Extraction Pipeline Optimization: Improve the accuracy, speed, and cost-effectiveness of the automated extraction process.

UI

  • Inline Editing: Enable editing on CompanyInfoCard fields (name, industry, HQ, founded year, size, website, description).

  • Visual Feedback: Add a badge on manual_override facts so users identify pinned values protected from future extractions.

  • Guardrails: Implement an Edit Mode toggle (View vs. Edit) to prevent accidental changes.

  • UX: Implement optimistic UI updates with error rollback functionality.


Extraction Optimization Suggestions

Pipeline Efficiency (Suggested, feel free to change)

  • Incremental Extraction: Instead of re-processing all documents from scratch on "Re-extract," track lastExtractionDocumentId and only process new or updated documents.

  • Smarter Chunking: Filter chunks to skip boilerplate, TOCs, and legal disclaimers to reduce LLM tokens and costs.

  • Caching: Store per-document extracted facts separately; use content hash comparisons to skip documents that haven't changed.

  • Conflict Resolution UI: Surface conflicts to the user when new extractions contradict existing data instead of silent merging. (Optional)

  • Extraction Preview: Show a diff (using the existing MetadataDiff structure) and let users approve/reject individual changes before committing. (Optional)

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions