-
Notifications
You must be signed in to change notification settings - Fork 111
Description
Issue: Manual UI Integration + Optimization in Company Metadata Extraction
Summary
- Manual Editing UI: Allow employers to manually view, correct, and enrich extracted metadata directly from the dashboard.
- Extraction Pipeline Optimization: Improve the accuracy, speed, and cost-effectiveness of the automated extraction process.
UI
-
Inline Editing: Enable editing on
CompanyInfoCardfields (name, industry, HQ, founded year, size, website, description). -
Visual Feedback: Add a badge on
manual_overridefacts so users identify pinned values protected from future extractions. -
Guardrails: Implement an Edit Mode toggle (View vs. Edit) to prevent accidental changes.
-
UX: Implement optimistic UI updates with error rollback functionality.
Extraction Optimization Suggestions
Pipeline Efficiency (Suggested, feel free to change)
-
Incremental Extraction: Instead of re-processing all documents from scratch on "Re-extract," track
lastExtractionDocumentIdand only process new or updated documents. -
Smarter Chunking: Filter chunks to skip boilerplate, TOCs, and legal disclaimers to reduce LLM tokens and costs.
-
Caching: Store per-document extracted facts separately; use content hash comparisons to skip documents that haven't changed.
-
Conflict Resolution UI: Surface conflicts to the user when new extractions contradict existing data instead of silent merging. (Optional)
-
Extraction Preview: Show a diff (using the existing
MetadataDiffstructure) and let users approve/reject individual changes before committing. (Optional)