rosetta/plan.md at main · olliekm/rosetta

RAW job posting data: https://huggingface.co/datasets/xanderios/linkedin-job-postings Run LLM extraction on the raw data for skills Validate this vs the skillspan dataset That gives you precision/recall numbers on your extraction prompt before you've trained anything. This is your LLM baseline.