Small v0 workflow for reviewing receipt photos.
The system is intentionally two-step:
extract_receipt_details(image_path)reads one receipt image and returns structured receipt data.evaluate_receipt_for_audit(receipt_details)decides whether the receipt should be audited.
uv syncThe app reads OPENAI_API_KEY from .env, .env.local, or /env.
src/receipt_review/
llm/ # OpenAI client and structured-output helpers
steps/ # The two workflow steps: extraction and audit
image_preflight.py # Orientation correction before extraction
schemas.py # Pydantic schemas and business data contracts
workflow.py # Composes the two steps and saves outputs
The optional image preflight module can use tesseract to detect upside-down receipts and write corrected copies to outputs/preprocessed/. It is currently paused so the v0 eval baseline measures extraction without preprocessing.
The receipt images come from the CC BY 4.0 licensed Receipt Handwriting Detection Computer Vision Project dataset published by Roboflow: https://universe.roboflow.com/newreceipts/receipt-handwriting-detection
uv run python scripts/run_receipt.py data/test/Gas_20240605_164059_Raven_Scan_3_jpeg.rf.e3408aa2b936afd1f1aed84fa40d454e.jpgThe command writes separate extraction and audit JSON files. Repeated runs are preserved with numbered filenames so model variability can be assessed:
outputs/reviews/extraction/<receipt-stem>.json
outputs/reviews/extraction/<receipt-stem> (1).json
outputs/reviews/audit_results/<receipt-stem>.json
outputs/reviews/audit_results/<receipt-stem> (1).json
If the image has ground truth in data/ground_truth, compare the saved result with:
uv run python scripts/assess_receipt.py outputs/reviews/extraction/Gas_20240605_164059_Raven_Scan_3_jpeg.rf.e3408aa2b936afd1f1aed84fa40d454e.jsonThis is not the full eval framework yet. It is a lightweight inspection helper so we can understand what should be measured before formalizing metrics.