-
Notifications
You must be signed in to change notification settings - Fork 24
Add preliminary answer generation with zero shot and correct chunk #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
i can fix CQ after merge |
No, please fix before - it's bad practice to merge PRs where checks do not pass |
fixed |
_generate_answers_for_questions( | ||
config=config, | ||
source_subset="single_shot_questions", | ||
scenario=scenario, | ||
output_subset="single_shot_questions_with_answers", | ||
) | ||
|
||
# Multi-hop | ||
_generate_answers_for_questions( | ||
config=config, | ||
source_subset="multi_hop_questions", | ||
scenario=scenario, | ||
output_subset="multi_hop_questions_with_answers", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you select both here? It should depend on the config and whether we generated multihop questions & single shot ones, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the code so that single-shot answers are only generated if single_shot_question_generation.run
is true, and similarly for multi-hop if multi_hop_question_generation.run
is true. This way, we only generate answers for the question types actually produced.
of course, need to add further validation later, for special cases (perhaps multiple config files, later runs, etc), but this should do as an officially supported functionality
row_map.append(i) | ||
|
||
else: | ||
logger.warning(f"Unrecognized scenario '{scenario}' (row={i}), skipping.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should fail way earlier as it will affect all rows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(btw, no longer an issue if you invert the if and for clauses ^^)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now raise a ValueError
if the scenario is unrecognized (not in ("zero_shot","with_correct_chunk")) before processing any rows, instead of issuing repeated warnings per-row
return calls, row_map | ||
|
||
doc_meta_map = {} | ||
if scenario == "with_correct_chunk": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to split here the if, and iterate on the rows inside the clauses - else you're adding an extra if check at each new row, it's an extra operation which is useless.
if scenario == "zero shot":
for row;
do the thing
elif scenario == "with correct chunks":
load dataset
for row:
do the thing
else:
etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll re do a review once this is fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _build_inference_calls_scenario()
function now has a top-level branch on scenario
(if scenario == "zero_shot": ... elif scenario == "with_correct_chunk": ...)
, so we avoid repeated scenario checks within the row loop.
|
||
# Single-shot uses 'chunk_id', multi-hop uses 'source_chunk_ids' | ||
if source_subset.startswith("single_shot"): | ||
chunk_id = row.get("chunk_id", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ŵhy don't you use an extraction function here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added _extract_chunk_ids_for_row(...)
which pulls out document_id and chunk IDs for single-hop vs. multi-hop
|
||
for model_name, model_responses in responses_dict.items(): | ||
logger.info(f"Processing {len(model_responses)} responses from model={model_name} for scenario={scenario}.") | ||
n_common = min(len(model_responses), len(row_map)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the role of n_common? (the name is not clear to me, please add one line of comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment
|
||
A column 'answer_fashion' indicates which scenario was used to generate the answer. | ||
""" | ||
stage_cfg = config.get("pipeline", {}).get("answer_generation", {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A possible situation here, if the config structure is wrong, it could silently fail or behave incorrectly. I would add a "TODO" here to integrate a config validation for the whole project, something like
TODO: Add global config schema validation across the project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
…tion add global guard for question generation
Make "fast_chunking" the default chunking mode and preserve semantic mode under config
Add TruffleHog Secret Scanning GitHub Action
Bump up huggingface-hub to 0.30.1
fix uninitialized subset loading
remove fallback row add
update readme to include HF_ORGANIZATION env var
Add yourbench paper
Update README.md
remove torch dependency
Throw relevant error when inference fails
…nch into add-answer-generation
Add answer generation with two settings: