Skip to content

Add support for preprocessing user prompts and metadata filter for retrieval and answering #531

@pmeier

Description

@pmeier

Currently, Ragna only supports a minimal RAG pattern: during the retrieval stage we passing the prompt without any processing to the source storage and retrieve sources from that. This has several downsides:

  • An embedding of a question might not generate a close match to an embedding of a statement that contains the answer to the question. Although this is technically something that should be solved on the embedding model side, it is usually solved by rephrasing the question before using it to retrieve sources.
  • A question might contain hidden context assumptions that cannot be handled by the embedding model. For example a question like "What happened last year?" likely won't get any close matches. Rephrasing the prompt to "What happened in 2024?" will help here.

As touched on above, a common strategy to improve RAG results, is to preprocess the prompt before passing it on to the source storage and assistant.

I'd like to use this issue as discussion to enable this functionality in Ragna. I'm just going dump my thoughts here that we can use to sort out the proper design for this:

  • Input preprocessing should be optional. As this is an advanced pattern, we don't want to force users into it.
  • Input processing should be rather generic. I don't want to enforce an agentic workflow or the like. IMO, the interface should be something simple like preprocess(prompt: str, metadata_filter: MetadataFilter) -> Prompt. The returned Prompt object should just be a dataclass that carries a retrieval_prompt, an answer_prompt, and the metadata filter. In case no preprocessing is defined, we can just do Prompt(retrieval_prompt=prompt, answer_prompt=prompt, metadata_filter=metadata_filter)
  • While we should allow the preprocessing to alter the metadata filter (imagine the prompt "What happened in this project in 2024?" narrowing down the metadata filter to only include documents from the year 2024) we might need to enforce that the preprocessing does not widen the filter. However, since the processing is user defined, we might also not need to enforce this at all and push the responsibility to the user.
  • To be able to evaluate the preprocessing or the full RAG workflow with the preprocessing, we need to be able to track the individual steps. So maybe the Prompt object needs to contain a list of intermediate results that can be inserted into our DB, while the actual RAG procedure only moves on with the last entry.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions