Skip to content

Feat/enrich page number to partitions #1044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

lunmatu101
Copy link

Motivation and Context (Why the change? What's the scenario?)

  • Include page number information in Text Partition.

High level description (Approach, Design)

  • Implement by the simple way
    • Reuse ExtractedContent from GeneratedFileDetails.
    • Create chunks based on ExtractedContent instead of ExtractedText, to reference the page number.
  • This way can keep page number info in a simple way but can lead to unoptimized chunks' size.

@nurkmez2
Copy link

Hi,
Just checking in on this — any updates on the review process?

This feature would be a great addition to Kernel Memory. It's something we’re really looking forward to, as it could unlock some essential capabilities in real-world scenarios.
Thanks again for your work on this!

@vbottiWerfen
Copy link

vbottiWerfen commented Jun 18, 2025

Hi @dluc @lunmatu101, do you have any plans to merge this PR? I'd really appreciate having this functionality available.

@lunmatu101
Copy link
Author

@lunmatu101 please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree

@lunmatu101 lunmatu101 marked this pull request as ready for review June 18, 2025 08:30
@lunmatu101 lunmatu101 requested a review from dluc as a code owner June 18, 2025 08:30
@lunmatu101
Copy link
Author

Since this update seems to be interesting to the community, I opened this PR for review from the owner.

@vbottiWerfen
Copy link

Hi @dluc can you take a look to this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants