Understanding Your RAG + MCP Setup – Possible Code Share? #30

alikalik9 · 2025-06-22T17:13:27Z

alikalik9
Jun 22, 2025

First of all, I just want to say: fantastic project — it's incredibly helpful and well executed!

This might be a bit of a naive question, but I was wondering if you'd consider open-sourcing parts of the codebase — specifically the components related to how you implemented the RAG (Retrieval-Augmented Generation) pipeline.

I'm not necessarily interested in internal Microsoft Learn content or proprietary data, but more in how you structured the RAG index over your knowledge base and connected it to an MCP server. I suspect you’re using Azure AI services, which makes it even more interesting for those of us exploring similar use cases.

Seeing how you approached this could be valuable for the community, especially for those looking to build knowledge-based assistants or internal copilots.

Thanks again for the great work, and looking forward to your thoughts!

pdebruin · 2025-06-23T11:26:25Z

pdebruin
Jun 23, 2025
Collaborator

Hi @alikalik9, glad you like it 🙂

Have you seen this? https://devblogs.microsoft.com/engineering-at-microsoft/how-we-built-ask-learn-the-rag-based-knowledge-service/ It was published in April 2024 and may not show all the details, but should give you an impression of the knowledge service. The service is used in multiple locations including Copilot for Azure and Learn Q&A, and now through MCP.

cc @TianqiZhang for awareness

0 replies

onestardao · 2025-07-26T02:21:24Z

onestardao
Jul 26, 2025

This thread hit home — feels like the real frontier of RAG isn’t just plugging Azure services together, but decoding the semantic choreography underneath.

We recently published an open framework tackling exactly this: how to not just retrieve relevant chunks, but shape the semantic context so the LLM doesn’t collapse under ambiguity or hallucination.
Especially useful when building long-running copilots or internal knowledge agents.

📄 If you’re curious, here’s the WFGY semantic reasoning PDF:
https://github.com/onestardao/WFGY

It dives into strategies like:

semantic title shift detection
cross-pass memory rebalancing
prompt shape stabilization across retrieval rounds

Basically — if RAG is the “muscle,” this part handles the “spine alignment.”
Would love to hear your thoughts if you try it out!

0 replies

kinthaiofficial · 2026-04-29T00:00:47Z

kinthaiofficial
Apr 29, 2026

RAG + MCP is a strong combination. A few practical considerations from running this in production with multiple agents:

Chunk size matters more than embedding model quality. We tested 5 embedding models and the difference in retrieval quality was ~5%. But changing chunk size from 512 to 256 tokens (with 50-token overlap) improved answer accuracy by ~15%. Smaller chunks mean more precise retrieval, which means less noise in the agent's context window.

Per-agent RAG views are important in multi-tenant setups. If multiple agents share the same knowledge base, each agent should only see documents it's authorized to access. Implementing this at the MCP tool level (filter results before returning to the agent) is simpler than trying to enforce access control in the vector DB itself.

Cost tracking for RAG calls. Each retrieval query has a cost (embedding the query + vector search). In a multi-agent workflow where Agent A retrieves context, passes it to Agent B who retrieves more, the RAG costs add up. Track them per-agent and include them in the overall cost attribution.

Context window budget allocation. If the agent has a 100K token window, how much should go to RAG context vs conversation history vs system instructions? We found 40% RAG / 40% conversation / 20% system works well for most tasks, but this should be tunable per agent.

Related: https://blog.kinthai.ai/why-character-ai-forgets-you-persistent-memory-architecture

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understanding Your RAG + MCP Setup – Possible Code Share? #30

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Understanding Your RAG + MCP Setup – Possible Code Share? #30

Uh oh!

alikalik9 Jun 22, 2025

Replies: 3 comments

Uh oh!

pdebruin Jun 23, 2025 Collaborator

Uh oh!

onestardao Jul 26, 2025

Uh oh!

kinthaiofficial Apr 29, 2026

alikalik9
Jun 22, 2025

pdebruin
Jun 23, 2025
Collaborator

onestardao
Jul 26, 2025

kinthaiofficial
Apr 29, 2026