-
Notifications
You must be signed in to change notification settings - Fork 632
feat(chat): file upload artifacts #2077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dimetron
wants to merge
4
commits into
kagent-dev:main
Choose a base branch
from
dimetron:feature/file-upload-artifacts
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 2 commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
43941f8
feat: ADK artifact file upload
dimetron 5acac27
docs(design): add EP-89405 file upload & artifact support
dimetron 928105a
fix(adk,ui): address PR review on file upload artifacts
dimetron 791da60
Merge remote-tracking branch 'upstream/main' into feature/file-upload…
dimetron File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,174 @@ | ||
| # EP-89405: File Upload & Artifact Support in Chat | ||
|
|
||
| * Issue: [MSFN-89405] (internal) — feat: file upload feature | ||
| * Status: `implemented` (branch `feature/file-upload-artifacts`, commit `ccfada50`) | ||
| * Related: EP-2046 (Chat UI for MCP UI widgets) — shares the chat files but | ||
| explicitly excludes the file-upload backend, which this EP owns. | ||
|
|
||
| ## Background | ||
|
|
||
| kagent's chat could only exchange text. Users frequently need to hand an agent a | ||
| file (a log, a PDF, a CSV, a screenshot) and to receive files an agent/tool | ||
| produces (a generated report, an extracted table). The Go ADK runtime ships an | ||
| artifact subsystem (`artifact.Service`, `loadartifactstool`, the | ||
| `SaveInputBlobsAsArtifacts` run option, and the per-event `ArtifactDelta` | ||
| signal), but kagent serves agents purely over A2A (`adka2a`) and never wired the | ||
| `ArtifactService` into the runner — so none of it was reachable. | ||
|
|
||
| This EP enables an **end-to-end file upload / download round trip** in chat: | ||
|
|
||
| 1. Users attach files in the chat UI; they travel inline (base64) over the | ||
| existing A2A message/SSE channel — **no new HTTP API, no CRD field**. | ||
| 2. The Go ADK executor persists inbound uploads as artifacts and surfaces | ||
| agent-produced artifacts back to the UI as downloadable A2A file parts. | ||
| 3. Agents get a `save_artifact` tool (produce files) plus the built-in | ||
| `load_artifacts` tool (reference uploaded/produced files across turns). | ||
| 4. Both runtimes extract text from uploaded files (notably PDF) so models that | ||
| cannot natively read a document still receive its content. | ||
|
|
||
| ## Motivation | ||
|
|
||
| - Let users give agents real working material instead of pasting text. | ||
| - Let agents return generated files (reports, transformed data) as first-class, | ||
| downloadable chat attachments rather than dumping content into the message. | ||
| - Reuse the ADK's existing, battle-tested artifact contract rather than inventing | ||
| a parallel storage/transport. | ||
|
|
||
| ### Goals | ||
|
|
||
| - Wire the ADK in-memory `ArtifactService` into the Go runner (process-lifetime | ||
| persistence, versioned, user/session scoped). | ||
| - Accept inbound A2A file parts, persist them via `SaveInputBlobsAsArtifacts`, | ||
| and pass them inline to the model. | ||
| - Emit agent-saved artifacts back to the UI as A2A `FilePart` events, driven by | ||
| the `ArtifactDelta` event signal (event-driven, no store diffing). | ||
| - Register `loadartifactstool` (load) and a new `save_artifact` tool (produce) | ||
| for agents. | ||
| - Extract text from uploaded documents (PDF first) in both the Go ADK | ||
| (`fileextract`) and Python (`_file_extract`) model paths so non-multimodal | ||
| models still receive document content. | ||
| - Chat UI: attach multiple files with type/size validation; render image | ||
| thumbnails and downloadable file chips for both user and agent bubbles. | ||
| - Raise the nginx/proxy request-body limit so uploads aren't rejected at the edge. | ||
|
|
||
| ### Non-Goals | ||
|
|
||
| - Durable / shared artifact storage (GCS, DB) and cross-replica access. Artifacts | ||
| live in process memory and are lost on pod restart. | ||
| - A standalone artifact browser UI (list / delete / version history) beyond the | ||
| per-message chips. | ||
| - A dedicated artifact HTTP/storage API for the UI (everything rides A2A). | ||
| - The MCP-app / minimap chat features that share the same chat files (EP-2046). | ||
|
|
||
| ## Implementation Details | ||
|
|
||
| ### Transport & data model | ||
|
|
||
| - **Upload:** `@a2a-js/sdk` `FilePart` `{ kind: "file", file: { name, mimeType, | ||
| bytes /*base64*/ } }` appended alongside the text part on `message/stream`. | ||
| - **Download:** A2A artifact-update events carrying a `FilePart`. | ||
| - **ADK artifact value:** `*genai.Part` with `InlineData {Data, MIMEType, | ||
| DisplayName}`; key `(AppName, UserID, SessionID, FileName, Version)`; versions | ||
| auto-increment; `ArtifactDelta map[filename]version` set automatically on save. | ||
|
|
||
| ### Go ADK runtime | ||
|
|
||
| - **`go/adk/pkg/runner/adapter.go`** — set | ||
| `ArtifactService: adkartifact.InMemoryService()` on the `runner.Config` in | ||
| `CreateRunnerConfig` (single instance, reused for process-lifetime persistence). | ||
| - **`go/adk/pkg/agent/agent.go`** — register `loadartifactstool.New()` and the | ||
| new `save_artifact` tool in the agent's local tool set. | ||
| - **`go/adk/pkg/tools/save_artifact_tool.go`** — new tool letting the LLM produce | ||
| a downloadable file from chat; the executor surfaces it as an A2A file part. | ||
| - **`go/adk/pkg/a2a/executor.go`** — enable | ||
| `runConfig.SaveInputBlobsAsArtifacts = true`; on each event with | ||
| `Actions.ArtifactDelta`, `Load` each `(name, version)` from the store, set | ||
| `InlineData.DisplayName`, convert via `ToA2APart`, and emit an A2A | ||
| artifact-update `FilePart` event (`LastChunk=true`). Load/convert errors are | ||
| logged and skipped so the turn continues. | ||
| - **`go/adk/pkg/a2a/artifacts.go`** — helpers for building/emitting artifact | ||
| events from saved ADK parts. | ||
| - **`go/adk/pkg/fileextract/`** (`fileextract.go`, `pdf.go`) — extract text from | ||
| uploaded documents (PDF and other supported types) so the content is injected | ||
| for models that can't read the raw file. | ||
| - **`go/adk/pkg/models/openai_adk.go`** — inject extracted file text into the | ||
| OpenAI request path. | ||
| - New Go deps in `go/go.mod` / `go/go.sum` for PDF extraction. | ||
|
|
||
| ### Python runtime | ||
|
|
||
| - **`python/packages/kagent-adk/src/kagent/adk/models/_file_extract.py`** — text | ||
| extraction (PDF, etc.) mirroring the Go path. | ||
| - **`python/packages/kagent-adk/src/kagent/adk/models/_openai.py`** — inject | ||
| extracted file content into the OpenAI request. | ||
| - **`python/packages/kagent-adk/pyproject.toml`** — add the extraction dependency. | ||
|
|
||
| > Note: the original design scoped Python as a follow-up; the shipped | ||
| > implementation includes the Python extraction path as well. | ||
|
|
||
| ### UI (Next.js, `ui/src`) | ||
|
|
||
| - **`lib/fileUpload.ts`** — `FILE_ACCEPT`, `MAX_FILE_BYTES` (10 MB), `isAllowedFile`, | ||
| `fileToFilePart` (read file → base64 `FilePart`). Allowlist: images, PDF, | ||
| text/markdown, CSV, JSON. | ||
| - **`components/chat/ChatInterface.tsx`** — attach button + hidden multi-file | ||
| `<input>`; `pendingFiles` state with removable chips; type/size validation with | ||
| toasts; build `FilePart`s and append to the outgoing message; session naming | ||
| falls back to the first file name for file-only messages. | ||
| - **`components/chat/FileAttachment.tsx`** (new) — image thumbnail (object URL) | ||
| or a download chip (icon, filename, human size, download button). | ||
| - **`components/chat/ChatMessage.tsx`** — render file parts in user and agent | ||
| bubbles. | ||
| - **`lib/messageHandlers.ts`** — preserve `file` parts from messages and from | ||
| `artifact-update` events (`extractMessagesFromTasks`). | ||
|
|
||
| ### Edge / deployment | ||
|
|
||
| - **`helm/kagent/files/nginx.conf`** — add | ||
| `client_max_body_size {{ .Values.ui.nginx.clientMaxBodySize }};`. | ||
| - **`helm/kagent/values.yaml`** — `ui.nginx.clientMaxBodySize: 50m` (default). | ||
| - **`helm/kagent/tests/ui-nginx-configmap_test.yaml`** — assert the default and | ||
| an override render into the config. | ||
|
|
||
| ### Acceptance criteria (AC1–AC8) | ||
|
|
||
| AC1 artifact service wired · AC2 upload passed inline · AC3 upload persisted · | ||
| AC4 agent-saved artifact emitted as A2A `FilePart` · AC5 `loadartifactstool` | ||
| registered · AC6 UI multi-file attach with validation · AC7 thumbnail/chip | ||
| rendering in both bubbles · AC8 E2E upload round trip on kind. | ||
|
|
||
| ### Test Plan | ||
|
|
||
| - **Go unit:** `adapter_test.go` (non-nil `ArtifactService`), `agent_test.go` | ||
| (tool list includes load/save tools), `executor_test.go` (ArtifactDelta → | ||
| emitted `FilePart`; oversized inbound → failed status), `artifacts_test.go`, | ||
| `save_artifact_tool_test.go`, `fileextract` tests (`fileextract_test.go`, | ||
| `fixture_test.go`), `models/openai_adk_test.go`. | ||
| - **Go e2e:** `go/core/test/e2e/file_upload_test.go` — upload an inline A2A file | ||
| part to a Go ADK agent and assert it is processed (uses the current | ||
| `a2aproject/a2a-go/v2` API). | ||
| - **Python unit:** `tests/unittests/models/test_file_extract.py`, | ||
| `test_openai.py`. | ||
| - **UI unit:** `lib/__tests__/fileUpload.test.ts`, | ||
| `lib/__tests__/messageHandlers.test.ts`, | ||
| `chat/__tests__/FileAttachment.test.tsx`, `chat/__tests__/ChatMessage.test.tsx`. | ||
| - **Helm:** `helm/kagent/tests/ui-nginx-configmap_test.yaml`. | ||
|
|
||
| ## Alternatives | ||
|
|
||
| - **Core-server artifact endpoints / mounting `adkrest`:** more moving parts, a | ||
| second transport, and `adkrest` has no upload route — rejected in favor of | ||
| A2A-only. | ||
| - **Diff the artifact store per turn:** racy and fragile versus the idiomatic | ||
| `ArtifactDelta` event signal. | ||
| - **Inline-only download (no store surfacing):** wouldn't connect the artifact | ||
| store to the UI. | ||
| - **Send raw files to every model:** token bloat and many models can't read | ||
| PDFs — hence server-side text extraction with the 10 MB cap. | ||
|
|
||
| ## Open Questions | ||
|
|
||
| - Durable/shared artifact storage for multi-replica deployments and restarts. | ||
| - Per-file size limit configurability beyond the current 10 MB UI cap + | ||
| `client_max_body_size`. | ||
| - Whether to expose an artifact browser (list/version/delete) in the UI. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,136 @@ | ||
| package a2a | ||
|
|
||
| import ( | ||
| "context" | ||
| "encoding/base64" | ||
| "fmt" | ||
| "maps" | ||
| "os" | ||
| "strconv" | ||
|
|
||
| a2atype "github.com/a2aproject/a2a-go/a2a" | ||
| "github.com/a2aproject/a2a-go/a2asrv" | ||
| "github.com/a2aproject/a2a-go/a2asrv/eventqueue" | ||
| adkartifact "google.golang.org/adk/artifact" | ||
| "google.golang.org/adk/server/adka2a" //nolint:staticcheck // kagent still uses a2a-go v1; this ADK package is the compatibility adapter. | ||
| ) | ||
|
|
||
| const ( | ||
| // defaultMaxArtifactBytes is the default per-file size limit for inbound | ||
| // uploads (10 MB). | ||
| defaultMaxArtifactBytes = 10 * 1024 * 1024 | ||
| // envMaxArtifactBytes overrides the inbound file size limit (in bytes). | ||
| envMaxArtifactBytes = "KAGENT_MAX_ARTIFACT_BYTES" | ||
| ) | ||
|
|
||
| // MaxArtifactBytes returns the artifact size limit, honoring the | ||
| // KAGENT_MAX_ARTIFACT_BYTES env var and falling back to the default. It bounds | ||
| // both inbound uploads and agent-saved artifacts. | ||
| func MaxArtifactBytes() int { | ||
| if v := os.Getenv(envMaxArtifactBytes); v != "" { | ||
| if n, err := strconv.Atoi(v); err == nil && n > 0 { | ||
| return n | ||
| } | ||
| } | ||
| return defaultMaxArtifactBytes | ||
| } | ||
|
|
||
| // checkInboundFileSizes returns an error if any inbound FilePart's decoded | ||
| // content exceeds the limit. Only inline base64 (FileBytes) parts are checked; | ||
| // URI-referenced files are out of scope. | ||
| func checkInboundFileSizes(msg *a2atype.Message, limit int) error { | ||
| if msg == nil { | ||
| return nil | ||
| } | ||
| for _, part := range msg.Parts { | ||
| fp := asFilePart(part) | ||
| if fp == nil { | ||
| continue | ||
| } | ||
| fb, ok := fp.File.(a2atype.FileBytes) | ||
| if !ok { | ||
| continue | ||
| } | ||
| decoded, err := base64.StdEncoding.DecodeString(fb.Bytes) | ||
| if err != nil { | ||
| return fmt.Errorf("invalid base64 in file part %q: %w", fb.Name, err) | ||
| } | ||
| if len(decoded) > limit { | ||
| return fmt.Errorf("file %q exceeds maximum allowed size: %d bytes > %d bytes", fb.Name, len(decoded), limit) | ||
| } | ||
| } | ||
| return nil | ||
| } | ||
|
|
||
| // asFilePart extracts a *FilePart from an A2A Part, handling both value and | ||
| // pointer types. | ||
| func asFilePart(part a2atype.Part) *a2atype.FilePart { | ||
| switch p := part.(type) { | ||
| case *a2atype.FilePart: | ||
| return p | ||
| case a2atype.FilePart: | ||
| return &p | ||
| } | ||
| return nil | ||
| } | ||
|
|
||
| // emitArtifacts loads each artifact named in delta from the artifact service | ||
| // and emits it as an A2A artifact event carrying a FilePart. Load/convert | ||
| // failures are logged and skipped so the turn continues (AC4). | ||
| func (e *KAgentExecutor) emitArtifacts( | ||
| ctx context.Context, | ||
| reqCtx *a2asrv.RequestContext, | ||
| queue eventqueue.Queue, | ||
| userID string, | ||
| sessionID string, | ||
| delta map[string]int64, | ||
| eventMeta map[string]any, | ||
| ) { | ||
| svc := e.runnerConfig.ArtifactService | ||
| if svc == nil { | ||
| return | ||
| } | ||
|
|
||
| for name, version := range delta { | ||
| resp, err := svc.Load(ctx, &adkartifact.LoadRequest{ | ||
| AppName: e.appName, | ||
| UserID: userID, | ||
| SessionID: sessionID, | ||
| FileName: name, | ||
| Version: version, | ||
| }) | ||
| if err != nil { | ||
| e.logger.Error(err, "failed to load saved artifact", "name", name, "version", version) | ||
| continue | ||
| } | ||
| if resp == nil || resp.Part == nil { | ||
| e.logger.V(1).Info("artifact load returned no part", "name", name, "version", version) | ||
| continue | ||
| } | ||
|
|
||
| part := resp.Part | ||
| // Carry the filename so the converted FilePart has a Name. | ||
| if part.InlineData != nil && part.InlineData.DisplayName == "" { | ||
| part.InlineData.DisplayName = name | ||
| } | ||
|
|
||
| a2aPart, err := adka2a.ToA2APart(part, nil) | ||
| if err != nil { | ||
| e.logger.Error(err, "failed to convert artifact to A2A part", "name", name, "version", version) | ||
| continue | ||
| } | ||
|
|
||
| artifactEvent := a2atype.NewArtifactEvent(reqCtx, a2aPart) | ||
| artifactEvent.LastChunk = true | ||
| artifactEvent.Metadata = maps.Clone(eventMeta) | ||
| artifactEvent.Metadata[adka2a.ToA2AMetaKey("artifact_name")] = name | ||
| artifactEvent.Metadata[adka2a.ToA2AMetaKey("artifact_version")] = version | ||
| if part.InlineData != nil { | ||
| artifactEvent.Metadata[adka2a.ToA2AMetaKey("mime_type")] = part.InlineData.MIMEType | ||
| } | ||
|
|
||
| if err := queue.Write(ctx, artifactEvent); err != nil { | ||
| e.logger.Error(err, "failed to write artifact event", "name", name, "version", version) | ||
| } | ||
| } | ||
| } |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still correct that python is out of scope?