Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Changed (Internal)

- **MCP server catch-up — `drt_doctor` tool + `drt_run_sync(compute_diff=True)` CLI parity**: the MCP server picked up two pieces that had been shipped on the CLI for months but weren't reachable from Claude / Cursor. **`drt_doctor`** mirrors the `drt doctor` CLI (v0.7.0+) and returns a structured `{"passed": bool, "checks": [{"category", "name", "ok", "message"}, ...]}` report — runtime / project / extras / env-var rows, so "why won't this drt project run?" is one tool call instead of a console screenshot. **`drt_run_sync`** now accepts `compute_diff: bool` and `diff_limit: int` arguments matching the `drt run --dry-run --diff` CLI (#413, v0.7.1+) — when set, the response carries a structured `diff` field via the existing `cli/output.diff_to_dict` serialiser, so an LLM can preview added / updated / deleted / unchanged records for queryable destinations before committing. `compute_diff=True` without `dry_run=True` returns a structured error rather than silently running against a live destination (matches the CLI contract). Module docstring updated to list the now-complete tool surface (9 tools), and `drt_get_history` — which had been registered in code but missing from the docstring since #445 — is now listed there too. README.md / README.ja.md MCP-tool tables updated. 5 new tests in `tests/unit/test_mcp.py` cover the new tool registration, the `compute_diff` requires-`dry_run` contract, the doctor happy path against a well-formed fixture project, and the doctor `passed=False` path when the project file is missing.

- **Shared object-storage serialiser extracted from S3 destination** (prep for [#169](https://github.com/drt-hub/drt/issues/169) GCS + [#170](https://github.com/drt-hub/drt/issues/170) Azure Blob): the format-handling logic (csv / json / jsonl / parquet + gzip wrapping + `<prefix><timestamp>.<ext>` key naming) is now `drt/destinations/_blob_serializer.py` — pure, config-agnostic, no client dependency. The S3 destination keeps its existing behaviour (locked by the 25-test suite, all green; one `unittest.mock.patch.object(S3Destination, "_serialise")` test re-pointed at the new `serialise_records` import path) and shrinks from 244 LOC to 167 LOC. GCS and Azure Blob destinations will land as thin `_client` + `put_object`-equivalent shims on top of this module instead of duplicating the format/key logic three times. 19 new tests in `tests/unit/test_blob_serializer.py` lock the module's own contract: CSV header + data rows, JSON-as-array, JSONL-per-line, `default=str` fallback for datetime/Decimal/UUID, gzip wraps text formats while parquet ignores the outer gzip flag, parquet missing-extras raises with the `drt-core[parquet]` install hint preserved, key-naming default + template overrides (with and without an explicit extension), `.gz` suffix appended for gzip-text but not gzip-parquet. No behaviour change, no public API change.

- **OpenTelemetry observability scaffolding — Phase 2** ([#531](https://github.com/drt-hub/drt/issues/531), PR [#615](https://github.com/drt-hub/drt/pull/615) by [@Muawiya-contact](https://github.com/Muawiya-contact)): new `drt/observability/` package owning the lazy initialisation of the global OTLP-backed tracer and meter providers, plus a guaranteed no-op fallback so drt has **zero runtime overhead** when OTel is not configured or `[otel]` extras are not installed. Public functions `get_tracer()` and `get_meter()` return a real OTLP-backed instance when `observability.otel.endpoint` is set (or the standard `OTEL_EXPORTER_OTLP_ENDPOINT` env var is present as a fallback) and the `[otel]` extras are installed; in every other case (extras missing, no `observability` block, missing endpoint, exporter init failure) they return a `_FallbackNoOpTracer` / `_FallbackNoOpMeter` that supports the full API surface (`start_as_current_span` / `set_attribute` / `record_exception` / `set_status` / `create_counter` / `create_up_down_counter` / `create_histogram` / `create_gauge`) so Phase 3 instrumentation can sprinkle `with tracer.start_as_current_span(...)` everywhere without risk. Provider init is **lazy and idempotent** (double-checked locking via `_STATE.initialized` + `_LOCK`, zero side-effects at import time). `service.name` resource attribute defaults to `"drt"` and is overridable via `observability.otel.service_name`. OTLP headers from `observability.otel.headers` are passed through to the exporter with `${VAR}` env expansion happening at provider init (not at config load time — Phase 1 deliberately keeps headers raw on disk for safety). Endpoint normalisation maps `http://` scheme to `insecure=True` per the gRPC OTLP convention. The `_warn_once` helper guarantees init failures log a single warning, never spam. New `tests/unit/test_otel.py` ships 12 tests covering the no-op-when-otel-is-missing path, env-var fallback, full config with header expansion, exporter init failure → no-op + warning, idempotent reuse, and the defensive paths (`_parse_otlp_headers_env` `ValueError` raises × 2, `_expand_headers` non-dict / non-string raises, `_FallbackNoOpInstrument.add` / `.record`, `_FallbackNoOpMeter.create_*`). Continuation of [@Muawiya-contact](https://github.com/Muawiya-contact)'s OTel chain since [#429](https://github.com/drt-hub/drt/issues/429) / [#527](https://github.com/drt-hub/drt/pull/527) (Phase 1 config schema); Phase 3 (engine instrumentation: `drt.sync.run` / `drt.sync.extract` / `drt.sync.load` spans wrapping `engine/sync.py` execution) is the natural next step and will be filed as a separate sub-issue.
Expand Down
5 changes: 4 additions & 1 deletion README.ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,11 +185,14 @@ drt mcp run
| ツール | 機能 |
|------|-------------|
| `drt_list_syncs` | 同期定義の一覧を表示 |
| `drt_run_sync` | 同期を実行(`dry_run`対応) |
| `drt_run_sync` | 同期を実行(`dry_run` + `compute_diff` で `--diff` 同等) |
| `drt_run_test` | 同期後の検証テストを実行(`drt test` 相当) |
| `drt_get_status` | 前回の実行結果を取得 |
| `drt_get_history` | 直近の同期実行履歴を取得 |
| `drt_validate` | 同期YAML設定を検証 |
| `drt_get_schema` | 設定ファイルのJSONスキーマを返す |
| `drt_list_connectors` | 利用可能なソースとデスティネーションを一覧 |
| `drt_doctor` | 環境診断(`drt doctor` 相当) |

---

Expand Down
19 changes: 11 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -208,14 +208,17 @@ drt mcp run

**Available MCP tools:**

| Tool | What it does |
| --------------------- | --------------------------------------- |
| `drt_list_syncs` | List all sync definitions |
| `drt_run_sync` | Run a sync (supports `dry_run`) |
| `drt_get_status` | Get last run result(s) |
| `drt_validate` | Validate sync YAML configs |
| `drt_get_schema` | Return JSON Schema for config files |
| `drt_list_connectors` | List available sources and destinations |
| Tool | What it does |
| --------------------- | ------------------------------------------------------------------------------------- |
| `drt_list_syncs` | List all sync definitions |
| `drt_run_sync` | Run a sync (supports `dry_run` + `compute_diff` for `--diff` parity) |
| `drt_run_test` | Run post-sync validation tests (mirrors `drt test`) |
| `drt_get_status` | Get last run result(s) |
| `drt_get_history` | Get recent sync run history |
| `drt_validate` | Validate sync YAML configs |
| `drt_get_schema` | Return JSON Schema for config files |
| `drt_list_connectors` | List available sources and destinations |
| `drt_doctor` | Environment diagnostics (mirrors `drt doctor`) |

---

Expand Down
112 changes: 108 additions & 4 deletions drt/mcp/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,14 @@

Tools:
drt_list_syncs — list all sync definitions
drt_run_sync — run a specific sync (dry_run supported)
drt_run_sync — run a specific sync (dry_run + compute_diff supported)
drt_run_test — run post-sync validation tests for a sync
drt_get_status — get last sync result for a sync
drt_get_history — get recent sync run history (v0.7+)
drt_validate — validate all sync YAML configs (per-file errors)
drt_get_schema — return JSON Schema for drt_project.yml / sync.yml
drt_list_connectors — list available source and destination connectors
drt_doctor — environment diagnostics (mirrors `drt doctor` CLI)
"""

from __future__ import annotations
Expand Down Expand Up @@ -68,22 +70,41 @@ def drt_list_syncs() -> list[dict[str, str]]:
# -----------------------------------------------------------------------

@mcp.tool()
def drt_run_sync(sync_name: str, dry_run: bool = False) -> dict[str, Any]:
def drt_run_sync(
sync_name: str,
dry_run: bool = False,
compute_diff: bool = False,
diff_limit: int = 20,
) -> dict[str, Any]:
"""Run a specific drt sync.

Args:
sync_name: Name of the sync to run (from drt_list_syncs).
dry_run: If True, extracts data but does not write to destination.
compute_diff: When True (requires ``dry_run=True``), compute a
record-level diff (added / updated / deleted / unchanged)
against the destination. Queryable destinations get a true
diff; non-queryable destinations get a sample preview.
Mirrors ``drt run --dry-run --diff`` (v0.7.1+).
diff_limit: Cap on records per diff category (default 20).

Returns:
Result summary with success count, failed count, and any errors.
Result summary with success count, failed count, errors, and
(when ``compute_diff=True``) a ``diff`` field with the
structured preview.
"""
from drt.cli.main import _get_destination, _get_source
from drt.config.credentials import load_profile
from drt.config.parser import load_project, load_syncs
from drt.engine.sync import run_sync
from drt.state.manager import StateManager

if compute_diff and not dry_run:
return {
"error": "compute_diff requires dry_run=True (matches the "
"`drt run --dry-run --diff` CLI contract)."
}

project = load_project(_project_dir)
profile = load_profile(project.profile)
syncs = load_syncs(_project_dir)
Expand All @@ -105,15 +126,23 @@ def drt_run_sync(sync_name: str, dry_run: bool = False) -> dict[str, Any]:
_project_dir,
dry_run,
state_mgr,
compute_diff=compute_diff,
diff_limit=diff_limit,
)

return {
response: dict[str, Any] = {
"sync_name": sync_name,
"dry_run": dry_run,
"success": result.success,
"failed": result.failed,
"errors": result.errors[:10], # cap at 10 to avoid huge payloads
}
diff_value = getattr(result, "diff", None)
if compute_diff and diff_value is not None:
from drt.cli.output import diff_to_dict

response["diff"] = diff_to_dict(diff_value)
return response

# -----------------------------------------------------------------------
# drt_run_test
Expand Down Expand Up @@ -361,6 +390,81 @@ def drt_list_connectors() -> dict[str, list[dict[str, str]]]:
],
}

# -----------------------------------------------------------------------
# drt_doctor
# -----------------------------------------------------------------------

@mcp.tool()
def drt_doctor() -> dict[str, Any]:
"""Run environment diagnostics — the MCP equivalent of ``drt doctor``.

Mirrors the CLI ``drt doctor`` (v0.7.0+) but returns a structured
report instead of a console table. Useful for "why won't this drt
project run?" before reading any code — catches missing env vars,
malformed profile, uninstalled extras, etc.

Returns:
``{"passed": bool, "checks": [{"category", "name", "ok",
"message"}, ...]}`` where ``passed`` is False if any required
check failed (project file / profile / Python version).
"""
from drt import __version__ as drt_version
from drt.cli.doctor import (
_check_env_vars,
_check_extras,
_check_profile,
_check_project_file,
_check_python,
_check_syncs,
)

checks: list[dict[str, Any]] = []
required_ok = True

py_ok, py_msg = _check_python()
checks.append(
{"category": "runtime", "name": "Python version", "ok": py_ok, "message": py_msg}
)
required_ok = required_ok and py_ok

checks.append(
{
"category": "runtime",
"name": "drt version",
"ok": True,
"message": drt_version,
}
)

proj_ok, proj_msg, project_data = _check_project_file()
checks.append(
{"category": "project", "name": "Project file", "ok": proj_ok, "message": proj_msg}
)
required_ok = required_ok and proj_ok

if project_data:
prof_ok, prof_msg = _check_profile(project_data)
checks.append(
{"category": "project", "name": "Profile", "ok": prof_ok, "message": prof_msg}
)
required_ok = required_ok and prof_ok

_, syncs_ok, syncs_msg = _check_syncs(project_data)
checks.append(
{"category": "project", "name": "Syncs", "ok": syncs_ok, "message": syncs_msg}
)

for label, ok, msg in _check_extras():
# Extras are optional — they affect ``ok`` of the row but not
# overall ``passed``. A user can run a duckdb-only project with
# no other extras installed and that's fine.
checks.append({"category": "extras", "name": label, "ok": ok, "message": msg})

for var, ok, msg in _check_env_vars(project_data):
checks.append({"category": "env", "name": var, "ok": ok, "message": msg})

return {"passed": required_ok, "checks": checks}

return mcp


Expand Down
Loading
Loading