feat(krep): rg streaming + LSA model + standalone CLI (v2.0.2.0) by ozhangebesoglu · Pull Request #4 · ozhangebesoglu/Kishi-Shell

ozhangebesoglu · 2026-06-01T13:09:10Z

Summary

Kishi Shell v2.0.2.0 — Krep AI semantic search'in büyük yeniden yapılandırması. Üç ana feature, tek branch'te (rg-streaming + svd-model merge'ü).

🚀 Yeni Özellikler

1. ripgrep streaming prefilter (150-3000× speedup)

rg sistemde varsa otomatik dispatch (kullanıcı setup yapmaz)
Streaming Popen + limit×10 early-termination + 10s hard-timeout
Adaptive fallback: rg 0-match dönerse semantic walker'a düşer
--no-rg flag debug için
Kishi src auth login: 1668ms → 5ms (220×)
Python stdlib (6.8M satır): timeout → 11ms (>5000×)
1 GB tek dosya: timeout → 6ms (>10000×)

2. Dictionary-free LSA model (PPMI + SVD)

krep --learn PATH ile corpus'tan otomatik vocab + 3D anlamsal eksen
Manuel 178-keyword sözlüğüne bağımlı değil, tüm diller otomatik
Rank-50 SVD HD vec (cosine ranking) + PCA-3 (scatter görsel)
Eksen auto-label (frequency-weighted top-5 kelime)
Verified on Loghub (OpenSSH/Apache/Linux/Mac/HDFS, 10k satır)

3. Tail-aware incremental + lazy auto-refresh

`krep --update-learn` sadece YENİ satırları işler (file offset tracking)
`--auto-refresh 1h` ile her sorguda background subprocess
Mevcut sorgu eski modelle devam, sıradaki yeniyi görür
Rotation/truncate detect (size shrink → baştan oku)
Silinmiş dosyalar file_state'ten otomatik çıkar

4. Standalone `krep` CLI

`pip install kishi-shell` artık iki binary kurar: `kishi` ve `krep`
bash/zsh/fish'tan doğrudan: `krep "auth" /var/log/`

5. Optional numpy/scipy (felsefe korundu)

Core: `pip install kishi-shell` → 2 dep (prompt_toolkit, psutil)
LSA model: `pip install kishi-shell[krep]` → +numpy/scipy
Keyword engine numpy olmadan çalışmaya devam eder

📊 Test Coverage

407 / 407 test geçiyor
95 yeni test (krep_learn 63 + krep_streaming 27 + krep_perf 5)
Senior code-audit APPROVED (2 IMPORTANT fix uygulandı)
Memory leak yok (10k iter, +0.2 MB RSS)
Thread-safe (16 thread × 20 paralel = 320, 0 exception)

🛠️ CLI Komutları

```bash
krep PATTERN [PATH...] # Search
krep --learn PATH # Build LSA model
krep --update-learn PATH # Tail-only incremental
krep --auto-refresh 1h # Lazy background refresh
krep --list-models # Show cached models
krep --purge-models # Delete all models
krep --no-model PATTERN PATH # Bypass model, keyword only
krep --no-rg PATTERN PATH # Bypass ripgrep, walker only
```

Test plan

407/407 pytest passing
Manual: `krep -r 'auth login' kishi/` (rg + LSA çalışıyor)
Manual: `krep --learn /tmp/krep_realtest/` Loghub corpora
Manual: `krep --list-models` (FRESH/STALE, age, axes)
Manual: `echo "auth login" | krep auth` (stdin pipe)
Edge: 1 GB single file (6ms)
Edge: empty corpus, permission denied, symlink loop, binary, null byte
Senior code-audit independent review (subagent)

🤖 Generated with Claude Code

…nsion and concept pruning and bump version to 2.0.1.0

Plan covers a 5-task TDD rollout that introduces a ripgrep-based streaming prefilter with early termination, while keeping the existing Cython-backed walker as the fallback path. No new Python runtime dependencies; ripgrep stays optional. Doğrulanmış kazanımlar (spike ölçümleri): - Kishi src "auth login": 1656 ms → ~11 ms (150x) - Python stdlib "auth login" (~20k dosya): 56 s → ~19 ms (~3000x) - Fallback (rg yok): mevcut davranış (regresyon yok) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet <noreply@anthropic.com>

Subagent's earlier split-based implementation kept dotted tokens literal (e.g. 'config.py' → 'config\.py'), but that diverges from Krep's semantic OR-prefilter character. Real-corpus measurement shows: - 'auth login', 'error timeout' (typical queries): IDENTICAL behavior. - 'auth.token expired': findall yields 64 matches vs split's 6 — broader semantic coverage that the user actually wants. - 'user@admin': findall yields 63 matches vs split's 0. Restored re.findall(r'[\w]+') per the plan. The metacharacter safety test now asserts the cleaner drop-not-escape behavior — \w+ naturally strips meta-chars so the pattern is always re.compile-safe. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Popen + line-by-line streaming via stdout - Terminates rg when limit*early_stop_factor matches reached (default 50) - 10s hard-timeout safety net - Non-UTF8 safe (bytes mode + decode errors='replace') - Returns (matches, stats) tuple where matches uses (l_vec, sim, output_str) format compatible with krep_search's existing finalize path - Does NOT use rg's -w (word-boundary) flag: that would skip 'auth_token', 'authenticate' for query 'auth' — too narrow for Krep's semantic prefilter goal. Performance on real corpora (post -w removal): - Kishi src 'auth login': 8 ms (vec=20, match=17) - Kishi src 'error timeout': 10 ms (vec=50, match=50, early-stop) - Stdlib 'auth login': 23 ms (vec=66, match=50, early-stop) - Stdlib 'database query': 23 ms (vec=53, match=50, early-stop) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Extracted shared _krep_finalize(matches, q_vec, limit) helper. - krep_search now dispatches to _krep_rg_streaming when rg is present and a path argument was provided. - Adaptive fallback: if rg returns 0 matches but the pattern was valid, fall through to the legacy walker. This preserves Krep's semantic edge for queries like 'login authorization' that should still match 'auth token' lines via concept-vector bigram similarity. - stdin mode and rg_spawn_failed both route to the legacy walker. Verification: - 333 / 333 tests pass (no regression in existing 305 tests). - test_krep_perf::test_recursive_search_under_threshold PASSES at <100 ms (Task 1 TDD target met). - test_krep_search_files (legacy semantic-eşleşme test) PASSES via the adaptive fallback path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- kishi_krep gains --no-rg flag (debug/test); briefly overrides kishi.krep._HAS_RG and restores it after the call. - help text documents --no-rg and the rg-streaming performance bump. - README.md & README.tr.md gain a Krep Performance section with the full benchmark table (Walker vs rg-streaming, real corpora). - Version bumped in pyproject.toml, kishi/main.py banner, kishi/builtins.py help_text and neofetch shell line, and both READMEs. Verified end-to-end (3-run averages, 12-core x86_64, Python 3.14, ripgrep 15.1): Kishi src 'auth login': 5 ms (vs 1068 ms walker) → 206x Kishi src 'database query': 5 ms (vs 1071 ms walker) → 210x Tests dir 'auth login': 6 ms (vs 1053 ms walker) → 171x Stdlib 'auth login': 11 ms (walker times out) → >5000x Stdlib 'database query': 14 ms (walker times out) → >4000x 335 / 335 tests pass. No regressions in existing 305-test baseline. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Three new permanent regression tests in tests/test_krep_perf.py: 1. test_p99_under_strict_threshold — 10 cold-cache runs, p99 must be <50 ms. Current p99 measured ~8 ms; 5x headroom for CI variance. 2. test_no_memory_leak_100_iterations — 100 sequential calls; RSS delta must be <10 MB. Current delta ~0 MB. 3. test_thread_safety_smoke — 8 threads × 5 calls = 40 parallel invocations; zero exceptions, all rc=0. Print is monkey-patched to no-op because capsys is not thread-safe. Verified locally on real corpora: - Kishi src (~5k lines): p50 ~5 ms, p99 ~8 ms - Tests dir (~3k lines): p50 ~6 ms, p99 ~8 ms - Stdlib (~6.8M lines): p50 ~10 ms, p99 ~24 ms - 189k-file combined corpus: avg ~15 ms - /usr top-level (mega): avg ~29 ms These guards lock in the rg-streaming performance contract and catch any future regression (lost streaming, leaked subprocess, broken fallback) at CI time. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Independent senior code review identified 4 actionable findings: 1. [IMPORTANT] proc.stdout.close() missing before break in _krep_rg_streaming On early-stop, proc.terminate() sent SIGTERM but the stdout pipe stayed open. If the pipe buffer was full, rg blocked until proc.wait(timeout=2) timed out, then proc.kill() ran, then another proc.wait(timeout=1) — up to 3 s of dead-time on every early-stop. Fix: close stdout in `finally` before wait; SIGPIPE makes rg exit immediately. Measured impact: 100k-potential-match scenario dropped from 6.8 ms (luck) to a tight 6.7-7.5 ms upper bound across 5 runs. 2. [IMPORTANT] rg_spawn_failed was silent — operational blindness When _HAS_RG is True but Popen raises OSError (broken executable, ENOMEM), we fall back to the legacy walker silently. Users would suddenly see 1-second searches instead of 10 ms and have no clue why. Fix: write an amber-colored stderr warning so the cause is visible. 3. [NIT] PEP8 E402: `import subprocess` and `import time` were inline mid-file. Moved to the module header alongside other stdlib imports. 4. New tests added: - test_dispatch_rg_spawn_failed_falls_back (mocked OSError + assert warning in captured.err + assert fallback found the match) - test_streaming_hard_timeout_safety_net (hard_timeout=0.001 forces early termination) - test_streaming_terminates_cleanly_on_early_stop (asserts wall time <1 s — regression guard for the pipe-close bug) 341 / 341 tests pass. No behavior change for successful rg paths; only the failure-path latency is improved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Plan was fully implemented (commits 75ab71c..8b9c587). Added an "Implementation Notes" section to the plan documenting 4 deliberate deviations and 1 extra layer that emerged during execution: 1. -w (word-boundary) flag removed — Krep is semantic, needed wider prefilter so 'auth' matches 'auth_token'/'authenticate'. 2. Tokenization stayed on re.findall(\w+) per the plan; subagent's first iteration used split() and was reverted after real-corpus measurement showed findall's OR semantics is what Krep actually wants (64 vs 6 matches on 'auth.token expired' query). 3. Adaptive fallback when rg returns 0 hits — Krep's semantic edge (bigram bridging 'login authorization' to 'auth token') is preserved by falling through to the walker. 4. Senior audit fixes: proc.stdout.close() before terminate (was 3 s regression latency) + rg_spawn_failed stderr warning. 5. Extra CI guards: p99 < 50 ms, memory leak < 10 MB / 100 calls, 8-thread smoke. Plan didn't require these — added for production-grade. Final stress-test summary appended (10k iteration in 77 s, +0.2 MB RSS, 0 FD leak, ±1% perf drift; 200 MB single file 7.8 ms; /usr mega-corpus 29 ms; senior audit APPROVED with fixes applied; 341 tests passing). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

docs/benchmarks/2026-05-26-krep-perf.md captures every measurement made during this branch's verification cycle (4 iterations across ~5 Ralph-loop turns): - 4-quadrant matrix (rg × Cython): isolates each accelerator's contribution. rg alone ≈ 260x; Cython adds 22-80% on top. - Stat-grade table (30-run): p50/p95/p99/stdev for Kishi src, Stdlib. All under 30 ms p99. - Corpus-size scaling row: 3 k lines → 50 M lines, search time stays in [5, 29] ms because early-stop is location-driven, not size-driven. - Sub-component timing on 1 GB single file: rg subprocess 4.4 ms, Python wrapper +1.8 ms = 6.2 ms total. Demonstrates that the Python overhead is constant. - Memory: tracemalloc +656 byte/100 iter, RSS +0.2 MB/10k iter, zero FD leak, zero zombies. - Concurrency: 320 parallel calls, 32-thread cache write, zero exceptions in either. - 21 edge case results (binary, symlink loop, null byte, permission denied, 1 GB single file, etc.) all rc-valid, no crash. - Senior audit findings + status (all addressed). - Coverage: ~89% on new code. - Per-direction speedup table: up to >10000x on 1 GB single file. This file is the regression baseline. Any future krep change that moves the numbers in the "ileride karşılaştırma" section in the wrong direction must be reviewed before merge. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Manuel 178-kelime X/Y/Z_KEYWORDS sözlüğünden bağımsız, corpus-tabanlı 3D semantik embedding. Latent Semantic Analysis (PPMI + truncated SVD). Yeni dosya: kishi/krep_learn.py (~340 satır) - build_model(paths): vocab tarama, cooccurrence matrix, PPMI normalize, scipy.sparse.linalg.svds(k=3), eksen auto-label - save/load model: vectors.npz (float32 array, pickle yok) + metadata.json - find_model_for(paths): deterministik klasör hash ile model lookup - vectorize_with_model(text, model): OOV-safe, vocab dışı kelime atılır - list_models / purge_models: bakım kishi/krep.py: - _resolve_model + _vectorize_dispatch helpers (REPL cache ile) - krep_search bash: model varsa kullan, yoksa keyword fallback - process_file: model varsa file-level pruning atla (vocab uzayında pahalı), her satır model'le vektörize; mevcut keyword yolu DEĞİŞMEDİ kishi/builtins.py: - --learn PATH... : corpus'tan model üret, ~/.cache/kishi/krep_models/'e kaydet - --no-model : modeli bypass et (debug/test) - --list-models : kayıtlı modeller, vocab/lines/axes - --purge-models : tüm modelleri sil POC doğrulama (Kishi src, 4560 satır): - Build: 0.1s, 1534 vocab, 34 KB model - Eksenler auto-label: axis 0: self return def import not (Python yapısı) axis 1: print model krep color_reset path (UI/krep) axis 2: the kishi explorer command ctrl (TUI) - Semantic: auth↔password=1.00, error↔fail=0.99 (sözlüksüz!) - Kazanım: 'plugin install' query → keyword 0 match, model 5 match 'color message' query → keyword 0 match, model 5 match Bağımlılık: numpy>=1.20, scipy>=1.7 (PyPI wheel matrix yaygın, AUR'da resmi paket, hemen her dev makinesinde mevcut). 312/312 test geçiyor, regresyon yok. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Senin (kullanıcı) önerin: "klasörde --learn yaptıktan sonra interval ver, yeni log satırları geldiğinde otomatik kaldığı yerden devam etsin" — Celery periodic task mantığı, ama daemon yok. YENİ ÖZELLİKLER: 1. Tail-aware incremental scan (krep_learn._scan_corpus): - Her dosya için file_state{offset, mtime, size} kaydeder - Update zamanı: cur_size < prev_size → rotate/truncate, baştan oku cur_size > prev_size → tail offset'ten yeni satırlar cur_mtime == prev_mtime → atla (değişmemiş) - Silinmiş dosyalar file_state'ten otomatik çıkar 2. update_model(existing, paths) → incremental SVD: - Yeni satırların cooccurrence pair'lerini önceki state'e topla - Yeni vocab kelimeleri otomatik dahil - PPMI + SVD yeniden çöz (kompakt, doğru) - 5 saniyelik build → ~0.5 saniyelik update 3. Lazy auto-refresh (krep._resolve_model): - krep --learn /var/log/ --auto-refresh 1h - Her sorguda model_age kontrol: > interval ise subprocess.Popen([python -c "...krep_learn.update_model..."]) fire-and-forget (start_new_session=True) - Mevcut sorgu eski modelle devam, sıradaki sorgu yeni modeli görür - _REFRESH_TRIGGERED set'i ile aynı sorguda çift spawn'u engelle 4. Yeni CLI: - --update-learn PATH manuel tail incremental - --auto-refresh INTERVAL insancıl interval (1h, 30m, 1d, 0=off) - --list-models yaş + auto-refresh + STALE/FRESH gösterir - Mevcut --learn, --no-model, --purge-models korundu 5. parse_interval / format_age helpers: - "1h" → 3600, "30m" → 1800, "1d" → 86400, "2w" → 1209600 - "0", "off", "false" → 0 (kapalı) - Geçersiz format ValueError E2E DOĞRULAMA: - Initial build: 17 vocab, 40 lines, file_state 2 dosya - Log append (sadece log1): +395 byte → update_model SADECE log1 işliyor - V: 17 → 27 (yeni kelimeler: kubernetes, pod, restart vs.) - is_stale: 1h+just_built → False, 1h+2h_ago → True - --list-models: "auto-refresh 1h · STALE" doğru göstr DOSYA FORMATI (per-model klasör): vectors.npz word_vecs (float32, pickle yok) metadata.json vocab, axis_labels, build_time, auto_refresh_seconds state.json file_state, term_freq, pair_counts (incremental için) load_model(with_state=False) sorgu için lightweight load, with_state=True update için tam state. _MODEL_CACHE REPL ömrü. 312/312 mevcut test geçiyor, sıfır regresyon. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Bug rapor: SVD rank-3 vec'leri tek yöne sıkıştırıyordu — vocab'ın %42'si sim>0.99 (random pair'lerde), thermal↔ftp=0.99 gibi alakasız çiftler de yüksek skor alıyordu. Sonuç: krep "authentication failure" sorgusu HDFS block satırlarını matchliyordu. Çözüm — word2vec/LSA standardı: - SVD rank-50 (HD): cosine ranking için gerçek vocab ayrışması - PCA-3 (sadece scatter görseli için): %14 variance, görsel yön - word_vecs (V, 50) cosine_similarity için - word_vecs_3d (V, 3) 3D ASCII scatter plot için Match tuple genişletildi: (l_vec_hd, sim, output_str, raw_text) — render öncesi top-K match'ler raw_text'ten yeniden 3D vectorize edilir, scatter HD bilgisini taşır ama görsel 3D bütünlüğünü korur. Gerçek-log (Loghub: openssh, apache, linux, mac, hdfs, 10000 satır) test: 'authentication failure' → linux.log sshd auth failure sim=0.99 ✓ 'invalid user from' → openssh.log Invalid user sim=0.99 ✓ 'permission denied' → klogind Auth failed (semantic) sim=0.89 ✓ 'kernel memory' → kernel command line sim=0.93 ✓ 'block packetresponder' → hdfs Served block sim=1.00 ✓ 'thunderbolt thermal' → IOThunderboltSwitch sim=0.59 ✓ 'google software update' → GoogleSoftwareUpdateAgent sim=0.82 ✓ Cosine artık VARYE EDİYOR (0.51-1.00 arası), eski 1.00-düz değil. Tam match: ~0.99, semantic: 0.80-0.92, loose: 0.51-0.59. API: - vectorize_with_model(text, model, dim="hd") cosine için (default) - vectorize_with_model(text, model, dim="3d") scatter görsel için - _cosine_anyd(a, b) boyut-agnostik cosine helper vectors.npz: word_vecs + word_vecs_3d, ikisi de float32, pickle yok. load_model geriye uyumlu: word_vecs_3d yoksa word_vecs'i fallback eder. 312/312 test geçiyor. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Bug: lazy refresh tetiklemiyordu çünkü _resolve_model cache hit'te is_stale kontrolünü atlıyordu (return önce). Ayrıca background refresh tamamlandıktan sonra cache stale model'i tutmaya devam ediyordu. Düzeltme: - Cache key artık (paths, metadata.json mtime) — dosya değişirse cache otomatik invalid → reload - is_stale check her resolve'da çalışır, cache hit'te bile - Background subprocess tamamlandığında save_model metadata.json mtime'ını güncellediği için cache otomatik yenilenir Gerçek log üzerinde tam lifecycle testi: 1. krep --learn /tmp/krep_realtest/ --auto-refresh 1h 2. build_time 2h önceye al (simulate stale) 3. krep "auth" → _REFRESH_TRIGGERED=1, bg subprocess spawn 4. 4 saniye bekle 5. Aynı krep "auth" → cache mtime invalid → fresh model load 6. is_stale = False (refresh sonrası) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

tests/test_krep_learn.py covers every public + critical-private API of the PPMI+SVD model layer: - TestTokenize (6) — short-word filter, pure-digit skip, __dunder skip, Unicode aware, punctuation split - TestWalkFiles (3) — .git/__pycache__ skip, binary ext skip, recursive - TestReadFileFromOffset (5) — full / tail-seek / out-of-bounds (truncate) / binary NUL detect / empty file - TestParseInterval (7) — h/m/d/w/s formats, disable keywords, invalid - TestFormatAge (4) — seconds/minutes/hours/days - TestIsStale (4) — disabled never stale, fresh ok, old stale, age delta - TestBuildModel (7) — vocab, HD vs 3D shapes, axis labels, file_state, term_freq, pair_counts, auto_refresh stored, empty corpus error - TestSaveLoadRoundtrip (7) — creates vectors.npz+metadata.json+state.json, lightweight vs with_state, 3D persistence, deterministic hash dir, missing returns None - TestVectorizeWithModel (6) — dim="hd" vs "3d", OOV zero, empty, partial OOV, L2-normalized output - TestCosine (3) — identical=1, orthogonal=0, zero-vec=0 - TestUpdateModel (5) — no-change, new lines, tail-only bytes, rotation detection (size shrink), deleted file removed from state - TestListPurgeModels (4) — empty, after build, purge removes all, purge empty returns 0 - TestEndToEnd (2) — full pipeline build→save→load→query, unrelated word low similarity Bug fixes uncovered during TDD: 1. _tokenize: '__pyx_n_u_error' filter was too narrow (digit required). Now ALL __dunder prefixes drop (Cython internals + Python __init__ noise both filtered). 2. _walk_files: '/.git/' SKIP_DIR pattern missed dirpath without trailing slash. Added rstrip('/') + '/' padding so 'os.walk' returned paths match exactly. 3. update_model: file-deletion-only case (no new lines but a file gone) used to short-circuit with stale file_state. New `files_deleted` branch updates file_state without re-running SVD. Total tests: 375 (312 baseline + 63 new krep_learn). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Hibrit paketleme stratejisi (yol C): krep'i ayrı repo'ya çıkarmadan, tek paket içinde bağımsız bir CLI entry point olarak expose et. YENİ: - kishi/krep_cli.py — sys.argv → kishi_krep builtin'i sarar, sys.exit ile rc döndürür. Bash, zsh, fish gibi herhangi bir shell'den çağrılabilir. - pyproject.toml [project.scripts]: kishi = "kishi.main:main" # mevcut krep = "kishi.krep_cli:main" # YENİ KULLANICI DENEYIMI: $ pip install kishi-shell $ krep "auth login" /var/log/ # ← bash'tan doğrudan $ krep --learn /var/log/ --auto-refresh 1h $ krep --list-models $ cat app.log | krep error # stdin pipe da OK $ kishi # ← Kishi REPL hâlâ orada Kishi$ -> krep "auth" /var/log/ SMOKE TESTS: - python -m kishi.krep_cli --help ✓ - python -m kishi.krep_cli --list-models ✓ - python -m kishi.krep_cli auth FILE ✓ (3D scatter render) - echo "..." | python -m kishi.krep_cli ✓ (stdin pipe) YENİ TESTLER (3, TestKrepCliEntry): - test_help_exits_zero — --help SystemExit(0) - test_no_pattern_exits_one — boş çağrı SystemExit(1) - test_list_models_exits_zero — --list-models boş cache OK GELECEK ESNEKLİK: İleride krep ekosistemi büyürse, ayrı repo (krep-cli) PyPI paketi yapmak 30 dakikalık iş — kishi/krep_*.py taşır, pyproject paket adı değiştir, kishi-shell krep-cli'yi dep olarak ekler. Şimdiki mimaride böyle bir migration için ek yatırım yok. 378/378 test geçiyor (375 önce + 3 CLI entry test). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Kishi'nin "saf Python + 2 dep" felsefesini koru — numpy/scipy optional-dependencies'e taşı. pip install kishi-shell artık 30 KB core paket; krep --learn ihtiyaç doğarsa pip install kishi-shell[krep]. pyproject.toml: dependencies = [prompt_toolkit, psutil] # core optional-dependencies.krep = [numpy>=1.20, scipy>=1.7] kishi/builtins.py: --learn / --update-learn / --list-models / --purge-models numpy yoksa yönlendirici hata: "Install: pip install kishi-shell[krep] Arch: sudo pacman -S python-numpy python-scipy" Keyword engine bu paketler olmadan da çalışmaya devam eder. README.md + README.tr.md: - Install section: iki seçenek (sade + [krep]) - Krep AI section: LSA modeli için optional extra notu Version bump 2.0.1.0 → 2.0.2.0 (yeni krep CLI + LSA model + optional dep) - pyproject.toml - kishi/main.py banner - kishi/builtins.py help_text + neofetch - README + README.tr başlıklar 378/378 test geçiyor, sıfır regresyon. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Krep'in iki ana feature dalını tek branch'te birleştir. Her iki feature da krep_search'ün başına kendi dispatch'ini ekliyordu; merge sonrası ÜÇ KATMANLI dispatch zinciri: 1. _resolve_model(paths) → SVD modeli varsa yükle (numpy opsiyonel) 2. _krep_rg_streaming → rg sistemde varsa stream prefilter (model varsa HD vec ile cosine) 3. process_file walker → ikisi de yoksa veya rg 0-match dönerse Conflict resolutions: - pyproject.toml: v2.0.2.0 + optional-deps[krep] (SVD'den) - kishi/main.py + builtins.py: v2.0.2.0 banner - kishi/builtins.py: --learn/--update-learn/--auto-refresh (SVD) + --no-rg (rg) + birleştirilmiş --no-model/--no-rg implementation - kishi/krep.py: _krep_rg_streaming model parametresi alır, HD vec ile satır vektörleştirir. Match tuple 4-element (l_vec, sim, output_str, raw_text). _krep_finalize artık model parametresi alır, raw_text'ten PCA-3 reduce ile scatter render eder. - README.md / README.tr.md: v2.0.2.0 başlık tests/test_krep_streaming.py: - Match tuple format check 3-veya-4 element kabul edecek şekilde güncellendi (HD vec için vec_len >= 3 check). 407 / 407 test geçiyor (305 baseline + 63 krep_learn + 27 krep_streaming + 5 krep_perf + 7 krep). Sıfır regresyon. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

PR #4 CI 5 Python versiyonunda da fail etti çünkü GitHub runner'da ripgrep yok. _krep_rg_streaming rg_spawn_failed dönünce match boş kaldı. İki katmanlı düzeltme: 1. .github/workflows/ci.yml: - sudo apt-get install -y ripgrep adımı eklendi (önerilen yol) - pip install -e ".[krep]" ile numpy/scipy de yüklenir (krep_learn tests için) 2. tests/test_krep_streaming.py: - TestStreamingSearch class'ına autouse fixture: rg yoksa skip - test_streaming_hard_timeout_safety_net + test_streaming_terminates_cleanly_on_early_stop: fonksiyon başında rg check (TestKrepSearchDispatch class içinde) - Defansif: rg install adımı düşse bile testler skip olur, fail olmaz CI artık geçmeli; lokal 407/407 hala geçiyor. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…dirs PR #4'te merge sırasında 'git add -A' yanlışlıkla test artifact'leri de sürükledi (redirect testlerinden '>' ile bırakılmış boş dosyalar: =13, =3.0.0, Editor, Editör, Girdi, Input, Risk, Task, Terminal, echo, export, f, file, greetnWelcome, i, krep, merhabanSisteme, out, plugin, weather, $MYOUT, &1, (, **For, **NOT:**, **Tip:**, 0, 1:n, 2400x, 60, =, sess.log:17:, u001b[1 ve daha fazlası — toplam 38 dosya). Ayrıca .serena/ ve .vscode/ editor-spesifik konfig'ler de tracked'di. Bunlar kullanıcıya/makineye bağlı, repo'da olmamalı. Düzeltme: 1. git rm --cached ile 38 çöp dosya + .serena/ + .vscode/ index'ten çıktı 2. Lokal disk'te de silindi 3. .gitignore'a kapsamlı kalıcı kurallar: - .serena/, .vscode/, .idea/, .mypy_cache/, .pytest_cache/, .coverage - .venv*/, .cache/, .krep_models/ - Redirect test artifact'ları için pattern'ler ($*, =*, &*, vs.) - Türkçe karakter dahil tüm test çöp isimleri 407/407 test geçiyor (sıfır regresyon). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

PyPI sdist build (python -m build) krep_core.pyx'i bulamadığı için fail oluyordu — pyproject.toml [tool.setuptools.packages.find] sadece *.py dosyaları alıyor, MANIFEST.in olmadan .pyx sdist'e girmez. MANIFEST.in: - recursive-include kishi *.pyx *.py (Cython source + Python) - recursive-include tests *.py (downstream verify) - recursive-include docs *.md (plan + benchmark) - recursive-include assets *.png (README image'lar) - include README*.md LICENSE pyproject.toml setup.py - exclude kishi/*.c kishi/*.so (build sırasında üretilir) - prune .serena .vscode .mypy_cache .pytest_cache build dist - global-exclude __pycache__ *.pyc Sonuç: dist/kishi_shell-2.0.2.0.tar.gz 7.6 MB (sdist, .pyx + tests + assets) dist/kishi_shell-2.0.2.0-cp314-cp314-linux_x86_64.whl 236 KB twine check dist/* → PASSED Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

PR #4 sonrası v2.0.2.0 tag push'unda PyPI 400 hatası: Binary wheel 'kishi_shell-2.0.2.0-cp312-cp312-linux_x86_64.whl' has an unsupported platform tag 'linux_x86_64'. Kök neden: 'python -m build' default'ta hem sdist hem wheel üretir. GitHub runner Ubuntu/Linux olduğu için wheel 'linux_x86_64' tag'i alıyor. PyPI sadece manylinux* / musllinux* / win_amd64 / macosx_* gibi taşınabilir platform tag'lerini kabul ediyor; 'linux_x86_64' kullanıcı makinesine özel olduğu için reddediliyor. Düzeltme: - 'python -m build --sdist' → sadece kaynak .tar.gz üret - Kullanıcı 'pip install kishi-shell' yaparken Cython ile kendi makinesinde derler (gcc + Python headers gerekli, Linux'ta standart). İleride manylinux2014_x86_64 + macOS arm64 + Windows amd64 binary wheel'lar için 'cibuildwheel' eklenebilir (gerek olunca). Ayrıca test job'a ripgrep + krep extra'sı eklendi (PR #4'teki ana CI fix'i publish workflow'a da uygulandı; krep test'leri rg gerektiriyor, krep_learn test'leri numpy/scipy gerektiriyor). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

ozhangebesoglu and others added 24 commits May 25, 2026 23:24

chore: add krep built-in command and bump version to 2.0.0.9

f5b526c

feat: implement Krep AI semantic 3D vector search using Cython C-exte…

47bf28d

…nsion and concept pruning and bump version to 2.0.1.0

test(krep): add performance benchmark with 100ms target

a6acd54

Co-Authored-By: Claude Sonnet <noreply@anthropic.com>

feat(krep): add _HAS_RG detection and _build_rg_pattern helper

dd86c5a

Co-Authored-By: Claude Sonnet <noreply@anthropic.com>

ozhangebesoglu mentioned this pull request Jun 1, 2026

docs: announce Kishi Shell v2.0.2.0 Krep AI upgrades ozhangebesoglu/Kishi-Plugins#1

Merged

3 tasks

ozhangebesoglu merged commit 28a100a into main Jun 1, 2026
5 checks passed

ozhangebesoglu deleted the feat/krep-svd-model branch June 1, 2026 14:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(krep): rg streaming + LSA model + standalone CLI (v2.0.2.0)#4

feat(krep): rg streaming + LSA model + standalone CLI (v2.0.2.0)#4
ozhangebesoglu merged 24 commits into
mainfrom
feat/krep-svd-model

ozhangebesoglu commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ozhangebesoglu commented Jun 1, 2026

Summary

🚀 Yeni Özellikler

📊 Test Coverage

🛠️ CLI Komutları

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant