A fully local implementation of Karpathy's LLM Knowledge Base pattern. No cloud APIs, no data exfiltration. The ingest + query core runs on the Python standard library alone; the optional web UI and optional web-search augmentation add two narrowly-scoped runtime dependencies (fastapi, ddgs) that are not reached by any offline-only code path.
Drop source documents into a folder. A local LLM reads them, extracts entities and concepts, writes interlinked wiki pages and maintains a persistent, compounding knowledge graph, all on-device, all offline.
This project is a proof-of-concept that integrates four recent developments into one working system:
-
Karpathy's LLM Wiki pattern (April 2026), the idea that an LLM should build and maintain a wiki from raw sources, rather than doing one-shot RAG retrieval. Raw data is "compiled" into interlinked Markdown, then operated on by CLI tools for Q&A, linting and incremental enrichment.
-
Gemma 4 27B-A4B (April 2026), Google DeepMind's open-weights Mixture-of-Experts model. 25,2B total parameters, but only 3,8B active per token via learned routing across 128 experts. This gives 27B-class output quality at roughly 4B-class inference cost (model card: MMLU Pro 82,6%, GPQA Diamond 82,3%, within 2-3% of the 31B Dense variant on all benchmarks). Apache 2.0 licensed.
-
Unsloth Dynamic 2.0 (UD), per-layer importance-weighted quantization for GGUF files (docs). Unlike standard Q4_K_M which applies uniform bit-width across all layers, UD selectively adjusts quantization precision per layer based on importance analysis, attention layers that matter more for output quality get higher precision, less impactful MLP layers get lower precision. Same ~16GB file size, measurably better output quality.
-
TurboQuant KV Cache Compression (Zandieh et al. ICLR 2026), runtime KV cache compression using PolarQuant with Walsh-Hadamard rotation. We use the asymmetric
q8_0K +turbo4V configuration via the llama-cpp-turboquant fork (not yet in mainline llama.cpp). Full-precision keys preserve attention routing accuracy; compressed values (3,8×) reduce the KV cache from ~5 GB to ~3 GB, freeing ~3 GB of headroom for longer context windows or additional parallel slots.
Everything runs on a single MacBook. The 16 GB model loads into Metal GPU unified memory, processes documents through a structured extraction pipeline and produces an Obsidian-compatible knowledge base with hundreds of interlinked pages.
The wiki after ingesting ~25 sources on local LLM inference: 500+ interlinked pages organised by topic clusters. Color groups show how the LLM cross-references entities across sources, red = TurboQuant, orange = Gemma 4, green = Karpathy, lime = agents.
graph LR
subgraph PATTERN ["Karpathy's LLM Wiki"]
p["Raw sources compiled<br/>into interlinked wiki"]
end
subgraph MODEL ["Gemma 4 26B-A4B MoE"]
m["25,2B params, 3,8B active<br/>128 experts, Apache 2.0"]
end
subgraph WEIGHTS ["Unsloth Dynamic 2.0"]
w["Per-layer importance<br/>weighted quantization"]
end
subgraph KV ["TurboQuant KV Cache"]
k["q8_0 K + turbo4 V<br/>3,8x V compression"]
end
p --> SYSTEM
m --> SYSTEM
w --> SYSTEM
k --> SYSTEM
SYSTEM(("Fully Local<br/>Knowledge Base<br/>on a MacBook"))
style PATTERN fill:#dae8fc,stroke:#6c8ebf,color:#000
style MODEL fill:#d5e8d4,stroke:#82b366,color:#000
style WEIGHTS fill:#fff2cc,stroke:#d6b656,color:#000
style KV fill:#e1d5e7,stroke:#9673a6,color:#000
style SYSTEM fill:#f8cecc,stroke:#b85450,color:#000
This README is a landing page. The depth is in docs/, structured according to the arc42 template (Starke & Hruschka) and the C4 model (Simon Brown). Architecture decisions use Michael Nygard's ADR format; quality scenarios use the ISO/IEC 25010 attribute set.
docs/arc42/README.md, start here. Reading orders, cross-reference conventions, section map.- § 1, Introduction and Goals, five quality goals, prioritised
- § 2, Architecture Constraints, TC-1 through TC-8
- § 3, System Scope and Context, C4 Level 1 and the mapping to Karpathy's gist
- § 4, Solution Strategy, the big choices
- § 5, Building Block View, C4 Level 2 and Level 3 inline
- § 6, Runtime View, sequence diagrams for ingest, query, the six-stage resolver, context-overflow recovery
- § 7, Deployment View, infrastructure, memory budget, one-time setup
- § 8, Cross-cutting Concepts, domain model, error handling, concurrency, prompt discipline
- § 9, Architecture Decisions, ADR-001 through ADR-007 (zero deps, fork llama.cpp, FTS5+graph, asymmetric KV, six-stage resolver, F1 gates, reverse-index idempotency)
- § 10, Quality Requirements, 23 ISO/IEC 25010 scenarios in arc42 S/E/R/M format
- § 11, Risks and Technical Debt, known limitations, technical debt, risk register, scaling limits
- § 12, Glossary, ~75 alphabetical terms
- Appendix A, Academic Retrospective, design trade-offs: what worked and fit purpose, what succeeded but didn't fit (D-1..D-5), labelled failure modes (F-1..F-6), meta-lessons (M-1..M-5)
docs/c4/L1-system-context.md, Level 1, System Contextdocs/c4/L2-container.md, Level 2, Containersdocs/c4/L3-component.md, Level 3, Components
docs/UI.md, panel-by-panel tour of the optional web UI, with screenshots of every view and the near-term UI roadmap
| Question | Go to |
|---|---|
| "What does the system do and why?" | arc42 § 1 + § 3 |
| "How is it decomposed?" | arc42 § 5 or C4 L2 / L3 |
| "How does ingestion work, step by step?" | arc42 § 6.2 |
| "How does retrieval work?" | arc42 § 6.3 + C4 L3.B |
| "How does entity resolution work?" | arc42 § 6.4 + C4 L3.C |
| "Why did you pick FTS5 over a vector DB?" | arc42 § 9, ADR-003 |
| "Why zero dependencies?" | arc42 § 9, ADR-001 |
| "Why fork llama.cpp?" | arc42 § 9, ADR-002 and ADR-004 |
| "What are the quality goals?" | arc42 § 1.2 + § 10 |
| "What went wrong?" | Appendix A § A.4 |
| "What are the known limitations and open debt?" | arc42 § 11 |
| "How do I use the web UI?" | docs/UI.md |
| "What is a term I don't recognise?" | arc42 § 12, Glossary |
The full setup and deployment procedure is in arc42 § 7 (Deployment View). The short version lives here.
- Platform: macOS on Apple Silicon is the reference target (M-series, Metal GPU). Linux with an NVIDIA GPU (CUDA) or a modern CPU-only build also works; Windows is supported via WSL 2 (Ubuntu) with the Linux instructions below. Native Windows (PowerShell) is not tested end-to-end; the shell scripts under
scripts/assume bash andfswatch/inotifywait, so on Windows you should run everything inside WSL. - Hardware (any platform): ≥ 32 GB RAM recommended, 16 GB is the floor with reduced throughput. Apple Silicon uses unified memory; on discrete-GPU systems the Gemma 4 Q4_K_M weights (~16 GB) must fit in VRAM for full Metal/CUDA offload, otherwise llama.cpp falls back to shared CPU+GPU layers.
- Python 3.12+,
python3 --version - Poppler for PDF text extraction:
- macOS:
brew install poppler - Linux (Debian/Ubuntu):
sudo apt install poppler-utils - Windows (WSL): same as Linux; or native: install
poppler for Windowsand putbin/on yourPATH
- macOS:
- Obsidian, optional but recommended for the graph view (cross-platform)
git clone <your-fork-or-upstream-url> llm-wiki
cd llm-wiki
mkdir -p models
huggingface-cli download unsloth/gemma-4-26B-A4B-it-GGUF \
gemma-4-26B-A4B-it-UD-Q4_K_M.gguf \
--local-dir models/git clone https://github.com/TheTom/llama-cpp-turboquant.git llama.cpp
cd llama.cpp
git checkout feature/turboquant-kv-cache
# macOS (Apple Silicon, default): Metal GPU offload
cmake -B build -DGGML_METAL=ON -DCMAKE_BUILD_TYPE=Release
# Linux or WSL with NVIDIA GPU: swap the flag
# cmake -B build -DGGML_CUDA=ON -DCMAKE_BUILD_TYPE=Release
# CPU-only fallback (any platform, slow but works):
# cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release -j
cd ..Why a fork? TurboQuant KV cache compression (paper) is not yet merged into mainline llama.cpp. The TheTom/llama-cpp-turboquant fork adds
turbo4cache types with validated Metal / Apple Silicon support (M1-M5). CUDA builds share the sameturbo4kernels; CPU-only builds fall back toq8_0on both K and V automatically. Full rationale: ADR-002 and ADR-004.Thread count:
scripts/start_server.shreadssysctl -n hw.performancecoreson macOS and falls back to8elsewhere. On Linux or WSL, if you have significantly more or fewer performance cores than 8, editTHREADS=near the top ofstart_server.sh(or exportTHREADSbefore running the script, the variable is honoured if already set).
bash scripts/start_server.shWait for llama server listening (~30 s for the 16 GB model to load into Metal). Leave this terminal open.
In a second terminal:
# Create the vault directories (first time only):
mkdir -p obsidian_vault/raw obsidian_vault/wiki/{sources,entities,concepts,synthesis}
# Drop files into obsidian_vault/raw/, then:
python3 scripts/ingest.py --list # see what's available
python3 scripts/ingest.py article.md # ingest one file
python3 scripts/ingest.py --all # ingest everything pendingpython3 scripts/query.py "what themes connect these sources?"
python3 scripts/query.py -i # interactive mode
python3 scripts/query.py -s "compare X and Y" # save answer as wiki pageA FastAPI + Vite/Lit single-page app wraps every CLI operation behind an HTTP JSON API. Install the two runtime dependencies and run:
pip install 'fastapi[standard]' ddgs # ddgs is optional web-search augmentation
python3 web/api/app.py # serves http://127.0.0.1:3000 by default
python3 web/api/app.py --dev # auto-reload on code changesThe app binds to 127.0.0.1 only. The pre-built frontend bundle under web/frontend/dist/ is served as static files; to work on the UI, run npm install && npm run dev inside web/frontend/ for the Vite dev server on :5173 (proxied to the API).
For a panel-by-panel walkthrough with screenshots, Query, Search, Browse, Graph, Page viewer, Ingest, Health, Dedup, Server, and the near-term UI roadmap, see docs/UI.md.
For the full ops reference including fallback configurations for 16 GB machines, see arc42 § 7 (Deployment View).
| Command | What it does |
|---|---|
bash scripts/start_server.sh |
Start the generation server (Gemma 4, 127.0.0.1:8080) |
bash scripts/start_server.sh stop |
Stop the server |
bash scripts/start_embed_server.sh |
Start the optional embedding server (bge-m3, 127.0.0.1:8081) for resolver stage 5 |
python3 scripts/ingest.py --all |
Ingest all pending sources |
python3 scripts/ingest.py --list |
List sources and their ingest status |
python3 scripts/ingest.py --reprocess file |
Re-ingest, overwriting existing pages |
python3 scripts/query.py "question" |
Ask a question |
python3 scripts/query.py -i |
Interactive query mode |
python3 scripts/query.py -s "question" |
Query and save answer as wiki page |
python3 scripts/search.py "terms" |
Test retrieval (no LLM needed, ~5 ms) |
python3 scripts/search.py --rebuild |
Rebuild the FTS5 search index |
python3 scripts/lint.py |
Health-check the wiki graph |
python3 scripts/cleanup_dedup.py |
Find and merge duplicate pages (dry run; use --apply to write) |
bash scripts/watch.sh |
Auto-ingest new files dropped into raw/ (needs fswatch on macOS or inotify-tools on Linux / WSL) |
bash scripts/watch.sh --lint |
Auto-ingest + lint after each |
python3 web/api/app.py |
Start the FastAPI web UI on 127.0.0.1:3000 |
python3 web/api/app.py --dev |
Same, with auto-reload for frontend/backend development |
Platform notes. The shell scripts are bash and assume a Unix-like environment. They run natively on macOS and Linux, and work unchanged inside WSL 2 on Windows. On native Windows PowerShell you would need to invoke
llama-serverdirectly (thestart_server.shbody is a single command line with flags; see the file). The Python entrypoints (scripts/*.py,web/api/app.py) are pure cross-platform. Thread count is auto-detected on macOS (sysctl) and defaults to 8 elsewhere, override by exportingTHREADS=<N>before launching either server script.
| Type | Extension | Processing | Requires |
|---|---|---|---|
.pdf |
Text extracted via pdftotext, chunked by paragraphs |
macOS: brew install poppler · Linux/WSL: sudo apt install poppler-utils · Windows (native): poppler-windows releases |
|
| Markdown | .md |
Read as-is, chunked if large | - |
| SMS backup XML | .xml |
Parsed via xml.etree.ElementTree (XXE-safe) |
- |
| Plain text | .txt |
Same as markdown | - |
Any other file type is read as plain text with UTF-8 decoding.
SecondBrain_POC/
├── README.md # this landing page
├── CLAUDE.md # wiki schema, LLM instructions
├── LICENSE # MIT
├── pyproject.toml # project metadata (stdlib + 1 optional runtime dep: ddgs)
├── awake_mac.py # prevent Mac sleep via caffeinate
│
├── docs/ # architecture documentation
│ ├── arc42/ # full arc42 template (12 sections + appendix)
│ │ ├── README.md # arc42 index, start here
│ │ ├── 01-introduction-and-goals.md
│ │ ├── 02-architecture-constraints.md
│ │ ├── 03-system-scope-and-context.md
│ │ ├── 04-solution-strategy.md
│ │ ├── 05-building-block-view.md
│ │ ├── 06-runtime-view.md
│ │ ├── 07-deployment-view.md
│ │ ├── 08-crosscutting-concepts.md
│ │ ├── 09-architecture-decisions.md
│ │ ├── 10-quality-requirements.md
│ │ ├── 11-risks-and-technical-debt.md
│ │ ├── 12-glossary.md
│ │ └── appendix-a-academic-retrospective.md
│ └── c4/ # C4 model (standalone)
│ ├── L1-system-context.md
│ ├── L2-container.md
│ └── L3-component.md
│
├── scripts/ # CLI pipeline (stdlib-only)
│ ├── llm_client.py # shared LLM client, paths, constants, safe_filename
│ ├── search.py # FTS5 + wikilink graph + RRF retrieval
│ ├── ingest.py # ingestion pipeline (write path)
│ ├── query.py # wiki query interface (read path)
│ ├── resolver.py # six-stage entity resolver (stages 0-5)
│ ├── aliases.py # gazetteer sidecar (seed + runtime tiers)
│ ├── data/
│ │ └── seed_aliases.json # 149 curated canonical entries
│ ├── cleanup_dedup.py # offline duplicate merger
│ ├── lint.py # wiki health checker
│ ├── start_server.sh # generation server launcher
│ ├── start_embed_server.sh # embedding server launcher (optional)
│ └── watch.sh # filesystem watcher for auto-ingestion
│
├── web/ # optional web UI (FastAPI + Vite/Lit)
│ ├── api/
│ │ ├── app.py # FastAPI entrypoint, CSP + security headers
│ │ ├── models.py # Pydantic request/response schemas
│ │ ├── services.py # CLI-to-JSON adapter layer
│ │ └── routers/ # ingest, query, search, wiki, lint, dedup, server, admin
│ └── frontend/
│ ├── src/ # Lit components, DOMPurify + Marked rendering
│ ├── dist/ # built bundle (gitignored; re-built with `npm run build`)
│ └── package.json # dompurify, lit, marked, vite
│
├── obsidian_vault/ # the knowledge base
│ ├── raw/ # source documents (immutable, gitignored)
│ │ └── assets/ # downloaded images and attachments
│ └── wiki/ # LLM-generated pages (gitignored)
│ ├── index.md # content catalogue
│ ├── log.md # chronological operations log
│ ├── sources/ # one summary per ingested source
│ ├── entities/ # people, orgs, tools, datasets, models
│ ├── concepts/ # methods, theories, frameworks, patterns
│ └── synthesis/ # filed query answers
│
├── db/ # derived state (auto-generated, gitignored)
│ ├── wiki_search.db # SQLite FTS5 + source_files reverse index
│ ├── alias_registry.json # runtime-promoted gazetteer
│ ├── judge_cache.json # resolver stage-4 verdicts
│ ├── embed_cache.json # bge-m3 vectors (stage 5 cache)
│ └── resolver_calibration.json # F1 threshold calibration data
│
├── logs/ # runtime server logs (gitignored)
├── models/ # GGUF weights (gitignored)
└── llama.cpp/ # TurboQuant fork build (gitignored)
The full static decomposition with responsibilities, interfaces and allowed-dependency matrix is in arc42 § 5 and C4 L2.
For the full troubleshooting table including memory-pressure fallbacks, the turbo3 quality warning and the context-overflow recovery procedure, see arc42 § 11.3 (Known limitations).
| Problem | Pointer |
|---|---|
| "Cannot reach llama.cpp server" | bash scripts/start_server.sh, wait for llama server listening |
| Server runs out of memory | Reduce CONTEXT to 32768 or 16384 in scripts/start_server.sh. See arc42 § 7.4. |
| Ingest produces 0 entities / 0 concepts | Reasoning mode must be off for ingestion. Either restart scripts/start_server.sh with REASONING="off" (line 36), or flip the toggle in the web UI header before ingesting. Full story: Appendix A F-3. |
| HTTP 400 during ingest | Handled automatically, the pipeline auto-splits and retries up to 2 levels. See arc42 § 6.5. |
| "unknown cache type turbo4" | You are on mainline llama.cpp instead of the TurboQuant fork. Re-clone step 2 of Quick Start. |
| Quality degradation at inference | Do not use turbo3 on Gemma 4 Q4_K_M (PPL > 100K). Use turbo4 only. See Appendix A F-5. |
| Obsidian doesn't show new pages | Filesystem watch delay, click a different folder and back, or reopen the vault. |
| llama.cpp build fails (macOS) | xcode-select --install, then rebuild. |
| llama.cpp build fails (Linux) | Install a C++ toolchain and CMake: sudo apt install build-essential cmake. For CUDA builds also install the matching CUDA toolkit. |
watch.sh exits with "no filesystem watcher found" |
Install fswatch (macOS) or inotify-tools (Linux/WSL). On native Windows, run watch.sh from inside WSL. |
This project is designed to run single-user, offline and local. The design properties below hold in the current tree:
- No outbound network.
scripts/contains nourlopen,requests, orhttpxcalls; the only HTTPS strings are documentation comments. - Loopback-only server binding. Both llama.cpp servers bind to
127.0.0.1only. - One-way
raw/. The pipeline reads fromraw/and never writes to it. Enforced by convention and by CLAUDE.md Rule 1. - Path-containment on all writes. Every write under
wiki/goes throughsafe_filename()inscripts/llm_client.py, and web-API delete / save / page-fetch endpoints explicitly resolve +relative_totheir wiki subdir so untrusted inputs cannot escape. - XXE-safe XML parsing. Python's
xml.etree.ElementTreedoes not expand external entities by default. - Parameterised SQL. Every
cursor.executecall inscripts/search.pyuses?placeholders; column weights inside the query string are literal constants, not user input. - List-form subprocess calls.
pdftotextandpdfinfoare invoked viasubprocess.run([...], shell=False), so shell metacharacters in filenames cannot reach a shell. - Web UI hardening. The FastAPI app ships a strict Content-Security-Policy (
script-src 'self', no inline scripts),X-Frame-Options: DENY,Referrer-Policy: strict-origin-when-cross-origin, andX-Content-Type-Options: nosniff. All rendered markdown passes through DOMPurify with an explicit URI allowlist (https?:,mailto:,tel:, relative) before Lit inserts it viaunsafeHTML.
Not implemented, intentionally. There is no authentication and no rate limiting on the FastAPI endpoints, this is a single-user, localhost-only tool. Binding to
0.0.0.0, exposing the port over a LAN, or tunnelling it to the public internet turns every/api/query/*,/api/ingest/*,/api/lint/delete, and/api/admin/resetendpoint into an unauthenticated handle on your wiki and your local LLM. If you need to run across a LAN, put the app behind an authenticating reverse proxy (Caddy + basic auth, Tailscale Serve, etc.) and add per-IP rate limits on the LLM endpoints before doing so.
Anything that would require a new outbound network edge, telemetry, crash reporters, update checks, cloud LLM fallback, is a breaking change to Quality Goal Q1 (privacy) and requires an ADR.
An architecture-level retrospective covering what worked, what did not fit, and the labelled failure modes encountered during development lives in Appendix A, Academic Retrospective. Brief selection:
Worked and fit the purpose:
- Four-pillar integration (Karpathy + Gemma 4 + UD + TurboQuant)
- FTS5 + graph + RRF retrieval, replacing an LLM-driven page selector that hit a scaling ceiling at ≈ 500 pages
- Six-stage entity resolver with a canonical alias gazetteer, addressing cross-document proper-noun forks
- Asymmetric
q8_0K +turbo4V KV cache, ~3 GB reclaimed vs. symmetric Q8 - Stdlib-only core (ingest + query), ~2 000 LOC against Python 3.12; optional web UI and web-search augmentation are isolated extras
- Reverse-index idempotency via
source_filestable, replacing an O(N) directory scan - Hard gates on the F1 threshold tuner,
MIN_SAMPLES=20,MIN_NEG=5,MIN_POS=5 - Consolidated
safe_filename()andfind_existing_page()helpers inllm_client.py
Succeeded but did not fit the default pipeline (kept opt-in, or dropped):
- bge-m3 stage-5 embedding cosine, kept opt-in behind a flag
- Age-gap tiebreaker, gated behind
--use-embeddings - Greek Snowball stemmer, dropped; Porter stemmer + stopwords covered typical workloads
- GraphRAG community summaries, dropped; found to underperform plain RAG on single-hop QA (Han et al. 2025)
- LLM page compression, dropped; quality drift not worth the cost
Known failure modes, resolved:
- F-1 LLM-based page selection scaling ceiling → replaced with FTS5 + graph + RRF
- F-2 F1 threshold degenerated on class-imbalanced calibration data → hard gates on sample counts
- F-3 Gemma 4 thinking tokens consumed output budget →
--reasoning offat the server - F-4 ChatGPT multi-way fork under similarity-only resolution → canonical alias gazetteer at stage 0
- F-5
turbo3caused PPL > 100K on Gemma 4 Q4_K_M → onlyturbo4, asymmetric - F-6 Aedes aegypti fork from LLM type noise → narrowed the type-constraint stage
Each failure mode has a full symptom / root cause / mitigation / status / lesson block in Appendix A § A.4.
The complete reference list, with inline citations at the point of use, lives inside the arc42 sections. The core references are:
Pattern. Andrej Karpathy, "LLM Knowledge Bases" and architecture gist, April 2026.
Model and quantization. Gemma 4 model card (Google DeepMind, 2026); Unsloth Dynamic 2.0; A. Zandieh, M. Daliri, M. Hadian, V. Mirrokni, "TurboQuant: Online KV Cache Quantization via Rotated Random Projections", ICLR 2026; I. Han et al. "PolarQuant", AISTATS 2026; A. Zandieh, M. Daliri, I. Han, "QJL", 2024.
Retrieval. G. V. Cormack, C. L. A. Clarke, S. Büttcher, "Reciprocal Rank Fusion", SIGIR 2009; N. Thakur et al. "BEIR", NeurIPS 2021; G. M. B. Rosa et al. "BM25 Is a Strong Baseline for Legal Case Retrieval", 2021; S. Bruch et al. "Analysis of Fusion Functions for Hybrid Retrieval", 2022; S. Mandikal, R. J. Mooney, "Sparse Meets Dense", 2024.
Entity resolution. P. Ferragina, U. Scaiella, "TAGME", CIKM 2010; L. Wu et al. "BLINK", EMNLP 2020; T. Ayoola et al. "ReFinED", NAACL 2022; N. De Cao et al. "mGENRE", TACL 2022; W. L. Hamilton, J. Leskovec, D. Jurafsky, "Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change", ACL 2016; T. Fawcett, "An Introduction to ROC Analysis", 2006.
Graph-augmented retrieval. D. Edge et al. "From Local to Global: A GraphRAG Approach", 2024; Z. Han et al. "RAG vs. GraphRAG", 2025; Z. Li et al. "Simple is Effective", 2024.
Infrastructure. llama.cpp by Georgi Gerganov et al.; TheTom/llama-cpp-turboquant fork; TheTom/turboquant_plus research workspace; Obsidian; SQLite FTS5; Simon Willison, "Exploring Search Relevance Algorithms with SQLite", 2019.
Methodology. arc42 template (Gernot Starke, Peter Hruschka); C4 model (Simon Brown); Michael Nygard's ADR format; ISO/IEC 25010 software product quality model.

