Skip to content

Commit ff68c5d

Browse files
gregmosstevenobiajuluclaude
committed
Release v1.0.0 — first public release
Server: - Rewrite bootstrap: lifespan + anyio.to_thread (fixes deadlock on Windows) - pip call + import verification (fixes false pip failures) - Retry wrapper for transient OSError/DLL locks (3 retries, 10s delay) - Dynamic retry_after_sec: 25s first 3 min, 10s after - Chunked processing works with HITL entity overrides - Docx table cells joined with " | " for NER context - sys.executable instead of hardcoded "python" (macOS/Linux fix) - macOS/Linux platform_overrides in manifest (python3) README: - Professional layout for awesome-mcp submission - Dynamic badges (CI, stars, forks, release, platforms, tech stack) - Privacy architecture table, cross-platform instructions SKILL.md: - Realistic boot timings (2-5 min first launch) - Correct retry behavior: sleep for "loading", STOP for "No such tool" Tests (new): - 73 tests across 8 files (detection, anonymization, dedup, chunking, overrides, mapping, EU patterns, false positives) - GitHub Actions CI: Python 3.10/3.12 × Windows/macOS/Linux Version sync: - All files aligned to v1.0.0, .dxt format, GLiNER model updated to knowledgator/gliner-pii-base-v1.0 Co-Authored-By: stevenobiajulu <[email protected]> Co-Authored-By: Claude Opus 4.6 <[email protected]>
1 parent 4727ee1 commit ff68c5d

24 files changed

+1744
-555
lines changed

.dxtignore

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
.git
2+
.claude
3+
.test_venv
4+
__pycache__
5+
*.pyc
6+
dist
7+
*.mcpb
8+
*.dxt
9+
*.skill
10+
setup_pii_shield.*
11+
README.md
12+
pyproject.toml
13+
server/review_demo.html
14+
server/__pycache__
15+
tests
16+
.pytest_cache
17+
pytest.ini
18+
.github

.github/workflows/test.yml

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
name: Tests
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
branches: [main]
8+
9+
jobs:
10+
test:
11+
runs-on: ${{ matrix.os }}
12+
strategy:
13+
fail-fast: false
14+
matrix:
15+
os: [ubuntu-latest, windows-latest, macos-latest]
16+
python-version: ["3.10", "3.12"]
17+
18+
steps:
19+
- uses: actions/checkout@v4
20+
21+
- name: Set up Python ${{ matrix.python-version }}
22+
uses: actions/setup-python@v5
23+
with:
24+
python-version: ${{ matrix.python-version }}
25+
26+
- name: Install dependencies
27+
run: |
28+
python -m pip install --upgrade pip
29+
pip install pytest
30+
pip install "mcp[cli]>=1.0.0" "presidio-analyzer>=2.2.355" "spacy>=3.7.0" "python-docx>=1.1.0" "cryptography>=42.0.0" "numpy>=1.24.0" "torch>=2.0.0" "gliner>=0.2.7"
31+
python -m spacy download en_core_web_sm
32+
33+
- name: Download GLiNER model
34+
run: |
35+
python -c "from gliner import GLiNER; GLiNER.from_pretrained('knowledgator/gliner-pii-base-v1.0')"
36+
37+
- name: Run tests
38+
run: |
39+
pytest tests/ -v --tb=short

README.md

Lines changed: 170 additions & 237 deletions
Large diffs are not rendered by default.

dist/pii-contract-analyze.skill

-19.5 KB
Binary file not shown.

dist/pii-shield-v1.0.0.dxt

60.6 KB
Binary file not shown.

dist/pii-shield-v6.0.0.mcpb

-44.5 KB
Binary file not shown.

manifest.json

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
{
2-
"manifest_version": "0.3",
2+
"dxt_version": "0.3",
33
"name": "pii-shield",
44
"display_name": "PII Shield",
5-
"version": "6.0.0",
5+
"version": "1.0.0",
66
"description": "Anonymize PII in documents. GLiNER zero-shot NER for high-quality entity recognition in legal documents.",
7-
"long_description": "PII Shield provides automated PII detection and anonymization for legal document analysis. v5.5 uses GLiNER (DeBERTa-v3 zero-shot NER) via Presidio's GLiNERRecognizer for high-quality named entity recognition — handles ALL-CAPS names, domain-specific company names, and legal document patterns. SpaCy handles tokenization, GLiNER handles NER. Falls back to SpaCy-only if GLiNER is unavailable. Also includes: fuzzy entity deduplication, prefix support for multi-file workflows, two-pass boundary cleanup with false-positive filtering, 17 EU pattern recognizers. Deanonymized output is written to local files only — PII never flows back through Claude.",
7+
"long_description": "MCP server that anonymizes documents before Claude sees them and restores real data after analysis. PII never enters the API — only file paths and session IDs are exchanged. Uses GLiNER zero-shot NER for high-quality entity recognition in legal documents, with Human-in-the-Loop review via local web UI. Supports PDF, DOCX (with tracked changes), and plain text. Includes 17 EU pattern recognizers, entity deduplication, chunked processing for large documents, and full local audit logging.",
88
"author": {
99
"name": "Grigorii Moskalev",
1010
"url": "https://www.linkedin.com/in/grigorii-moskalev/"
@@ -21,6 +21,14 @@
2121
"PII_MIN_SCORE": "${user_config.pii_min_score}",
2222
"PII_GLINER_MODEL": "${user_config.gliner_model}",
2323
"PII_WORK_DIR": "${user_config.work_dir}"
24+
},
25+
"platform_overrides": {
26+
"darwin": {
27+
"command": "python3"
28+
},
29+
"linux": {
30+
"command": "python3"
31+
}
2432
}
2533
}
2634
},
@@ -35,7 +43,10 @@
3543
{"name": "scan_text", "description": "Detect PII without anonymizing (preview mode)"},
3644
{"name": "list_entities", "description": "Show status, supported types, and recent sessions"},
3745
{"name": "start_review", "description": "Start local review server and return URL for HITL verification. Does NOT open browser. PII stays on your machine."},
38-
{"name": "get_review_status", "description": "Check if user approved the HITL review. Returns status and has_changes only (no PII or override details)."}
46+
{"name": "get_review_status", "description": "Check if user approved the HITL review. Returns status and has_changes only (no PII or override details)."},
47+
{"name": "anonymize_next_chunk", "description": "Process next chunk of a chunked anonymization session. Returns progress and partial result."},
48+
{"name": "get_full_anonymized_text", "description": "Assemble all processed chunks and finalize the anonymization. Returns output_path and session_id."},
49+
{"name": "resolve_path", "description": "Zero-config file path resolution. Finds a marker file on host via BFS to map VM paths to host paths."}
3950
],
4051
"keywords": ["pii", "anonymize", "gdpr", "privacy", "legal", "presidio", "ner", "gliner", "zero-shot"],
4152
"license": "MIT",
@@ -51,7 +62,7 @@
5162
"type": "string",
5263
"title": "GLiNER NER model",
5364
"description": "HuggingFace GLiNER model name for zero-shot NER",
54-
"default": "urchade/gliner_small-v2.1",
65+
"default": "knowledgator/gliner-pii-base-v1.0",
5566
"required": false
5667
},
5768
"work_dir": {

0 commit comments

Comments
 (0)