Augment GitHub Copilot with custom domain knowledge. Improve generic code model alignment and context grounding. Reduce token use by providing more context to coding agents.
- Collect code examples/files with prompt, incorrect/correct responses, and explanations
- Configurable metadata fields (languages, frameworks, topics, or add custom via settings)
- View, sort, search by folder, language or metadata fields
- Export to JSONL format optimized for model training
- Go to the Releases page
- Download the latest
.vsixfile from the release assets - In VS Code, open the Command Palette (Ctrl+Shift+P / Cmd+Shift+P)
- Run the command
Extensions: Install from VSIX... - Select the downloaded
.vsixfile - Reload VS Code when prompted
- Configure storage location - Store your data in cloud-synced folders (OneDrive, Google Drive, etc.) Default location is extension storage folder
- Configure GitHub Copilot chat integrations (Check out this readme for complete Copilot integration guide)
-
Right-click on a file, file tab or a code selection inside a file
-
Choose
Copilot Library>Add new (in)valid response snippetto create new record orAdd to existing recordto update existing record. -
Fill out the form:
- Name & Description: What this training example demonstrates
- Prompt: The instruction that should generate the code
- Incorrect Response: Multi-file example of what NOT to generate
- Explanation: Why the incorrect approach is wrong (optional)
- Correct Response: Multi-file example of what SHOULD be generated
- Metadata: Languages, frameworks, topics (all optional and customizable)
and hit Create/Update
-
View and manage records in the Training Records tree view
- Export complete training dataset to JSONL format by clicking "Export Training Dataset"
{
"prompt": "Create a Blazor Server feature to update a user profile.",
"incorrect_response": {
"User.razor": "// Bad example without proper patterns",
"UserService.cs": "// Bad example without DI/async"
},
"explanation": "The incorrect version lacks dependency injection, async methods, and proper error handling.",
"correct_response": {
"User.razor": "// Good example with proper Blazor patterns",
"UserService.cs": "// Good example with DI and async"
},
"metadata": {
"languages": ["csharp", "html"],
"frameworks": ["blazor", "aspnet-core"],
"topics": ["user-interface", "dependency-injection"]
}
}You can customize the metadata fields by clicking the settings button (⚙️) in the View Options panel to open the extension-config.json file. This allows you to add custom fields, modify existing ones, or create specialized field types for your training data.
text: Single-line text input (e.g., project name, author)select: Single choice from predefined options (e.g., difficulty level)multiselect: Multiple choices from predefined options (e.g., programming languages)number: Numeric input with optional min/max validation (e.g., complexity score)boolean: True/false checkbox (e.g., includes tests, production ready)tags: Free-form tags that users can type (e.g., frameworks, topics)
{
"id": "unique_field_identifier",
"label": "Display Name",
"type": "text|select|multiselect|number|boolean|tags",
"required": false,
"options": ["option1", "option2"], /* For select/multiselect only */
"defaultValue": "default", /* Optional default value */
"description": "Help text for users", /* Optional tooltip */
"validation": { /* Optional validation rules */
"min": 0, /* For number fields */
"max": 10, /* For number fields */
"pattern": "regex_pattern" /* For text fields */
}
}Add a difficulty rating:
{
"id": "difficulty",
"label": "Difficulty Level",
"type": "select",
"required": false,
"defaultValue": "Beginner",
"options": ["Beginner", "Intermediate", "Advanced", "Expert"],
"description": "Code complexity level"
}Add a production-ready flag:
{
"id": "production_ready",
"label": "Production Ready",
"type": "boolean",
"required": false,
"defaultValue": false,
"description": "Is this code suitable for production use?"
}After modifying extension-config.json, restart VS Code to apply the changes.
copilotLibrary.datasetPath: Training dataset export location