Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Commit 98e5049

Browse files
authored
cody docs: document why we skip files (#54165)
This adds a section to the FAQ and also documents some previously undocumented site config settings.
1 parent 52d8ce0 commit 98e5049

File tree

2 files changed

+36
-0
lines changed

2 files changed

+36
-0
lines changed

doc/cody/explanations/code_graph_context.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,20 @@ To use `excludedFilePathPatterns`, add it to your embeddings site config with a
5353
}
5454
```
5555

56+
By default, the following patterns are excluded from embeddings:
57+
58+
- *ignore" // Files like .gitignore, .eslintignore
59+
- .gitattributes
60+
- .mailmap
61+
- *.csv
62+
- *.svg
63+
- *.xml
64+
- \_\_fixtures\_\_/
65+
- node_modules/
66+
- testdata/
67+
- mocks/
68+
- vendor/
69+
5670
> NOTE: The `excludedFilePathPatterns` setting is only available in Sourcegraph version `5.0.1` and later.
5771
5872
### Storing embedding indexes
@@ -182,3 +196,18 @@ A negative value disables the limit and all repositories are selected.
182196
}
183197
}
184198
```
199+
200+
### Limitting the number of embeddings that can be generated
201+
202+
The number of embeddings that can be generated per repo is limited to `embeddings.maxCodeEmbeddingsPerRepo` for code embeddings (default 3.072.000) or `embeddings.maxTextEmbeddingsPerRepo` (default 512.000) for text embeddings.
203+
204+
Use the following site configuration to update the limits:
205+
206+
```jsonc
207+
{
208+
"embeddings": {
209+
"maxCodeEmbeddingsPerRepo": 3072000,
210+
"maxTextEmbeddingsPerRepo": 512000
211+
}
212+
}
213+
```

doc/cody/faq.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,13 @@ There can be several reasons why a job is not showing up in the list of jobs:
5858
### How do I stop a running embeddings job?
5959

6060
Jobs in state _QUEUED_ or _PROCESSING_ can be canceled by admins from the **Cody > Embeddings Jobs** page. To cancel a job, click on the _Cancel_ button of the job you want to cancel. The job will be marked for cancellation. Note that, depending on the state of the job, it might take a few seconds or minutes for the job to actually be canceled.
61+
#### What are the reasons files are skipped?
62+
63+
Files are skipped for the following reasons:
64+
65+
- The file is too large (1 MB)
66+
- The file path matches an [exclusion pattern](./explanations/code_graph_context.md#excluding-files-from-embeddings)
67+
- We have already generated more than [`embeddings.maxCodeEmbeddingsPerRepo`](./explanations/code_graph_context.md#limitting-the-number-of-embeddings-that-can-be-generated) or [`embeddings.maxTextEmbeddingsPerRepo`](./explanations/code_graph_context.md#limitting-the-number-of-embeddings-that-can-be-generated) embeddings for the repo.
6168

6269
### Third party dependencies
6370

0 commit comments

Comments
 (0)