-
-
Notifications
You must be signed in to change notification settings - Fork 197
Architecture
In issue #22, folks were asking why the generated HTML website is not fully static, and is not hostable using, say, Python.
The short answer is that, for large codebases, the search index (list of all symbols that you can search for) can be in the millions, and it is not feasible to download a huge list of symbols from the server to search in it statically using JS on the client.
The only thing that's on the server in the current architecture is a list of declarations and a few auxiliary data structures (list of all assemblies and all projects).
But primarily it's a list of all declared symbols.
Each declared symbol is basically a type with 5 things:
- Assembly number
- Glyph (icon)
- Name (what you search for)
- Symbol ID (the hex number used in hyperlinks to it)
- Description (usually the full namespace and type and member name)
You can store this static list on the server as .txt, download it in JavaScript and implement search on the client without even going to the server.
The problem with that approach is SourceBrowser was designed to be highly scalable. It easily works with 60 million lines of code (all of Microsoft Developer Division source) and can scale to 100 million easily. This means around 6 million symbols currently (4 GB memory compressed). This is not something you can do on the client. Holding this list on the server is easy.
Implementing the feature we're talking about would basically mean removing SourceIndexServer and replacing it with a client side solution. This is a non-trivial amount of work and not something I'm willing to do (no time). I also don't believe it will scale past 100-200 thousand symbols, which would cover most midsize codebases but not the bigger ones.
If someone is willing to do the work, feel free to do so in your own forks, but I won't be taking this as a PR into my original repo (lots of support work I'm not prepared to do).