1
1
tree-sitter-vimdoc
2
2
==================
3
3
4
- This grammar intentionally support a subset of the vimdoc "spec"; predictable
5
- results are the primary goal, so that _ output_ formats (e.g. HTML) are
6
- well-formed; the _ input_ (vimdoc) is secondary. The first step should always be
7
- to try to fix the input (within reason) rather than insist on a grammar that
8
- handles vimdoc's endless quirks.
4
+ This grammar intentionally support a subset of the vimdoc "spec"
5
+ ([ ref1] ( https://neovim.io/doc/user/helphelp.html#help-writing ) ,
6
+ [ ref2] ( https://github.com/nanotee/vimdoc-notes ) );
7
+ predictable results are the primary goal, so that _ output_ formats (e.g. HTML)
8
+ are well-formed; the _ input_ (vimdoc) is secondary. The first step should
9
+ always be to try to fix the input rather than insist on a grammar that handles
10
+ vimdoc's endless quirks.
9
11
10
12
Overview
11
13
--------
12
14
13
- - vimdoc format "spec":
14
- - [ : help help-writing] ( https://neovim.io/doc/user/helphelp.html#help-writing )
15
- - https://github.com/nanotee/vimdoc-notes
16
- - whitespace is intentionally captured in all atoms, because it is often used
17
- for "layout" and ascii art in legacy help files.
18
- - ` block ` is the main top-level node which contains ` line ` nodes.
19
- - ends at blank line(s) or a line starting with ` < ` .
15
+ - ` block ` is the main top-level node which contains ` line ` and ` line_li ` nodes.
16
+ - delimited by blank line(s) or any line starting with ` < ` (codeblock terminator).
20
17
- ` line ` :
21
18
- contains atoms (words, tags, taglinks, …)
22
- - contains ` codeblock ` because ` > ` can start a codeblock at the end of a line.
23
19
- contains headings (` h1 ` , ` h2 ` , ` h3 ` ) because ` codeblock ` terminated by
24
20
"implicit stop" (no terminating ` < ` ) consumes blank lines, so ` block ` has
25
21
no way to end.
26
22
- contains ` column_heading ` because ` < ` (the ` codeblock ` terminating char)
27
23
can appear at the start of ` column_heading ` .
24
+ - ` line_li ` ("list item")
25
+ - consumes lines until blank line, codeblock, or next listitem.
26
+ - nesting is ignored: indented listitems are parsed as siblings.
28
27
- ` codeblock ` :
29
- - contains ` line ` nodes which do not contain ` word ` nodes, it's just the full
28
+ - contained by ` line ` or ` line_li ` . Because ">" can start
29
+ a codeblock at the end of any line.
30
+ - contains ` line ` nodes without ` word ` nodes, it's just the full
30
31
raw text line including whitespace. This is somewhat dictated by its
31
32
"preformatted" nature; parsing the contents would require loading a "child"
32
33
language (injection). See [ #2 ] ( https://github.com/neovim/tree-sitter-vimdoc/issues/2 ) .
@@ -38,16 +39,20 @@ Overview
38
39
Known issues
39
40
------------
40
41
41
- - ` line_li ` ("list item") is experimental. It doesn't support nesting yet.
42
+ - Input must end with newline/EOL (` \n ` ). Grammar does not support files without EOL.
43
+ - Input must end with a blank line. Though this doesn't seem to matter in practice.
42
44
- Spec requires that ` codeblock ` delimiter ">" must be preceded by a space
43
45
(" >"), not a tab. But currently the grammar doesn't enforce this. Example:
44
46
` :help lcs-tab ` .
45
47
- ` url ` doesn't handle _ surrounding_ parens. E.g. ` (https://example.com/#yay) ` yields ` word `
46
48
- ` url ` doesn't handle _ nested_ parens. E.g. ` (https://example.com/(foo)#yay) `
49
+ - ` column_heading ` currently only recognizes tilde "~ " preceded by space (i.e.
50
+ "foo ~ " not "foo~ "). This covers 99% of : help files, but the grammar should
51
+ probably support "foo~ " also.
47
52
48
53
TODO
49
54
----
50
55
51
- - ` line_noeol ` is a special-case to support documents that don't end in EOL.
52
- Grammar could be simpler if we require EOL at end of document.
53
56
- ` line_modeline ` ?
57
+ - ` tag_heading ` : line(s) containing only tags, typically implies a "heading"
58
+ before a block.
0 commit comments