A Tree Sitter grammar for the hledger journal format.
This grammar provides comprehensive parsing and syntax highlighting for hledger journal files with support for:
- Transactions with dates, status, codes, and descriptions
- Postings with accounts and amounts
- Virtual postings (parentheses and brackets)
- Balance assertions (
=and==) - Cost/price specifications (
@and@@) - Comments (
;and#) - Directives (account, commodity, include, alias, payee, tag, P, decimal-mark)
- Periodic transactions (with
~and intervals) - Multi-format number parsing (1,234.56, 1.234,56, scientific notation)
- Unicode support for accounts and commodities
- Syntax highlighting via Tree-sitter queries
- Multiple date formats (YYYY-MM-DD, YYYY/MM/DD, YYYY.MM.DD)
- Currency symbols and quoted commodities
- Complex amount formatting with thousands separators
- Account names with Unicode characters
While working with hledger files, I've found the editor integration to be subpar. ledger-mode is great, but is even more restrictive to non-technical folks than ledger already is. Many of the integrations for other editors are also subpar. For instance mariosangiorgio/vscode-ledger (no longer maintained) and mhansen/hledger-vscode still use incomplete tmLanguage syntaxes.
Eventually, I would like to use this to create a hledger autoformatter.
Future enhancements to consider:
- Secondary dates
- Posting dates
- Balance assignments
- Auto-postings
- More directive types
- Error recovery improvements
This parser can be used with any editor that supports Tree-sitter grammars (Neovim, Emacs, VS Code with extensions, etc.).
# Install dependencies
npm install
# Generate parser from grammar
npm run build
# Run tests
npm run testThe grammar recognizes these file extensions:
.journal.j.hledger.ledger
The test suite consists of two types of tests:
Hand-written tests in test/corpus/ covering:
- Basic transactions and postings
- Virtual postings and balance assertions
- Unicode accounts and commodities
- Number formatting variations
- Directives and comments
- Periodic transactions
- Error cases
The project includes a test extraction system that pulls real-world test cases from the official hledger repository:
# Update hledger submodule to latest version
git submodule update --remote hledger
# Extract test cases from hledger's test suite and updates assertions
npm run codegenThe extraction process:
- Scans 600+ test files from hledger's test suite
- Extracts journal content from shelltest format
- Generates
corpus/extracted_from_hledger.txtwith smart merging - Preserves any manual parse tree assertions you've written
- Never overwrites your custom work
# Run all tests (manual + any generated)
npm test
# Run tests for specific corpus file
npx tree-sitter test --corpus test/corpus/basic.txt