Procedure is a JavaScript function that will be called against the node that matches our selector. In temme, the procedure part is placed after the normal CSS selector within a pair of curly brace { }
. Actually, in the previous docs, we have met procedures: text-capture and conditional-assignment are two special forms of procedures.
div{ proc(arg1, arg2) }
When a node matches the matching rules,proc
will be executed. Every argument is either a capture (e.g.$foo
) or a simple JavaScript literal.div{ $arg }
This is a special form and it is equivalent todiv{ text($arg) }
div{ $foo = bar }
This is a special form and it is equivalent todiv{ assign($foo, bar) }
Since text-capture is the most used capture form, temme uses text
as the default procedure. When we only provide a single value-capture, the default capture text
will be executed. In another word, div{ $foo }
and div{ text($foo) }
are equivalent.
Conditional-assignment is the shortcut for assign
procedure. In other words, div{ $foo = bar }
is equivalent to div{ assign($foo, bar) }
.
Temme provides the following built-in procedures to accommodate common capture requirements. Besides text
and assign
, built-in procedures contains:
div{ html($foo) }
Capture the inner-HTML of a nodediv{ node($bar) }
Capture the node itself. When there are some capture requirements that cannot be satisfied with the declarative temme selector, we can get the cheerio node and call its JavaScript API.
Examples:
<!-- html used below -->
<div class="outer">
<p>TEXT-1</p>
<div class="inner">TEXT-2</div>
</div>
const selector = `
.outer{ $a };
.outer{ text($b) };
.outer{ html($c) };
.outer{ node($d|attr('class')) };
.outer{ $e|toLowerCase };
div{ $hasDiv = true };
table{ $hasTable = true };
`
temme(html, selector)
//=>
// {
// "a": "TEXT-1 TEXT-2",
// "b": "TEXT-1 TEXT-2",
// "c": "<p>TEXT-1</p> <div class=\"inner\">TEXT-2</div>",
// "d": "outer",
// "e": "text-1 text-2",
// "hasDiv": true
// }
Procedure find
tries to capture a substring of the node text. find
accepts two or three arguments with each a string or a value-capture. The detailed usage is as follows:
find($x, 'world')
will try to capture the text before'world'
. If the text of node is'hello world'
, then the result will be{ x: 'hello' }
find('hello', $x)
will try to capture the text after'hello'
.find('hello', $x, 'world')
will try to capture the text between'hello'
and'world'
.
find
simply uses String#indexOf
to get the index of a substring. So when there are more than one matches, find
will always use the first match. If find
cannot find a substring matching the argument, then the result of find
is null.
Example:
<a href="https://github.com/shinima/temme">Star Me on GitHub</a>
temme(html, `a { find('Star Me on ', $website) }`)
//=> { "website": "GitHub" }
temme(html, `a { find('Fork Me on ', $website) }`)
//=> null
// cannot find a substring matching 'Fork Me on ', so the result is null
Like in filters, temme supports several different ways to define customized procedures. When customized procedures get called, the arguments are as follows: the capture result, the node matching the CSS selector, the arguments passed in the selector.
Procedures are supported to be powerful and complex. In most situations, we do not need to use it. Temme supports pseudo-class selector (powered by css-select) which improve the expression ability greatly (especially :contains
, :not
and :has
). Before defining customized procedures, try to have a luck with these pseudo-class selectors.
When implementing customized procedures, you can use procedures.ts as a reference.
import { defineProcedure } from 'temme'
defineProcedure('myProcedure', function myProcedure(result, node, ...args) {
/* ... */
})
Provide a customized procedure map as the fifth argument of temme
.
const extraProcedures = {
mark(result, node, arg) {
result.add(arg, { mark: true, text: node.text() })
},
// ...
}
temme(
html,
'div{ mark($foo) }',
/* extraFilters */ null,
/* extraModifiers */ null,
extraProcedures,
)
Define procedures in selector string. Inline procedure definition has the same syntax as JavaScript-style function definition. The difference is that temme uses procedure
as the keyword instead of function
.
procedure mark(result, node, arg) {
/* Procedure logic here */
/* The code here will be executed as in a JavaScript function */
/* Note that the curly braces must be balanced here due to the parser limitations */
}
// We can use `mark` as follows
div{ mark($foo) };