Synax Blocks and Folding |
Top Previous Next |
Text parsing goes by two stages:
Example 1:
[ keyword:while keyword:for ] .+? keyword:do
Means start of Lua “while” or “for” construct. As you can see, expression is same as usual regular expression, with one difference: intead of simple chars, we use names of tokens with optional token content given. Also, you can’t use here character class related regexp constructs like “\s, \S, \W, \w, \d, \D, \0xFF, \U{Unicode_cat}”, and char-related modifiers like (?ims), just because here are no chars, only int-codes for tokens, case insensitivity has no sense, and all token sequence always interpreted as single line.
For those token names: keyword, identifier, symbol we can use shortcuts: kw, id, sym respectively.
Example 2:
[ kw:while kw:for ] .+? kw:do
Example 3:
Five any keywords, after that Lua while/for construct start.
kw{5} [ kw:while kw:for ] .+? kw:do
Example 4:
JavaScript function: “function” keyword, any identifier (detected by <KeywordRegex> rule), “(“ symbol, anything except “; {}” symbols, “)” symbol, and “{” symbol.
kw:function id sym:( [^ sym:; sym:} sym:{ ]* sym:) sym:{
Example1 (syntax blocks):
<Scheme name='Comment' defaultToken='comment' />
<!--Sample JavaScript scheme --> <Scheme name='JavaScriptMain' defaultToken='default' keywordsIgnoreCase='false'>
<!--Regexp for keywords and identifiers --> <KeywordRegex>\b[a-zA-Z_][\w_]*\b</KeywordRegex>
<!--Keyword list (short list, for this example) --> <Keywords> for in if else return while function new this var with arguments throw try catch finally with </Keywords>
<Regex innerScheme='Comment' regex='//.*$' /> <Regex token0='symbol' regex='[ \} \{ \] \[ \( \) > < ]' /> <Regex token0='symbol' regex='[-:?\~=+!^;,]' /> <SkipSyntaxToken token='comment' /> <SyntaxBlock capture="true"> <Start> kw:function id sym:( [^ sym:; sym:} sym:{ ]* sym:) sym:{ </Start> <End> sym:\} </End> </SyntaxBlock>
<!-- We can use common syntax for many language constructs --> <SyntaxBlock capture="true" priority='10'> <Start> [ kw:while kw:do kw:if kw:else kw:try kw:catch kw:finally kw:switch ]
[^ sym:; sym:} ]*? sym:\{ </Start>
<End> sym:} </End> </SyntaxBlock>
<!-- We don't want folds for code in simple { .. } We should just skip it, for parens balance, because other constructs ends with } too. --> <SyntaxBlock capture="false" priority='0' > <Start> sym:{ </Start> <End> sym:} </End> </SyntaxBlock> </Scheme>
Example2: VB syntax (using references to start of block)
<SyntaxBlock capture="true"> <Start> [ kw:sub kw:class kw:if kw:function kw:property kw:select kw:with ] </Start> <End> kw:end $0 </End> </SyntaxBlock>
Here we fold everything like Sub FuncName .... End Sub, Class ClassName ..... End Class ... etc.
This element is sub-element of <Scheme>, it works as helper for <SyntaxBlock> element
Example:
<SkipSyntaxToken token='comment' />
All comments will be skipped at syntax parsing stage, so, you can write
kw:function id sym:( [^ sym:; sym:} sym:{ ]* sym:) sym:{
Instead of
kw:function comment* id comment* sym:( comment* [^ sym:; sym:} sym:{ ]* sym:) comment* sym:{
for JavaScript function.
You can set multiple <SkipSyntaxToken> in scheme.
|