regex-ext-tan global variables, keys, and functions summarized

tan:get-ucd-decomp()

Used by function tan:string-base() tan:string-composite() tan:expand-search()

Does not rely upon global variables, keys, functions, or templates.

Option 1 (regex-ext-tan-functions)

tan:matches($input as xs:string?, $pattern as xs:string) as xs:boolean

two-param function of the three-param version below

Used by function tan:obeyed-by-m() tan:get-toks() tan:matches()

Relies upon tan:matches .

Option 2 (regex-ext-tan-functions)

tan:matches($input as xs:string?, $pattern as xs:string, $flags as xs:string) as xs:boolean

Parallel to fn:matches(), but converts TAN-exceptions into classes. See tan:regex() for details.

Used by function tan:obeyed-by-m() tan:get-toks() tan:matches()

Relies upon tan:regex .

tan:regex($regex as xs:string?) as xs:string?

Input: string of a regex search

Output: the same string, with TAN-reserved escape sequences replaced by characters class sequences

E.g., '\k{.greek.capital.perispomeni}' - - > '[ἎἏἮἯἾἿὟὮὯᾎᾏᾞᾟᾮᾯ]'

\k{.latin.cedilla} - - > '[ÇçĢģĶķĻļŅņŖŗŞşŢţȨȩᷗḈḉḐḑḜḝḨḩ]'

'angle \k{4d-4f, 51}' - - > 'angle [MNOQ]'

This function grabs entire classes of Unicode characters either by their codepoint or by the parts of

their name. It performs specially upon the form \k{***VALUE***}, where ***VALUE*** is either (1) one or

more hexadecimal numbers joined by commas and hyphens or (2) one or more words each one prepended by a

non-word character. In the first option, there will be returned every Unicode character that has been

picked, filling in ranges where indicated by the hyphen. In the second option, there will be returned

every Unicode character that has all of those words in its official Unicode name, or alias.

Other examples:

Any word with an omega, even if not in any of the Greek blocks: '\k{.omega}' (useful if you

wish to find nonstandard uses of the omega, especially in the symbol block)

Any word with two successive omegas, no matter their accentuation or capitalizaton, or if they

have an iota subscript: '\k{.greek.omega}{2}' (useful for looking up a Greek word where accentuation

changes depending upon context or inflection)

Every Greek word that attracts an accent from an enclitic:

'[\k{.greek.oxia}\k{.greek.tonos}\k{.greek.perispomeni}]\w*[\k{.greek.tonos}\k{.greek.oxia}]'

Used by function tan:matches() tan:replace() tan:tokenize()

Relies upon tan:process-regex-escape-k ŧ add-square-brackets .

Option 1 (regex-ext-tan-functions)

tan:replace($input as xs:string?, $pattern as xs:string, $replacement as xs:string) as xs:string

three-param function of the four-param version below

Used by function tan:batch-replace() tan:replace()

Relies upon tan:replace .

Option 2 (regex-ext-tan-functions)

tan:replace($input as xs:string?, $pattern as xs:string, $replacement as xs:string, $flags as xs:string) as xs:string

Parallel to fn:replace(), but converts TAN-exceptions into classes. See tan:regex() for details.

Used by function tan:batch-replace() tan:replace()

Relies upon tan:regex .

Option 1 (regex-ext-tan-functions)

tan:tokenize($input as xs:string?, $pattern as xs:string) as xs:string*

two-param function of the three-param version below

Used by function tan:tokenize()

Relies upon tan:tokenize .

Option 2 (regex-ext-tan-functions)

tan:tokenize($input as xs:string?, $pattern as xs:string, $flags as xs:string) as xs:string*

Parallel to fn:tokenize(), but converts TAN-exceptions into classes. See tan:regex() for details.

Used by function tan:tokenize()

Relies upon tan:regex .