regex-ext-tan global variables, keys, and functions summarized

regex-ext-tan global variables, keys, and functions summarized
Prev	Chapter 11. TAN variables, keys, functions, and templates	Next

variables

`$hex-key`

Definition: '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'

Used by function tan:dec-to-hex() tan:hex-to-dec()

Does not rely upon global variables, keys, functions, or templates.

functions

Does not rely upon global variables, keys, functions, or templates.

`tan:hex-to-dec()`

tan:hex-to-dec($hex as xs:string?) as item()*

Change any hexadecimal string into an integer

E.g., '1F' - > 31

Used by function tan:process-regex-escape-k()

Relies upon $hex-key .

`tan:matches()`

Option 1 (regex-ext-tan-functions)

tan:matches($input as xs:string?, $pattern as xs:string) as xs:boolean

two-param function of the three-param version below

Used by function tan:obeyed-by-m() tan:get-toks() tan:matches()

Relies upon tan:matches .

Option 2 (regex-ext-tan-functions)

tan:matches($input as xs:string?, $pattern as xs:string, $flags as xs:string) as xs:boolean

Parallel to fn:matches(), but converts TAN-exceptions into classes. See tan:regex() for details.

Used by function tan:obeyed-by-m() tan:get-toks() tan:matches()

Relies upon tan:regex .

`tan:process-regex-escape-k()`

tan:process-regex-escape-k($val-inside-braces as xs:string, $unicode-db as document-node()) as xs:string?

Used by function tan:regex()

Relies upon tan:hex-to-dec .

`tan:regex()`

tan:regex($regex as xs:string?) as xs:string?

Input: string of a regex search

Output: the same string, with TAN-reserved escape sequences replaced by characters class sequences

E.g., '\k{.greek.capital.perispomeni}' - - > '[ἎἏἮἯἾἿὟὮὯᾎᾏᾞᾟᾮᾯ]'

\k{.latin.cedilla} - - > '[ÇçĢģĶķĻļŅņŖŗŞşŢţȨȩᷗḈḉḐḑḜḝḨḩ]'

'angle \k{4d-4f, 51}' - - > 'angle [MNOQ]'

This function grabs entire classes of Unicode characters either by their codepoint or by the parts of

their name. It performs specially upon the form \k{***VALUE***}, where ***VALUE*** is either (1) one or

more hexadecimal numbers joined by commas and hyphens or (2) one or more words each one prepended by a

non-word character. In the first option, there will be returned every Unicode character that has been

picked, filling in ranges where indicated by the hyphen. In the second option, there will be returned

every Unicode character that has all of those words in its official Unicode name, or alias.

Other examples:

Any word with an omega, even if not in any of the Greek blocks: '\k{.omega}' (useful if you

wish to find nonstandard uses of the omega, especially in the symbol block)

Any word with two successive omegas, no matter their accentuation or capitalizaton, or if they

have an iota subscript: '\k{.greek.omega}{2}' (useful for looking up a Greek word where accentuation

changes depending upon context or inflection)

Every Greek word that attracts an accent from an enclitic:

'[\k{.greek.oxia}\k{.greek.tonos}\k{.greek.perispomeni}]\w*[\k{.greek.tonos}\k{.greek.oxia}]'

Used by function tan:matches() tan:replace() tan:tokenize()

Relies upon tan:process-regex-escape-k ŧ add-square-brackets .

`tan:replace()`

Option 1 (regex-ext-tan-functions)

tan:replace($input as xs:string?, $pattern as xs:string, $replacement as xs:string) as xs:string

three-param function of the four-param version below

Used by function tan:batch-replace() tan:replace()

Relies upon tan:replace .

Option 2 (regex-ext-tan-functions)

tan:replace($input as xs:string?, $pattern as xs:string, $replacement as xs:string, $flags as xs:string) as xs:string

Parallel to fn:replace(), but converts TAN-exceptions into classes. See tan:regex() for details.

Used by function tan:batch-replace() tan:replace()

Relies upon tan:regex .

`tan:string-base()`

tan:string-base($arg as xs:string?) as xs:string?

This function takes any string and replaces every character with its base Unicode character.

E.g., ἀνθρὠπους - > ανθρωπουσ

This is useful for preparing text to be searched without respect to accents

No variables, keys, functions, or named templates depend upon this xsl:function.

Relies upon tan:get-ucd-decomp .

`tan:string-composite()`

tan:string-composite($arg as xs:string?) as xs:string?

This function is the inverse of tan:string-base, in that it replaces every character with

those Unicode characters that use it as a base. If none exist, then the character itself is

returned.

E.g., 'Max' - > 'MᴹḾṀṂℳⅯⓂ㎆㎒㎫㎹㎿㏁Ｍ𝐌𝑀𝑴𝓜𝔐𝕄𝕸𝖬𝗠𝘔𝙈𝙼🄼🅋🅪🅫aªàáâãäåāăąǎǟǡǻȁȃȧᵃḁẚạảấầẩẫậắằẳẵặₐ℀℁ⓐ㏂ａ𝐚𝑎𝒂𝒶𝓪𝔞𝕒𝖆𝖺𝗮𝘢𝙖𝚊xˣẋẍₓⅹⅺⅻⓧｘ𝐱𝑥𝒙𝓍𝔁𝔵𝕩𝖝𝗑𝘅𝘹𝙭𝚡'

This is useful for preparing regex character classes to broaden a search.

Used by function tan:expand-search()

Relies upon tan:get-ucd-decomp .

`tan:tokenize()`

Option 1 (regex-ext-tan-functions)

tan:tokenize($input as xs:string?, $pattern as xs:string) as xs:string*

two-param function of the three-param version below

Used by function tan:tokenize()

Relies upon tan:tokenize .

Option 2 (regex-ext-tan-functions)

tan:tokenize($input as xs:string?, $pattern as xs:string, $flags as xs:string) as xs:string*

Parallel to fn:tokenize(), but converts TAN-exceptions into classes. See tan:regex() for details.

Used by function tan:tokenize()

Relies upon tan:regex .

templates

`Ŧ prep-regex-char-class`

No variables, keys, functions, or named templates depend upon this xsl:template.

Does not rely upon global variables, keys, functions, or templates.

Prev	Up	Next
diff-for-xslt2 global variables, keys, and functions summarized	Home	TAN-schema global variables, keys, and functions summarized