TAN-core-string global variables, keys, and functions summarized

TAN-core-string global variables, keys, and functions summarized
Prev	Chapter 11. TAN variables, keys, functions, and templates	Next

Variables

`$char-reg-exp`

TAN-core-string-functions

Definition: '\P{M}\p{M}*'

Used by function tan:chop-string().

Does not rely upon global variables, keys, functions, or templates.

`$nested-phrase-close-marker-regex`

TAN-core-string-functions

Definition: concat('[', tan:escape(string-join($nested-phrase-markers/tan:pair/tan:close/text(), '')), ']')

No variables, keys, functions, or named templates depend upon this xsl:variable.

Relies upon $nested-phrase-markers.

`$nested-phrase-marker-regex`

TAN-core-string-functions

Definition: concat('[', tan:escape(string-join($nested-phrase-markers/tan:pair/*/text(), '')), ']')

Used by function tan:nested-phrase-loop().

Relies upon $nested-phrase-markers.

`$nested-phrase-markers`

TAN-core-string-functions

This variable has a complex definition. See stylesheet for definiton.

Used by variable $nested-phrase-marker-regex, $nested-phrase-close-marker-regex.

Used by function tan:nested-phrase-loop().

Does not rely upon global variables, keys, functions, or templates.

Functions

`tan:atomize-string()`

TAN-core-string-functions

tan:atomize-string($input as xs:string?) as xs:string*

alias for tan:-chop-string()

No variables, keys, functions, or named templates depend upon this xsl:function.

Relies upon tan:chop-string().

`tan:batch-replace()`

TAN-core-string-functions

tan:batch-replace($string-to-replace as xs:string?, $replace-elements as element()*) as xs:string?

Input: a string, a sequence of <[ANY NAME] pattern="" replacement="" [flags=""]>

Output: the string, after those replaces are processed in order

Used by function tan:batch-replace().

Relies upon tan:batch-replace().

`tan:chop-string()`

Option 1 (TAN-core-string-functions)

tan:chop-string($input as xs:string?) as xs:string*

Input: any string

Output: that string chopped into a sequence of individual characters, following 
TAN rules (modifying characters always join their preceding base character)

Used by template ŧ core-expansion-terse-attributes-to-elements, ŧ mark-dependencies-pass-2.

Used by function tan:infuse-divs(), tan:int-to-grc(), tan:atomize-string(), tan:string-length(), tan:chop-string().

Relies upon $char-reg-exp.

Option 2 (TAN-core-string-functions)

tan:chop-string($input as xs:string?, $chop-after-regex as xs:string) as xs:string*

2-param version of the fuller one below

Used by template ŧ core-expansion-terse-attributes-to-elements, ŧ mark-dependencies-pass-2.

Used by function tan:infuse-divs(), tan:int-to-grc(), tan:atomize-string(), tan:string-length(), tan:chop-string().

Relies upon tan:chop-string().

Option 3 (TAN-core-string-functions)

tan:chop-string($input as xs:string?, $chop-after-regex as xs:string, $preserve-nested-clauses as xs:boolean) as xs:string*

Input: any string, a regular expression, a boolean

Output: the input string cut into a sequence of strings using the regular expression 
as the cut marker

If the last boolean is true, then nested clauses (parentheses, direct quotations, 
etc.) will be preserved.

This function differs from the 1-parameter version in that it is used to chop the 
string not into individual characters but into words, clauses, sentences, etc.

Used by template ŧ core-expansion-terse-attributes-to-elements, ŧ mark-dependencies-pass-2.

Used by function tan:infuse-divs(), tan:int-to-grc(), tan:atomize-string(), tan:string-length(), tan:chop-string().

Relies upon tan:nested-phrase-loop().

`tan:collate-loop-inner()`

TAN-core-string-functions

tan:collate-loop-inner($collation-so-far as element(), $string-to-process as xs:string?, $string-label as xs:string?) as element()*

Input: a collation element (see template mode diff-to-collation), one string to 
process, and the corresponding string label

Output: a series of collation elements, marking where there is commonality and 
differences

This inner loop returns only the children of the collation element; the outer loop 
handles the parent element

This function supports the XSLT 2.0 version of tan:collate()

Used by function tan:collate-loop-outer(), tan:collate-loop-inner().

Relies upon tan:collate-loop-inner(), tan:diff().

`tan:collate-loop-outer()`

TAN-core-string-functions

tan:collate-loop-outer($collation-so-far as element(), $strings-to-process as xs:string*, $string-labels as xs:string*) as element()

Input: a collation element (see template mode diff-to-collation), some strings to 
process, and corresponding string labels

Output: a series of collation elements, marking where there is commonality and 
differences

This function supports the XSLT 2.0 version of tan:collate()

Used by function tan:collate(), tan:collate-loop-outer().

Relies upon tan:collate-loop-inner(), tan:collate-loop-outer(), tan:trim-long-text().

`tan:diff()`

Option 1 (TAN-core-string-functions)

tan:diff($string-a as xs:string?, $string-b as xs:string?) as element()

2-param version of fuller one below

Used by template ŧ class-1-expansion-verbose-pass-1.

Used by function tan:diff-cache(), tan:giant-diff(), tan:diff(), tan:collate(), tan:collate-loop-inner().

Relies upon tan:diff().

Option 2 (TAN-core-string-functions)

tan:diff($string-a as xs:string?, $string-b as xs:string?, $snap-to-word as xs:boolean) as element()

3-param version of fuller one below

Used by template ŧ class-1-expansion-verbose-pass-1.

Used by function tan:diff-cache(), tan:giant-diff(), tan:diff(), tan:collate(), tan:collate-loop-inner().

Relies upon tan:diff().

Option 3 (TAN-core-string-functions)

tan:diff($string-a as xs:string?, $string-b as xs:string?, $snap-to-word as xs:boolean, $preprocess-long-strings as xs:boolean) as element()

This function prepares strings for 5-ary tan:diff(), primarily by tending to input 
strings that are large or really large (giant). Large pairs of strings are parsed to find 
common characters that might be used to find pairwise congruence of large segments. Giant 
pairs of strings are passed to tan:giant-diff().

Used by template ŧ class-1-expansion-verbose-pass-1.

Used by function tan:diff-cache(), tan:giant-diff(), tan:diff(), tan:collate(), tan:collate-loop-inner().

Relies upon tan:diff(), tan:ellipses(), tan:giant-diff().

Option 4 (TAN-core-string-functions)

tan:diff($string-a as xs:string?, $string-b as xs:string?, $snap-to-word as xs:boolean, $characters-to-tokenize-on as xs:string*, $loop-counter as xs:integer) as element()

Input: any two strings; boolean indicating whether results should snap to nearest 
word; boolean indicating whether long strings should be pre-processed

Output: an element with <a>, <b>, and <common> children showing where strings a and b 
match and depart

This function was written to assist the validation of <redivision>s quickly find 
differences between any two strings. The function has been tested on pairs of strings up to 
combined lengths of 9M characters. At that scale, the only way to efficiently process the 
diffs is by chaining smaller diffs, which are still large, optimally about 350K in length.

Used by template ŧ class-1-expansion-verbose-pass-1.

Used by function tan:diff-cache(), tan:giant-diff(), tan:diff(), tan:collate(), tan:collate-loop-inner().

Relies upon tan:collate-pair-of-sequences(), tan:diff(), tan:diff-outer-loop(), tan:ellipses(), tan:trim-long-text(), ŧ snap-to-word-pass-1.

Input: any sequence of strings; an integer

Output: the sequence of strings, but with any substring beyond the requested length 
replaced by ellipses

Used by function tan:giant-diff(), tan:diff().

Does not rely upon global variables, keys, functions, or templates.

`tan:fill()`

TAN-core-string-functions

tan:fill($string-to-fill as xs:string?, $times-to-repeat as xs:integer) as xs:string?

Input: a string, an integer

Output: a string with the first parameter repeated the number of times specified by 
the integer

This function was written to facilitate indentation

Used by template ŧ indent-items.

Does not rely upon global variables, keys, functions, or templates.

`tan:nested-phrase-loop()`

TAN-core-string-functions

tan:nested-phrase-loop($elements-to-process as element()*, $current-nesting-data as element()?) as element()*

Input: a series of elements with text content; an element indicating what nesting 
exists so far

Output: each input element with the text value put into <val> and a

Used by function tan:chop-string(), tan:nested-phrase-loop().

Relies upon $nested-phrase-marker-regex, $nested-phrase-markers, tan:nested-phrase-loop().

`tan:normalize-name()`

TAN-core-string-functions

tan:normalize-name($text as xs:string*) as xs:string*

one-parameter, name-normalizing version of tan:normalize-text()

Used by template ŧ first-stamp-shallow-copy, ŧ core-expansion-terse-attributes, ŧ core-expansion-terse, ŧ first-stamp-shallow-skip, ŧ core-expansion-normal.

Used by function tan:vocabulary(), tan:resolve-doc-loop(), tan:attribute-vocabulary().

Relies upon tan:normalize-text().

`tan:normalize-text()`

Option 1 (TAN-core-string-functions)

tan:normalize-text($text as xs:string*) as xs:string*

one-parameter version of full function below

Used by template ŧ check-referred-doc.

Used by function tan:normalize-text(), tan:normalize-name().

Relies upon tan:normalize-text().

Option 2 (TAN-core-string-functions)

tan:normalize-text($text as xs:string*, $treat-as-name-values as xs:boolean) as xs:string*

Input: any sequence of strings; a boolean indicating whether the results should be 
name-normalized

Output: that sequence, with each item's space normalized, and removal of any help 
requested

In name-normalization, the string is converted to lower-case, and spaces replace 
hyphens, underscores, and illegal characters.

Special end div characters are not removed in this operation, nor is tail-end space 
adjusted according to TAN rules; for that, see tan:normalize-div-text().

Used by template ŧ check-referred-doc.

Used by function tan:normalize-text(), tan:normalize-name().

Relies upon $help-trigger-regex, $regex-characters-not-permitted, $regex-name-space-characters.

`tan:string-length()`

TAN-core-string-functions

tan:string-length($input as xs:string?) as xs:integer

Input: any string

Output: the number of characters in the string, as defined by TAN (i.e., modifiers 
are counted with the preceding base character)

Used by template ŧ analyze-string-length-pass-1.

Used by function tan:analyze-leaf-div-text-length-loop().

Relies upon tan:chop-string().

`tan:tokenize-text()`

Option 1 (TAN-core-string-functions)

tan:tokenize-text($text as xs:string*) as element()*

one-parameter version of the function below

Used by template ŧ mark-dependencies-for-validation, ŧ dependency-adjustments-pass-2, ŧ tokenize-div, ŧ mark-dependencies-pass-1, ŧ dependency-expansion-normal, ŧ dependency-expansion-verbose.

Used by function tan:tokenize-text().

Relies upon $token-definition-default, tan:tokenize-text().

Option 2 (TAN-core-string-functions)

tan:tokenize-text($text as xs:string*, $token-definition as element(tan:token-definition)?, $count-toks as xs:boolean?) as element()*

three-parameter version of the function below

Used by function tan:tokenize-text().

Relies upon tan:tokenize-text().

Option 3 (TAN-core-string-functions)

tan:tokenize-text($text as xs:string*, $token-definition as element(tan:token-definition)?, $count-toks as xs:boolean?, $add-attr-q as xs:boolean?, $add-attr-pos as xs:boolean?) as element()*

Input: any number of strings; a <token-definition>; a boolean indicating whether 
tokens should be counted and labeled.

Output: a <result> for each string, tokenized into <tok> and <non-tok>, 
respectively. If the counting option is turned on, the <result> contains @tok-count and 
@non-tok-count, and each <tok> and <non-tok> have an @n indicating which <tok> group it belongs to.

Used by function tan:tokenize-text().

Relies upon $token-definition-default, tan:tokenize-text(), ŧ add-tok-pos, ŧ first-stamp-shallow-copy.

`tan:unique-char()`

TAN-core-string-functions

tan:unique-char($context-strings as xs:string*) as xs:string?

Input: any sequence of strings

Output: a single character that is not to be found in those strings

This function, written to support tan:collate-sequences(), provides unique way 
to join any sequence strings in such a way that it can later be tokenized.

No variables, keys, functions, or named templates depend upon this xsl:function.

Does not rely upon global variables, keys, functions, or templates.

`tan:vertical-stops()`

TAN-core-string-functions

tan:vertical-stops($short-string as xs:string?) as xs:double*

Input: a string

Output: percentages of the string that should be followed in 
tan:diff-outer-loop()

Used by function tan:diff-outer-loop().

Does not rely upon global variables, keys, functions, or templates.

Prev	Up	Next
TAN-core-errors global variables, keys, and functions summarized	Home	TAN-core-3-0 global variables, keys, and functions summarized