TAN-core-string global variables, keys, and functions summarized

TAN-core-string-functions

Definition: '\P{M}\p{M}*'

Used by function tan:chop-string().

Does not rely upon global variables, keys, functions, or templates.

TAN-core-string-functions

Definition: concat('[', tan:escape(string-join($nested-phrase-markers/tan:pair/tan:close/text(), '')), ']')

No variables, keys, functions, or named templates depend upon this xsl:variable.

Relies upon $nested-phrase-markers.

Option 1 (TAN-core-string-functions)

tan:chop-string($input as xs:string?) as xs:string*

Input: any string 
Output: that string chopped into a sequence of individual characters, following 
TAN rules (modifying characters always join their preceding base character) 

Used by template ŧ core-expansion-terse-attributes-to-elements, ŧ mark-dependencies-pass-2.

Used by function tan:infuse-divs(), tan:int-to-grc(), tan:atomize-string(), tan:string-length(), tan:chop-string().

Relies upon $char-reg-exp.

Option 2 (TAN-core-string-functions)

tan:chop-string($input as xs:string?, $chop-after-regex as xs:string) as xs:string*

2-param version of the fuller one below 

Used by template ŧ core-expansion-terse-attributes-to-elements, ŧ mark-dependencies-pass-2.

Used by function tan:infuse-divs(), tan:int-to-grc(), tan:atomize-string(), tan:string-length(), tan:chop-string().

Relies upon tan:chop-string().

Option 3 (TAN-core-string-functions)

tan:chop-string($input as xs:string?, $chop-after-regex as xs:string, $preserve-nested-clauses as xs:boolean) as xs:string*

Input: any string, a regular expression, a boolean 
Output: the input string cut into a sequence of strings using the regular expression 
as the cut marker 
If the last boolean is true, then nested clauses (parentheses, direct quotations, 
etc.) will be preserved. 
This function differs from the 1-parameter version in that it is used to chop the 
string not into individual characters but into words, clauses, sentences, etc. 

Used by template ŧ core-expansion-terse-attributes-to-elements, ŧ mark-dependencies-pass-2.

Used by function tan:infuse-divs(), tan:int-to-grc(), tan:atomize-string(), tan:string-length(), tan:chop-string().

Relies upon tan:nested-phrase-loop().

Option 1 (TAN-core-string-functions)

tan:diff($string-a as xs:string?, $string-b as xs:string?) as element()

2-param version of fuller one below 

Used by template ŧ class-1-expansion-verbose-pass-1.

Used by function tan:diff-cache(), tan:giant-diff(), tan:diff(), tan:collate(), tan:collate-loop-inner().

Relies upon tan:diff().

Option 2 (TAN-core-string-functions)

tan:diff($string-a as xs:string?, $string-b as xs:string?, $snap-to-word as xs:boolean) as element()

3-param version of fuller one below 

Used by template ŧ class-1-expansion-verbose-pass-1.

Used by function tan:diff-cache(), tan:giant-diff(), tan:diff(), tan:collate(), tan:collate-loop-inner().

Relies upon tan:diff().

Option 3 (TAN-core-string-functions)

tan:diff($string-a as xs:string?, $string-b as xs:string?, $snap-to-word as xs:boolean, $preprocess-long-strings as xs:boolean) as element()

This function prepares strings for 5-ary tan:diff(), primarily by tending to input 
strings that are large or really large (giant). Large pairs of strings are parsed to find 
common characters that might be used to find pairwise congruence of large segments. Giant 
pairs of strings are passed to tan:giant-diff(). 

Used by template ŧ class-1-expansion-verbose-pass-1.

Used by function tan:diff-cache(), tan:giant-diff(), tan:diff(), tan:collate(), tan:collate-loop-inner().

Relies upon tan:diff(), tan:ellipses(), tan:giant-diff().

Option 4 (TAN-core-string-functions)

tan:diff($string-a as xs:string?, $string-b as xs:string?, $snap-to-word as xs:boolean, $characters-to-tokenize-on as xs:string*, $loop-counter as xs:integer) as element()

Input: any two strings; boolean indicating whether results should snap to nearest 
word; boolean indicating whether long strings should be pre-processed 
Output: an element with <a>, <b>, and <common> children showing where strings a and b 
match and depart 
This function was written to assist the validation of <redivision>s quickly find 
differences between any two strings. The function has been tested on pairs of strings up to 
combined lengths of 9M characters. At that scale, the only way to efficiently process the 
diffs is by chaining smaller diffs, which are still large, optimally about 350K in length. 

Used by template ŧ class-1-expansion-verbose-pass-1.

Used by function tan:diff-cache(), tan:giant-diff(), tan:diff(), tan:collate(), tan:collate-loop-inner().

Relies upon tan:collate-pair-of-sequences(), tan:diff(), tan:diff-outer-loop(), tan:ellipses(), tan:trim-long-text(), ŧ snap-to-word-pass-1.

Option 1 (TAN-core-string-functions)

tan:normalize-text($text as xs:string*) as xs:string*

one-parameter version of full function below 

Used by template ŧ check-referred-doc.

Used by function tan:normalize-text(), tan:normalize-name().

Relies upon tan:normalize-text().

Option 2 (TAN-core-string-functions)

tan:normalize-text($text as xs:string*, $treat-as-name-values as xs:boolean) as xs:string*

Input: any sequence of strings; a boolean indicating whether the results should be 
name-normalized 
Output: that sequence, with each item's space normalized, and removal of any help 
requested 
In name-normalization, the string is converted to lower-case, and spaces replace 
hyphens, underscores, and illegal characters. 
Special end div characters are not removed in this operation, nor is tail-end space 
adjusted according to TAN rules; for that, see tan:normalize-div-text(). 

Used by template ŧ check-referred-doc.

Used by function tan:normalize-text(), tan:normalize-name().

Relies upon $help-trigger-regex, $regex-characters-not-permitted, $regex-name-space-characters.

Option 1 (TAN-core-string-functions)

tan:tokenize-text($text as xs:string*) as element()*

one-parameter version of the function below 

Used by template ŧ mark-dependencies-for-validation, ŧ dependency-adjustments-pass-2, ŧ tokenize-div, ŧ mark-dependencies-pass-1, ŧ dependency-expansion-normal, ŧ dependency-expansion-verbose.

Used by function tan:tokenize-text().

Relies upon $token-definition-default, tan:tokenize-text().

Option 2 (TAN-core-string-functions)

tan:tokenize-text($text as xs:string*, $token-definition as element(tan:token-definition)?, $count-toks as xs:boolean?) as element()*

three-parameter version of the function below 

Used by template ŧ mark-dependencies-for-validation, ŧ dependency-adjustments-pass-2, ŧ tokenize-div, ŧ mark-dependencies-pass-1, ŧ dependency-expansion-normal, ŧ dependency-expansion-verbose.

Used by function tan:tokenize-text().

Relies upon tan:tokenize-text().

Option 3 (TAN-core-string-functions)

tan:tokenize-text($text as xs:string*, $token-definition as element(tan:token-definition)?, $count-toks as xs:boolean?, $add-attr-q as xs:boolean?, $add-attr-pos as xs:boolean?) as element()*

Input: any number of strings; a <token-definition>; a boolean indicating whether 
tokens should be counted and labeled. 
Output: a <result> for each string, tokenized into <tok> and <non-tok>, 
respectively. If the counting option is turned on, the <result> contains @tok-count and 
@non-tok-count, and each <tok> and <non-tok> have an @n indicating which <tok> group it belongs to. 

Used by template ŧ mark-dependencies-for-validation, ŧ dependency-adjustments-pass-2, ŧ tokenize-div, ŧ mark-dependencies-pass-1, ŧ dependency-expansion-normal, ŧ dependency-expansion-verbose.

Used by function tan:tokenize-text().

Relies upon $token-definition-default, tan:tokenize-text(), ŧ add-tok-pos, ŧ first-stamp-shallow-copy.

TAN-core-string-functions

tan:unique-char($context-strings as xs:string*) as xs:string?

Input: any sequence of strings 
Output: a single character that is not to be found in those strings 
This function, written to support tan:collate-sequences(), provides unique way 
to join any sequence strings in such a way that it can later be tokenized. 

No variables, keys, functions, or named templates depend upon this xsl:function.

Does not rely upon global variables, keys, functions, or templates.