TAN-core-string-functions
Definition: '\P{M}\p{M}*'
Used by function tan:chop-string()
.
Does not rely upon global variables, keys, functions, or templates.
TAN-core-string-functions
Definition: concat('[', tan:escape(string-join($nested-phrase-markers/tan:pair/tan:close/text(), '')), ']')
No variables, keys, functions, or named templates depend upon this xsl:variable.
Relies upon $nested-phrase-markers
.
TAN-core-string-functions
Definition: concat('[', tan:escape(string-join($nested-phrase-markers/tan:pair/*/text(), '')), ']')
Used by function tan:nested-phrase-loop()
.
Relies upon $nested-phrase-markers
.
TAN-core-string-functions
This variable has a complex definition. See stylesheet for definiton.
Used by variable $nested-phrase-marker-regex
, $nested-phrase-close-marker-regex
.
Used by function tan:nested-phrase-loop()
.
Does not rely upon global variables, keys, functions, or templates.
TAN-core-string-functions
tan:atomize-string($input as xs:string?) as xs:string*
alias for tan:-chop-string()
No variables, keys, functions, or named templates depend upon this xsl:function.
Relies upon tan:chop-string()
.
TAN-core-string-functions
tan:batch-replace($string-to-replace as xs:string?, $replace-elements as element()*) as xs:string?
Input: a string, a sequence of <[ANY NAME] pattern="" replacement="" [flags=""]>
Output: the string, after those replaces are processed in order
Used by function tan:batch-replace()
.
Relies upon tan:batch-replace()
.
Option 1 (TAN-core-string-functions)
tan:chop-string($input as xs:string?) as xs:string*
Input: any string
Output: that string chopped into a sequence of individual characters, following TAN rules (modifying characters always join their preceding base character)
Used by template ŧ core-expansion-terse-attributes-to-elements
, ŧ mark-dependencies-pass-2
.
Used by function tan:infuse-divs()
, tan:int-to-grc()
, tan:atomize-string()
, tan:string-length()
, tan:chop-string()
.
Relies upon $char-reg-exp
.
Option 2 (TAN-core-string-functions)
tan:chop-string($input as xs:string?, $chop-after-regex as xs:string) as xs:string*
2-param version of the fuller one below
Used by template ŧ core-expansion-terse-attributes-to-elements
, ŧ mark-dependencies-pass-2
.
Used by function tan:infuse-divs()
, tan:int-to-grc()
, tan:atomize-string()
, tan:string-length()
, tan:chop-string()
.
Relies upon tan:chop-string()
.
Option 3 (TAN-core-string-functions)
tan:chop-string($input as xs:string?, $chop-after-regex as xs:string, $preserve-nested-clauses as xs:boolean) as xs:string*
Input: any string, a regular expression, a boolean
Output: the input string cut into a sequence of strings using the regular expression as the cut marker
If the last boolean is true, then nested clauses (parentheses, direct quotations, etc.) will be preserved.
This function differs from the 1-parameter version in that it is used to chop the string not into individual characters but into words, clauses, sentences, etc.
Used by template ŧ core-expansion-terse-attributes-to-elements
, ŧ mark-dependencies-pass-2
.
Used by function tan:infuse-divs()
, tan:int-to-grc()
, tan:atomize-string()
, tan:string-length()
, tan:chop-string()
.
Relies upon tan:nested-phrase-loop()
.
TAN-core-string-functions
tan:collate-loop-inner($collation-so-far as element(), $string-to-process as xs:string?, $string-label as xs:string?) as element()*
Input: a collation element (see template mode diff-to-collation), one string to process, and the corresponding string label
Output: a series of collation elements, marking where there is commonality and differences
This inner loop returns only the children of the collation element; the outer loop handles the parent element
This function supports the XSLT 2.0 version of tan:collate()
Used by function tan:collate-loop-outer()
, tan:collate-loop-inner()
.
Relies upon tan:collate-loop-inner()
, tan:diff()
.
TAN-core-string-functions
tan:collate-loop-outer($collation-so-far as element(), $strings-to-process as xs:string*, $string-labels as xs:string*) as element()
Input: a collation element (see template mode diff-to-collation), some strings to process, and corresponding string labels
Output: a series of collation elements, marking where there is commonality and differences
This function supports the XSLT 2.0 version of tan:collate()
Used by function tan:collate()
, tan:collate-loop-outer()
.
Relies upon tan:collate-loop-inner()
, tan:collate-loop-outer()
, tan:trim-long-text()
.
Option 1 (TAN-core-string-functions)
tan:diff($string-a as xs:string?, $string-b as xs:string?) as element()
2-param version of fuller one below
Used by template ŧ class-1-expansion-verbose-pass-1
.
Used by function tan:diff-cache()
, tan:giant-diff()
, tan:diff()
, tan:collate()
, tan:collate-loop-inner()
.
Relies upon tan:diff()
.
Option 2 (TAN-core-string-functions)
tan:diff($string-a as xs:string?, $string-b as xs:string?, $snap-to-word as xs:boolean) as element()
3-param version of fuller one below
Used by template ŧ class-1-expansion-verbose-pass-1
.
Used by function tan:diff-cache()
, tan:giant-diff()
, tan:diff()
, tan:collate()
, tan:collate-loop-inner()
.
Relies upon tan:diff()
.
Option 3 (TAN-core-string-functions)
tan:diff($string-a as xs:string?, $string-b as xs:string?, $snap-to-word as xs:boolean, $preprocess-long-strings as xs:boolean) as element()
This function prepares strings for 5-arytan:diff()
, primarily by tending to input strings that are large or really large (giant). Large pairs of strings are parsed to find common characters that might be used to find pairwise congruence of large segments. Giant pairs of strings are passed totan:giant-diff()
.
Used by template ŧ class-1-expansion-verbose-pass-1
.
Used by function tan:diff-cache()
, tan:giant-diff()
, tan:diff()
, tan:collate()
, tan:collate-loop-inner()
.
Relies upon tan:diff()
, tan:ellipses()
, tan:giant-diff()
.
Option 4 (TAN-core-string-functions)
tan:diff($string-a as xs:string?, $string-b as xs:string?, $snap-to-word as xs:boolean, $characters-to-tokenize-on as xs:string*, $loop-counter as xs:integer) as element()
Input: any two strings; boolean indicating whether results should snap to nearest word; boolean indicating whether long strings should be pre-processed
Output: an element with<a>
,<b>
, and<common>
children showing where strings a and b match and depart
This function was written to assist the validation of <redivision>
s quickly find
differences between any two strings. The function has been tested on pairs of strings up to
combined lengths of 9M characters. At that scale, the only way to efficiently process the
diffs is by chaining smaller diffs, which are still large, optimally about 350K in length.
Used by template ŧ class-1-expansion-verbose-pass-1
.
Used by function tan:diff-cache()
, tan:giant-diff()
, tan:diff()
, tan:collate()
, tan:collate-loop-inner()
.
Relies upon tan:collate-pair-of-sequences()
, tan:diff()
, tan:diff-outer-loop()
, tan:ellipses()
, tan:trim-long-text()
, ŧ snap-to-word-pass-1
.
TAN-core-string-functions
tan:diff-inner-loop($short-string as element()?, $long-string as element()?, $starting-locs-to-check as xs:integer*, $length-of-short-substring as xs:integer, $search-prefix as xs:string?, $search-suffix as xs:string?, $loop-counter as xs:integer)
Used by function tan:diff-outer-loop()
, tan:diff-inner-loop()
.
Relies upon tan:diff-inner-loop()
, tan:trim-long-text()
.
TAN-core-string-functions
tan:diff-outer-loop($short-string as element()?, $long-string as element()?, $start-at-beginning as xs:boolean, $check-vertically-before-horizontally as xs:boolean, $vertical-stops-to-process as xs:double*, $loop-counter as xs:integer)
Used by function tan:diff()
, tan:diff-outer-loop()
.
Relies upon tan:common-end-string()
, tan:common-start-string()
, tan:diff-inner-loop()
, tan:diff-outer-loop()
, tan:trim-long-text()
, tan:vertical-stops()
.
TAN-core-string-functions
tan:ellipses($strings-to-truncate as xs:string*, $string-length-to-retain as xs:integer) as xs:string*
Input: any sequence of strings; an integer
Output: the sequence of strings, but with any substring beyond the requested length replaced by ellipses
Used by function tan:giant-diff()
, tan:diff()
.
Does not rely upon global variables, keys, functions, or templates.
TAN-core-string-functions
tan:fill($string-to-fill as xs:string?, $times-to-repeat as xs:integer) as xs:string?
Input: a string, an integer
Output: a string with the first parameter repeated the number of times specified by the integer
This function was written to facilitate indentation
Used by template ŧ indent-items
.
Does not rely upon global variables, keys, functions, or templates.
TAN-core-string-functions
tan:nested-phrase-loop($elements-to-process as element()*, $current-nesting-data as element()?) as element()*
Input: a series of elements with text content; an element indicating what nesting exists so far
Output: each input element with the text value put into <val>
and a
Used by function tan:chop-string()
, tan:nested-phrase-loop()
.
Relies upon $nested-phrase-marker-regex
, $nested-phrase-markers
, tan:nested-phrase-loop()
.
TAN-core-string-functions
tan:normalize-name($text as xs:string*) as xs:string*
one-parameter, name-normalizing version of tan:normalize-text()
Used by template ŧ first-stamp-shallow-copy
, ŧ core-expansion-terse-attributes
, ŧ core-expansion-terse
, ŧ first-stamp-shallow-skip
, ŧ core-expansion-normal
.
Used by function tan:vocabulary()
, tan:resolve-doc-loop()
, tan:attribute-vocabulary()
.
Relies upon tan:normalize-text()
.
Option 1 (TAN-core-string-functions)
tan:normalize-text($text as xs:string*) as xs:string*
one-parameter version of full function below
Used by template ŧ check-referred-doc
.
Used by function tan:normalize-text()
, tan:normalize-name()
.
Relies upon tan:normalize-text()
.
Option 2 (TAN-core-string-functions)
tan:normalize-text($text as xs:string*, $treat-as-name-values as xs:boolean) as xs:string*
Input: any sequence of strings; a boolean indicating whether the results should be name-normalized
Output: that sequence, with each item's space normalized, and removal of any help requested
In name-normalization, the string is converted to lower-case, and spaces replace hyphens, underscores, and illegal characters.
Special end div characters are not removed in this operation, nor is tail-end space
adjusted according to TAN rules; for that, see tan:normalize-div-text()
.
Used by template ŧ check-referred-doc
.
Used by function tan:normalize-text()
, tan:normalize-name()
.
Relies upon $help-trigger-regex
, $regex-characters-not-permitted
, $regex-name-space-characters
.
TAN-core-string-functions
tan:string-length($input as xs:string?) as xs:integer
Input: any string
Output: the number of characters in the string, as defined by TAN (i.e., modifiers are counted with the preceding base character)
Used by template ŧ analyze-string-length-pass-1
.
Used by function tan:analyze-leaf-div-text-length-loop()
.
Relies upon tan:chop-string()
.
Option 1 (TAN-core-string-functions)
tan:tokenize-text($text as xs:string*) as element()*
one-parameter version of the function below
Used by template ŧ mark-dependencies-for-validation
, ŧ dependency-adjustments-pass-2
, ŧ tokenize-div
, ŧ mark-dependencies-pass-1
, ŧ dependency-expansion-normal
, ŧ dependency-expansion-verbose
.
Used by function tan:tokenize-text()
.
Relies upon $token-definition-default
, tan:tokenize-text()
.
Option 2 (TAN-core-string-functions)
tan:tokenize-text($text as xs:string*, $token-definition as element(tan:token-definition)?, $count-toks as xs:boolean?) as element()*
three-parameter version of the function below
Used by template ŧ mark-dependencies-for-validation
, ŧ dependency-adjustments-pass-2
, ŧ tokenize-div
, ŧ mark-dependencies-pass-1
, ŧ dependency-expansion-normal
, ŧ dependency-expansion-verbose
.
Used by function tan:tokenize-text()
.
Relies upon tan:tokenize-text()
.
Option 3 (TAN-core-string-functions)
tan:tokenize-text($text as xs:string*, $token-definition as element(tan:token-definition)?, $count-toks as xs:boolean?, $add-attr-q as xs:boolean?, $add-attr-pos as xs:boolean?) as element()*
Input: any number of strings; a <token-definition>
; a boolean indicating whether
tokens should be counted and labeled.
Output: a<result>
for each string, tokenized into<tok>
and<non-tok>
, respectively. If the counting option is turned on, the<result>
contains@tok-count
and@non-tok-count
, and each<tok>
and<non-tok>
have an@n
indicating which<tok>
group it belongs to.
Used by template ŧ mark-dependencies-for-validation
, ŧ dependency-adjustments-pass-2
, ŧ tokenize-div
, ŧ mark-dependencies-pass-1
, ŧ dependency-expansion-normal
, ŧ dependency-expansion-verbose
.
Used by function tan:tokenize-text()
.
Relies upon $token-definition-default
, tan:tokenize-text()
, ŧ add-tok-pos
, ŧ first-stamp-shallow-copy
.
TAN-core-string-functions
tan:unique-char($context-strings as xs:string*) as xs:string?
Input: any sequence of strings
Output: a single character that is not to be found in those strings
This function, written to support tan:collate-sequences()
, provides unique way
to join any sequence strings in such a way that it can later be tokenized.
No variables, keys, functions, or named templates depend upon this xsl:function.
Does not rely upon global variables, keys, functions, or templates.
TAN-core-string-functions
tan:vertical-stops($short-string as xs:string?) as xs:double*
Input: a string
Output: percentages of the string that should be followed in
tan:diff-outer-loop()
Used by function tan:diff-outer-loop()
.
Does not rely upon global variables, keys, functions, or templates.