Definition: '\P{M}\p{M}*'
Used by function tan:chop-string
()
Does not rely upon global variables, keys, functions, or templates.
tan:atomize-string($input as xs:string?) as xs:string*
alias for tan:-chop-string(
)
Used by template ŧ class-2-expansion-terse
Relies upon tan:chop-string
.
tan:batch-replace($string as xs:string?, $replace-elements as element()*) as xs:string?
Input: a string, a sequence of <[ANY NAME] pattern="" replacement="" [flags=""]>
Output: the string, after those replaces are processed in order
Used by function tan:batch-replace
()
Relies upon tan:replace
tan:batch-replace
.
tan:chop-string($input as xs:string?) as xs:string*
Input: any string
Output: that string chopped into a sequence of individual characters, following TAN rules (modifying characters always join their preceding base character)
Used by template ŧ class-1-expansion-verbose
ŧ class-1-expansion-verbose
Used by function tan:string-length
() tan:atomize-string
()
Relies upon $char-reg-exp
.
Option 1 (TAN-core-string-functions)
tan:collate($strings as xs:string*) as element()
one parameter version of full one below
Used by function tan:collate
()
Relies upon tan:collate
.
Option 2 (TAN-core-string-functions)
tan:collate($strings as xs:string*, $labels as xs:string*) as element()
Input: any number of strings
Output: an element with <c>
and <u w="[WITNESS NUMBERS]">, showing where there are common strings and where there are departures. At the beginning are <witness>
es identifying the numbers, and providing basic statistics about how much each pair of witnesses agree.
This function was written to deal with multiple OCR results of the same page of text, to find agreement wherever possible.
Used by function tan:collate
()
Relies upon tan:diff
tan:collate-loop-outer
ŧ diff-to-collation
.
tan:collate-loop-inner($collation-so-far as element(), $string-to-process as xs:string?, $string-label as xs:string?) as element()*
Input: a collation element (see template mode diff-to-collation), one string to process, and the corresponding string label
Output: a series of collation elements, marking where there is commonality and differences
This inner loop returns only the children of the collation element; the outer loop handles the parent element
Used by function tan:collate-loop-outer
() tan:collate-loop-inner
()
Relies upon tan:diff
tan:collate-loop-inner
.
tan:collate-loop-outer($collation-so-far as element(), $strings-to-process as xs:string*, $string-labels as xs:string*) as element()
Input: a collation element (see template mode diff-to-collation), some strings to process, and corresponding string labels
Output: a series of collation elements, marking where there is commonality and differences
Used by function tan:collate
() tan:collate-loop-outer
()
Relies upon tan:collate-loop-inner
tan:collate-loop-outer
.
Option 1 (TAN-core-string-functions)
tan:diff($string-a as xs:string?, $string-b as xs:string?) as element()
2-param version of fuller one below
Used by template ŧ class-1-expansion-verbose
Used by function tan:diff
() tan:collate
() tan:collate-loop-inner
()
Relies upon tan:diff
.
Option 2 (TAN-core-string-functions)
tan:diff($string-a as xs:string?, $string-b as xs:string?, $snap-to-word as xs:boolean) as element()
Input: any two strings; boolean indicating whether results should snap to nearest word
Output: an element with <a>
, <b>
, and <common>
children showing where strings a and b match and depart
This function was written after tan:diff, intended to be a cruder and faster way to check two strings against each other, suitable for validation without hanging due to nested recursion objections.
Used by template ŧ class-1-expansion-verbose
Used by function tan:diff
() tan:collate
() tan:collate-loop-inner
()
Relies upon tan:diff-loop
tan:group-adjacent-elements
ŧ snap-to-word-pass-1
.
tan:diff-loop($short-string as element()?, $long-string as element()?, $start-at-beginning as xs:boolean, $check-vertically-before-horizontally as xs:boolean, $loop-counter as xs:integer)
Used by function tan:diff
() tan:diff-loop
()
Relies upon tan:escape
tan:diff-loop
.
tan:escape($strings as xs:string*) as xs:string*
Input: any sequence of strings
Output: each string prepared for regular expression searches, i.e., with reserved characters escaped out.
Used by variable $help-trigger-regex
Used by template ŧ evaluate-conditions
ŧ class-2-expansion-terse-pass-2
Used by function tan:diff-loop
()
Relies upon $regex-escaping-characters
.
tan:group-adjacent-elements($elements as element()*) as element()*
Input: any sequence of elements
Output: the same elements, but adjacent elements of the same name grouped together
Used by function tan:diff
()
Does not rely upon global variables, keys, functions, or templates.
Option 1 (TAN-core-string-functions)
tan:normalize-text($text as xs:string*) as xs:string*
one-parameter version of full function below
Used by template ŧ check-referred-doc
ŧ expand-tan-key-dependencies
core-expansion-terse ŧ core-expansion-normal
ŧ resolve-attr-include
Used by function tan:feature-test-to-groups
() tan:resolve-doc
() tan:normalize-text
()
Relies upon tan:normalize-text
.
Option 2 (TAN-core-string-functions)
tan:normalize-text($text as xs:string*, $render-common as xs:boolean) as xs:string*
Input: any sequence of strings; a boolean indicating whether the results should be normalized further to a common form
Output: that sequence, with each item's space normalized, and removal of any help requested
A common form is one where the string is converted to lower-case, and hyphens are replaced by spaces
Used by template ŧ check-referred-doc
ŧ expand-tan-key-dependencies
core-expansion-terse ŧ core-expansion-normal
ŧ resolve-attr-include
Used by function tan:feature-test-to-groups
() tan:resolve-doc
() tan:normalize-text
()
Relies upon $help-trigger-regex
.
tan:string-length($input as xs:string?) as xs:integer
Input: any string
Output: the number of characters in the string, as defined by TAN (i.e., modifiers are counted with the preceding base character)
Used by template ŧ class-1-expansion-verbose
ŧ analyze-string-length-pass-1
Used by function tan:analyze-leaf-div-text-length-loop
()
Relies upon tan:chop-string
.
Option 1 (TAN-core-string-functions)
tan:tokenize-text($text as xs:string*) as element()*
one-parameter version of the function below
Used by template ŧ tokenize-div
ŧ dependencies-tokenized-selectively
ŧ dependency-expansion-normal
ŧ dependency-expansion-verbose
Used by function tan:tokenize-text
()
Relies upon tan:tokenize-text
$token-definition-default
.
Option 2 (TAN-core-string-functions)
tan:tokenize-text($text as xs:string*, $token-definition as element(tan:token-definition)?, $count-toks as xs:boolean?) as element()*
Input: any number of strings; a <token-definition>
; a boolean indicating whether tokens should be counted and labeled.
Output: a <result>
for each string, tokenized into <tok>
and <non-tok>
, respectively. If the counting option is turned on, the <result>
contains @tok-count
and @non-tok-count
, and each <tok>
and <non-tok>
have an @n
indicating which <tok>
group it belongs to.
Used by template ŧ tokenize-div
ŧ dependencies-tokenized-selectively
ŧ dependency-expansion-normal
ŧ dependency-expansion-verbose
Used by function tan:tokenize-text
()
Relies upon $token-definition-default
.