Used by variable $special-end-div-chars-regex
Used by function tan:normalize-div-text
()
Definition: concat('[', string-join($special-end-div-chars, ''), ']$')
Used by function tan:normalize-div-text
()
Relies upon $special-end-div-chars
.
Definition: $token-definitions-reserved[following-sibling::tan:name = 'nonspace']
No variables, keys, functions, or named templates depend upon this xsl:variable.
Relies upon $token-definitions-reserved
.
Looks for elements matching tan:div
Used by template ŧ class-2-expansion-terse
Does not rely upon global variables, keys, functions, or templates.
Looks for elements matching tan:div
Used by template ŧ class-1-expansion-verbose
ŧ class-2-expansion-terse
Does not rely upon global variables, keys, functions, or templates.
tan:analyze-leaf-div-string-length($document-fragment as item()*) as item()*
Input: any class 1 document fragment
Output: Every leaf div stamped with @string-length
and @string-pos
, indicating how long the text node is, and where it is relative to all other leaf text nodes, after TAN text normalization rules have been applied.
This function is useful for statistical processing, and for comparing a TAN-T(
EI) file against an alternatively divided edition.
It has also been designed to stamp the <a>
and <common>
results of tan:diff
(), to facilitate SQFs that replace a text with that of the other version.
This function does the same thing as tan:analyze-string-length
(), but approaches the problem with a recursive loop
Used by template ŧ class-1-expansion-verbose
ŧ class-1-expansion-verbose
Relies upon tan:analyze-leaf-div-text-length-loop
.
tan:analyze-leaf-div-text-length-loop($items-to-process as item()*, $char-count-so-far as xs:integer, $return-final-count as xs:boolean) as item()*
Loop function for the master one, above.
Used by function tan:analyze-leaf-div-string-length
() tan:analyze-leaf-div-text-length-loop
()
Relies upon tan:analyze-leaf-div-text-length-loop
tan:string-length
tan:normalize-div-text
.
Option 1 (TAN-class-1-functions)
tan:analyze-string-length($resolved-class-1-doc-or-fragment as item()*) as item()*
One-parameter function of the two-parameter version below
Used by function tan:analyze-string-length
()
Relies upon tan:analyze-string-length
.
Option 2 (TAN-class-1-functions)
tan:analyze-string-length($resolved-class-1-doc-or-fragment as item()*, $mark-only-leaf-divs as xs:boolean) as item()*
Input: any class-1 document or fragment; an indication whether string lengths should be added only to leaf divs, or to every div.
Output: the same document, with @string-length
and @string-pos
added to every element
Function to calculate string lengths of each leaf elements and their relative position, so that a raw text can be segmented proportionally and given the structure of a model exemplar. NB: any $special-end-div-chars
that terminate a <div>
not only will not be counted, but the
assumed space that follows will also not be counted. On the other hand, the lack of a special
character at the end means that the nominal space that follows a div will be included in both
the length and the position. Thus input...
<div type="m" n="1">abc­</div>
<div type="m" n="2">def‍</div>
<div type="m" n="3">ghi</div>
<div type="m" n="4">xyz</div>
...presumes a raw joined text of "abcdefghi xyz ", and so becomes output:
<div type="m" n="1" string-length="3" string-pos="1">abc­</div>
<div type="m" n="2" string-length="3" string-pos="4">def‍</div>
<div type="m" n="3" string-length="4" string-pos="7">ghi</div>
<div type="m" n="4" string-length="4" string-pos="11">xyz</div>
This function does the same thing as tan:analyze-leaf-div-string-length
(), but approaches the problem in a two-template cycle, instead of a loop
Used by function tan:analyze-string-length
()
Relies upon ŧ analyze-string-length-pass-1
ŧ analyze-string-length-pass-2
.
tan:div-to-div-transfer($items-with-div-content-to-be-transferred as item()*, $items-whose-divs-should-be-infused-with-new-content as item()*) as item()*
Input: (1) any set of divs with content to be transferred into the structure of (2) another set of divs.
Output: The div structure of (2), infused with the content of (1). The content is allocated proportionately, with preference given to punctuation, within a certain range, and then word breaks.
This function is useful for converting class-1 documents from one reference system to another. Normally the conversion is flawed, because two versions of the same work rarely synchronize, but this function provides a good estimate, or a starting point for manual correction.
No variables, keys, functions, or named templates depend upon this xsl:function.
Relies upon tan:text-join
tan:infuse-divs
.
tan:infuse-divs($new-content-to-be-transferred as xs:string?, $items-whose-divs-should-be-infused-with-new-content as item()*) as item()*
Input: a string; an XML fragment that has <div>
s
Output: the latter, infused with the former, following infusing text proportionate to the relative quantities of text being replaced
Used by function tan:div-to-div-transfer
()
Relies upon ŧ analyze-string-length-pass-1
ŧ analyze-string-length-pass-2
ŧ infuse-tokenized-text
ŧ strip-all-attributes-except
.
Option 1 (TAN-class-1-functions)
tan:merge-divs($expanded-class-1-fragment as item()*) as item()*
See fuller version below
Used by template ŧ reset-hierarchy
Used by function tan:merge-divs
() tan:merge-divs
()
Relies upon tan:merge-divs
.
Option 2 (TAN-class-1-functions)
tan:merge-divs($expanded-class-1-fragment as item()*, $itemize-leaf-divs as xs:boolean) as item()*
See fuller version below
Used by template ŧ reset-hierarchy
Used by function tan:merge-divs
() tan:merge-divs
()
Relies upon tan:merge-divs
.
Option 3 (TAN-class-1-functions)
tan:merge-divs($expanded-class-1-fragment as item()*, $itemize-leaf-divs as xs:boolean, $exclude-elements-with-duplicate-values-of-what-attribute as xs:string?, $keep-last-duplicate as xs:boolean?) as item()*
Input: expanded class 1 document fragment whose individual <div>
s are assumed to be in the proper hierarchy (result of tan:normalize-text-hierarchy(
)); a boolean indicating whether leaf divs should be itemized; an optional string representing the name of an attribute to be checked for duplicates
Output: the fragment with the <div>
s grouped according to their <ref>
values
If the 2nd parameter is true, for each leaf <div>
in a group there will be a separate <div type="#version">; otherwise leaf divs will be merely copied
For merging multiple files normally the value should be true; if they are misfits from a single source, false
Used by template ŧ reset-hierarchy
Used by function tan:merge-divs
() tan:merge-divs
()
Relies upon ŧ merge-divs
.
tan:normalize-div-text($div-strings as xs:string*) as xs:string*
Input: any sequence of strings
Output: the same sequence, normalized according to TAN rules. Each item in the sequence is space normalized and then if its end matches one of the special div-end characters, ZWJ U+200D or SOFT HYPHEN U+AD, the character is removed; otherwise a space is added at the end. Zero-length strings are skipped.
This function is designed specifically for TAN's commitment to nonmixed content. That is, every TAN element contains either elements or non-whitespace text but not both, which also means that whitespace text nodes are effectively ignored. It is assumed that every TAN element is followed by a notional whitespace.
Used by template ŧ text-join
ŧ class-1-expansion-verbose
Used by function tan:analyze-leaf-div-text-length-loop
()
Relies upon $special-end-div-chars-regex
.
tan:text-join($items as item()*) as xs:string?
Input: any document fragment of a TAN class 1 body, whether raw, resolved, or expanded
Output: a single string that joins and normalizes the leaf div text according to TAN rules:
Used by template ŧ class-1-expansion-verbose
ŧ merge-divs
ŧ analyze-string-length-pass-1
Used by function tan:div-to-div-transfer
()
Relies upon ŧ text-join
.
tan:tokenize-div($divs as element()*, $token-definitions as element(tan:token-definition)) as element()*
Input: any <div>
s, a <token-definition>
Output: the <divs>
s in tokenized form
No variables, keys, functions, or named templates depend upon this xsl:function.
Relies upon ŧ tokenize-div
.