Using TAN outside the Network

Using TAN outside the Network
Prev	Chapter 8. Working with TAN Files	Next

The function library behind TAN is quite powerful, and it can be used in non-TAN applications. Below is a list of some functions that have been extremely helpful. Some of the functions are not central to validation, so must be retrieved through ../functions/TAN-extra-functions.xsl. For a complete list of all functions, see Chapter 11, TAN variables, keys, functions, and templates.

tan:batch-replace(): runs a sequence of regular expression replacements on any string. The sequence is prepared by constructing a series of <replace pattern="" replacement="" [flags=""]> whose attributes follow the rules of tan:replace() or fn:replace().
tan:chop-string(): changes a string into a sequence of characters, as defined in TAN (i.e., combining characters are always kept with the base character). It is roughly equivalent to the XPath expression for $i in fn:string-to-codepoints(.) return fn:codepoints-to-string($i).
tan:collate(): like tan:diff(), but applied to any number of strings. The results are treated much like a collation of manuscript readings, with the output xml fragment tethered to sigla corresponding to the input strings. The function can be used to optimize the order of the input strings, and to compute pairwise similarity of each string.
tan:copy-indentation(): applies the white-space indentation of an element to any other XML fragment. Useful for when you want to insert items in an XML file and preserve/imitate its indentation.
tan:diff(): compare any two strings for differences. Includes an option to mark the changes letter-for-letter, or merely word-for-word (easier to read in some contexts). This function, which was written under the assumption that the input strings would have some resemblance, has been used successfully on pairs of strings as long as 5M characters.
tan:duplicate-items(): like tan:duplicate-values(), but applied to any item. If a node, duplication is determined based on whether it is deeply equal to any other node.
tan:duplicate-values(): finds distinct items in a sequence whose values are repeated in the sequence. This function complements fn:distinct-values().
tan:fill(): repeats a string a given number of times. Helpful for formatting plain-text output.
tan:get-chars-by-name(): retrieves Unicode characters based upon words in their name.
tan:glob-to-regex(): changes a glob-like expression (normally used for filenames) into a regular expression (e.g., *.* becomes .*\..*).
tan:lang-code(): retrieves an ISO 639-3 code for a language of a given name.
tan:lang-name(): finds the name of a language, given its ISO 639-3 code.
tan:median(): retrieves the median from a sequence of numbers
tan:most-common-item(): from a sequence of items, returns the one that occurs most frequently
tan:most-common-item-count(): returns the number of times the most common item appears in a sequence
tan:no-outliers(): removes outliers from a sequence of numbers
tan:outliers(): returns only outliers from a sequence of numbers
tan:search-morpheus(): retrieves lexico-morphological data for Greek and Latin from the Morpheus service
tan:search-wikipedia(): retrieves a set number of records from Wikipedia
tan:shallow-copy(): returns a copy of a node to a set depth. Useful for messages, to provide feedback on a particular element and its attributes, without any descendants (which would make the message hard to read).
tan:uri-relative-to(): converts an absolute URI to a relative one, based on some context URI

Some numeral functions might prove useful:

Letter numerals ↔ integers: tan:aaa-to-int(), tan:int-to-aaa()
Roman numerals → integers: tan:rom-to-int() (reverse not available)
Greek numerals ↔ integers: tan:grc-to-int(), tan:int-to-grc()
Syriac numerals → integers: tan:syr-to-int() (reverse not available)
Hexadecimal ↔ decimal: tan:hex-to-dec(), tan:dec-to-hex()
String range ↔ integers: tan:expand-numerical-sequence(), tan:integers-to-sequence()

Prev	Up	Next
Doing things with TAN files	Home	Part IV. Appendixes