TAN-class-2 elements and attributes summarized

The element rename indicates the name of a <div> @n that should be changed in a given @type, and the name to which it should be changed.

There is no need to use this feature to convert Roman, alphabetic, or other numerals, which are detected and converted automatically

Formal Definition
@old, @new

Used by: ~decl-rename-div-n

[Caution]Caution

@old and @new may not share the same value

[Caution]Caution

No value of @new or @old may appear more than once for a given div type in a given source.

[Caution]Caution

@old must be found in every div type of every source


[Note]Note

Taken from ringoroses.div.1.xml


The element rename-div-ns provisionally reassigns @n values for one or more sources and one or more div types. Renaming applies only to the current file.

This element is especially useful for converting Roman numerals or letter numerals into Arabic numerals. See <rename> for syntax.

This feature is strictly speaking a convenience, not a necessity. All TAN-compliant preprocessors are required to automatically detect Roman and alphabetic numbering systems and treat them as Arabic numerals.

It is also useful for div types that use descriptive names for @n (such as books of the Bible), particularly for reconciling those names with a system that prevails or is preferred (e.g., "mt" to "Matt").

Note for TAN-A-div users: Although this element can reconcile simple differences, it should not be used for more complex inconsistencies that affect alignment, best handled in the <body> of a TAN-A-div file.

For more inforrmation see the section called “Class 2 Metadata (<head>)”

Formal Definition
~ed-stamp?, 
   (~inclusion | (
{[TAN-class-2 (~source-refs):]   @src} OR 

{[TAN-core (~source-refs):]   {empty}} OR 

{[TAN-LM-core (~source-refs):]   {empty}}, @div-type-ref, <rename>+))

Used by: ~declaration-items

[Caution]Caution

Every div type reference must be valid in every source


[Note]Note

Taken from ringoroses.div.1.xml


The element suppress-div-types marks div types in a source that should be suppressed in references. Suppressions occur shallowly. That is, it does not suppress any descendants of that div type. But if the suppression applies to a leaf div, that div and its text is effectively suppressed.

Any suppression of a div type must preserve the Leaf Div Uniqueness Rule (LDUR). See the section called “Flattened References, and the Leaf Div Uniqueness Rule”

This element will be used seldomly, for cases where a source has a div type that is dispensable in text references.

Formal Definition
~ed-stamp?, 
   (~inclusion | (
{[TAN-class-2 (~source-refs):]   @src} OR 

{[TAN-core (~source-refs):]   {empty}} OR 

{[TAN-LM-core (~source-refs):]   {empty}}, @div-type-ref))

Used by: ~declaration-items

[Caution]Caution

Every div type reference must be valid in every source


[Note]Note

Taken from ar.cat.tan-a-div.xml


[Note]Note

Taken from ringoroses.div.1.xml

The element tok identifies one or more words or word fragments. Used by class 2 files to make assertions about specific words.

In TAN-A-div and TAN-A-tok files, <tok> has no linguistic connotations; in TAN-LM, it normally does.

<tok>s are two types: simple and complex.

SIMPLE: <tok>s that are restricted to a single token, or a portion of a single token. This is the normal behavior of <tok>. Multiple values in @src, @ref, and @pos will result in expansion across all values. But multiple values of @chars are taken to refer to the constituent parts of a single <tok> and so no expansion occurs on @chars.

For example, a <tok> with 2 values for @src, 3 for @ref, 4 for @pos, and 5 for @chars will result in a <tok> that points to 24 tokens, each of which is filtered to the same five characters (by position, not content). This syntax, then, allows multiple <tok>s to be collapsed into a single one, to save space and perhaps enhance legibility. Put another way, <tok src="X" ref="a" pos="1"/> and <tok src="X" ref="a" pos="2"/> is always identical to <tok src="X" ref="a" pos="1-2"/>

COMPLEX: There are cases where one wishes to treat more than one token, in whole or part, as a single entity. In this case, @cont should be used, and it must join <tok>s that have only single values for @src, @ref, and @pos. @chars may take multiple values.

The behavior of <tok> differs from <div-ref>. The former is never treated as a group, whereas the latter is. For more, see <div-ref>.

Formal Definition
~tok-attr-core, 
{[TAN-A-div (~tok-source-ref-opt):]   {empty}} OR 

{[TAN-class-2 (~tok-source-ref-opt):]   
  {{[TAN-class-2 (~source-refs):]   @src}} OR 

  {{[TAN-core (~source-refs):]   {empty}}} OR 

  {{[TAN-LM-core (~source-refs):]   {empty}}}}, 
{[TAN-LM-lang (~pointer-to-div-range):]   {empty}} OR 

{[TAN-class-2 (~pointer-to-div-range):]   @ref}, 
   (@val | 
{[TAN-LM-lang (~seq-pos-ref):]   {empty}} OR 

{[TAN-class-2 (~seq-pos-ref):]   @pos} | (@val, 
{[TAN-LM-lang (~seq-pos-ref):]   {empty}} OR 

{[TAN-class-2 (~seq-pos-ref):]   @pos})), 
{[TAN-A-div (~tok-cert-opt):]   {empty}} OR 

{[TAN-class-2 (~tok-cert-opt):]   
   
      (@cert | (@cert, @cert2))?}~tok-sequence-attr-core, @src, 
{[TAN-A-div (~continuation-opt):]   {empty}} OR 

{[TAN-class-2 (~continuation-opt):]   @cont} OR 

{[TAN-LM-core (~continuation-opt):]   @cont}, 
   
      (@cert | (@cert, @cert2))?~tok-sequence-attr-core, 
{[TAN-A-div (~continuation-opt):]   {empty}} OR 

{[TAN-class-2 (~continuation-opt):]   @cont} OR 

{[TAN-LM-core (~continuation-opt):]   @cont}~tok-sequence-attr-core

Used by: ~split, ~complex-text-ref, ~alignment-content-non-class-2, ~tok-sequence, ~TAN-LM-item

[Caution]Caution

Every token must be locatable in every cited ref in every source.

[Caution]Caution

<tok> must reference a leaf <div>.

[Caution]Caution

No source may be split more than once in the same place.

[Caution]Caution

Splits may not be made at the first token in a div.

[Caution]Caution

A <tok> may not duplicate any sibling <tok>.

[Caution]Caution

Any ana with an @xml:id must point to no more than one token.


The attribute chars list of one or more characters, specified through Arabic numerals, the keyword 'last' or 'last-X' (where X is a valid number), joined with commas or hyphens.

Examples: '1', 'last', 'last-3 - last-1', '1, 3, 5, 7 - 11, last-8, last'

Formal Definition
string (pattern ((last|max|all|\*)|((last|max)-\d+)|(\d+))(\s*-\s*(((last|max))|((last|max)-\d+)|(\d+)))?(\s*[, ]\s*(((last|max))|((last|max)-\d+)|(\d+))(\s+-\s+(((last|max))|((last|max)-\d+)|(\d+)))?)*|.*\?\?\?.*)

Used by: ~tok-attr-core

[Caution]Caution

Sequences may not include values less than 1.

[Caution]Caution

Sequences may not include values greater than the maximum allowed.

[Caution]Caution

Sequences may not include ranges that go from a larger value to a smaller, e.g., 4 - 2.


The attribute cont indicates whether the current element is continued by the next one and to be treated as a single one. Value must be 1 or true, implied by the very presence of the attribute. If you wish to decare it to be false, delete the attribute altogether.

This feature is useful in <tok> for rejoining the portion of a word split across two <div>s, or for uniting into a single linguistic token multiple tokens separated by the tokenization process, e.g., "pom pom".

This feature is useful in <div-ref> for creating groups of references that cannot be expressed in a single <div-ref>

Formal Definition
boolean (pattern true|1)

Used by: ~continuation-opt

[Caution]Caution

Any element taking @cont must be followed by at least one sibling of the same type.

The attribute div-type-ref is used by class-2 files to point to one or more <div-type>s in class-1 files. Permits multiple values separated by spaces.

Formal Definition

Used by: ~div-type-ref-cluster, ~decl-supp-div-type, ~decl-rename-div-n

[Caution]Caution

Every div type reference must be valid in every source


[Note]Note

Taken from ar.cat.tan-a-div.xml


[Note]Note

Taken from ringoroses.div.1.xml

The attribute new provides the new name for an @n that is to be renamed

Formal Definition
string (pattern (\w|\d+-\d+)+(\s+(\w|\d+-\d+)+)*)

Used by: <rename>

[Caution]Caution

@old and @new may not share the same value

[Caution]Caution

No value of @new or @old may appear more than once for a given div type in a given source.


[Note]Note

Taken from ringoroses.div.1.xml


The attribute old provides the name of an @n to be renamed

Formal Definition
string (pattern (\w|\d+-\d+)+(\s+(\w|\d+-\d+)+)*)

Used by: <rename>

[Caution]Caution

@old and @new may not share the same value

[Caution]Caution

No value of @new or @old may appear more than once for a given div type in a given source.

[Caution]Caution

@old must be found in every div type of every source


[Note]Note

Taken from ringoroses.div.1.xml


The attribute pos lists one or more items, specified through Arabic numerals and the keyword 'last' or 'last-X' (where X is a valid number), joined with commas or hyphens.

Examples: '1', 'last', 'last-3 - last-1', '1, 3, 5, 7 - 11, last-8, last'

For more see the section called “@pos and @val”

Formal Definition
string (pattern ((last|max|all|\*)|((last|max)-\d+)|(\d+))(\s*-\s*(((last|max))|((last|max)-\d+)|(\d+)))?(\s*[, ]\s*(((last|max))|((last|max)-\d+)|(\d+))(\s+-\s+(((last|max))|((last|max)-\d+)|(\d+)))?)*|.*\?\?\?.*)string (pattern ((last|max)|((last|max)-\d+)|(\d+))|.*\?\?\?.*)

Used by: ~tok-regular, ~tok-sequence-attr-core

[Caution]Caution

Sequences may not include values less than 1.

[Caution]Caution

Sequences may not include values greater than the maximum allowed.

[Caution]Caution

Sequences may not include ranges that go from a larger value to a smaller, e.g., 4 - 2.


The attribute ref lists references to one or more <div>s. It consists of one or more simple references joined by commas or hyphens. A simple reference is a string value that points to a single <div>.

It is assumed that any simple reference that has fewer @n values than preceding simple references has been truncated. The abbreviated form will be checked before the form actually stated. For example, 1 1 - 3 will be interpreted first as 1 1 through 1 3; if that is invalid, it will be interpeted as 1 1 through 3. Examples: '2.4 - 7, 9', 'iv 7 - 9'

In a range with members of uneven depth, those <div>s that are closest to the shallowest member are retrieved. For example, 2 - 3 2 2 might fetch 2, 3 1, 3 2 1, 3 2 2 (and not 3 or 3 1 1).

For more, see the section called “Class 2 Data Patterns (<body>)”

Formal Definition
string (pattern (\w+([^\w\-]\w+)*)(((\s*-\s*)|(\s*,\s+))(\w+([^\w\-]\w+)*))*|.*\?\?\?.*)string (pattern (\w+([^\w\-]\w+)*)|.*\?\?\?.*)

Used by: ~anchor-div-ref-item, ~reanchor-div-ref-item, ~simple-textual-reference, ~claim-div-ref-item, ~tok-regular, ~tok-sequence-attr-core

[Caution]Caution

No single set of references may mix Roman numerals, alphabetic numerals, and numerals that are ambiguously either.

[Caution]Caution

Every atomic reference in a @ref must correspond to a <div> in every source mentioned by @src.

[Caution]Caution

Every range in a @ref must correspond to one or more <div>s in every source mentioned by @src.

[Caution]Caution

If @ref points to a leaf div, it must be unique.

[Important]Important

A defective reference is a value of @ref that corresponds to a <div> in some but not all sources in a work. If a defective reference is used, a warning will be reported, identifying the sources that lack the appropriate <div>.


The attribute src refers to the ID of one or more <source>s

The attribute src refers to the ID of only one <source>

Formal Definition
NCName 

Used by: ~div-type-ref-cluster, ~split, ~anchor-div-ref-item, ~reanchor-div-ref-item, ~simple-textual-reference, ~complex-textual-reference-set, ~decl-supp-div-type, ~decl-rename-div-n, ~tok-source-ref-opt, ~tok-with-src-and-cont, ~decl-tok-def

[Caution]Caution

Every idref in an attribute must point to the @xml:id value of the appropriate corresponding element.

[Caution]Caution

All idrefs in an attribute must be unique.

[Caution]Caution

Every atomic reference in a @ref must correspond to a <div> in every source mentioned by @src.

[Caution]Caution

Every range in a @ref must correspond to one or more <div>s in every source mentioned by @src.


[Note]Note

Taken from ar.cat.tan-a-div.xml

The attribute val specifies a particular word token by means of its string value. Permits regular expressions.

For more see the section called “@pos and @val”

Formal Definition
string (pattern .+)

Used by: ~tok-regular, ~tok-sequence-attr-core

[Caution]Caution

Attributes that take a regular expression must use escape sequences recognized by XML schema or TAN escape extensions (\k{}). See http://www.w3.org/TR/xmlschema-2/#regexs for details.

[Caution]Caution

@val must wholly match a token in the target.

[Important]Important

A @val set to '.+', a regular expression that matches any string, is equivalent to the omission of @val The value '.+' will match any string.


[Note]Note

Taken from ar.cat.tan-a-div.xml