Chapter 8. TAN patterns, elements, and attributes defined

Table of Contents

TAN-core elements and attributes summarized
<agent>
<agentrole>
<alias>
<body>
<change>
<checksum>
<comment>
<declarations>
<desc>
<for-lang>
<group>
<group-type>
<head>
<inclusion>
<IRI>
<key>
<location>
<master-location>
<name>
<relationship>
<rights-excluding-sources>
<rights-source-only>
<role>
<see-also>
<source>
<tail>
<token-definition>
<value>
<version>
<when>
<work>
@affects-element
@cert
@cert2
@ed-when
@ed-who
@flags
@from
@group
@help
@href
@id
@idrefs
@in-progress
@include
@n
@regex
@rights-holder
@roles
@TAN-version
@to
@type
@when
@when-accessed
@which
@who
@xml:id
@xml:lang
TAN-class-1 elements and attributes summarized
<div-type>
<filter>
<normalization>
<replace>
<transliteration>
@replacement
TAN-T elements and attributes summarized
<div>
<TAN-T>
TAN-class-2 elements and attributes summarized
<rename>
<rename-div-ns>
<suppress-div-types>
<tok>
@chars
@cont
@div-type-ref
@new
@old
@pos
@ref
@src
@val
TAN-A-div elements and attributes summarized
<anchor-div-ref>
<div-ref>
<div-type-ref>
<equate-div-types>
<equate-works>
<realign>
<split-leaf-div-at>
<TAN-A-div>
@seg
@work
TAN-A-tok elements and attributes summarized
<align>
<bitext-relation>
<reuse-type>
<TAN-A-tok>
@bitext-relation
@reuse-type
TAN-LM-core elements and attributes summarized
<ana>
<l>
<lexicon>
<lm>
<m>
<morphology>
<TAN-LM>
@def-ref
@lexicon
@morphology
TAN-LM elements and attributes summarized
TAN-LM-lang elements and attributes summarized
TAN-class-3 elements and attributes summarized
TAN-key elements and attributes summarized
<item>
<TAN-key>
TAN-mor elements and attributes summarized
<assert>
<category>
<feature>
<report>
<TAN-mor>
@code
@context
@feature-qty-test
@feature-test
@matches-m
@matches-tok
TAN-c elements and attributes summarized
<TAN-c>
TAN-c-core elements and attributes summarized
<claim>
<claim-basis>
<locus>
<modal>
<object>
<person>
<place>
<scriptum>
<subject>
<topic>
<unit>
<verb>
@adverb
@claim-basis
@claimant
@object
@object-datatype
@object-lexical-constraint
@subject
@units
@verb
@where
TAN patterns
~agent-list
~agent-ref
~agent-role-list
~alignment
~alignment-attributes-non-class-2
~alignment-content-non-class-2
~alignment-inclusion-opt
~anchor-div-ref-item
~any-attribute
~any-content
~any-element
~assert
~attr-cert
~attr-cert2
~bitext-relation-attr
~body-group
~body-group-opt
~category
~category-feature
~category-list
~cert-claim
~cert-content
~cert-opt
~certainty-stamp
~change-list
~char-ref
~checksum
~claim
~claim-div-ref-item
~claimant
~code
~comment
~complex-object
~complex-rationale
~complex-subject
~complex-text-ref
~complex-textual-reference-set
~continuation
~continuation-opt
~decl-alias
~decl-brel
~decl-class-1
~decl-div
~decl-filt
~decl-filt-norm
~decl-filt-repl
~decl-filt-tlit
~decl-filter-content
~decl-group-type
~decl-id-ref-opt
~decl-lexi
~decl-mode
~decl-morph
~decl-non-class-1
~decl-opt
~decl-pattern-default
~decl-pattern-language
~decl-pattern-no-id
~decl-pers
~decl-place
~decl-rename-div-n
~decl-reus
~decl-scri
~decl-supp-div-type
~decl-tok-def
~decl-topic
~decl-unit
~decl-verb
~decl-vers
~decl-work
~declaration-core
~declaration-items
~div-item-ref
~div-range-ref
~div-type-equiv
~div-type-ref
~div-type-ref-cluster
~ed-agent
~ed-stamp
~ed-time
~element-scope
~entity-digital-generic-ref
~entity-digital-tan-other-ref
~entity-digital-tan-self-ref
~entity-nondigital-ref
~entity-tok-def
~error-flag
~feature
~feature-list
~feature-pattern
~feature-pattern-no-code
~feature-qty-test
~feature-test
~filter
~func-param-flags
~func-param-pattern
~func-replace
~grammar-attr
~group-attributes
~group-ref
~help-opt
~href-opt
~id-option
~inclusion
~inclusion-att
~inclusion-item
~inclusion-list
~internal-id
~internal-idrefs
~IRI-gen
~IRI-gen-ref
~item
~item-picker
~item-pos-ref
~key-item
~key-list
~keyword-ref
~lang-of-content
~lang-outside
~lexeme
~lexicon-attr
~loc-self
~loc-src
~locus
~matches-m
~matches-tok
~metadata-desc
~metadata-human
~modal-claim
~morph
~n
~n-val
~name-change
~non-class-2-opt
~nonsource-rights
~nontextual-reference
~object
~object-constraint
~object-datatype
~object-element
~object-lexical-constraint
~other-body-attributes
~period-filter
~place-filter
~pointer-to-div-item
~pointer-to-div-range
~progress
~rationale
~realignment
~reanchor-div-ref-item
~relationship
~report
~reuse-type-attr
~rights-holder
~role-list
~role-ref
~see-also-item
~see-also-list
~seg-ref
~seq-picker
~seq-pos-ref
~set-of-claims
~simple-object
~simple-rationale
~simple-subject
~simple-textual-reference
~source-id-opt
~source-item
~source-list
~source-ref
~source-refs
~source-rights
~split
~subject
~TAN-body
~TAN-body-core
~TAN-c-decl
~TAN-c-decl-core
~TAN-c-item
~TAN-head
~TAN-key-decl
~TAN-key-item
~TAN-LM-item
~TAN-R-mor-body
~TAN-root
~TAN-tail
~TAN-ver
~test-pattern
~text-div
~textual-reference
~tok-attr-core
~tok-cert-opt
~tok-regular
~tok-sequence
~tok-sequence-attr-core
~tok-source-ref-opt
~tok-with-cont-but-no-src
~tok-with-src-and-cont
~tok-without-cont-or-src
~token-value-ref
~type
~units
~URI-tag
~verb
~when-claim
~work-equiv
~work-ref
~work-refs

The 81 elements and 60 attributes defined in TAN, excluding TEI, are the following::

@adverb @affects-element <agent> <agentrole> <alias> <align> <ana> <anchor-div-ref> <assert> @bitext-relation <bitext-relation> <body> <category> @cert @cert2 <change> @chars <checksum> <claim> @claim-basis <claim-basis> @claimant @code <comment> @cont @context <declarations> @def-ref <desc> <div> <div-ref> <div-type> @div-type-ref <div-type-ref> @ed-when @ed-who <equate-div-types> <equate-works> <feature> @feature-qty-test @feature-test <filter> @flags <for-lang> @from @group <group> <group-type> <head> @help @href @id @idrefs @in-progress @include <inclusion> <IRI> <item> <key> <l> @lexicon <lexicon> <lm> <location> <locus> <m> <master-location> @matches-m @matches-tok <modal> @morphology <morphology> @n <name> @new <normalization> @object <object> @object-datatype @object-lexical-constraint @old <person> <place> @pos <realign> @ref @regex <relationship> <rename> <rename-div-ns> <replace> @replacement <report> @reuse-type <reuse-type> <rights-excluding-sources> @rights-holder <rights-source-only> <role> @roles <scriptum> <see-also> @seg <source> <split-leaf-div-at> @src @subject <subject> <suppress-div-types> <tail> <TAN-A-div> <TAN-A-tok> <TAN-c> <TAN-key> <TAN-LM> <TAN-mor> <TAN-T> @TAN-version @to <tok> <token-definition> <topic> <transliteration> @type <unit> @units @val <value> @verb <verb> <version> @when <when> @when-accessed @where @which @who @work <work> @xml:id @xml:lang

The contents of this chapter have been generated automatically. Although much effort has been spent to ensure accurate representation of the schemas and function library, you may find errors or inconsistencies. In such cases, the functions and schemas (particularly the RELAX-NG, compact syntax) are to be given priority.

The element agent specifies a person or organization that played a direct or indirect role in preparing, creating, or editing the data.

At least one <agent> must have an <IRI> that is a tag URN whose namespace matches that of the IRI name. By default, the first such <agent>, called the key agent, is taken to be the person or organization ultimately responsible for the assertions in the current file. See the section called “@id and a TAN file's IRI Name”

This element may also name a computer or algorithm that performed a task. This feature is useful for crediting software, e.g., an OCR program used to convert an image, or an algorithm that estimates word-to-word alignments.

Formal Definition
~ed-stamp?, 
   (~inclusion | 
      (@roles?, @xml:id, (<comment>* & 
         
            ((<IRI>+, ~metadata-human) | @which))))

Used by: ~TAN-head

[Caution]Caution

Every TAN file must have a primary agent, the organization or person that takes the greatest responsibility for the content of the TAN file. The primary agent is defined as the first <agent> with an <IRI> that is a tag URI whose namespace matches the namespaces of @id in the root element.





The element body contains the data

Formal Definition
@in-progress?, ~ed-stamp?, (<comment>* & 
{[TAN-A-div (~TAN-body-core):]   ~TAN-body-core} OR 

{[TAN-mor (~TAN-body-core):]   ~TAN-R-mor-body} OR 

{[TAN-core (~TAN-body-core):]   ~TAN-body-core})

Used by: ~TAN-root





The element change declares a change made to the current file. Must credit an <agent>, specified by @who, and a time the change was made, specified by @when.

Collectively, <change> elements are called the changelog, the revision history of the document.

The editor has discretion as to how long or detailed a <change> should be, or how many should be retained in a changelog. Ideally, <change>s documenting every published version should be retained.

<change> elements may appear in any order, but it is good practice to put the most recent at the top.

Formal Definition
~ed-stamp?, @when, @flags?, @who, text




The element comment discusses issues relevant to nearby data. Must credit an <agent>, specified by @who, and a time the comment was made, specified by @when.

Formal Definition
@when, @who, text

Used by: ~split, ~realignment, ~alignment, ~feature-pattern, ~feature-pattern-no-code, ~category, ~decl-div, ~decl-filt, ~decl-filt-norm, ~func-replace, ~decl-pattern-default, ~decl-pattern-no-id, ~decl-pattern-language, ~decl-group-type, ~TAN-head, ~TAN-body, ~nonsource-rights, ~inclusion-item, ~key-item, ~source-item, ~source-rights, ~see-also-item, ~decl-opt, ~agent-list, ~role-list, ~TAN-LM-item





The element declarations contains assumptions or decisions made that materially affect the interpretation of the data in <body>. Every TAN format's <declarations> is unique.

Formal Definition
~ed-stamp?, (<comment>* & <alias>* & 
{[TAN-A-div (~declaration-items):]   
   (<token-definition>* & <suppress-div-types>* & <rename-div-ns>* & )} OR 

{[TAN-A-tok (~declaration-items):]   
   (<token-definition>* & <suppress-div-types>* & <rename-div-ns>* & <bitext-relation>+ & <reuse-type>++ & <group-type>*)} OR 

{[TAN-c (~declaration-items):]   } OR 

{[TAN-key (~declaration-items):]   <group-type>*} OR 

{[TAN-class-1 (~declaration-items):]   
   (
      (<work> & <version>? & <div-type>+ & <token-definition>* & <filter>?) & {empty})} OR 

{[TAN-core (~declaration-items):]   {empty}} OR 

{[TAN-LM-core (~declaration-items):]   
   (<token-definition>* & <suppress-div-types>* & <rename-div-ns>* & <lexicon>+ & <morphology>+ & <group-type>*)})

Used by: ~TAN-head





The element desc provides a description of a concept, person, or thing referred to by the parent element (or the current document, if the parent element is <head>). <desc> is, in effect, a <comment> about that concept, person, or thing. It has two possible structures, one human-readable and the other computer-readable.

Under the first, human-readable approach, <desc> takes merely a descriptive text about the entity, optionally with @xml:lang. If you provide descriptions in other languages, it best to make sure that each version says roughly the same thing.

Under the second, computer-readable approach, <desc> takes an IRI + name pattern plus <location> and @href pointing to a <TAN-c> file, which provides contextual information about the concept, person, or thing.

Formal Definition

   (~metadata-desc | (@which | 
      
         (@href | (<IRI>, ~metadata-human, <checksum>*, <location>+))))
[Caution]Caution

All text must be normalized (Unicode NFC).




The element head contains the metadata (data about the data contained by <body>)

This element indicates at a bare minimum the name of the file, the sources, the most significant parts of the editorial history; the linguistic or scholarly conventions that have been adopted in creating the data; the license, i.e., who holds what rights to the data, and what kind of reuse is allowed; the persons, organizations, or entities that helped create the data, and the roles played by each.

The structure of <head> is shared across TAN files, with differences between them isolated to the child <declarations>.

Formal Definition
~ed-stamp?, 
   (<comment>* & 
      (~entity-digital-tan-self-ref, <rights-excluding-sources>, (<inclusion>* & <key>* & 
{[TAN-A-div (~source-list):]   <source>+} OR 

{[TAN-A-tok (~source-list):]   ~source-list} OR 

{[TAN-c (~source-list):]   {empty}} OR 

{[TAN-key (~source-list):]   {empty}} OR 

{[TAN-LM-lang (~source-list):]   <for-lang>} OR 

{[TAN-LM (~source-list):]   <source>} OR 

{[TAN-class-3 (~source-list):]   <source>*} OR 

{[TAN-core (~source-list):]   <source>} & <see-also>*), <declarations>, <agent>+, <role>+, <agentrole>*, <change>+))

Used by: ~TAN-root





The element inclusion specifies a TAN file that is available for inclusion. An inclusion occurs whenever an element X points to this inclusion by means of @include. TAN-compliant validators and processors will find every X that is found in the included file (checked recursively, against any inclusions of X adopted by the inclusion) and insert them at that place in the main document.

Only select elements will be included, not the entire inclusion file. Exactly which elements are included is dictated by @include.

Invoking an <inclusion> does not require its use.

For more on this, see the section called “Inclusions and Keys”

Formal Definition
~ed-stamp?, @xml:id, (<comment>* & 
   
      (@href | (<IRI>, ~metadata-human, <checksum>*, <location>+)))

Used by: ~work-equiv, ~div-type-equiv, ~split, ~realignment, ~alignment-inclusion-opt, ~TAN-key-item, ~feature-pattern, ~feature-pattern-no-code, ~category, ~test-pattern, ~text-div, ~claim, ~decl-div, ~decl-filt-norm, ~func-replace, ~decl-supp-div-type, ~decl-rename-div-n, ~decl-pattern-default, ~decl-pattern-no-id, ~decl-pattern-language, ~decl-group-type, ~decl-tok-def, ~body-group, ~nonsource-rights, ~inclusion-list, ~key-item, ~source-item, ~source-rights, ~see-also-item, ~relationship, ~agent-list, ~role-list, ~agent-role-list, ~decl-alias, ~decl-morph, ~decl-lexi, ~TAN-LM-item

[Caution]Caution

Inclusions may not introduce duplicate values of @xml:id.

[Caution]Caution

For any element with @include, at least one element of the same name must be found in target inclusion document.

[Caution]Caution

Inclusions may not be circular.

[Caution]Caution

Inclusions are integral parts of any TAN file. Access to at least one copy is absolutely mandatory.

[Caution]Caution

Every inclusion should have at least one document available.

[Caution]Caution

Every element with a <location> should have at least one document available.

[Caution]Caution

Every TAN file referred to by way of an element containing <location> should have an @id that matches the <IRI> of the parent of the <location>

[Caution]Caution

No element may point to a TAN file that has an identical @id value; the only exception is a <see-also> pointing to an older or new version.

[Important]Important

If @when-accessed predates one or more dates in a target file, a warning will be returned.

[Important]Important

If a target file does not explicitly give the <body>'s @in-progress the value of true(() a warning will be returned. Target file is marked as being in progress.

[Important]Important

If a target file has a <see-also> marked as a new version (update) a warning will be returned.


The element IRI contains an International Resource Identifier that serves as a name for the a concept, person, or thing referred to by the parent element. IRIs are explained at the section called “Identifiers and Their Use”.

Any kind of IRIs are allowed: URLs, tag URNs, UUIDs, etc. For names of well-known resources, a URL identifier might be preferred (http://...), to facilitate linked data. If an entity/resource lacks a suitable URL-type name, you may use or coin any other valid IRI, such as a UUID, a tag URN, or an OID. Some concepts may be difficult to find IRIs for.

Sibling <IRI>s are to be treated as names for the same thing, not as names of different things. Nevertheless, they are not synonymous, only poecilonymic. In the terms of Web Ontology Language (http://www.w3.org/TR/owl-ref/), sibling <IRI>s cannot be assumed to share the relationship owl:sameAs, because they will draw from independent vocabularies that may define similar concepts differently.

An element defined with multiple <IRI>s is technically within the intersection, not the union, of those definitions. Nevertheless, most interpretations of TAN files will draw inferences based upon the union. That is, if item A is defined by IRI X, item B by IRIs X and Y, and item C with IRI Y, it is likely that users of the data will infer identity between items A and C. It is advisable to be cautious is assigning multiple IRIs to entities.

The element is named IRI instead of URI to encourage internationalization. Alphabets other than the Latin are welcome.

Formal Definition
~ed-stamp?, anyURI (pattern [a-zA-Z][\-.+a-zA-Z0-9]+:\S+)

Used by: ~entity-digital-tan-other-ref, ~entity-digital-generic-ref, ~entity-nondigital-ref

[Caution]Caution

An IRI may appear no more than once in a TAN document.

[Caution]Caution

An IRI that names a TAN file must match that file's @id exactly.

[Caution]Caution

No file may import keys that have duplicate IRIs.

[Caution]Caution

All text must be normalized (Unicode NFC).

[Caution]Caution

Every item in a reserved TAN-key must have at least one IRI with a tag URN in the TAN namespace


The element key specifies a tan:item from a TAN-key (predefined, or declared in a <key>) that defines the contents of an element that has @which.

Any number of <key>s may be supplied, but all <item>s with unique names for the element indicated by @affects-elements.

For more discussion, see the section called “Keyword Vocabulary (TAN-key)”

Formal Definition
~ed-stamp?, 
   (~inclusion | (<comment>* & 
      
         (@href | (<IRI>, ~metadata-human, <checksum>*, <location>+))))

Used by: ~key-list

[Caution]Caution

No file may import keys that have duplicate IRIs.

[Caution]Caution

Every element with a <location> should have at least one document available.

[Caution]Caution

Every TAN file referred to by way of an element containing <location> should have an @id that matches the <IRI> of the parent of the <location>

[Caution]Caution

No element may point to a TAN file that has an identical @id value; the only exception is a <see-also> pointing to an older or new version.

[Important]Important

If @when-accessed predates one or more dates in a target file, a warning will be returned.

[Important]Important

If a target file does not explicitly give the <body>'s @in-progress the value of true(() a warning will be returned. Target file is marked as being in progress.

[Important]Important

If a target file has a <see-also> marked as a new version (update) a warning will be returned.

[Caution]Caution

An element's @which must have a value that corresponds to a <name>, either in the core TAN keyword or an associated TAN-key file, that is marked as applying to that element.

[Caution]Caution

Keywords (values of @which) must be unique for a given element name.

[Caution]Caution

Any element that takes @which must have keywords defined for that element.

[Caution]Caution

Keys are integral parts of a document. Access to at least one version is absolutely mandatory.


The element location declares where an electronic file was found and when.

The URL may be absolute or relative to the current document.

Formal Definition
~ed-stamp?, @when-accessed, @href

Used by: ~entity-digital-tan-other-ref, ~entity-digital-generic-ref

[Caution]Caution

Every element with a <location> should have at least one document available.

[Caution]Caution

Every TAN file referred to by way of an element containing <location> should have an @id that matches the <IRI> of the parent of the <location>

[Caution]Caution

No element may point to a TAN file that has an identical @id value; the only exception is a <see-also> pointing to an older or new version.

[Important]Important

If @when-accessed predates one or more dates in a target file, a warning will be returned.

[Important]Important

If a target file does not explicitly give the <body>'s @in-progress the value of true(() a warning will be returned. Target file is marked as being in progress.

[Important]Important

If a target file has a <see-also> marked as a new version (update) a warning will be returned.



The element master-location points to a location where a master copy of the file is to be found. Use of this element entails a commitment to updating the TAN file in those locations. Also, if @in-progress is false, a <master-location> must be provided.

The URL may be absolute or relative to the current document.

<master-location> does not disallow the file from being kept, published, or distributed elsewhere. It merely points to the main locations where an authoritative version of the file is to be found.

Formal Definition
~ed-stamp?, @href

Used by: ~entity-digital-tan-self-ref

[Caution]Caution

Any TAN file marked as being no longer in progress must have at least one master-location.

[Caution]Caution

No <master-location> may have an @href that points to a compressed archive.





The element name provides a human-readable name of a concept, person, or thing referred to by the parent element (or the current document, if the parent element is <head>)

Formal Definition
~metadata-desc
[Caution]Caution

All text must be normalized (Unicode NFC).

[Caution]Caution

Names may not duplicate reserved TAN keywords for the affected element.

[Caution]Caution

Names may not be duplicated for the same element.

Example 8.43. <name>

<TAN-T id="tag:kalvesmaki.com,2014:tan-t:ar.cat.eng.1926.edghill:model-object-refs" TAN-version="1 dev">
   <head>
      <name>Categories, Aristotle, English translation by E. M. Edghill</name>
      <rights-excluding-sources rights-holder="kalvesmaki">
         <IRI>http://creativecommons.org/licenses/by/4.0/deed.en_US</IRI>
         <name>Creative Commons Attribution 4.0 International License</name>
         <desc>Exclusive of rights held and licenses offered by rightsholders of the source or
            sources listed below, this data file, insofar as it constitutes an independent work, is
            licensed under a Creative Commons Attribution 4.0 International License.</desc>
      </rights-excluding-sources>
      <source>
         <IRI>http://id.lib.harvard.edu/aleph/007901738/catalog</IRI>
         <name>Aristotle: Categoriae & De interpretatione by E.M. Edghill. Analytica priora / by
            A.J. Jenkinson. Analytica posteriora / by G.R.G. Mure. Oxford : Clarendon Press, 1926.
         </name>
      </source>
      <see-also>
         .........
         <IRI>tag:kalvesmaki.com,2014:tan-t:ar.cat.grc.1949.minio-paluello:object-refs</IRI>
         <name>Categories, Aristotle, Greek text by Minio-Paluello</name>
         <location href="ar.cat.grc.1949.minio-paluello-obj.xml" when-accessed="2016-07-07T16:36:28.867-04:00"/>
      </see-also>
      <see-also>
         .........
         <IRI>tag:kalvesmaki.com,2014:tan-t:ar.cat.eng.1926.edghill:semantic-refs</IRI>
         <name>Categories, Aristotle, English translation by E. M. Edghill</name>
         <location href="ar.cat.eng.1926.edghill.sem.xml" when-accessed="2016-07-07T16:36:28.867-04:00"/>
      </see-also>
      .........
   </head>
   .........
</TAN-T>


The element relationship specifies the role that the item named by the parent <see-also> played. This may be either a reserved keyword or an IRI + name pattern that identifies a specific kind of relationship.

See main.xml# keywords-relationship f or standardized vocabulary.

Formal Definition
~ed-stamp?, (~inclusion | 
   
      ((<IRI>+, ~metadata-human) | @which))

Used by: ~see-also-item

[Caution]Caution

Any <see-also> whose <relationship> is defined as requiring a target TAN file must point to a file whose root element is a TAN file.

[Caution]Caution

Any <see-also> whose <relationship> is defined as requiring a target TAN-c file must point to a TAN file whose root element is <TAN-c>.

[Caution]Caution

Any <see-also> whose <relationship> is defined as requiring a target copy must point to a TAN file whose root element is identical.

[Caution]Caution

<see-also> may have the <relationship> of a different work version only if both are class 1 files and both share the same work.

[Caution]Caution

In class 1 files, alternative editions must share the same source.

[Caution]Caution

In class 1 files, alternative editions must share the same work.

[Caution]Caution

In class 1 files, alternative editions must share the same work-version, if supplied.

[Caution]Caution

In class 1 files, resegmented copies must have identical transcriptions, after TAN normalization.

[Caution]Caution

A class 1 file and its model must have the same work.

[Caution]Caution

A class 1 file may have no more than one model.

[Important]Important

If a class 1 file diverges from the structure of its model a warning will be generated specifying where differences exist.



The element rights-excluding-sources states the license under which the data is distributed and the rights associated with it, EXCLUSIVE of any rights attached to the source.

Diligently check to ensure that the license you have claimed respects the rights of your sources' rightsholders. It is recommended that you license your data under a license that is similar to or more liberal than the one under which your sources have been released.

For more discussion, see the section called “Rights and Licenses” and for a list of standard vocabulary, main.xml# keywords-rights-excluding-sources

Formal Definition
~ed-stamp?, 
   (~inclusion | 
      (@rights-holder, (<comment>* & 
         
            ((<IRI>+, ~metadata-human) | @which))))

Used by: ~TAN-head





The element role specifies a role (responsibility, task, or activity) that one or more <agent>s did in creating or editing the data.

A role may be any activity, e.g., editor, funder, supervisor, data-processor, peer reviewer, patron, defined through the enclosed IRI + name pattern.

Formal Definition
~ed-stamp?, 
   (~inclusion | 
      (@xml:id, (<comment>* & 
         
            ((<IRI>+, ~metadata-human) | @which))))

Used by: ~TAN-head





The element see-also identifies auxiliary entities that were materially helpful in creating or editing the data, or are helpful in understanding the data.

This element is especially useful for crediting third parties who provided a set of raw data that served as a starting point, or was consulted.

Formal Definition
~ed-stamp?, 
   (~inclusion | 
      (<comment>* & 
         (<relationship>, (
            
               ((<IRI>+, ~metadata-human) | @which) | 
            
               ((<IRI>+, ~metadata-human, <checksum>*, <location>+) | @which) | 
            
               (@href | (<IRI>, ~metadata-human, <checksum>*, <location>+))))))

Used by: ~see-also-list

[Caution]Caution

Any <see-also> whose <relationship> is defined as requiring a target TAN file must point to a file whose root element is a TAN file.

[Caution]Caution

Any <see-also> whose <relationship> is defined as requiring a target TAN-c file must point to a TAN file whose root element is <TAN-c>.

[Caution]Caution

Any <see-also> whose <relationship> is defined as requiring a target copy must point to a TAN file whose root element is identical.

[Caution]Caution

<see-also> may have the <relationship> of a different work version only if both are class 1 files and both share the same work.

[Caution]Caution

In class 1 files, alternative editions must share the same source.

[Caution]Caution

In class 1 files, alternative editions must share the same work.

[Caution]Caution

In class 1 files, alternative editions must share the same work-version, if supplied.

[Caution]Caution

In class 1 files, resegmented copies must have identical transcriptions, after TAN normalization.

[Caution]Caution

A class 1 file and its model must have the same work.

[Caution]Caution

A class 1 file may have no more than one model.

[Important]Important

If a class 1 file diverges from the structure of its model a warning will be generated specifying where differences exist.

[Caution]Caution

Every element with a <location> should have at least one document available.

[Caution]Caution

Every TAN file referred to by way of an element containing <location> should have an @id that matches the <IRI> of the parent of the <location>

[Caution]Caution

No element may point to a TAN file that has an identical @id value; the only exception is a <see-also> pointing to an older or new version.

[Important]Important

If @when-accessed predates one or more dates in a target file, a warning will be returned.

[Important]Important

If a target file does not explicitly give the <body>'s @in-progress the value of true(() a warning will be returned. Target file is marked as being in progress.

[Important]Important

If a target file has a <see-also> marked as a new version (update) a warning will be returned.



The element source identifies the source upon which the data in the <body> of the current file depends.

TAN-T and TAN-LM allow only one <source>. TAN-A-tok allows exactly two. All other TAN formats require one or more.

Formal Definition
~ed-stamp?, 
   (~inclusion | 
      (
{[TAN-A-div (~source-id-opt):]   @xml:id} OR 

{[TAN-A-tok (~source-id-opt):]   @xml:id} OR 

{[TAN-class-3 (~source-id-opt):]   @xml:id?} OR 

{[TAN-core (~source-id-opt):]   {empty}}, 
         (<comment>* & 
            ((
               
                  ((<IRI>+, ~metadata-human) | @which) | 
               
                  ((<IRI>+, ~metadata-human, <checksum>*, <location>+) | @which) | 
               
                  (@href | (<IRI>, ~metadata-human, <checksum>*, <location>+))), 
{[TAN-class-2 (~source-rights):]   {empty}} OR 

{[TAN-core (~source-rights):]   <rights-source-only>}?))))

Used by: ~source-list

[Caution]Caution

Every element with a <location> should have at least one document available.

[Caution]Caution

Every TAN file referred to by way of an element containing <location> should have an @id that matches the <IRI> of the parent of the <location>

[Caution]Caution

No element may point to a TAN file that has an identical @id value; the only exception is a <see-also> pointing to an older or new version.

[Important]Important

If @when-accessed predates one or more dates in a target file, a warning will be returned.

[Important]Important

If a target file does not explicitly give the <body>'s @in-progress the value of true(() a warning will be returned. Target file is marked as being in progress.

[Important]Important

If a target file has a <see-also> marked as a new version (update) a warning will be returned.

[Caution]Caution

Sources are integral parts of a class 2 TAN file. Access to at least one copy is absolutely mandatory.





The element token-definition takes a regular expression to define a word token. This element will be used to segment a string into token and non-token components.

This element takes attributes that function as the parameters for the function xsl:analyze-string (see https://www.w3.org/TR/xslt-30/#element-analyze-string).

For more see the section called “Defining Words and Tokens”

Formal Definition
~ed-stamp?, 
   (~inclusion | 
      (
{[TAN-class-2 (~source-refs):]   @src} OR 

{[TAN-core (~source-refs):]   {empty}} OR 

{[TAN-LM-core (~source-refs):]   {empty}}, 
         (@which | (@regex, @flags?))))

Used by: ~declaration-items, ~decl-class-1, ~entity-tok-def

[Caution]Caution

No source may be given more than one token definition.


[Note]Note

Taken from ringoroses.div.1.xml




The element version identifies the version of a work. Applicable to sources that contain multiple versions, e.g., original text and facing translations. Like <work>, <version> points to a conceptual entity, not a physical one.

In the context of a class 1 file, the entity identified by <version> is assumed to be a version of the entity defined in <work>. In TAN-c files, however, no relationship is assumed between <version> and any putative work, unless explicitly stated in that file.

Very few work-versions have their own URN names. It is advisable to assign a tag URN or a UUID. If you have used an IRI for <work> that you are entitled to modify, you may wish to add a suffix that will name the version. If you need to specify exactly where on a text-bearing object a version appears, <desc> or <comment> should be used.

For more, see the section called “One work”

Formal Definition
~decl-pattern-default

Used by: ~TAN-c-decl-core, ~decl-class-1


The element when constrains an event to a period of time.

Multiple values of <when> are interpreted to mean "or" with union. No distribution takes place (e.g., x <when/> with y <when/> means "at time x or y", not "at time x" and "at time y").

Formal Definition
@from, @to

Used by: ~claim, ~agent-role-list

The element work indicates a creative work. The element identifies a conceptual entity, not a physical one.

The term "work" is only loosely defined in TAN. Any text that has enough unity to be referred to in ordinary conversation as a single entity may be identified as a work. A work may be composed of other works, be a part of other works, or even overlap with other works. E.g., the Lord's Prayer, the Gospel of Luke, the Tetravengelion, the New Testament, and the Bible are all valid works, despite the complex relationship between each of them.

This element takes the IRI + name pattern. For more, see the section called “One work”

Formal Definition
~decl-pattern-default

Used by: ~TAN-c-decl-core, ~decl-class-1

[Caution]Caution

A work element may invoke no more than one inclusion.





The attribute affects-element names one or more TAN elements that the keywords apply to

Formal Definition

Used by: ~other-body-attributes, ~group-attributes, ~TAN-key-item

[Caution]Caution

@affects-element must include only names of TAN elements that accept @which


[Note]Note

Taken from ar.cat.TAN-key.xml



[Note]Note

Taken from div-types.TAN-key.xml


[Note]Note

Taken from features.TAN-key.xml

The attribute cert2 provides a second measure of certainty. The value is taken along with @cert as the range in which an editors certainty resides.

Formal Definition
double (pattern 1|0|(0\.\d*[1-9]))

Used by: ~cert-claim

The attribute ed-when marks the date or time when an element or its content was edited (added or modified)

The value of must always conform to an ISO date or dateTime pattern. See the section called “Dates and times”.

Along with @ed-who, this forms the Edit Stamp pattern. See the section called “Edit Stamp”

This attribute is inheritable. See the section called “Interpretation of inheritable attributes”

Formal Definition
(
   dateTime 
   date )

Used by: ~ed-stamp

[Caution]Caution

Date attributes must be castable either as xs:dateTime or xs:date

[Caution]Caution

Future dates are not permitted.




The attribute ed-who refers to one or more <agent>s who have edited (added or modified) an element or its content.

Along with @ed-when, this forms the Edit Stamp pattern. See the section called “Edit Stamp”

This attribute is inheritable. See the section called “Interpretation of inheritable attributes”

Formal Definition

Used by: ~ed-stamp

[Caution]Caution

Every idref in an attribute must point to the @xml:id value of the appropriate corresponding element.

[Caution]Caution

All idrefs in an attribute must be unique.




The attribute from specifies the beginning of a period of time

Formal Definition
(
   dateTime 
   date )

Used by: <when>

[Caution]Caution

Date attributes must be castable either as xs:dateTime or xs:date

[Caution]Caution

Future dates are not permitted.

[Caution]Caution

@from must predate @to

The attribute href points to the location of a file. In some contexts, this attribute is allowed only as a temporary measure, to invoke editing assistance by means of Schematron Quick Fixes.

Formal Definition

Used by: ~entity-digital-tan-other-ref, ~loc-self, ~loc-src

[Caution]Caution

@href must have <location> or <master-location> as a parent; any other parent will trigger a quick fix to populate the element with the IRI + name pattern of the target file.

[Important]Important

If fn:doc-available(() for an @href returns false, the following message will be returned. @href points to file that is either (1) not available, (2) not valid XML, or (3) at a server not trusted by the validation engine.

[Caution]Caution

No <master-location> may have an @href that points to a compressed archive.



The attribute id contains a tag URN that permanently and uniquely names the current file, the so-called IRI Name of the current file. See the section called “@id and a TAN file's IRI Name” for discussion.

For more on the syntax of tag URNs see the section called “Tag URNs”

Formal Definition
anyURI (pattern tag:([\-a-zA-Z0-9._%+]+@)?[\-a-zA-Z0-9.]+\.[A-Za-z]{2,4},\d{4}(-(0\d|1[0-2]))?(-([0-2]\d|3[01]))?:\S+)

Used by: <TAN-A-div>, <TAN-A-tok>, <TAN-c>, <TAN-key>, <TAN-mor>, <TAN-T>, <TAN-LM>

[Caution]Caution

Every TAN file must have a primary agent, the organization or person that takes the greatest responsibility for the content of the TAN file. The primary agent is defined as the first <agent> with an <IRI> that is a tag URI whose namespace matches the namespaces of @id in the root element.





The attribute in-progress specifies whether or not the editors of the current file have not yet finished supplying the data, intend to make important changes, or otherwise wish to reserve the right to make major changes.

This attribute does not claim that the data is perfect or that it will not be changed. Rather, it signals to users, especially those who would use the file the object of a <source>, <see-also>, or <inclusion>, the possibility of major work that may render dependent data as wrong or invalid.

Formal Definition
boolean 

Used by: ~TAN-body

[Caution]Caution

Any TAN file marked as being no longer in progress must have at least one master-location.





The attribute include signals that the parent element is to be replaced by all elements of the same name found in the referred <inclusion>.

Formal Definition
IDREFS 

Used by: ~inclusion

[Caution]Caution

Every idref in an attribute must point to the @xml:id value of the appropriate corresponding element.

[Caution]Caution

All idrefs in an attribute must be unique.

[Caution]Caution

Inclusions may not introduce duplicate values of @xml:id.

[Caution]Caution

For any element with @include, at least one element of the same name must be found in target inclusion document.

[Caution]Caution

Inclusions may not be circular.

[Caution]Caution

Inclusions are integral parts of any TAN file. Access to at least one copy is absolutely mandatory.

[Caution]Caution

Every inclusion should have at least one document available.

[Caution]Caution

A work element may invoke no more than one inclusion.

[Caution]Caution

Every <feature> inclusion must support every language that has been declared.


The attribute n names a <div> or <group>.

In a <div> of a class 1 file, the space-delimited concatenation of values of @n from the rootmost ancestor becomes the reference for a <div>.

Special use may be made of the tilde (~), as a kind of surrogate hyphen (which is disallowed), to indicate an @n that corresponds to a range of values, e.g., n="7~8" for a <div> that has text that mixes text from 7 and 8.

Formal Definition
string (pattern (\w|\d+-\d+)+(\s+(\w|\d+-\d+)+)*)

Used by: ~text-div, ~group-attributes

[Caution]Caution

No single set of references may mix Roman numerals, alphabetic numerals, and numerals that are ambiguously either.

[Caution]Caution

Leaf div references must be unique.

[Caution]Caution

To avoid ambiguous numerals, no div type should mix Roman and alphabet numerals.


The attribute regex specifies a regular expression pattern to be searched for or matched. TAN regular expressions include an extended syntax, most noted by the special escape character \k{}.

For more see the section called “Regular Expressions” and https://www.w3.org/TR/xpath-functions-30/#regex-syntax

Formal Definition

Used by: ~func-replace, ~decl-tok-def

[Caution]Caution

Attributes that take a regular expression must use escape sequences recognized by XML schema or TAN escape extensions (\k{}). See http://www.w3.org/TR/xmlschema-2/#regexs for details.


[Note]Note

Taken from ringoroses.div.1.xml




The attribute rights-holder specifies one or more <agent>s who hold the rights over the material specified by the parent element (either the data of the current file, or of the source that forms the basis for the data).

Nothing should be inferred from a missing @rights-holder from <rights-source-only>. Its absence does not mean that the rightsholder is unknown or nonexistent. For more, see the section called “Rights and Licenses”

Formal Definition

Used by: ~nonsource-rights, ~source-rights





The attribute roles refers to the ID of one or more <role>s

Formal Definition

Used by: ~agent-list, ~agent-role-list

[Caution]Caution

Every idref in an attribute must point to the @xml:id value of the appropriate corresponding element.

[Caution]Caution

All idrefs in an attribute must be unique.





The attribute to specifies the end of a period of time

Formal Definition
(
   dateTime 
   date )

Used by: <when>

[Caution]Caution

Date attributes must be castable either as xs:dateTime or xs:date

[Caution]Caution

Future dates are not permitted.

[Caution]Caution

@from must predate @to

The attribute when-accessed specifies when an electronic file was last examined

Formal Definition
(
   date 
   dateTime )

Used by: <location>

[Caution]Caution

Date attributes must be castable either as xs:dateTime or xs:date

[Caution]Caution

Future dates are not permitted.



The attribute which used to point to a reserved keyword, either a reserved tokenization pattern or a relationship.

Formal Definition

Used by: ~decl-tok-def, ~entity-digital-generic-ref, ~entity-nondigital-ref, ~metadata-human, ~decl-morph

[Caution]Caution

An element's @which must have a value that corresponds to a <name>, either in the core TAN keyword or an associated TAN-key file, that is marked as applying to that element.

[Caution]Caution

Keywords (values of @which) must be unique for a given element name.

[Caution]Caution

Any element that takes @which must have keywords defined for that element.

[Caution]Caution

Keys are integral parts of a document. Access to at least one version is absolutely mandatory.