<tok>

The element tok identifies one or more words or word fragments. Used by class 2 files to make assertions about specific words.

In TAN-A-div and TAN-A-tok files, <tok> has no linguistic connotations; in TAN-A-lm, it normally does.

<tok>s that are restricted to a single token, or a portion of a single token. This is the normal behavior of <tok>. Multiple values in @src, @ref, and @pos will result in expansion across all values. But multiple values of @chars are taken to refer to the constituent parts of a single <tok> and so no expansion occurs on @chars.

This syntax allows multiple <tok>s to be collapsed into a single one, to save space and perhaps enhance legibility. For example, a <tok> with 2 values for @src, 3 for @ref, 4 for @pos, and 5 for @chars will result in a <tok> that points to 24 tokens, each of which is filtered to the same five characters (by position, not content). Put another way, <tok src="X" ref="a" pos="1"/> and <tok src="X" ref="a" pos="2"/> is always identical to <tok src="X" ref="a" pos="1-2"/>

If you wish to treat multiple word fragments as a single token, use <group>.

Formal Definition

~certainty-stamp?, @val~ed-stamp?, 
{[TAN-A-div (~tok-sources-ref-opt):]   {empty}} OR 

{[TAN-class-2 (~tok-sources-ref-opt):]   
  {{[TAN-A-lm (~sources-ref):]   {empty}}} OR 

  {{[TAN-class-2 (~sources-ref):]   @src}} OR 

  {{[TAN-core (~sources-ref):]   {empty}}}}, @ref, 
   
      (@val | @pos | (@val, @pos)), 
{[TAN-A-div (~tok-cert-opt):]   {empty}} OR 

{[TAN-class-2 (~tok-cert-opt):]   
   
      (@cert | (@cert, @cert2))?}, @chars?~ed-stamp?, @ref, (
   
      (@val | @pos | (@val, @pos)) | ~tok-range-selector)

Defined at: TAN-A-lm.rng, TAN-class-2.rng

Used by: ~lm-tok-ref, ~alt-reassign, ~tok-ref, ~tok-ref-group

[Caution]Caution

Every token must be locatable in every cited ref in every source.

[Caution]Caution

<tok> must reference a leaf <div>.

[Caution]Caution

Every character must be locatable in every token in every ref in every source.

[Important]Important

No <tok> should duplicate any sibling <tok>.

[Caution]Caution

In a ranged <tok> in a <reassign>, the token referred to by <from> must precede the one referred to by <to>.

Example 8.214. <tok>

<TAN-A-div TAN-version="2018" id="tag:parkj@textalign.net,2015:ar.cat.tan-a-div:claims">
   .........
   <body claimant="lmp">
      .........
      <claim subject="dexippus porphyry">
         <claim subject="andronicus boethus" adverb="perhaps" verb="omits">
            <object src="grc">
               <tok ref="1 a 2" pos="3-4"/>
            </object>
         </claim>
      </claim>
      <claim subject="herminus comm-omnes" verb="agrees">
         <locus src="grc">
            <tok ref="1 a 2" pos="3-4"/>
         </locus>
      </claim>
      .........
      <claim subject="B" verb="replaces">
         <locus src="grc">
            <tok ref="1 a 5" pos="1-2"/>
         </locus>
         .........
      </claim>
      <claim subject="Λ" adverb="perhaps" verb="replaces">
         <locus src="grc">
            <tok ref="1 a 5" pos="1-2"/>
         </locus>
         .........
      </claim>
      <claim subject="π α φ ο" verb="agrees">
         <locus src="grc">
            <tok ref="1 a 5" pos="1-2"/>
         </locus>
      </claim>
      .........
   </body>
</TAN-A-div>


[Note]Note

Taken from ar.cat.tan-a-div.claims