<token-definition>

The element token-definition takes a regular expression to define a word token. This element will be used to segment a string into token and non-token components.

This element takes attributes that function as the parameters for the function xsl:analyze-string (see https://www.w3.org/TR/xslt-30/#element-analyze-string).

For more see the section called “Defining Words and Tokens”

Formal Definition

~ed-stamp?, 
   (~inclusion | 
      (
{[TAN-A-lm (~sources-ref):]   {empty}} OR 

{[TAN-class-2 (~sources-ref):]   @src} OR 

{[TAN-core (~sources-ref):]   {empty}}, 
         (@which | (@pattern, @flags?))))

Defined at: TAN-core.rng

Used by: ~defn-class-1, ~definition-class-2, ~entity-tok-def

[Caution]Caution

No source may be given more than one token definition.

Example 8.215. <token-definition>

      <definitions>
         <comment when="2016-02-22-05:00" who="park">The following token definition treats the
                following as words: sequences of letters, any individual character that is neither a
                letter nor a space (i.e., punctuation).</comment>
         <token-definition src="eng-us" pattern="[-\w]+"/>
         <person xml:id="park">
            .........
         </person>
         .........
      </definitions>


[Note]Note

Taken from ringoroses.div.1

Example 8.216. <token-definition>

      <definitions>
         <token-definition pattern="[\w­​‍]+"/>
         <lexicon xml:id="LSJ">
            .........
         </lexicon>
         .........
      </definitions>


[Note]Note

Taken from ar.cat.grc.1949.minio-paluello-sem-TAN-LM-sample

Example 8.217. <token-definition>

      <definitions>
         .........
         <lexicon xml:id="english">
            .........
         </lexicon>
         <token-definition which="letters and punctuation"/>
         <person xml:id="park">
            .........
         </person>
         .........
      </definitions>


[Note]Note

Taken from ring-o-roses.eng.1881.lm

Example 8.218. <token-definition>

      <definitions>
         .........
         <reuse-type xml:id="adaptation">
            .........
         </reuse-type>
         <token-definition src="ring1881 ring1987" which="letters"/>
         <person xml:id="park">
            .........
         </person>
         .........
      </definitions>


[Note]Note

Taken from ringoroses.01+02.token.1