The element tok
identifies one or more words or word fragments. Used by class 2 files to make assertions about specific words.
In TAN-A-div and TAN-A-tok files, <tok>
has no linguistic connotations; in TAN-A-lm, it normally does.
<tok>
s that are restricted to a single token, or a portion of a single token. This is the normal behavior of <tok>
. Multiple values in @src
, @ref
, and @pos
will result in expansion across all values. But multiple values of @chars
are taken to refer to the constituent parts of a single <tok>
and so no expansion occurs on @chars.
This syntax allows multiple <tok>
s to be collapsed into a single one, to save space and perhaps enhance legibility. For example, a <tok>
with 2 values for @src
, 3 for @ref
, 4 for @pos
, and 5 for @chars
will result in a <tok>
that points to 24 tokens, each of which is filtered to the same five characters (by position, not content). Put another way, <tok src="X" ref="a" pos="1"/> and <tok src="X" ref="a" pos="2"/> is always identical to <tok src="X" ref="a" pos="1-2"/>
If you wish to treat multiple word fragments as a single token, use <group>
.
Formal Definition
~certainty-stamp
?,@val
~ed-stamp
?, {[TAN-A-div (~tok-sources-ref-opt
):] {empty}} OR {[TAN-class-2 (~tok-sources-ref-opt
):] {{[TAN-A-lm (~sources-ref
):] {empty}}} OR {{[TAN-class-2 (~sources-ref
):]@src
}} OR {{[TAN-core (~sources-ref
):] {empty}}}},@ref
, (@val
|@pos
| (@val
,@pos
)), {[TAN-A-div (~tok-cert-opt
):] {empty}} OR {[TAN-class-2 (~tok-cert-opt
):] (@cert
| (@cert
,@cert2
))?},@chars
?~ed-stamp
?,@ref
, ( (@val
|@pos
| (@val
,@pos
)) |~tok-range-selector
)
Defined at:
TAN-A-lm.rng
, TAN-class-2.rng
Used by: ~lm-tok-ref
, ~alt-reassign
, ~tok-ref
, ~tok-ref-group
Caution | |
---|---|
Every token must be locatable in every cited ref in every source. |
Caution | |
---|---|
Every character must be locatable in every token in every ref in every source. |
Caution | |
---|---|
In a ranged |
Example 8.214. <tok>
<TAN-A-div TAN-version="2018" id="tag:parkj@textalign.net,2015:ar.cat.tan-a-div:claims"> ......... <body claimant="lmp"> ......... <claim subject="dexippus porphyry"> <claim subject="andronicus boethus" adverb="perhaps" verb="omits"> <object src="grc"> <tok ref="1 a 2" pos="3-4"/> </object> </claim> </claim> <claim subject="herminus comm-omnes" verb="agrees"> <locus src="grc"> <tok ref="1 a 2" pos="3-4"/> </locus> </claim> ......... <claim subject="B" verb="replaces"> <locus src="grc"> <tok ref="1 a 5" pos="1-2"/> </locus> ......... </claim> <claim subject="Λ" adverb="perhaps" verb="replaces"> <locus src="grc"> <tok ref="1 a 5" pos="1-2"/> </locus> ......... </claim> <claim subject="π α φ ο" verb="agrees"> <locus src="grc"> <tok ref="1 a 5" pos="1-2"/> </locus> </claim> ......... </body> </TAN-A-div>
Note | |
---|---|
Taken from ar.cat.tan-a-div.claims |