<normalization>
)Definitive list of key terms used for normalizations to texts.
Master location: http://textalign.net/release/TAN-2021/vocabularies/normalizations.TAN-voc.xml
Table 11.11. TAN keywords for types of normalizations
names | IRIs | Comments |
---|---|---|
no hyphens |
| Discretionary word-break line-end hyphens have been deleted. |
norm space |
| General Punctuation spaces (U+2000..U+200B) to regular space have been replaced
with regular space. Equivalent to |
no note callouts |
| Footnote or endnote signals (frequently superscript numbers or letters) have been deleted. |
no notes |
| Footnotes or endnotes have been deleted. |
no comments |
| Editorial comments have been deleted. |
no pointers |
| Reference pointers to other texts, both internal (cross-references) and external (citations of primary or secondary sources) have been deleted. |
no milestones |
| Reference milestones such as page numbers and section numbers have been deleted. |
no ligatures |
| All ligatures have been converted into constituent letters. |
no combining chars |
| All combining letters (U+0363..U+036F) have been converted to their corresponding ASCII counterpart. |
corrected spelling |
| All orthography (spelling) has been tacitly corrected to standard forms. |
corrected punctuation |
| All punctuation has been tacitly corrected to standard forms. |
no punctuation |
| All punctuation has been removed. |
no quotation marks |
| Quotation marks have been removed. |
corrected capitalization |
| All letters have been tacitly capitalized according to standard forms. |
changed to lowercase |
| All uppercase letters converted to lowercase. |
changed to uppercase |
| All lowercase letters converted to uppercase. |
no music |
| Printed music has been removed. |
no prepunctuation space |
| All prepunctuation space has been corrected according to standard forms. |
normalized unicode unicode nfc unicode normalized |
| All non-NFC-compliant Unicode converted to normalized Unicode. Same effect as if
applying |
converted html to tan |
| HTML converted to TAN-T format |
no reference markers |
| All numbers, letters, or other labels inserted by the author or editor to indicate
references (the value ordinarily placed in |
accents normalized |
| Accents have been normalized. If missing, they have been supplied. If incorrect, they have been corrected. |