<normalization>
)Definitive list of key terms used for normalizations to texts.
Master location: http://textalign.net/release/TAN-2018/TAN-key/normalizations.TAN-key.xml
Table 9.7. TAN keywords for types of normalizations
keywords (optional values of @which ) | IRIs | Comments |
---|---|---|
|
| Discretionary word-break line-end hyphens have been deleted. |
|
| General Punctuation spaces (U+2000..U+200B) to regular space have been replaced
with regular space. Equivalent to |
|
| Footnote or endnote signals (frequently superscript numbers or letters) have been deleted. |
|
| Footnotes or endnotes have been deleted. |
|
| Editorial comments have been deleted. |
|
| Reference pointers to other texts, both internal (cross-references) and external (citations of primary or secondary sources) have been deleted. |
|
| Reference milestones such as page numbers and section numbers have been deleted. |
|
| All ligatures have been converted into constituent letters. |
|
| All combining letters (U+0363..U+036F) have been converted to their corresponding ASCII counterpart. |
|
| All orthography (spelling) has been tacitly corrected to standard forms. |
|
| All punctuation has been tacitly corrected to standard forms. |
|
| All punctuation has been removed. |
|
| Quotation marks have been removed. |
|
| All letters have been tacitly capitalized according to standard forms. |
|
| All uppercase letters converted to lowercase. |
|
| All lowercase letters converted to uppercase. |
|
| Printed music has been removed. |
|
| All prepunctuation space has been corrected according to standard forms. |
|
| All non-NFC-compliant Unicode converted to normalized Unicode. Same effect as if
applying |
|
| HTML converted to TAN-T format |
|
| All numbers, letters, or other labels inserted by the author or editor to indicate
references (the value ordinarily placed in |