Text Alignment Network

The Text Alignment Network (TAN) is a framework that allows users, working independently and collaboratively, to share, find, create, edit, and explore digital texts and annotations.
A customized extension of Text Encoding Initiative (TEI) XML, TAN is particularly suited for organizing and aligning texts with multiple versions (copies, translations, paraphrases), and for creating and editing text annotations such as quotations, translation clusters (word-to-word), and linguistic features.
The foundation of TAN is a suite of XML formats, each designed for a specific task. The extensive valid routines maximize the syntactic and semantic interoperability of texts, annotations, and language resources. TAN comes with applications and utilities that open new frontiers in scholarly publishing, research, and teaching.
Why use TAN?
Extensive error checking. Built-in TAN validation rules go well beyond the customary error-checking performed by other formats. Files linked in the network "talk" to each other, to let users know about changes and updates. More than one hundred types of content-based errors are checked. Through Schematron Quick Fixes, many of the problems can be corrected in a matter of seconds.
Time-saving utilities. Enjoy enhanced editing functions in Oxygen XML Editor's Author mode. Highly customizable TAN utilities help you create, edit, and maintain TEI and TAN files. For example:
Pathbreaking applications. Core TAN applications, written in XSLT, provide cutting-edge tools for textual research and analysis. For example:
Intuitive text referencing. Unlike TEI, HTML, or other markup systems that rely heavily upon arbitrary identifiers that can be difficult to navigate and maintain, TAN points to text portions using familiar reference systems, or user-customized tokenization rules.
Application development. TAN is built upon an extensive and robust XSLT function library, one of the few of its kind. Do you already use Natural Language Toolkit, Classical Language Toolkit, or comparable packages in programming languages to develop tools for textual and linguistic research? Do you have to process, analyze, and transform texts that are in tree structures? With more than 250 public functions, covering a range of tasks, from numerics to maps, checksums to tree manipulation, the TAN function library might have everything you need, and more, and help you stay within an XML environment. Many TAN functions are extremely useful, even outside of TEI or TAN.
Semantic Web. TAN was designed at the outset to ensure that texts and their annotations would be rooted in the practices of the Semantic Web. Unlike many other formats, whose attribute values are almost always only human-readable, most TAN file components are tied to URIs, making them suitable for use in Semantic Web applications.

Select TAN libraries

Lexico-morphology (rules for Latin, Greek, Syriac, Coptic, English, and TAN-A-lm files in Latin and Greek)
Pseudo-Methodios

Presentations

Doha, Qatar, March 2018
Berlin, Germany, December 2018
Kalamazoo, Michigan, May 2019
Bergen, Norway, April 2022
Paderborn, Germany, September 2023

Publications

———. “Multiple String Comparison in XSLT with tan:collate().” Proceedings of Balisage: The Markup Conference 2022. Balisage Series on Markup Technologies, vol. 27 (2022). https://doi.org/10.4242/BalisageVol27.Kalvesmaki01.
Kalvesmaki, Joel. “String Comparison in XSLT with tan:diff().” Proceedings of Balisage: The Markup Conference 2021. Balisage Series on Markup Technologies, vol. 26 (2021). https://doi.org/10.4242/BalisageVol26.Kalvesmaki01.
———. “A New \u: Extending XSLT Regular Expressions for Unicode.” Proceedings of Balisage: The Markup Conference 2020. Balisage Series on Markup Technologies, vol. 25 (2020). https://doi.org/10.4242/BalisageVol25.Kalvesmaki01.
———. “Intertextual Pointers in the Text Alignment Network,"  Journal of Data Mining & Digital Humanities (Oct. 2017)
———. “Three Ways to Enhance the Interoperability of Cross-References in TEI XML.” Proceedings of the Symposium on Cultural Heritage Markup, Balisage Series on Markup Technologies, 16 (2015). https://doi.org/doi:10.4242/BalisageVol16.Kalvesmaki01.

Select output

A number of end-use applications for TAN have been designed. Listed here are assorted output files from select TAN applications, past and present. The output is intended as proof-of-concept, not as polished product, illustrative of how TAN files can be manipulated for publishing, studying, creating, and editing.
Sample output may contain known errors in content, structure, styling, and Javascript. Files may be added, renamed, or deleted at any time. If the appearance looks irregular, trying using a different browser.
All output content is believed to be available under a license that permits this type of use.

Participation

Changes are made regularly to TAN, mainly in its development branch. If you have a TAN library, sharing it with other participants, particularly via Git, will help developers test any changes that have been made to the function library, and encourage others to contribute to your project.
The TAN project is by no means finished. This version TAN merely scratches the surface of what is possible. New participants to test, use, and develop TAN's schemas, functions, guidelines, and applications are welcome. Inquiries about participation should be sent to the project director, Joel Kalvesmaki, by email: director at textalign.net.
Official announcements are made by email (Google Group) and by Twitter.

Previous versions

Version 2020: textalign.net | GitHub | zip
Version 2018: textalign.net | GitHub | zip
Version 1 dev (2015-2017): textalign.net | GitHub | zip

Creative Commons License
Unless otherwise specified, all material on this site is licensed under a Creative Commons Attribution 4.0 International License.