Chapter 1. Introduction

The Text Alignment Network (TAN) is a suite of highly regulated XML formats intended for scholars to align and share texts and textual analysis at a maximal level of syntactic and semantic interoperability. TAN is particularly suited to textual works with multiple versions (translations, paraphrases), and to related datasets on quotations, word-for-word alignments, and lexicomorphological features.

TAN files are simple, modular, and networked, allowing users, working independently and collaboratively, to edit, study, and annotate shared files. The extensive validation rules depend upon a library of processing functions that definitively interpret the format, thereby informing and helping editors in research and publication, and providing a basis for developing tools and applications.

Although expressive of scholarly nuance and complexity, the TAN format has been designed to benefit everyone, scholars and non-scholars alike, and can be used broadly for multilingual publishing, language learning, and machine translation.