About the format

About the format
Prev	Chapter 1. Introduction	Next

TAN differs from other text formats such as HTML, Microsoft Word, PDF, or Docbook. Each of those formats are interoperable only in the sense that any file can be reliably opened and displayed by the same software. Despite such software compatibility, the content, structured by each user, looks very different from one file to the next. If you receive from different people two versions of a particular literary work in the same file format (e.g., Word or PDF), there would be little likelihood that you could align them in a new document without a lot of extra work. These are presentation formats, designed to let the creator use his or her imagination to shape, structure, and present the material in highly stylized, creative ways. The formats are laissez faire, concerned mainly to ensure that each component is rendered properly, without regard for the meaning of those components.

Creating a text in TAN is like opening a word processor and telling it, "I don't care how the text looks. I want to ensure that it is in a meaningful structure that corresponds to any other version of that text. The appearance, which could take thousands of directions, can be worried about later."

The closest analogue to the TAN formats is the XML format developed by the Text Encoding Initiative, whose design catalyzed and continues to inspire the development of TAN. TAN is, in fact, a customized extension of TEI. TAN takes a handful of TEI concepts and extends them via stand-off annotation, to allow for overlapping annotations, to engage with the Semantic Web, and to support cross-project interoperability. TAN reduces some of the repetition that tends to be necessary in TEI files. For more on comparisons between TAN and TEI see the section called “The Text Encoding Initiative”.

Some other caveats:

Although TAN comes with an extensive library of functions and templates, it is not what most people think of as a tool or application. It is not customer, off-the-shelf software. It does not come with graphic interface. Rather, it is a package of XML resources, particularly in XSLT, that allows programmers and developers to create customized applications and tools. If you work with an XML editor like Oxygen, your editing experience will be greatly enhanced by the TAN function library, which was designed in Oxygen, and optimized for it.
The TAN formats are specialized. They are not meant to replace other common text formats such as TEI, Docbook, and so forth, or other alignment formats such as XLIFF or TMX. Converting a TAN file into these formats is usually straightforward, but will usually entail loss. Conversely, most conversions from one of these formats into TAN will not entail loss, but will be imperfect or incomplete, because many of these formats lack the data required by TAN. Conversion must be given careful thought, and can only be semiautomated.
Each TAN format has a restricted field of inquiry, defined and explained in these guidelines. TAN is not for everyone. For example, if you are working on developing a transcription that imitates a particular print edition, you are better off using only TEI, or a version of TEI that you have customized. But once you want to bring that transcription into close comparison with other versions and study it intertextually, then TAN might be ideal.

Prev	Up	Next
Rationale and purpose	Home	Participation