To facilitate the research questions mentioned above, the TAN encoding formats and this manual have been designed around a few core principles.
Scholarly freedom: Scholars should be able to create data within their sphere of inquiry simply, expressively, independently, and with fidelity to their guiding lights.
Given two ways of expressing the same idea, simplicity is better than complexity, expressiveness than silence. Simplicity and expressiveness should be treated as complementary ideals. In cases where one must be chosen over the other, simplicity is to be preferred.
Editors should be able to register doubt about claims. If in doubt about an assertion, an editor should be able to state alternatives.
Editors should be able to work on the same material indepedently but interoperably.
Editors should work freely within their theories, opinions, and assumptions about language. They should declare those positions, not suppress or alter them.
Scholarly responsibility: Scholars must make their data uniquely citable, and responsibly describe how that data was created.
Each TAN file should have an expressive, unique, persistent name that can be cited and used independent of the file's location or availability.
Editors must supply, at the very minimum, the core statements of responsibility that are normally expected in any scholarly work:
What was done by whom, when.
What sources have been used.
Who holds rights over the data, and what reuse is permitted.
What editorial assumptions and decisions were made in creating the data.
Utility to both computers and humans: Data should be easy for both humans and computers to read and write; the latter should be able to import, process, and create the data reliably, consistently, and interoperably.
The format should depend upon stable technologies or standards.
All classes and types of formats in the TAN suite should be structured consistently and predictably.
As many as possible computable inconsistencies or errors should be flagged by validation rules.
Every datum should be expressed in both a form that is as human readable as possible and a form that is computer-readable, to make the material suitable for linked data (semantic web) or for processing via an algorithm.
In a given file, data should not be redundant, irrelevant to the immediate points of inquiry, or more reliably and authoritatively found elsewhere.
References to textual units or linguistic concepts should be expressed .
Each TAN file, or collection of files, should be integrally complete and fully useful, independent of any other software such as text processors or version control software.