Configuring and running an XSLT application

Configuring and running an XSLT application
Prev	Chapter 9. Using TAN Applications and Utilities	Next

Configuring global parameters

Once you have determined the master XSLT stylesheet for the application, you may want to configure it by adjusting the values given to the global parameters. You have several possible strategies:

Work with a configuration file. If you are comfortable writing some simple XSLT code, you might create a small XSLT file that has nothing but an <xsl:import> whose @href value points to the original stylesheet. Copy from the master XSLT stylesheet only those <xsl:param>s that you want to change. This method is quick to set up and easy to use, but it also means that you do not have immediate access to documentation.
Overwrite the values in the master XSLT stylesheet directly. This method is quick, but it also means that you might not easily restore the original settings, unless you make a backup copy. Also, if you are using configuration files, their default values will change. That could be good or bad, depending upon your setup.
Work from a copy of the master XSLT file. This method allows you to customize the entire application, and consult as needed the original settings in the master file. Like configuration files (see above), you can make new copies for new situations emerge. You should make certain that any working copies are in the same subdirectory as the original, to keep links intact.
Manage transformations from Oxygen. Oxygen XML Editor has a powerful feature, Configure Transformation Scenarios, which allows you to create custom configurations for an XSLT application. Oxygen has good documentation on how to use this flexible feature, which can be combined with any of the preceding three options. Oxygen allows you not only to configure the parameters but to manage input and output. One drawback is that you are presented with all the global parameters that can be found, whether or not they are really relevant. Documentation associated with a particular parameter may be missing or truncated. You should use this feature in conjunction with any documentation that comes with the XSLT application.

Whatever method you adopt for configuration, first find the relevant global parameters. Once you have them, you should always ensure you understand what type of data is expected, and in what quantity.

Data types. XSLT is a strongly typed programming language. The data that is bound to variables and parameters are always at least implicitly typed. Many variables or parameters specify exactly what kind of data is expected. Those that do not are assigned some default type by the XSLT processor. Most data types you encounter will be of two sorts: atomic types, and nodes. Examples of atomic types are integers, booleans, strings, and dates. Examples of nodes are elements, attributes, comments, and processing instructions. There are other types, but we will focus here on the most common.

Quantities. In XSLT, there are four quantity categories: (1) zero or one; (2) exactly one; (3) zero or more; (4) one or more. Each of these are specified by adding to a data-type declaration a quantifier: ?, nothing, *, and +.

Table 9.1. Quantifiers and data types

Quantity	Symbol	Atomic type example	Node type example
zero or one	`?`	`xs:string?`	`element()?`
exactly one	none	`xs:boolean`	`document-node()`
zero or more	`*`	`xs:dateTime*`	`attribute()*`
one or more	`+`	`xs:integer+`	`comment()+`

Below are some of the more common data types you will find in global parameters, along with several examples going from simple values up to more complex assignments based upon XPath expressions or XSLT constructions. For more background, see the section called “XPath language”. Focus is placed upon data types and quantities expected in select TAN applications and utilities.

Strings. A string is a concatenated sequence of characters. Even when the value consists only of Arabic numerals, a string will be read and interpreted as a text, not as an integer.

In the following example, the string value is specified by the single quotation marks within the double quotation marks. The double-quotation marks delimit the value of the attribute, and the single-quotation marks specify that the value is a string. If you did not include the single quotation marks, it would be interpreted as an XPath expression pointing to the name of a child element within the context.

<xsl:param name="text-a-to-compare" as="xs:string?" select="'Every day'"/>

When more than one string is expected, the strings should be separated by a comma. It is also common to surround the series with parentheses, for visual clarity. This example assigns to the parameter a sequence of two strings.

<xsl:param name="text-a-to-compare" as="xs:string+" select="('day', 'night')"/>

In the next example, @select is replaced by the text node within the parameter. This technique can be useful if the value expected will be space-normalized, and you want to wrap text, and you do not need to create multiple strings.

<xsl:param name="text-a-to-compare" as="xs:string?">Every day</xsl:param>

The next example takes the primary input XML and converts it to a string. Such conversion is called casting. Keep in mind that the context node of any global parameter is the primary input XML document.

<xsl:param name="text-a-to-compare" as="xs:string" select="string(/)"/>

Perhaps you need to supply a path to some input. The following example traverses the tree to a particular @href within the primary input. The string value in that attribute will be treated like a URL, and it will be resolved relative to the base URI of the primary input.

<xsl:param name="path-to-source" as="xs:string" 
     select="resolve-uri(/*/tan:head/tan:predecessor/tan:location/@href, base-uri(/))"/>

If a parameter allows multiple values, and you need to change those values frequently, you might want to bind options to global parameters or global variables of your own creation...

<xsl:variable name="dir-1-path" as="xs:string" select="'../../novels/book-a'"/>
<xsl:variable name="dir-2-path" as="xs:string" select="'test/comparanda'"/>
<xsl:variable name="dir-3-path" as="xs:string" select="'test/logs'"/>
<xsl:variable name="dir-4-path" as="xs:string" select="'../brown/texts'"/>

...then update the master global parameter on a case-by-case-basis.

<xsl:param name="secondary-input-relative-uri-directories" as="xs:string+"
   select="$dir-1-path, $dir-4-path"/>

The preceding example allows you to quickly change from one set of data to another.

Booleans. A boolean is a true/false value. If a parameter expects a boolean, you should use some XPath expression that can be cast to a boolean, even if it is a simple one, such as true() or false(). If you need to express the value as a string, it should be either "true", "false", "0", or "1".

<param name="ignore-comments" as="xs:boolean" select="false()"/>
<param name="preoptimize-string-order" as="xs:boolean" select="'true'"/>

Integers. To supply an integer, you need only use numerals, perhaps preceded by a hyphen if it is negative. You should not use quotation marks, or the parameter's child text node. There will be no confusion of the integer with an XPath step, because no element's name may begin with a digit.

<xsl:param name="start-at-depth" as="xs:integer" select="1"/>
<xsl:param name="ngram-auras" as="xs:integer+" select="(2, 1)"/>

Decimals. Decimals are much like integers, but require decimal points. If the decimal is between 1.0 and -1.0, the decimal point must be preceded by a zero, e.g., -0.99.

<xsl:param name="diff-threshold-of-interest" as="xs:decimal" select="0.2"/>

Elements. If a global parameter expects elements as input, you must construct them inline, or provide an XPath expression that directs the processor to the elements in question. The following example shows how to construct a parameter that might be fed into tan:batch-replace().

<xsl:param name="additional-batch-replacements" as="element()">
   <replace pattern="(\d\d)/(\d\d)/(\d\d\d\d)" replacement="$3-$1-$2"
      message="Converted U.S.-style date to ISO-style"/>
</xsl:param>

The parameter used in the previous example might need to be given numerous elements. In those cases it might be convenient to put them in a separate XML file and point to it, with an XPath expression:

<xsl:param name="additional-batch-replacements" as="element()"
   select="doc('batch-replacements.xml')/*/tan:replace"/>

Starting the XSLT process

Running an XSLT application can be done in several ways. As noted above, at the heart of the process is the XSLT processor. The goal is to find the means to feed the primary input and the master stylesheet into the processor, and to tell the processor where to place the output.

From the command line. Processors such as Saxon allow you to initiate the process from the command line.

Windows:
Macintosh:
1. Open the Shell app;
2. Using the command cd navigate to the directory where your files are, e.g., cd E:/myfiles.
From there, follow the instructions provided by the vendor of the XSLT processor. Saxon provides instructions for its product at https://www.saxonica.com/documentation10/index.html#!using-xsl/commandline. A simple command-line instruction might look like the following:
```
java -cp "E:/xslt processors/saxon-he-10.0.jar" -s:init.xml -xsl:app.xsl
   -o:primary-output.xml
```

From Oxygen XML Editor. Oxygen provides numerous ways to initiate the XSLT process, including the following:

XSLT Debugger Perspective. This editing mode changes the appearance of Oxygen, putting eligible primary input files on the left, XSLT files in the middle, and an output pane on the right. You can choose the processor you prefer, and pick your primary input and master stylesheet. Running the application provides interactive output, with many diagnostic tools, letting you learn how the output came about.
Transformation Scenarios. You can choose configure transformation scenarios, and create a highly customized set of conditions for running an XSLT application.

These methods, and other more sophisticated approaches, are described by the vendor in their documentation, https://www.oxygenxml.com/.