up

Features

 From LaTeX to HTML  Low-Level Features  Sectioning and Tables of Contents  Tables  Lists and Environments  Pictures  Mathematical Formulas  Paragraphs  XHTML, MathML, DocBook, TEI, and Unicode  Cascade Style Sheets (CSS)  Fonts  Literate Programs (with ProTeX) and Scripts  TeX  Configurable Hooks  Private Configuration Files  General Configuration Files

From LaTeX to HTML

The translation of a LaTeX source file into HTML consists of loading the style package tex4ht.sty into the source file, choosing the desirable options for the translation, compiling the source into dvi code with the native LaTeX engine, and postprocessing the outcome with the tex4ht and t4ht programs (see overview).

The htlatex command loads a script which takes on itself to invoke the different steps of the process, without user intervention. The command assumes the form

htlatex filename "options1" "option2" "options3"

where the first set of options is for the tex4ht.sty and *.4ht style files, the second option is for the tex4ht postprocessor, and the third set for the t4ht postprocessor. If not empty, the second option should be a path, from the root directory ht-fonts of the hypertext fonts to a subdirectory. For instance,

htlatex foo
This command requests a translation according to the default conditions, which are set to produce HTML transitional 4.0 code.
htlatex foo "html,2,info"
This command is equivalent to the previous one, specifying explicitly the option html for tex4ht.sty instead of doing so implicitly.

In addition, the command requests a break up of the output into separate web pages, in accordance to the two top sectioning levels of the document. Moreover, it asks for a listing in the log file of the information available for the style files in use.

htlatex foo "" "dbcs/!"
This command requests the loading of the dbcs branch of Chinese hypertext fonts (on top of those already requested by the default setting).
htlatex foo "foo,frames" "" "-p"
This command requests (La)TeX to load a private configuration file, named foo.cfg, and to place the content and table of contents in separate frames. In addition, it asks t4ht not to produce bitmaps for pictures.
wlatex foo
This command requests HTML transitional 4.0 output, tuned for loading by Microsoft Word.

Documents requiring the combination of Latin, Greek, and Hebrew are probably best served, by the commonly available browsers, when compiled to Latin and Greek in iso-8859-7, with the Hebrew content translated to unicode or pictures. For instance,

htlatex foo "html,iso-8859-7,RL2LR,rl2lr" "unicode/hebrew/"

htlatex foo "html,iso-8859-7,pic-RL"

An Alternative Script

The base style file tex4ht.sty of TeX4ht can be explicitly loaded into the source file, through the instruction ‘\usepackage{tex4ht}’. In such a case, the translation may be activated with a command of the form ‘ht latex filename’.

 \documentclass{article}
    \usepackage{tex4ht}
 \begin{document}
    ..................
 \end{document}
Variations to the default outcome may be requested through parameters of the \usepackage command. The package parameters invoke built-in options of TeX4ht.

 \documentclass{article}
    \usepackage[html,mouseover]{tex4ht}
 \begin{document}
    ..................
 \end{document}
The ‘html’ parameter requests HTML output. The ‘mouseover’ parameter asks for JavaScript-driven pop-up messages, upon dragging the mouse over pointers to footnotes and bibliography entries.

The first package parameter is distinguished in that it may refer to ‘html’, to ‘xhtml’, or to a user-provided configuration file; other values are ignored. An extension ‘cfg’ is assumed, if the file name is provided without an extension.


 \documentclass{article}
    \usepackage[myconfig,html,3.2]{tex4ht}
 \begin{document}
    ..................
 \end{document}
The default environment loaded by the command ‘\usepackage{tex4ht}’ implicitly assumes an empty configuration file, and a ‘html’ parameter.

More on ‘htlatex’

Given a LaTeX file


 \documentclass{article}
 \begin{document}
    ..................
 \end{document}
the ‘htlatex filename’ command produces a call to ‘ht latex filename’ on an implicit file of the following form.

 \documentclass{article}
    \usepackage{tex4ht}
 \begin{document}
    ..................
 \end{document}
On the other hand, the command ‘htlatex filename "options"’ produces a call to ‘ht latex filename’ on an implicit file of the following form.

 \documentclass{article}
    \usepackage[options]{tex4ht}
 \begin{document}
    ..................
 \end{document}
TeX sources may use commands of the form ‘httex filename’ and ‘ht tex filename’, TeXsources may use commands of the forms ‘httexi filename’ and ‘ht tex filename’.

Quite a few variants of the htlatex script are included in the distribution, and many others can be easily tailored.

Validation

The output of TeX4ht can be easily broken. Hence, it is very important to validate the outcome.

TeX4ht doesn’t offer a built-in parser to verify the correctness of the outcome. However, external validator(s) can quite easily be integrated into the compilation process.

Recommendations

Most applications might require the knowledge of just a few additional simple features of TeX4ht, if any. Hence, it is strongly advised to check the output obtained from the default configuration, before trying to work with other settings.

The remainder of this document provides much more than that, with an eye directed toward users that want to customize their outcome. Therefore, the reader is encouraged to skim the information provided below for acquiring a general understanding of the system, leaving the tedious learning of the details to when the need arises.

To keep with the spirit of LaTeX and hypertext, in which style is assumed to be separated from content, the users are encouraged to avoid inserting TeX4ht code into their source files. Instead, they should place their modifications, to the default settings, within private configuration files to be loaded by htlatex-like commands.

Low-Level Features

The following are some of the more useful underlying commands of TeX4ht.

1 \HCode{...}
2 \HPage{anchor}content\EndHPage{}
3 \Link[target-file arguments]{target-loc}{cur-loc}anchor\EndLink
4 \ifHtml... \else... \fi
5 \ifOption{...}{true-part}{false-part}

Sectioning and Tables of Contents

A non-leading package parameter ‘1’, ‘2’, ‘3’, or ‘4’, in \usepackage, asks for a tree-structured set of files, reflecting on the sectioning of the document to the specified depth. Sequential prev-next links within the hierarchy, instead of the default hierarchical ones, can be requested with the ‘next’ parameter. The parameter ‘sections+’ creates titles for the sectioning commands that link to the tables of contents.

Finer control is possible with the following commands.

1 \CutAt{at-unit,until-unit-1,until-unit-2,...}
2 \tableofcontents[unit-1,unit-2,...]
3 \TocAt{at-unit,unit-1,unit-2,...,/until-unit-1,/until-unit-2,...}
4 \ConfigureToc{unit} {before-mark} {before-title} {before-page-number} {at-end}
5 \Configure{tableofcontents} {before-toc} {end-of-toc} {after-toc} {before-nonindented-par} {before-indented-par}
6 \Configure{TocAt} {before-toc} {after-toc}
7 \Configure{TocAt*} {before-toc} {after-toc}
8 \Configure{unit} {top} {bottom} {before-title} {after-title}
9 \Configure{CutAt} {unit} {before-button} {after-button}
10 \Configure{+CutAt} {unit} {before-button} {after-button}
11 \NewSection\unit {mark-for-toc}
12 \Configure{crosslinks} {left-delimiter} {right-delimiter} {next} {prev} {prev-tail} {front} {tail} {up}
13 \Configure{crosslinks+} {before-top-links} {after-top-links} {before-bottom-links} {after-bottob-links}

Tables

Tables with \multicolum entries need a few LaTeX compilations to stabilize.

1 \Configure{table} {before-tbl} {after-tbl} {before-row} {after-row} {before-entry} {after-entry}

Lists and Environments

The appearances of lists and \begin-\end environments are configured with the following commands.

1 \ConfigureList{list-name} {before-list} {after-list} {before-label} {after-label}
2 \ConfigureEnv{environment-name} {before-environment} {after-environment} {before-list} {after-list}

Pictures

The next command imports external pictures, and the two commands that follow request pictorial representations for local content. The attributes, and the replacement parameters with their enclosing rectangular brackets, are optional.

1 \Picture[replacement-for-textual-browser]{file-name attributes}
2\Picture+[replacement-for-text-browsers]{file-name attributes}content\EndPicture
3\Picture*[replacement-for-text-browsers]{file-name attributes}content\EndPicture

Mathematical Formulas

In the default setting, the math environments ‘\(...\)’, and the display math environments ‘\[...\]’ and ‘$$...$$’, request pictorial representations for their content. On the other hand, the math environments ‘$...$’ ask for no special treatment. Simple features like mathematical symbols, subscripts, and superscripts, are translated into html, and more complex entities like roots and fractions are translated into pictures (example).

1 \Configure{[]} {before$$at-start} {at-end$$after}, \Configure{()}{before$at-start}{at-end$after}
\Configure{$$}{before}{after}{at-start}
\Configure{$}{before}{after}{at-start}
2 \Configure{SUB}{before}{after}
\Configure{SUP}{before}{after}
\Configure{SUBSUP}{before}{between}{after}
3 no_, no^

Paragraphs

The insertions of code at paragraph breaks are controlled by the following commands.

1 \Configure{HtmlPar} {noindent-P} {indent-P} {from-noindent-P} {from-indent-P}
\EndP
2 \IgnorePar
3 \ShowPar
4 \IgnoreIndent
5 \ShowIndent

XHTML, MathML, DocBook, TEI, and Unicode

Scripts similar to htlatex are available for the different modes of output under support. The outcome of the translations should be checked by validators for proper syntax. Typically, with the presence of validators, errors are easy to detect and correct, but they require human intervention.

In particular, it might be worthwhile to notice some of the more common sources of problems for MathML.

Cascade Style Sheets (CSS)

Cascade style sheets attach presentations to the content of hypertext pages, in a manner similar to the way that ‘.sty’ files define the presentations to the content of source LaTeX files. TeX4ht produces a CSS file for each document that is translated to HTML transitional 4.0 code. The following are related commands.

1\Css{content}
2 \Css content\EndCss
3 \CssFile[list-of-css-files]content\EndCssFile

Fonts

TeX4ht has an elaborated machinery for handling fonts, through special virtual hypertext fonts stored in ‘.htf’ files. Instead of providing a design for each symbol, as is the case in standard fonts, the virtual fonts provide a content for each symbol. The following commands offer some control, from within the source LaTeX documents, over the content provided to the symbols.

1 \NoFonts
2 \EndNoFonts
3 \Configure{htf} {class} {delimiter} {template-1} {template-2} {template-3} {template-4} {template-5} {template-6} {template-7}
4 \Configure{htf-sty} {class/font} {CSS-instructions}

The htf fonts might request pictorial representations for symbols. In such cases, the sizes of the pictures depend on the sizes of the TeX fonts in use. Size changes through the \magnification command should be made before loading the tex4ht.sty package.

The design of a virtual hypertext font might take some labor, but it does not require too much sophistication.

Literate Programs (with ProTeX) and Scripts

Literate programming is a discipline that promotes the writing of programs the way one explains them to human beings. ProTeX is a literate programming system fully implemented in terms of TeX, and it is compatible with LaTeX and other TeX-base systems. TeX4ht, and ProTeX itself, are examples of applications written in ProTeX.

1\input ProTex.sty
\AlProTex{extension,<<<>>>,list,title,escape-character}
2\<title\><<<
code fragment
>>>
3`<title`>
4\OutputCode\<...\>

Scripts produce the content in verbatim format with no decorations.

1 \ScriptEnv{environment} {prefix} {postfix}
2 \ScriptCommand{\command} {prefix} {postfix}
3 \JavaScript...\EndJavaScript

TeX

Source TeX files are treated in a manner similar to the way LaTeX source files are treated, with the obvious restriction that only TeX commands are allowed. In particular, the \usepackage command is not valid in TeX. A counter part of the htlatex system command is called httex and it takes a similar format.

httex filename "options1" "option2" "options3"

The htlatex implies a loading of an implicit or an explicit configuration file when the command \begin{document} is encountered. The httex command, on the other hand, requires the insertion of the code ‘\csname tex4ht\endcsname’ into the source TeX file, at the location where the implicit or explicit configuration file is to be loaded (example).

The configuration files for TeX are similar to those for LaTeX, with the only exception of not including the ‘\begin{document}’ instruction.

\Preamble{...}...\EndPreamble

The compilation, of sources which explicitly include the configuration files, can be invoked with a command of the form ‘ht tex filename’ (example).

The following are package options are available for TeX only.

1 plain-
2 pic-eqalign

A \TableOfContents command, similar to the generalized command of \tableofcontents offered to LaTeX, is also provided for TeX.

Configurable Hooks

Much of the look and feel of TeX4ht is achieved through configurable hooks which are defined with the following commands.

1 \NewConfigure{name}[i]{body}
2 \Configure{name}{parameter-1}...{parameter-i}

For help configuring hooks already seeded in the system, compile the source files in use with the ‘info’ option active and review the information in log files. Much of the information in the log files may also be obtained by running ‘xhlatex mktex4ht’ and reviewing the entries in the outcome page ‘mktex4ht.html => index => mktex4ht’.

The following features can become handy for tailoring markups in LaTeX documents.

1 Package parameter ‘0.0
2 Parameter ‘hooks
3 Option ‘hooks+
4 Package parameter ‘edit
\Tg<...>
\Tg</...>
\Tg<.../>
5\Configure{edit} {before} {after}
\Configure{hooks} {before} {after} {}{}
6 \Configure<...>{before}{after}
\Configure</...>{before}{after}
\Configure<.../>{before}{after}
7 \Configure<...>-{replacement}
\Configure</...>-{replacement}
\Configure<.../>-{replacement}
8 Package parameter ‘edit+

This parameter is a generalization of the ‘edit’ parameter, which introduces configuration information into the log file.

9 Package parameter ‘verify
\Verify...\EndVerify
10 Package parameter ‘verify+

Private Configuration Files

The \usepackage{tex4ht} implicitly assumes a private configuration file of the following form.

\Preamble{html}\begin{document}\EndPreamble

Similarly, a command of the form ‘\usepackage[html,option,option,...]{tex4ht}’ implicitly assumes a file of the following form.

\Preamble{html,option,option,...}\begin{document}\EndPreamble

On the other hand, a command of the form ‘\usepackage[file,options]{tex4ht}’ assumes a configuration file obeying the following format (example). The extension ‘cfg’ is assumed for names of configuration files that are listed without their extension.

...early definitions...
\Preamble{options}
...definitions...
\begin{document}
...insertions into the header of the html file...
\EndPreamble
It is up to the user to decide the distribution of parameters between the \Preamble and the \usepackage commands.

One can avoid using configuration files, by including their implicit and explicit content within the source files. In such a case, the ‘\begin{document}’ of the source file should be replaced with a code segment of the following format (example).

\input tex4ht.sty
...early definitions...
\Preamble{options}
...definitions...
\begin{document}
...insertions into the header of the html file...
\EndPeamble
Listed below are a few additional parameters available for LaTeX.

1 no-halign
2 pictex
3 jpg, png

General Configuration Files

A compilation starts by opening ‘tex4ht.sty’ and loading a fraction of its code. The main purpose of this phase is to request the loading of the system at a later time (for instance, upon reaching \begin{document}). The motivation for the late loading is to allow TeX4ht to collect as much information as possible about the environment requested by the source file, and help the system reshape that environment with minimal interference from elsewhere.

The system uses two kinds of (4ht) configuration files. The files of the first kind mainly seed hooks into the macros loaded by the source file (for instance, latex.4ht, fontmath.4ht, and article.4ht). The files of the second kind mainly attach meaning to the hooks (for instance, html4.4ht, unicode.4ht, and mathml.4ht).

Different source files may request the loading of different style files and in different orders. The hook seeding files are loaded in response to the loading of the style files, and in a compatible order. Since the different style files may redefine the syntax and semantics of macros, TeX4t follows a similar route of defining and redefining the hooks and their meanings.

The meaning attaching files are normally requested through option names introduced in the tex4ht.4ht system file. The user may add option names, and redefine old ones, within a new file named tex4ht.usr.

A new ‘tex4ht.usr’ file should group references to *.4ht configuration files under arbitrarily chosen option names. For that purpose, \Configure commands similar to those provided in tex4ht.4ht should be employed.

Variants of the htlatex-like scripts may be produced in the following manner.

  1. Adjust the ‘latex’ (‘tex’, ‘texi’) command of a given script to use a desired option name, and rename the new script.
  2. Make sure the ‘tex4ht’ and ‘t4ht’ commands receive appropriate switches in the new script.
The definition of new meaning assigning configuration files can be considerable simplified by relying on literate programming and the file mktex4t.4ht. For additional information, compile this file into a hypertext document, visit the ‘index’ page, and from there reach into the ‘mktex4ht’ page.