Ltoh is a customizable LaTeX to HTML converter. It handles text,
tables, and hypertext links. ltoh is a large Perl script, and hence is
(almost completely) platform independent. ltoh is customizable in that
you can specify how to translate a given LaTeX2
See the ltoh web page for documentation, the
latest release, and how to contact the author (see the bottom of the web
page). Naturally, the HTML version of document was generated using
ltoh, and in my opinion looks better than the LaTeX2
Ltoh has two main restrictions. First, ltoh does not handle math equations, which in general are difficult to display in HTML. [Some have resorted to converting the latex equations into Postscript (PS), converting the PS to a bitmapped figure, and the displaying the figure in HTML. This is all too difficult for me.] Second, ltoh requires La/Tex macro parameters to be delimited by braces; in practice, ltoh might be unsuitable for most existing TeX code.
Surprisingly, I often preview my LaTeX2
Ltoh is distributed as either a zip file or a gzipped tar file (about 75K).
Both distributions contain the following files.
ltoh.pl | The perl script that does everything |
ltoh.specs | The default specifications. |
readme.html | Generated by ltoh |
readme.dvi | LaTeX2 |
readme.ps | Uses Times Roman |
readme.txt | Text version (generated from netscape) |
README | |
rq-ltoh.specs | An example of my specifications |
rq209.sty | Allow use of new LaTeX2 |
Ltoh version 97e requires the following system software.
perl -v
to see the version of Perl you have.
Additionally, the default ltoh specifications is based on standard new latex macros. Finally, to make full use of HTML tables, future versions of ltoh are likely to support multiple rows in the table packages only found in the new latex.
ltoh relies on unique matching braces to delimit arguments to the latex
macros. In particular, the font family and size commands in old latex
do not use braces to delimit arguments. Thus, ltoh\ does not (and
probably never will) handle old latex 2.09 font specifications.
Instead, you must use the LaTeX2
(Old latex) Normal but switch \bf to {bold \it then italics, back to} bold \normalfont then normal. (New latex) Normal but switch \textbf{to bold \textit{then italics, back to} bold} then normal.
Produces:
Normal but switch to bold then italics, back to bold then normal.
Using the old latex syntax, ltoh cannot determine when the bold and italic fonts stop being active.
If you have the new latex on your system, use it. If you must use an old latex file, convert it to look like new latex as much as possible.
{\XYZ ... }
and
\XYZ ... \normalfont
to \
textXYZ...
.
To use this file, put
in your latex files. The file rq209.sty additionally defines
the font size macros
\fsizeTiny
/.../\fsize
Huge
which take a single brace-delimited argument. For
exapmle, use \fsizesmall{some text}
instead of
{ \small some text }
. (This author wrote rq209.sty back in
1994 because the office computer ran the old latex but the home Linux
machine ran the new latex.)
Alternatively, write and use your own definitions of the \ textXYZ font change macros.
(One final note.) The old latex convention is simply a poor technical chioce. The current philosophy for document specifications (and even programming languages) is that parameters/arguments/blocks are clearly delimited syntactically. The use of matching braces by latex2e conforms to the the SGML syntax, as does HTML which ubiquitously uses matching begin and end tags.
To generate the HTML file xyz.html from the latex file xyz.tex, assuming ltoh is in your path, run:
prompt> ltoh xyz.tex or prompt> perl fullpath-of-ltoh.pl xyz.tex
(I have not tested ltoh on a Win32 machine, yet...) On a Win32 machine, which cannot automatically start Perl to execute the ltoh, you would probably run
prompt> perl ltoh.pl xyz.tex
There are five types of ltoh specifications. Please note the names.
b/e
)] Specifies how to translate a latex
\begin{XYZ}
and matching \end{XYZ}
command.
comm
)] Specifies how to translate a latex
command that does not take any parameters, such as \par
,
\item
or \hrule
.
{}
)] Specifies a translation for a
latex macro that takes a single brace-delimited argument arg-1,
where the corresponding HTML consists simply of surrounding the argument
with a preamble and postamble. The translation is simple as the
argument stays put; ltoh merely puts stuff before and after it. That
is, we expect
\simplemacro{...}
For example, use a simple-macro specification to translate the latex
macro \textbf{ ... }
(switch to bold face) into the HTML
<strong> ... </strong>
.
{N}
)] Specifies a translation
for a latex macro that takes $N$ brace-delimited arguments; the
corresponding HTML can make arbitrary use of the arguments. For
example, my latex macro \swallow{arg-1}
discards its
single (possibly long) argument. In the corresponding HTML, we also
``use'' the argument, by discarding it.
:=
)] An assignment sets a ltoh variable
which is then used later. As of version 97e, only a small number
of built-in variables are supported. I hope to support setting and
getting user-defined variables in the future.
The first four specications are known as translations specifications.
The four types of translation specifications have the same form. Do not use leading whitespace. Here is the general form and an example of each type.
:type :latex-macro-name:HTML-start-code:HTML-end-code:reserved/not-used :b/e :\begin{itemize}:<UL>:</>: :comm :\hrule:<hr>:: +comm +\homepage+http://www.best.com/~quong++ :{} :\textbf:<STRONG>:</>: :{2} :\rqhttp#1#2:<a href="#2"> #1 </a>::
Each specification contains six parts.
\homepage
macro
expands to HTML containing a colon, so a colon cannot be the delimiter
and I have used a plus. I do not recommend using a space/tab as the
delimiter, as multiple spaces/tabs are easy to overlook.
As an example of an optional regular expression, the LaTeX2\hspace
takes an optional *
argument, and then a required horizontal length argument. In the
generated HTML, we want to ignore the entire \hspace
macro, and so I use the following ltoh spec.
:comm :\hspace[*]?\{[^\}]+\}:::
\beginXXX
expands to
HTML start code.
In an arg-macro specification, using the LaTeX2\#1
, use
braces as in #{1}.) Thus, a macro that swaps the order of its
parameters would be written as
:{2} :\swap_two:#2#1::
As another example, the LaTeX2\makebox
command takes an
optional alignment parameter (one of [l]
, [c]
or
[r]
) followed by text to be put into the box. I use the
following ltoh spec to ignore the alignment parameter and to print the
text out unadorned.
:{1} :\makebox[^{]*#1:#1::
As a convenience, using </>
in the HTML end code
expands to the end tag(s) in reverse order of the corresponding HTML
begin code. For example, I want a LaTeX2\section
to show
up as a green <H2> header in HTML, so I specify
:{} :\section:<hr><H2><FONT color=green>:</>:
which is equivalent to
:{} :\section:<hr><H2><FONT color=green>:</FONT></H2></HR>:
The following table summarizes the effects of the various specifications, and the parts of the spefications used.
Type macro name HTML start HTML end input output
comm \abc XYZ not-used \abc XZY
b/e \begin{abc} XYZ ijk \begin{abc}...\end{abc} XZY ... ijk
{} \abc XYZ ijk \abc{...} XYZ... ijk
{2} \abc X#2Y#1Z not-used \abc{===}{+++}
X+++Y===Z
As a final example, here's how generate links in HTML. I define a latex
macro \rqhttp
and a corresponding ltoh specficiation.
Because the tilde is accessible only in math mode, I have had to define
a latex macro (\rqtilde
) for it, too.
(latex macro)\def\rqtilde{\ensuremath{\tilde{\;}}\xspace}
\def\rqhttp#1#2{#1 (\texttt{#2})}
(ltoh spec):comm :\rqtilde:~::
:2 :\rqhttp#1#2:<a href="#2"> #1 </a>::
In LaTeX2\rqhttp
macro as follows.
See the \rqhttp{\ltoh webpage}{http://www.best.com/\rqtilde{}quong/ltoh}.
The resulting dvi output from latex and the HTML from ltoh look like
(Latex) See the ltoh web page (http://www.best.com/~quong/ltoh).(HTML) See the <A HREF="http://www.best.com/~quong/ltoh"> ltoh web page </A>
Finally, good example of using ltoh specifiers is the default ltoh spec file ltoh.specs that comes with this release.
[Aside: Technically, the simple-macro
specifier is not needed,
as its functionality can be duplicated with an arg-macro
.
Namely,
:{} :\macro:HTML-begin:HTML-end::
can be duplicated via
:{1} :\macro:HTML-begin#1HTML-end:::.
Nonetheless, use of a simple-macro { } specifiction is preferred, because its processing is much simpler. With a simple-macro, ltoh does not have to extract and pass the parameter, and hence it is less likely to break than an arg-macro.]
An assignment specification has two nearly identical forms. The double quotes are optional and can be used to imbed leading spaces into the string-value. The whitespace surrounding string-value is removed.
variable-name := string-value variable-name := "string-value"title := The readme for ltoh
Here are the currently used built-in variables.
variable | Default | Description |
title | none | Title of the resulting HTML file, via
the <TITLE> tag. You must define this variable.
Set title in the latex file itself. (It drives me nuts when
web pages don't have titles) |
url | none | URL of the home page of the author. |
author | none | Author of the document. |
email | none | Email address to which comments should be sent |
htmlfile_spec | $BASE.html | Name of HTML file
generated. The ltoh variable $BASE is the
latex file name stripped of the directory and suffix components. |
The url
, author
, and email
variables are used
to generate an address block at the bottom of the HTML page. (See the
bottom of this document if you are reading it on the web).
ltoh handles the LaTeX2tabular
and tabularx
environments. Column alignments are read and passed onto the
corresponding HTML. The known column alignments must be one of `` l c r
p X''. If you define your own column alignment, it will not be
understood.
ltoh handles the LaTeX2multicolumn
macro reasonably well.
The column alignment is read and passed onto the corresponding HTML.
I plan to support the \multirow
macro soon.
ltoh ignores extraneous LaTeX2\@
, but there is a small
chance a complicated multiple column alignment spec will break this
code.
The generated HTML table has a border if one or more dividing lines in
the LaTeX2
As of version 97e, ltoh reads specifications from $(i)$
various specification-files and $(ii)$ from the LaTeX2
(In version 97e, you should do one of the following when running ltoh.
prompt> perl install-dir/ltoh.pl file.tex.
(csh) alias ltoh perl install-dir/ltoh
or
(bash) alias ltoh=perl install-dir/ltoh
prompt> ltoh.pl file.tex
prompt> ltoh file.tex
This mess is relative symbolic links. Yes. Given an arbitrary invocation of ltoh involving symbolic links, I cannot currently determine where the ltoh.pl script actually resides (the install-dir). Once I implement this code, the setup won't be complicated.
However, if none of the preceding spec files were, ltoh tries to read
/usr/local/bin/ltoh.specs and that fails, tries /usr/bin/ltoh.specs. If both of these still fail, ltoh quits.
%-ltoh- ltoh-specification
ltoh strips the leading %-ltoh-
and processes the
remainder of the line.
If nothing else, set the title
variable this way. For example,
here's how this LaTeX2
\documentclass[]{article} ... various latex commands like \usepackage %-ltoh- title := Ltoh, a customizable LaTeX to HTML converter %-ltoh- :comm:\ltoh:<font color=green><tt>ltoh</tt></font>:: ... \begin{document} ... the body of the document
It is not difficult to break ltoh, though there are often easy fixes by
restructuring your LaTeX2
-o ofile | generate HTML into file ofile |
-I specfile | read specifications from specfile |
-w N | set the warning level to N (for debugging) |
\begin{tabular}
or the owning
\multicolumn
. This assumption is very reasonable and
circumventing this restriction is difficult.
Ltoh first reads the entire LaTeX2
June 1996 | Version 96a. Preliminary fully hard-coded (not customizable) version. Purely regular expression based. Unable to handle nested braces. Ugly, but it worked. Sort of. |
July 1-15 1996 | Version 96b. First working version. Able to handle commands with multiple arguments and nested arguments. Took me a lot longer than I had expected to get this working. |
Jan 27-29 1997 | Version 97a. Stops processing at
\end{document} . Convert double back slashes
\\to <br> , which should have done a long time ago.
Fixed bug involving macros with only one parameter.
|
Feb 1997 | Version 97b. Added HTML <p> tags whenever
two or more consecutive blank lines are seen.
|
Mar 11-15 1997 | Version 97c. Much improved handling of special characters such as {, }, <, > and @. In particular, bare braces which mean nothing in latex are stripped from the HTML. Improved paragraph detection handling. (OK, OK, "Improved ..." really means "fixed bugs in ..."). No longer generates HTML comments for latex comments, by default. Version 97c was meant to be first public release, but the tables this readme.tex document broke ltoh badly. |
Mar 19-20 1997 | Version 97d. Complete rewrite of the table handling
code. Latex column alignment specifications are understood and passed
onto to the HTML. Multiple columns specified via either
\multicolumn or \mc (which is my personal
abbreviation macro) are handled properly. We try to ignore extraneous
LaTeX2@ .
, but there is there is a small chance multiple
columns
In particular, ltoh now handles this file (readme.tex) properly. |
Mar 25-31 1997 | Version 97e. Official release. Clean up source a bit for release. Minor improvements on tables (allow end of a row to be on a separate line), paragraphs, specification files and handling special characters (allow for multiple chars on one line). |
You may use ltoh freely, under the following conditions, which are covered under a BSD-style license.
Here's the official license as of 31 Mar 97.
# Copyright (c) 1996, 1997 Russell W Quong. # # In the following, the "author" refers to "Russell Quong." # # Permission to use, copy, modify, distribute, and sell this software and # its documentation for any purpose is hereby granted without fee, provided # that the following conditions are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. All advertising materials mentioning features or use of this software # must display the following acknowledgement: # This product includes software developed by Russell Quong. # 3. All HTML generated by ltoh must retain a visible notice that it # was generated by ltoh and contain a link to the ltoh web page # # Any or all of these provisions can be waived if you have specific, # prior permission from the author. # # THE SOFTWARE IS PROVIDED "AS-IS" AND WITHOUT WARRANTY OF ANY KIND, # EXPRESS, IMPLIED OR OTHERWISE, INCLUDING WITHOUT LIMITATION, ANY # WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. # # IN NO EVENT SHALL RUSSELL QUONG BE LIABLE FOR ANY SPECIAL, # INCIDENTAL, INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY KIND, OR ANY # DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, # WHETHER OR NOT ADVISED OF THE POSSIBILITY OF DAMAGE, AND ON ANY # THEORY OF LIABILITY, ARISING OUT OF OR IN CONNECTION WITH THE USE OR # PERFORMANCE OF THIS SOFTWARE.
(The motivation section belongs right after the introduction, but most people probably just want to get on with using ltoh. So this section has been relegated here. Ah well...)
Although other LaTeX2
Fundamentally, ltoh is a specialized macro processor that reads macro
specifications and generates HTML accordingly. A specification
indicates how to convert a specific LaTeX2
My orginal goals in writing ltoh were
Thanks to VA Research for letting the author work on ltoh.