1 October 1997
Web browsers which support the standard symbols of the International Phonetic
Association (IPA), for example with Unicode characters,
are not in general use, but the need for rendering IPA symbols is rather urgent
for web publications and teaching materials.
A simple procedure
,
which is better than the scanning process which is sometimes used,
was developed in order to render
IPA symbols in the hypertext version
of the recently published
Handbook of Standards and Resources for Spoken Language Systems,
edited by Dafydd Gibbon, Roger Moore and Richard Winski,
published by Mouton de Gruyter, Berlin (1997).
The procedure was required for inserting a large number of IPA macros into the
HTML version of a
formatted book, since the widely used
latex2html hyperdocument generator software, in the version and
configuration available, was not able to handle the problem.
A sample is contained in Table 1.
An obvious solution for rendering IPA symbols in HTML documents is to use
in-line transparent GIF images for each symbol or combination of symbols;
conversion of single symbols was targeted in the present application.
What is not so obvious is how to generate the GIFS automatically and insert
them into complex texts formatted in
. The GIFs generated by the
procedure described here can be imported into proprietary web assistants and
HTML editors if required, and of course converted into other graphics formats.
The
IPA macros used in the EAGLES application are those contained in
the wsuipa macro set, but the procedure operates with any special
character macros (e.g. the \LaTeX macro, which is rendered as
).
The method uses a property of the latex2html script which passes certain objects which it cannot handle, such as figure and math environments, to other programmes, effectively using the following cascade:
# This number will determine the size of the equations, special characters, # and anything which will be converted into an inlined image # *except* "image generating environments" such as "figure", "table" # or "minipage". # Effective values are those greater than 0. # Sensible values are between 0.1 - 4. $MATH_SCALE_FACTOR = 1.4; # This number will determine the size of # image generating environments such as "figure", "table" or "minipage". # Effective values are those greater than 0. # Sensible values are between 0.1 - 4. $FIGURE_SCALE_FACTOR = 1;
The IPA rendering procedure uses these features of latex2html
by inserting the IPA macros into a
math environment with a simple macro
which I call
eggbox.
Packed into an
eggbox, the symbols are then safely
transported through the latex2html conversion cascade described above,
and emerge as inline GIFs of the correct size:
Macro definition: \newcommand{\eggbox}[1]{$\mbox{#1}$}
Macro example: \eggbox{\eggbox{\schwa}}
The latex2html call for converting this document into a zero page depth hyperdocument (with minor manual post-editing) was:
latex2html -html_version 3.0 -t "IPA on the Web" -split 0 ipa2eggbox.tex
The \eggbox macros were automatically inserted into the original
file using the standard UNIX script language sed:
#!/bin/sh
# ipa2eggbox, Dafydd Gibbon, 5 May 1997
# IPA macro set is wsuipa
# Convention: macro may be followed by one or more spaces.
# Caution: check that macro name is not a prefix of another macro name.
# Macro \eggbox definition: \newcommand{\eggbox}[1]{$\mbox{#1}$}
cat $1 |
sed "s/\\\schwa */\\\eggbox{\\\schwa}/g
s/\\\scu */\\\eggbox{\\\scu}/g
s/\\\sci */\\\eggbox{\\\sci}/g
s/\\\inva */\\\eggbox{\\\inva}/g
s/\\\invv */\\\eggbox{\\\invv}/g
s/\\\niepsilon */\\\eggbox{\\\niepsilon}/g
s/\\\scy */\\\eggbox{\\\scy}/g
s/\\\stress */\\\eggbox{\\\stress}/g
s/\\\eng */\\\eggbox{\\\eng}/g
s/\\\esh */\\\eggbox{\\\esh}/g
s/\\\eth */\\\eggbox{\\\eth}/g
s/\\\openo */\\\eggbox{\\\openo}/g
s/\\\nitheta */\\\eggbox{\\\nitheta}/g
s/\\\dz */\\\eggbox{\\\dz}/g
s/{\\\ipa{\\\symbol{[^}]*}}}/\\\eggbox{&}/g
s/\\\diaunder\[\\\syllabic|m\]/\\\eggbox{\\\diaunder\[\\\syllabic|m\]}/g
s/\\\revepsilon */\\\eggbox{\\\revepsilon}/g
s/\\\downp */\\\eggbox{\\\downp}/g
s/\\\curlyc */\\\eggbox{\\\curlyc}/g
s/\\\invh */\\\eggbox{\\\invh}/g
s/\\\invscr */\\\eggbox{\\\invscr}/g
s/\\\nj */\\\eggbox{\\\nj}/g
s/\\\invscripta */\\\eggbox{\\\invscripta}/g
s/\\\baro */\\\eggbox{\\\baro}/g
s/\\\invy */\\\eggbox{\\\invy}/g
s/\\\scr[^i] */\\\eggbox{\\\scr}/g
s/\\\curlyz */\\\eggbox{\\\curlyz}/g
s/\\\baru */\\\eggbox{\\\baru}/g
s/\\\scripta */\\\eggbox{\\\scripta}/g
s/\\\yogh */\\\eggbox{\\\yogh}/g
s/\\\nigamma */\\\eggbox{\\\nigamma}/g
s/\\\glotstop */\\\eggbox{\\\glotstop}/g
s/\\\secstress */\\\eggbox{\\\secstress}/g"
Until browsers appear which provide renderings for IPA Unicode, the automatic inline GIF here can fill the gap reasonably satisfactorily. If and when such browsers appear will depend, among other things, on the efforts and influence of the spoken language community.