Andonyar
Sprachen
Mathematik
Entwicklung

Mathematics on the World Wide Web

Draft! Next update will follow after the CICM.

The devil offers you to tell you the solutions to all mathematical problems under the condition that don't talk about it with anyone. Do accept?

Introduction

This page is a working draft. Because of this, you find here incomplete sections and clutter. In the last section, there is a list of stuff I intend to add later.

The intention of this page is to help you get started iwth publishing mathematical stuff on the web.

First, we look at the involved components one at a time. Afterwards, we put things together and look for concrete soluctions.

Components

Unicode

Mathemeticians havily use crazy symbols.

Computers internally represent characters as numbers. So, one has to make a convention which number stays for wich character. Such a convention is called a character set. One such set is ASCII, which uses 7 bits per character (or 8 bits where the first is always 0). ASCII only contains characters for english texts. You can imagine that there are many character sets around. Often they use one byte for one character, but there are also numerous character sets which use more. There is a handful of problems that appear. For example for reading a textfile or a webpage, you have to know the character set it is encoded in. Also, it is not possible to use characters of different character sets in the same document.

Unicode does try to solve many of these problems. It assigns to all characters a unique number, so you can use all characters in a document. However, there are different ways how unicode is represented on the computer. These encodings are called UTF8, UTF16 and UTF32.

Fonts

In order to be able to see unicode characters on the screen, you need a font that provides so called glyphs for them. Often, an application that encounters a character for wich the current font has no glyph tries to find another font that has this glyph.

Great fonts for mathematics and also other sciences are the STIX fonts. These fonts are quite new and contain a huge amount of glyphs. These fonts are perfectly suited for browsers. Firefox 3 now uses the STIX fonts by default for MathML if they are installed.

Input of Unicode characters

Now, how to enter your favourite Unicode characters into your favourite text editor you ask? That is endeed a problem. There are different solutions. First, you could open a table of unicode characters and then copy and paste the character you need into your application. Depending on which editor you use, you can also configure shortcuts or key combination which cause your favourite character to be inserted. Uhm, you are not really happy with that?

Depending on the markup language you have use, you can call characters by their name. LaTeX for example has for every mathematical Non-ASCII character a name. You can use such a character by writing a backslash followed by the name of the character. XML on the other hand, provides so called entities. They begin with an ampersand and end with a semicolon. In between, you write the name of a character. Since XHTML and MathML both use XML, you can use such entities instead of the mathematical characters themselves. However, those entities have to be defined. This is done in the MathML DTD to some extent. Of course, not all characters for which there are entities defined in the MathML DTD have the same name as in LaTeX. Furthermore, if a user agent does not read the DTD and does not have an internal table of these entities, it will not be able to determine the character for an entity. You can also use Unicode numbers as entities, which of course do not have to be declared.

Besides using a table, configuring your editor and calling characters by their name in a language dependent way, there is one more way, a very interesting way indeed. The way I tell you about now is (nearly) independent of the application you use. However, it is highly depending on the operating system you use. I don't know about Windows or about Mac, but I am sure, there are similar solutions for them. For Linux, of course, there are multiple solutions of this type. The idea is to use an "input method" that translates your keyboard input into unicode characters and then feeds the resulting unicode characters into the application. For this, I use SCIM together with its m17n-engine. I wrote my own input methods for mathematics, becuase I couldn't find any others on the web. With these you can for example write Omega on your keyboard and the application receives Ω instead.

XML

A text files is a sequence of characters. By looking at the newline characters, you can split it into a sequence of lines. But you can not get out more structure from a (general) text file. XML builds upon text files and gives them more structure, a tree structure to be exact. It consits of labelled elements which can contain text or other elements or even a mix of text and elements. Additionally, elements can have attributes which are key/value pairs.

Mathmatical formulas have a tree structure too, so it seems to be natural to use XML to describe them. But the meaning of an element or an attribute is not given by XML. This is done by an application of XML. As you will see, there are several for mathematical formulas.

XHTML

There are many applications of XML. One of them is XHTML. Roughly spoken, it is a stricter variant of the widely known HTML. It represents the structure of a document: titles, paragraphs, links, lists and more. For example an element with the name h1 represents a title, one with the name p a paragraph.

Presentation MathML

MathML is an XML application. It consits of two parts. One of them is called Presentation MathML. It does describe how a formula looks like.


    
        b
        n-k
    
]]>

Content MathML

The other part of MathML is Content MathML. It describes the meaning of a formula.

OpenMath

OpenMath describes the meaning of a formula. Content MathML and OpenMath are indeed quite similar.

OMDoc

A shortcoming of XHTML, MathML and OpenMath is that they do not provide a way to represent important structures of mathematical literature: Definitions, theorems and proofs. OMDoc does exactly do that. Of course, it makes use of OpenMath and MathML.

LaTeX

TeX is a powerful typesetting language. LaTeX is a macro package for TeX and perfectly suited to create high quality typesetted scientific publications.

SVG

Mathematicians like to draw pictures. Raster image formats like PNG and JPG store the colour of every pixel of the picture. This makes it impossible to resize them or to apply other transformations to them. SVG stores an image as a sequence of painting istructions like Paint a red circle with radius 1 around the origin or Paint a line segment from (-1,1) to (1,1). This has the advantage that SVG images can be scaled to any size using the full quality available on the device used to view the image. SVG is an XML application and already supported by many browsers. Being an XML application, SVG can be directly integrated into XHTML documents and one can even use MathML inside SVG for example in order to make labels (which is not supported by all browsers though).

Shortcommings of LaTeX

Let us first discuss why it is not necessary a good idea to just use LaTeX for the web.

Putting the puzzle together

Converting

Stuff to be considered

I have to consider to integrate information on the following stuff into this page: