Doing the Sums
Many programmers and software engineers have an ambiguous relationship to mathematics. Richard Stallman read physics at Harvard, but Linus Torvalds was always known for his coding prowess and his leadership qualities, not his interest in mathematics.
Computer science itself deals with dynamic structures and tries to measure and improve their performance and efficiency. Mathematics tries to prove the validity of abstract structures without taking their temporal qualities into account. For decades, though, the popular concept of computer science was that of people dealing with huge data sets, usually numbers that needed to be calculated and recalculated and processed to yield new results. The earliest applications of computer hardware did reflect this preoccupation to some extent, calculating the shape and impact of nuclear explosions before they ever happened. It is curious that today simulation of nuclear explosions has made it possible to stop the testing of nuclear weapons altogether. Computationally intense tasks are supposed to be the main task of supercomputers and clusters. But somehow, somewhere, it seems strange that computers do not play quite as much of a role in the practice of the people who actually made computers possible, namely mathematicians.
Apart from conducting large-scale calculations, which wasn't the province of mathematicians in any case, there were a number of mathematical fields that were likely to profit from an application of mathematical tools.
Only a dead number is a good number
The field of symbolic mathematics, as opposed to numerical mathematics, leaves out numbers completely and just regards mathematics as a collection of algebraic systems, systems that consist of symbols and the rules of their mathematical composition. Although it is perfectly possible to build mathematical software that uses computer algebra of some kind, this software can also deal with numbers.
But we should not forget that mathematical research is one of the main springs of global culture, and has never depended for its existence on software. The situation is not dissimilar to fine art - some of the fine arts can be practiced using computer software, but most are entirely independent of software.
CAS explained
It is safe to say that almost everyone who has had anything to do with programming has heard of Mathematica. It runs on Linux and it is one of the most expensive Linux applications available. Mathematica is an excellent example of what a computer algebra system (CAS) can achieve: it can teach mathematics to children and enable graduate students to study finite fields; it simulates electrical networks, and makes signal processing a breeze.
But, like most CAS systems in the commercial arena, it was preceded by free developments. And, unsurprisingly, they had nothing to do with the data crunching language numero uno, FORTRAN. The reasons are fairly surprising: FORTRAN is not meant to deal with symbols and to manipulate symbols in the same sense algebraic systems accomplish their tasks, it is meant to implement and process the results of equations.
Macsyma
The earliest CAS system that achieved some kind of market penetration was, surprise, surprise, an academic effort, the code of which was essentially free: the Macsyma project was the first large-scale CAS system that became a kind of a standard. Macsyma Code written in LISP began to run in 1969. Project Macsyma (MAC's SYmbolic MAnipulator) was meant to demonstrate the power of symbolic approaches to Artificial Intelligence: the Great Old Man of AI, Marvin Minksy, suggested that mimicking the mental operations of mathematicians as formalized in modern algebra would enable us to understand the workings of the human mind itself. This is not a program that would be likely to gain any approval from research institutions today, but the results of this effort were quite atonishing.
RMS's darkest days
Starting off on a PDP-6, it took 11 years before Macsyma was ported to a Unix system. This wasn't the auspicious move that it would have been under different circumstances, since the product and the basic ideas were sold off in 1982 to a company called Symbolics Inc. If this sounds vaguely familiar to historians of Free Software, it should ring a few bells simply because it is one of two reasons by Richard Stallman decided to form the free software foundation: MIT Lisp and Macsyma both became largely proprietary products and many programmers went from being MIT employees to chasing big bucks as company shareholders. Richard Stallman realised that the days of sharing code and ideas in a reasonably carefree environment were past and he founded the FSF to recreate an environment where code would not be declared proprietary from one day to the next.
Macsyma, however, contined to exist. For reasons that can only be summarized as political, the US Department of Energy secured the release of another Macsyma version from the MIT in 1983. This version was distributed under slightly dubious legal circumstances to hundreds of sites around the world. The maintainer, Professor William Schelter from the University of Texas at Austin, faithfully kept the free version of Macsyma going, renaming it Maxima in the process. The company built on the potential success of Macsyma was liquidated in 1999. Maxima and the FSF are going from strength to strength. And Maxima, in one form or another, has been running for 36 years.
Commercial CAS
But this is where the story becomes fairly tricky: Maple and Mathematica appeared in the mid-to-late 1980s. CAS systems seemed to be dominated by tools companies, but today, with the appearance of Linux, there is what could be a called a harvest of CAS systems which are in some way based on Maxima; they started to appear in the 1990s. Many were free and some were GPLed.
All of them run on Linux and some, like Maxima, run on OS X. Some of them have GUIs built in, others, true to their mathematical heritage, produce LaTeX or even terminal text output. Gnuplot can be chained to Maxima to produced graphs, but TeXmacs produces fairly adequate results, too.
Godzilla CAS
The field of open source CAS is populated by several 800 pound gorillas. Those gorillas have an astonishing shelf life, outliving most existing operating systems and even recent programming languages by up to 20 years. One such giant is the recently opensourced Axiom. It has been in existence for almost as long as Maxima, but it has the slightly less encouraging distinction of being supported by the most extensive piece of documentation in the opensource world: 1105 pages.
Environments like Maxima are interactive, commandline-driven and rely on what amounts to an interpreted programming language to implement new modules and the commandline itself. Axiom is a very different animal: it was developed under the name of Scratchpad by IBM, an organization rather prominent in mathematical research. IBM is one of only a handful of commercial organizations in the world focusing on basic research. Axiom/Scratchpad has been around for a number of years and today is made available under a BSD-type license, but curiously, despite its almost mythical reputation for extensibility and the huge number of classes ("domains") available for it, it was very well known that during its heyday it rarely had more than 400 fulltime users. Why then do we mention it here?
Axioms are good for you
First of all, it represents the apex of all mathematical software: although it was only used by researchers of considerable academic achievement, it avoided the rather unhealthy focus on calculus-related research of most early (and present !) commercial offerings. It was perfectly possible to do symbolic integration without having to know whether the formula used was actually working. If it didn'd work, Axiom provided clear indications why the integration formala was invalid. Although we might take this feature for granted today, it was a sensation at its time. In a way, it proved that the original ideas of AI could be implemented to some extent and modelling a mathematician's mind seemed at least partly feasible.
Secondly, the fact that Axiom has been made available to the general public in 2003, led to a small renaissance among users who were not necessarily familiar with the product. The large number of "libraries", a not entirely accurate concept within Axiom architecture, its object-orientation and the scripting language known as Aldore make it rather more difficult to study than most mathematical software available on Linux and Windows, but the rewards are considerable.
Maxima and Axiom are some of the oldest and most mature environments within which a modern mathematician might express himself. Axiom is perhaps not suited that well to researchers used to the comforts of Unix and Linux, but this is probably a price many pay quite happily for working with LISP environments.
C is for wimps
There are alternatives, of course. LISP has often been considered the be-all and end-all for CAS users, but this does not mean that we should have to forego Unix-style environments. YACAS (Yet Another Computer Algebra System) is of rather more recent vintage; it has almost been entirely coded in C and C++. It pipes input and output to other Linux utilities and it has been GPLed. It can be programmed using the YACAS programming language, which for anyone with an inkling of C is fairly trivial to learn.
Again we move within the field of symbolic computation and YACAS is a more lightweight, and extremely well documented example of its kind. The very core of the system has been documented in an online book on mathematical algorithms, which makes it the starting point for systems that have been written on top of it, or even for entirely new implementations of similar environments for computational mathematics.
Euler and Octave
A good example is Euler, which is not to be confused with the programming language of the same name. Euler combines the power of YACAS with a numerical environment within which it is possible to run matrix calculations. It is fairly common within the CAS world to code somewhat more specialised environments that do, say, automatic theorem proving or statistical calculations. The latter might not be part of a CAS at all, but run on top of a standard library of FORTRAN routines hidden behined a layer of C++ APIs. Octave, the standard numerical computation environment running mostly within Linux and Unix environments follows this model closely. R, the statistical programming language is another example, although it could be regarded as a programming language specially suited to mathematical data structures.
What systems like Euler and Octave don't do at all is working with theorems of any kind. They should be regarded as some kind of general equation solvers largely working with numbers and, occasionally, place holders. But their problem domains tend to be far more limited even though they are far more tightly integrated in their environment. They are more like engineering tools, - a heritage clearly acknowledged by Octave, which is largely Matlab compatible - and they usually need to do an excellent job with their graphics.
There have been several attempts over the years to combine the dark arts of GUI toolkit programming and graphics libraries with the underlying mathematical problem-solving environment. So far the results have not been particularly encouraging. It often turned out to be far more useful to write a generalized problems solving environment, bolt a more specialized library on top and again output the results to an interface that combined excellent mathematical typography with decent line graphics. The reasons for this are fairly interesting and have a lot to do with the fairly different problem domains that lend themselves to clean partitioning. After all, who would like to combine a library of computational geometry routines with graphics output, combining both into one big monolith.
Luckily, mathematical formalisms, are similar, if not identical, the world over. Input conventions for most CAS systems are comparable; given the fact that there are more than 100 CAS systems listed in the co-called Rosetta document, it is interesting to see that most of them accept formulas and control characters in a similar way.
The Formulae of Formatting
The final leg on our journey through the CAS world leads us to the representation of mathematical formulae and data. Contrary to rumour, mathematicians are an extremely gregarious lot and they prefer to communicate their advances quickly and in an unambiguous manner. Contrary to physicists, who were happy to use hypertext-based representations, also known as the Worldwide Web, mathematicians need to be more fastidious. Just like physicists they come from a LaTeX-dominated world, but unlike physicists, they cannot do with anything less capable. Neither do they have the data-processing needs of some physical fields.
Of course, this means that we need to pay attention to another XML dialect. Although invented for entirely different reasons, XML can be regarded as a kind of machine for the generation of structured markup.
MathML describes mathematical objects and gives rendering engines a necessary common set of conventions without deciding on the look of the formatting itself. Rather more interesting, OpenMath is concerned with the meaning of mathematical objects and as such makes it possible to exchange information between CAS systems, databases, publishing systems and, of course, email. Like XML, which OpenMath can use as an encoding, it organizes mathematical dataypes as terminals of a tree structure. The precise method of encoding OpenMath objects is subject to a rather voluminous standard which need not concern us here; in theory it is not connected to a particular encoding (like XML, SGML or even LISP), but in practice, so called content dictionaries encoded in a XML-like syntax will fix the meaning of mathematical symbol sets.
The meaning of symbol sets, of course, is not what a practicing mathematician cares about. Notations are as much of a convenience as they can be an impediment to understanding mathematical problems. But producing readable (and printable) articles, or getting communications between different CAS systems to work flawlessly is as important as the results of mathematical research.
The LISP mechanic
Marvin Minsky and John McCarthy are the godfathers of modern artificial intelligence. John McCarthy also invented LISP and thereby made the idea of mathematical software possible.
In the late 1950s, the field of computer science did not exist. Most practicing computer scientists were mathematicians in some kind kind of intellectual exile or electrical engineers who had been sidetracked by supersecret government organizations trying to build number crunching machines to defeat the Red Menace.
John McCarthy, a Caltech and Princeton-educated mathematician, was struggling with a problem that threatened to scupper his attempts to extend FORTRAN to something resembling a functional programming language. It seems almost antediluvian today, but adding recursion and functions as arguments to a programming language was not an intuitive step for the few specialists working on programming language design. John McCarthy realised he had to develop a new programming language to cope with the need he had to code problems in symbolic integration. He found a job at the newly created MIT Artificial Intelligence Lab, becoming the colleague of an unknown mathematician-engineer, going by the name of Marvin Minksy.
Much of the early LISP development culminated in a paper called "An Algebraic Language for the Manipulation of Symbolic Expressions". In many ways, it prefigured what became LISP as much as it became the origin of CAS and established lists as the fundamental recursive datastructure for the new, and as yet unnamed programming language.
Frank Pohlmann
References
Yacas www.xs4all.nl/~apinkus/yacas.html
Axiom http://page.axiom-developer.org/zope/mathaction/FrontPage
Maxima http://maxima.sourceforge.net
John McCarthy www-formal.stanford.edu/jmc

