Meek not geek - Interview with Michael Meeks of OpenOffice.org
On the Novell website, there is a page dedicated to the company's Distinguished Engineers. One of these is Michael Meeks, a Cambridge graduate who began his Linux career at GNOME desktop start-up Ximian, and now works as part of Novell's OpenOffice.org team. Daniel James met him just prior to the announcement of the Novell/Microsoft agreement, and opened the interview with his favourite opening question to any free software hacker...
DJ: What was your first computer?
MM: I was encouraged to program by my mother, because she was a Head of Maths, so I had a BBC Micro and was doing simple BASIC programming. Then came the era of type-in games, so you could learn programming syntax. As my typing was not very quick, I had time to consider the constructs. Eventually I was writing assembler, and this led to x86; but all this was with proprietary software, really. Just on top of Windows, or DOS - a little bit of the Novell technologies crept in there, but only briefly. I left school and went to work doing PASCAL programming, on VT420's, VMS on Alpha machines, cross-compiling to 68000 embedded systems, real-time, the lot. Lot's of good stuff there, and quite interesting in many ways.
During my gap year I became a Christian, and at that point I realised my computer was riddled with non-free software, most of it stolen. So we had a big fight about this, me and God, and the voice of conscience was quite clear that this was not what should go on. So eventually, I ditched it to run Linux instead, which at the time was not good news. The hardware support, and anything graphical or visual, was just hopeless.
DJ: So it was a sort of 'forty days in the wilderness'?
MM: Yeah, that sort of thing! But I'm incredibly blessed now, if you look at my career from that point. I got in very early and met lots of people who've done good work, and I've managed to hold on to their coat-tails at various times. I got involved with working with Miguel de Icaza, having a high-speed internet connection at the University. In fact, I started with GCC doing some fixes to warning support, but they were not very receptive to my patches, or at least they didn't respond to my emails for several years. They only just went into GCC, in almost the same form, recently.
So I switched to working with various other projects, and GNOME was where I ended up. I worked with Miguel for a long time, on the spreadsheet application Gnumeric, doing XLS import and export, and then with Nat Friedman and Miguel on the component model, Bonobo. Then the CORBA::ORB interface which underlies GNOME, ORBit, maintaining that and then migrating to ORBit2. Some of the GNOME 2 work, getting that finished and out of the door, and then OpenOffice.org I guess.
DJ: Was that work on GNOME during the Ximian years?
MM: Before the Ximian years; I arrived to work on GNOME before GNOME 1.0. GTK would break, and GNOME libraries would break, so when you updated the spreadsheet tool, you had to update the whole rest of the system to get anything to build at all! It was absolutely nightmarish. But we got somewhere in the end, and the benefit of ABI stability in GNOME's approach is absolutely incredible, when you consider how bad things used to be. By the time Ximian started, I'd already worked with Nat and Miguel while I was at university, so I joined them full-time when I left.
DJ: So it was through Novell's acquisition of Ximian that you ended up working for Novell?
MM: That's right, and it was a really positive thing, I must say. I think it's been amazingly positive for Novell, but for me too, and for free software - it's just brilliant. It's a marriage of people who have very similar goals and direction, and I think we can change the world as Novell, where as we couldn't so much as Ximian. My team which does OpenOffice.org work used to be myself plus Federico Mena, who was a co-founder of GNOME, and some artists, but now it's ten times as big, with proper support, and we can focus on more interesting, strategic software.
DJ: How many people, would you say, are contributing to OpenOffice.org these days?
MM: That's a good question. Not enough, is clearly the answer. There are many reasons for that. If you want an absolute number, I'd say about a hundred full-time equivalents. Of those, most of them are inside Sun, then Novell is a big second to Sun. Google is next, maybe; Red Hat have one guy full-time on it. After that, part-timers everywhere. So if you look at, say, Ubuntu, claiming to ship and support OpenOffice.org, it's a total joke - they have a part-time packager. At Mandriva, for example, the OpenOffice.org packager is a self-described 'not a C++ programmer'. So how you can then go and say 'we'll support you'... Novell, at least, has people across the board working on the codebase, with a good understanding of lots of issues.
DJ: Surely, it's quite unusual for companies the size of Sun and Novell to be collaborating on a project?
MM: If you look at things like the GPL'd Linux kernel, there's all sorts of people competing who are working on it. It is a relatively new forum for that, and I think particularly from Sun's perspective, it's a revolutionary idea for Star Division, in terms of quality assurance. We've moved Sun, by encouraging them, from a very, very long release cycle of eighteen months. We've managed to get them to a three month release cycle, whereby we can add new features and functionality and build up the project much quicker, which has been brilliant.
In terms of quality metrics, it has improved very substantially, and it was obvious that this would be the case. Simply because, in the past, we would make it part of our business to backport fixes that Sun engineers couldn't be bothered to backport to the branch. We've done hundreds of thousands of lines of backporting to include features that were perfectly good, but wouldn't be released for a long time.
So we did good work there, but luckily we've now convinced them to work on a single tree, to try and get fixes out more frequently. Even so, there are serious systemic problems inside Sun, primarily in the QA department. For example, they would say that they can't understand how the Quickstarter could work in UNIX without a full specification, including screenshots of it, a description of what the function should do, approval by the user experience team inside Sun, and of course the help documentation team. They should all be involved, write a specification, and then you can actually write the code. It's a very, very different way.
This wouldn't be so bad, if it were not for the fact that the people are not extremely responsive, so it's normal to ask for a comment from a user experience person and not get anything back, for many months, sometimes years. Seriously, there are bugs in OpenOffice.org that have ten duplicates, are trivial to fix - some even have patches - and they are still waiting for comments from the user experience people. To do the work, as a community person, to be able to actually build OpenOffice.org - find this piece of code, and fix it - and then to get your patch ignored, is tragic in the extreme.
We try and help that by having a thing called ooo-build, that maintains a separate set of patches. The copyrights assigned to Sun are going upstream, and are probably filed. Sun can come and get them any time they choose to be responsive, but we actually ship them. There's a solver component inside our Calc, for example, which was produced by a developer and has been shipping for a while, but upstream are not so responsive.
DJ: Do any of the Linux distributions use ooo-build as the basis of their packages?
MM: Yes, virtually all of them. Debian, Ubuntu, Ark, PLD - all the little Linuxes, even Xandros, the more commercial/proprietary/closed people, Linspire I imagine. I'm permanently amused by how many people are using it. Some of them contribute stuff back, like the Debian people do. We work together with the Ubuntu people. Red Hat do a similar thing, but in a slightly different way. They used to do it with us, and for some reason they changed tack slightly. Lots of people shape it, which is good, because people's work actually gets deployed, and they get some feeling of love. There is a community; we're a team, and we work together.
I would stress that there are people inside Sun that do 'get it'. People that are open, and helpful, and really good. But there are also a large number who are very traditional, very staid people, particularly in quality assurance. You can't argue with them, because they're in their own self-reinforcing world view. They say specifications are necessary for product quality, and you say "That's fine, but look at the quality. It's still not very good." They say more specifications are necessary! The answer is always more of the same, and you can't argue with that. It leads to obsolescence - quality through obsolescence, is what I like to call it. By not ever changing the product, you can specify every bug as a feature, and hey! It's bug free. Instead, you really need to be getting changes in, fixing things and improving them.
DJ: What does OpenOffice.org need now? Things like import filters have improved over the years...
MM: One of the interesting things is that IBM has been quite prominent, saying how bad it is to have closed, binary file formats, and how we should be moving to ODF and various other formats. You can't help but agree with that. However, they've also promised to open their Lotus WordPro file format, which they are investigating currently. The process is ongoing, and we're very hopeful that they will release that.
Improving file filters is an endless task; adding core functionality to do that, expanding the feature set in lots of directions. For example, we've added multimedia support recently - one of our engineers has done native Gstreamer integration, so that's coming soon. SVG import support, improving the draw tools so we can do much better vector graphics editing. There's a huge gamut of things that we can be doing; content management system integration, lots and lots of interesting functionality. Tie-ins with the server; you can imagine wiki-like things but with much richer editing functionality; collaborative document editing. There are so many things that can be done, it's a shame not to have a go at it, and commit changes more quickly.
There are some encouraging things that are happening inside the core that have been done at Sun, threading for example. Currently, OpenOffice.org is essentially one thread; broadly speaking, the whole codebase is protected by one mutex. We're trying to improve this in various ways. The problem with this is that if you're loading a document, each square of the progress bar is 'nailed' individually on the screen. It's not a pretty process. Similarly, if you're loading one document and trying to edit another one, it can't be done - it's all in the same thread, protected by the same lock. There's a very clever guy called Kay Ramme who's come up with a model that will allow these non-threadsafe processes to interact and run concurrently, in separate domains. That's very promising.
We're starting now to split up the codebase; that's also very useful for servers, document translation - headless OpenOffice.org is another very interesting area. Anti-alias rendering: Novell did a big chunk of work to add a Cairo canvas, so now we can use Cairo to render slide-shows. Sun is doing another big chunk of work to move that out across OpenOffice.org, to use that rendering for everything.
There's plenty of room for growth and improvement, and performance is one of my pet hobby-horses. If you look at emacs versus vi, I use both, because vi starts in an instant, and emacs doesn't, frankly. I use emacs for virtually everything editing-wise while I'm in the 'flow', but I would never consider using OpenOffice.org for editing a simple text file. I just wouldn't do it, it's too big and too slow to start up, way beyond emacs. I'd like to move OpenOffice.org to the point where it's between vi and emacs, something you can really use fast and is the preferred tool for the job.
DJ: What do you think Microsoft Office has got that OpenOffice.org hasn't? Things that users might actually need?
MM: I think the things that I look at are more niche, maybe, but the problem is that you can use something very powerful, and not know you're using it. Change tracking in OpenOffice.org is really rather good, but maybe it doesn't handle tracking huge complex table changes. Clearly, if someone mails you a document with that in it, you need to be able to do it. So what may look like to you as just a chart, may actually be a pivot data report of a background database cache, with some huge number of rows crunched in some way. We need to be able to handle more of these cases, and get more things right. There are a lot of distributed bugs in minor areas, where a small percentage fall over - it would be good to fix them.
There's a lot that can be done to improve things across the board, in Microsoft Office too. Excel, for example, is fully threaded in terms of computation, so it can scale to the modern world where we have two CPUs in a laptop, maybe four in a couple of years. There's a whole load of really interesting computer science in threading big applications and getting these things to perform better, and there's a lot of low-hanging fruit. Just as a random example, I did something like a hundred line patch recently and saved about 1.6MB of OpenOffice.org, just code we don't need that we'd been linking for ages - a trivial amount of work for a measurable win. It's gratifying to be able to do that.
DJ: What do you think of Microsoft's OfficeOpen XML, and why do you think they chose that name?
MM: Hahahahaha! I was in the inaugural ECMA meeting with Microsoft, when they were announcing it. One of the amusing things was that Jean Paoli, a big XML strategist for Microsoft, inadvertently referred to it as OpenOffice XML! There's a serious cognitive dissonance between saying OfficeOpen and OpenOffice in the same phrase. But actually, the Microsoft guys are really interesting, and do a lot of good work on that spec - I think it's a brilliant initiative to open their file formats. It'll be very helpful for us, for interoperability, and so we've been helping them improve their spec, as Novell. I think that's entirely the right thing to do; other companies have chosen not to become involved, and that's their decision, but it's a shame not to get the very best specification possible.
The spec is not pleasant, but how many people are used to looking in their document files, anyway? The average user that Novell makes software for doesn't care, they just want their document to come back. One of the problems with OfficeOpen XML and OpenOffice.org's ODF is that possibly it doesn't all come back; maybe you can't represent it all, or you represent it in a different way that then interacts not so pleasantly. So if, for example, you have a problem with your table layout in the past, instead of fixing the table layout, they fixed the Microsoft import filter to construct fly frames and arrange them so it looked like a table, but wasn't actually. It appears to import, export and save again, but when you want to edit it like a table, it isn't one, and you get the most horrendous problems. There's nothing that can be done, because you've lost the fact that that was a table. So there are these kinds of compromises, which lead me to think that the customer wants to keep their data, even if it means some ugly choices in the file format.
I don't think there's a 'right' answer; we'll see which one wins in the marketplace. I think the people at Microsoft would love to be able to have an ODF format and throw away some of these legacy tail things. In OfficeOpen XML there are things like 'lay this document out like Word 95', 'lay it out like Word 97', 'lay it out with extended Asian hints'. Maintaining these and ensuring that they continue to work is a really big pain.
DJ: I suppose you've had the advantage of working from scratch, with ODF...
MM: Right. Some people say that ODF is just a dump of what Sun did, but I think that's unfair. Actually, I think it's a really nice way of representing data, with a unified table model. It is a beautiful thing; it's way shorter, smaller and more succinct than Microsoft's XML, but is that what the customer wants? At the end of the day, I think standardisation is good for everyone.
Michael Meeks' blog www.gnome.org/~michael/
CORBA and The GNOME http://developer.gnome.org/doc/guides/corba/
OpenOffice.org wiki http://wiki.services.openoffice.org