Improvements to Mathematics on Wikipedia
By fredw on Monday, January 13 2014, 07:56 - Permalink
Introduction
As mentioned during the Mozilla Summit and recent MathML meetings, progress has recently be made to the way mathematical equations are handled on Wikipedia. This work has mainly be done by the volunteer contributor Moritz Schubotz (alias Physikerwelt), Wikimedia Foundation's developer Gabriel Wicke as well as members of MathJax. Moritz has been particularly involved in that project and he even travelled from Germany to San Francisco in order to meet MediaWiki developers and spend one month to do volunteer work on this project. Although the solution is essentially ready for a couple of months, the review of the patches is progressing slowly. If you wish to speed up the integration of what is probably the most important improvements to MediaWiki Math to happen, please read how you can help below.
Current Status
The approach that has been used on Wikipedia so far is the following:
- Equations are written in LaTeX or more precisely, using a specific set of LaTeX commands accepted by texvc. One issue for the MediaWiki developers is that this program is written in OCaml and no longer maintained, so they would like to switch to a more modern setup.
- texvc calls the LaTeX program to convert the LaTeX source into PNG images and this is the default mode. Unfortunately, using images for representing mathematical equations on the Web leads to classical problems (for example alignment or rendering quality just to mention a few of them) that can not be addressed without changing the approach.
- For a long time, registered users have been able to switch to the MathJax mode thanks to the help of nageh, a member of the MathJax community. This mode solves many of the issues with PNG images but unfortunately it adds its own problems, some of them being just unacceptable for MediaWiki developers. Again, these issues are intrinsic to the use of a Javascript polyfill and thus yet another approach is necessary for a long-term perspective.
- Finally, registered users can also switch to the LaTeX source mode, that is only display the text source of equations.
Short Term Plan
Native MathML is the appropriate way to fix all the issues regarding the display of mathematical formulas in browsers. However, the language is still not perfectly implemented in Web rendering engines, so some fallback is necessary. The new approach will thus be:
- The TeX equation will still be edited by hand but it will be possible to use a visual editor.
- texvc will be used as a filter to validate the TeX source. This will ensure that only the texvc LaTeX syntax is accepted and will avoid other potential security issues. The LaTeX-to-PNG conversion as well as OCaml language will be kept in the short term, but the plan is to drop the former and to replace the latter with a a PHP equivalent.
- A LaTeX-to-MathML conversion followed by a MathML-to-SVG conversion will be performed server-side using MathJax.
- By default all the users will receive the same output (MathML+SVG+PNG) but only one will be made visible, according your browser capabilities. As a first step, native MathML will only be used in Gecko and other rendering engines will see the SVG/PNG fallback ; but the goal is to progressively drop the old PNG output and to move to native MathML.
- Registered users will still be able to switch to the LaTeX source mode.
- Registered users will still be able to use MathJax client-side, especially if they want to use the HTML-CSS output. However, this is will no longer be a separate mode but an option to enable. That is, the MathML/SVG/PNG/Source is displayed normally and progressively replaced with MathJax's output.
Most of the features above have already been approved and integrated in the development branch or are undergoing review process.
How can you help?
The main point is that everybody can review the patches on Gerrit. If you know about Javascript and/or PHP, if you are interested in math typesetting and wish to get involved in an important Open Source project such as Wikipedia then it is definitely the right time to help the MediaWiki Math project. The article How to become a MediaWiki hacker is a very good introduction.
When getting involved in a new open source project one of the most
important step is to set up the development environment. There are
various ways to setup a local installation of MediaWiki but
using
MediaWiki-Vagrant might be the simplest one: just follow the
Quick Start Guide and use
vagrant enable-role math
to
enable the Math Extension.
The second step is to create a WikiTech account and to set up the appropriate SSH keys on your MediaWiki-Vagrant virtual machine. Then you can check the Open Changes, test & review them. The Gerrit code review guide may helpful, here.
If you need more information, you can ask
Moritz
or try to reach people on the
#mediawiki
(freenode) or #mathml
(mozilla) channels. Thanks in advance for your help!
Comments
"One issue for the MediaWiki developers is that this program is written in OCaml and no longer maintained"
Is OCaml so horrible? How about reworking it in Haskell. ;D
Jokes aside, I believe seeing a patch for it not that long ago, but it might have been just some small fix and the sooner everyone will leave the raster graphic formula world the better. Thanks for all your effort!