As mentioned during the Mozilla Summit and recent MathMLmeetings, progress has recently be made to the way mathematical equations are handled on Wikipedia. This work has mainly be done by
the volunteer contributor Moritz Schubotz (alias Physikerwelt),
Wikimedia Foundation's developer Gabriel Wicke as well as members of
MathJax.
Moritz has been particularly involved in that project and he even
travelled from Germany to San Francisco in order to meet MediaWiki developers and spend one month to do volunteer work on this project.
Although the solution is essentially ready for a couple of months, the
review of the patches is progressing slowly.
If you wish to speed up the integration of what is probably the most
important improvements to MediaWiki Math to happen, please read
how you can help
below.
Current Status
The approach that has been used on Wikipedia so far is the
following:
Equations are written in LaTeX
or more precisely, using a specific
set of LaTeX commands accepted by
texvc. One issue
for the MediaWiki developers is that this program is written in
OCaml and no longer maintained, so they would like to switch to a more
modern setup.
texvc calls the LaTeX program to convert the LaTeX source into PNG images and this is the default mode.
Unfortunately, using images for representing mathematical equations
on the Web
leads to classical problems (for example alignment or rendering quality
just to mention a few of them)
that can not be addressed without changing the
approach.
For a long time, registered users have been able to switch to the MathJax mode thanks to the help of nageh, a member of the MathJax community.
This mode solves many of the issues with PNG images but
unfortunately it adds its own problems,
some of them being just unacceptable for MediaWiki developers. Again, these issues are intrinsic to the use
of a Javascript polyfill and thus yet another approach is necessary for
a long-term perspective.
Finally, registered users can also switch to the LaTeX source mode, that is only display the text source of equations.
Short Term Plan
Native MathML is the appropriate way to fix all the issues regarding the display of mathematical formulas in browsers. However, the language is still not perfectly implemented in Web rendering engines, so some fallback is necessary. The new approach will thus be:
The TeX equation will still be edited by hand but it will be
possible to use a visual editor.
texvc will be used as a filter to validate the TeX source.
This
will ensure that only the texvc LaTeX syntax is accepted and will avoid
other potential security issues.
The LaTeX-to-PNG conversion as well as OCaml language will be kept in
the short term, but the plan is to drop the former and to replace the
latter with a a PHP equivalent.
A LaTeX-to-MathML conversion followed
by a MathML-to-SVG conversion will be
performed server-side using MathJax.
By default all the users will receive the same output (MathML+SVG+PNG) but only one will be made visible, according your browser capabilities. As a
first step, native MathML will only be used in Gecko
and other rendering engines will see the SVG/PNG fallback ; but the goal is to progressively drop the old PNG output and to move
to native MathML.
Registered users will still be able to switch to the LaTeX source mode.
Registered users will still be able to use MathJax client-side, especially if they want to use the HTML-CSS output. However, this is
will no longer be a separate mode but an option to enable. That is, the MathML/SVG/PNG/Source is displayed normally and progressively replaced with MathJax's output.
Most of the features above have already been approved and integrated in the development branch or are undergoing review process.
How can you help?
The main point is that
everybody can review the patches on Gerrit.
If you know about Javascript and/or PHP, if you are interested in math
typesetting and wish to get involved in an important Open Source project
such as Wikipedia then it is definitely the right time to help
the MediaWiki Math project. The article
How to become a MediaWiki hacker is a very good introduction.
When getting involved in a new open source project one of the most
important step is to set up the development environment. There are
various ways to setup a local installation of MediaWiki but
using
MediaWiki-Vagrant might be the simplest one: just follow the
Quick Start Guide and use
vagrant enable-role math to
enable the Math Extension.
If you need more information, you can ask
Moritz
or try to reach people on the
#mediawiki (freenode) or #mathml (mozilla) channels. Thanks in advance for your help!
As I mentioned three months ago, I wanted to start a crowdfunding campaign so that I can have more time to devote to MathML developments in browsers and (at least for Mozilla) continue to mentor volunteer contributors. Rather than doing several crowdfunding campaigns for small features, I finally decided to do a single crowdfunding campaign with Ulule so that I only have to worry only once about the funding. This also sounded more convenient for me to rely on some French/EU website regarding legal issues, taxes etc. Also, just like Kickstarter it's possible with Ulule to offer some "rewards" to backers according to the level of contributions, so that gives a better way to motivate them.
As everybody following MathML activities noticed, big companies/organizations do not want to significantly invest in funding MathML developments at the moment. So the rationale for a crowdfunding campaign is to rely on the support of the current community and on the help of smaller companies/organizations that have business interest in it. Each one can give a small contribution and these contributions sum up in enough money to fund the project. Of course this model is probably not viable for a long term perspective, but at least this allows to start something instead of complaining without acting ; and to show bigger actors that there is a demand for these developments. As indicated on the Ulule Website, this is a way to start some relationship and to build a community around a common project. My hope is that it could lead to a long term funding of MathML developments and better partnership between the various actors.
Because one of the main demand for MathML (besides accessibility) is in EPUB, I've included in the project goals a collection of documents that demonstrate advanced Web features with native MathML. That way I can offer more concrete rewards to people and federate them around the project. Indeed, many of the work needed to improve the MathML rendering requires some preliminary "code refactoring" which is not really exciting or immediately visible to users...
Hence I launched the crowdfunding campaign the 19th of November and we reached 1/3 of the minimal funding goal in only three days! This was mainly thanks to the support of individuals from the MathML community. In mid december we reached the minimal funding goal after a significant contribution from the KWARC Group (Jacobs University Bremen, Germany) with which I have been in communication since the launch of the campaign. Currently, we are at 125% and this means that, minus the Ulule commision and my social/fiscal obligations, I will be able to work on the project during about 3 months.
I'd like to thank again all the companies, organizations and people who have supported the project so far! The crowdfunding campaign continues until the end of January so I hope more people will get involved. If you want better MathML in Web rendering engines and ebooks then please support this project, even a symbolic contribution. If you want to do a more significant contribution as a company/organization then note that Ulule is only providing a service to organize the crowdfunding campaign but otherwise the funding is legally treated the same as required by my self-employed status; feel free to contact me for any questions on the project or funding and discuss the long term perspective.
Finally, note that I've used my savings and I plan to continue like that until the official project launch in February. Below is a summary of what have been done during the five weeks before the holiday season. This is based on my weekly updates for supporters where you can also find references to the Bugzilla entries. Thanks to the Apple & Mozilla developers who spent time to review my patches!
Collection of documents
The goal is to show how to use existing tools (LaTeXML, itex2MML, tex4ht etc) to build EPUB books for science and education using Web standards. The idea is to cover various domains (maths, physics, chemistry, education, engineering...) as well as Web features. Given that
many scientific circles are too much biased by "math on paper / PDF" and closed research practices, it may look innovative to use the Open Web but to be honest the MathML language and its integration with other Web formats is well established for a long time. Hence in theory it should "just work" once you have native MathML support, without any circonvolutions or hacks. Here are a couple of features that are tested in the sample EPUB books that I wrote:
Rendering of MathML equations (of course!). Since the screen size and resolution vary for e-readers, automatic line breaking / reflowing of the page is "naturally" tested and is an important distinction with respect to paper / PDF documents.
CSS styling of the page and equations. This includes using (Web) fonts,
which are very important for mathematical publishing.
Using SVG schemas and how they can be mixed with MathML equations.
Using non-ASCII (Arabic) characters and RTL/LTR rendering of both the text and equations.
Interactive document using Javascript and <maction>, <input>, <button> etc. For those who are curious, I've created some videos for an algebra course and a lab practical.
Using the <video> element to include short sequences of an experiment in a physics course.
Using the <canvas> element to draw graphs of functions or of physical measurements.
Using WebGL to draw interactive 3D schemas. At the moment, I've only adapted a chemistry course and used ChemDoodle to load Crystallographic Information Files (CIF) and provide 3D-representation of crystal structures. But of course, there is not any problem to put MathML equations in WebGL to create other kinds of scientific 3D schemas.
WebKit
I've finished some work started as a MathJax developer, including the maction support requested by the KWARC Group. I then tried to focus on the main goals: rendering of token elements and more specifically operators (spacing and stretching).
I improved LTR/RTL handling of equations (full RTL support is not implemented yet and not part of the project goal).
I improved the maction elements and implemented the toggle actiontype.
I refactored the code of some "mrow-like" elements to make them all behave like an <mrow> element. For example while WebKit stretched (some) operators in <mrow> elements it could not stretch them in <mstyle>, <merror> etc Similarly, this will be needed to implement correct spacing around operators in <mrow> and other "mrow-like" elements.
I analyzed more carefully the vertical stretching of operators. I see at least two serious bugs to fix: baseline alignment and stretch size. I've uploaded an experimental patch to improve that.
Preliminary work on the MathML Operator Dictionary. This dictionary contains various properties of operators like spacing and stretchiness and is fundamental for later work on operators.
I have started to refactor the code for mi, mo and mfenced elements. This is also necessary for many serious bugs like the operator dictionary and the style of mi elements.
I have written a patch to restore support for foreign objects in annotation-xml elements and to implement the same selection algorithm as Gecko.
Gecko
I've continued to clean up the MathML code and to mentor volunteer contributors. The main goal is the support for the Open Type MATH table, at least for operator stretching.
Xuan Hu's work on the <mpadded> element landed in trunk. This element is used to modify the spacing of equations, for example by some TeX-to-MathML generators.
On Linux, I fixed a bug with preferred widths of MathML token elements. Concretely, when equations are used inside table cells or similar containers there is a bug that makes equations overflow the containers. Unfortunately, this bug is still present on Mac and Windows...
James Kitchener implemented the mathvariant attribute (e.g used by some tools to write symbols like double-struck, fraktur etc). This also fixed remaining issues with preferred widths of MathML token elements. Khaled Hosny started
to update his Amiri and XITS fonts to add the glyphs for Arabic mathvariants.
I finished Quentin Headen's code refactoring of mtable. This allowed to fix some bugs like bad alignment with columnalign. This is also a preparation for future support for rowspacing and columnspacing.
After the two previous points, it was finally possible to remove the private "_moz-" attributes. These were visible in the DOM or when manipulating MathML via Javascript (e.g. in editors, tree inspector, the html5lib etc)
Khaled Hosny fixed a regression with script alignments. He started to work on improvements regarding italic correction when positioning scripts. Also, James Kitchener made some progress on script size correction via the Open Type "ssty" feature.
I've refactored the stretchy operator code and prepared some patches to read the OpenType MATH table. You can try experimental support for new math fonts with e.g. Bill Gianopoulos' builds and the MathML Torture Tests.
Blink/Trident
MathML developments in Chrome or Internet Explorer is not part of the
project goal,
even if obviously MathML improvements to WebKit could
hopefully be imported to Blink in the future. Users keep asking for MathML in IE and I hope that a solution will be found to save MathPlayer's work. In the meantime, I've sent a proposal to Google and Microsoft to implement fallback content (alttext and semantics annotation) so that authors can use it. This is just a couple of CSS rules that could be integrated in the user agent style sheet. Let's see which of the two companies is the most reactive...
Note: some parts of this blog post (especially the Javascript program) may be lost when exported to Planet or other feed aggregators. Please view it on the original page.
I recently took a look at the description of the CSS 2D / SVG transform
matrix(a, b, c, d, e, f) on MDN and I added a
concrete example showing the effect of such a
transform on an SVG line, in order to make this clearer for people who
are not familiar with affine transformations or matrices.
This also recalled me a small algorithm to decompose an arbitrary
SVG transform into a composition of basic transforms (Scale, Rotate,
Translate and Skew) that I wrote 5 years ago for the Amaya SVG
editor.
I translated it into Javascript and I make it available here. Feel free
to copy it on MDN or anywhere else. The convention used to represent
transforms as 3-by-3 matrices
is the one of the SVG specification.
Live demo
Enter the CSS 2D transform you want to
reduce and decompose or pick one example from the list
. You can also choose between LU-like or QR-like decomposition:
.
CSS
Here is the reduced CSS/SVG matrix as computed by your rendering engine ? and its matrix representation:
After simplification (and modulo rounding errors), an SVG
decomposition into simple transformations is ? and it renders
like this:
After simplification (and modulo rounding errors), a CSS decomposition
into simple transformations is ? and it renders like this:
CSS
A matrix decomposition of the original transform is:
Mathematical Description
The decomposition algorithm is based on the classical
LU and
QR
decompositions. First remember the SVG specification: the transform
matrix(a,b,c,d,e,f) is represented by the matrix
which shows the classical factorization into a composition of a linear
transformation $\left(\begin{array}{cc}a& c\\ b& d\end{array}\right)$
and a translation $\left(\begin{array}{c}e\\ f\end{array}\right)$. Now let's focus on the matrix
$\left(\begin{array}{cc}a& c\\ b& d\end{array}\right)$ and denote $\Delta =ad-bc$ its determinant. We first
consider the LDU decomposition. If $a\ne 0$, we can use it as a pivot and
apply one step of Gaussian's elimination:
Hence if $a\ne 0$, the transform matrix(a,b,c,d,e,f) can be
written
translate(e,f) skewY(atan(b/a)) scale(a, Δ/a) skewX(c/a).
If $a=0$ and $b\ne 0$ then we have $\Delta =-cb$ and we
can write (this is approximately "LU with full pivoting"):
and so the transform becomes
translate(e,f) rotate(90°) scale(b, Δ/b) skewX(d/b). Finally,
if $a=b=0$, then we already have an LU decomposition and we can just write
and so the transform is
translate(e,f) scale(c, d) skewX(45°) scale(0, 1).
As a consequence, we have proved that any transform
matrix(a,b,c,d,e,f)
can be decomposed into a product of simple
transforms. However, the decomposition is not always what we want, for example
scale(2) rotate(30°) will be decomposed into a product that
involves skewX and skewY instead of preserving the
nice factors.
We thus consider instead the QR decomposition.
If $\Delta \ne 0$, then by applying the Gram–Schmidt process to the columns
$\left(\begin{array}{c}a\\ b\end{array}\right),\left(\begin{array}{c}c\\ d\end{array}\right)$
we obtain
where $r=\sqrt{{a}^{2}+{b}^{2}}\ne 0$. In that case, the transform becomes
translate(e,f) rotate(sign(b) * acos(a/r)) scale(r, Δ/r)
skewX(atan((a c + b d)/r^2)). In particular, a similarity transform
preserves orthogonality and length ratio and so
$ac+bd=\left(\begin{array}{c}a\\ b\end{array}\right)\cdot \left(\begin{array}{c}c\\ d\end{array}\right)=0$
and $\Delta =\parallel \left(\begin{array}{c}a\\ b\end{array}\right)\parallel \mid \left(\begin{array}{c}c\\ d\end{array}\right)\parallel \mathrm{cos}(\pi /2)={r}^{2}$. Hence for a
similarity transform we get
translate(e,f) rotate(sign(b) * acos(a/r)) scale(r) as wanted.
We also
note that it is enough to assume the weaker hypothesis
$r\ne 0$ (that is $a\ne 0$ or $b\ne 0$)
in the expression above and so the decomposition applies in that case too.
Similarly, if we let $s=\sqrt{{c}^{2}+{d}^{2}}$ and instead assume
$c\ne 0$ or $d\ne 0$ we get
Hence in that case the transform is
translate(e,f) rotate(90° - sign(d) * acos(-c/s)) scale(Delta/s, s) skewY(atan((a c + b d)/s^2)). Finally if $a=b=c=d=0$, then the transform is
just
scale(0,0).
The decomposition algorithms are now easy to write. We note that none of
them gives the best result in all the cases (compare for example how they factor
Rotate2 and Skew1). Also, for completeness we have included the noninvertible
transforms in our study
(that is $\Delta =0$) but in practice they are not really useful (try
NonInvertible).
This morning, Deyan Ginev announced on the LaTeXML mailing list that the first alpha version of LaTeXML with LaTeX to EPUB support is now available. This is a very good news for people willing to encourage researchers to move from offline formats to more modern Web formats. Although, some people
had already been successful to combine LaTeX-to-XHTML converters
and XHTML-to-EPUB converters, this is the first tool that I'm aware of that can do the direct LaTeX to EPUB3 (XHTML+MathML) conversion. I already mentioned a couple of Gecko-based EPUB tools in my previous blog post, so let's have a look at three of them. Feel free to mention more Gecko-based EPUB tools in the comments, I'm particularly interested to hear about FirefoxOS applications that would be similar
to Apple's iBooks.
I have updated the LaTeXML samples based on Boris Zbarsky's thesis that we demonstrated at the Innovation Fairs in Santa Clara & Brussels. This shows how to generate the traditional PDF version, the Web version, the Web version with MathJax fallback and now the EPUB version! Here are some screenshots using the Firefox extension Lucifox:
Boris' Thesis in Lucifox ; page 2
Boris' Thesis in Lucifox ; page 4
I have intentionally not shown the diagram that are incorrectly converted by LaTeXML due to missing Xy-pic support (this is still in development). However,
Gecko supports mixing SVG and MathML via the foreignObject element so this would not be a problem for Gecko-based EPUB readers. Here are some screenshots of an ebook about
regular polygon that can be constructed with compass and straightedge that I have created with the help of itex2MML. They are viewed in EPUBReader which is another Firefox extension:
EPUBReader, Constructible Numbers
EPUBReader, Cyclic Galois Extension
Lucifox and EPUBReader have a big drawback: they do not support EPUB pages with the "scripted" property. This means that you can not use Javascript to create dynamic ebooks with live samples or interactive exercices... but this is one of the reason to use Web formats! Fortunately, there is a XUL application called AZARDI that supports this feature. I have created another ebook that shows an interactive
course on matrices. Click on the image to see the video on YouTube:
update 2013-10-15: since I got feedback, I have to say that my funding plan is independent of my work at MathJax ; I'm not a MathJax employee but I have an independent contractor status. Actually, I already used my business to fund an intern for Gecko MathML developments during Summer 2011 :-)
Retrospect
Since last April, I have been allowed by the MathJax Consortium to dedicate a small amount of my time to do MathML development in browsers, until possibly more serious involvement later. At the same time, we mentioned this plan to Google developers but unfortunately they just decided to drop the WebKit MathML code from Blink, making external contributions hard and unwelcome...
Hence I have focused mainly on Gecko and WebKit: You can find the MathML bugs that have been closed during that period on bugzilla.mozilla.org and bugs.webkit.org. For Gecko, this has allowed me to finish some of the work I started as a volunteer before I was involved full-time in MathJax as well as to continue to mentor MathML contributors. Regarding WebKit, I added a few new basic features like MathML lengths, <mspace> or <mmultiscripts> while I was getting familiar with the MathML code and WebKit organization/community. I also started to work
on <semantics> and <maction>.
More importantly, I worked with Martin Robinson to address the design concerns of Google developers and a patch to fix these issues finally landed early this week.
However, my progress has been slow so as I mentioned in my previous blog post, I am planning to find
a way to fund MathML developments...
Why funding MathML?
Note: I am assuming that the readers of this blog know why MathML is important and are aware of the benefits it can bring
to the Web community. If not, please check
Peter Krautzberger's Interview by Fidus Writer or the MozSummit MathML slides for a quick introduction.
Here my point is to explain why we need more than volunteer-driven
development for MathML.
First the obvious thing: Volunteer time is limited so if
we really want to see serious progress in MathML support we need to give a
boost to MathML developments. e-book publishers/readers, researchers/educators who are stuck outside the Web in a LaTeX-to-PDF world, developers/users of accessibility tools or the MathML community in general want good math support in browsers now and not to wait again for 15 more years until all layout engines catch up with
Gecko or that the old Gecko bugs are fixed.
There are classical misunderstandings from people thinking that non-native
MathML solutions and other polyfills are the future or that math on the Web could be implemented
via PNG/SVG images or Web Components.
Just open a math book and you will see
that e.g. inline equations must be correctly aligned with the text or
participate
in line wrapping. Moreover we are considering math on the Web not math on paper,
so we want it to be compatible with HTML, SVG, CSS, Javascript,
Unicode, Bidi etc and also something that is fast and responsive. Technically,
this means that a clean solution must be in the core rendering engine,
spread over several parts of the code and must have strong interaction with the
various components like the HTML5 parser, the layout tree,
the graphic and font libraries, the DOM module, the style tree and so forth.
I do not see any volunteer-driven
Blink/Gecko/WebKit feature off the top of my head that has this
characteristic and actually even SVG or any other kind of language for
graphics have less interaction with HTML than MathML has.
The consequence of this is that it is extremely difficult for volunteers
to get involved in native MathML
and to do quick progress because they have to understand
how the various components of the Blink/Gecko/WebKit code work and be sure to do
things correctly. Good mathematical rendering is already something hard by
itself, so that is even more complicated when you are not writing an isolated
rendering engine for math on which you can have full control.
Also, working at the Blink/Gecko/WebKit level requires technical skills above the average
so finding volunteers who can work with the high-minded engineers of
the big browser companies is not something easy.
For instance, among the enthusiastic people coming to me and
willing to help MathML in Gecko, many got stuck when e.g. they tried to build
the Firefox source or do something more advanced and I never heard back from
them.
In the other direction, Blink/Gecko/WebKit paid developers are generally
not familiar with
MathML and do not have time to learn more about it
and thus can not always provide a relevant review of the code, or they may
break something while trying to modify code they do not entirely understand.
Moreover,
both the volunteers and paid staff have only a small amount of time to do
MathML stuff while the other parts of the engine evolve very quickly,
so it's sometimes hard to keep everything in sync.
Finally,
the core layout engines have strong security requirements that are difficult to satisfy in a volunteer-driven situation...
Beyond volunteer-driven MathML developments
At that point, there are several options. First the lazy one: Give up
with native math rendering, only focus on features that have impact on the
widest Web audience (i.e. those that would allow browser vendors to get more market share and thus increase their profit), thank the math people for creating the Web and kindly ask them to use
whatever hacks they can imagine to display equations on the Web. Of course as a
Mozillian, I think people
must decide the Web they want and thus exclude this option.
Next there is the ingenuous option: Expect that browser companies
will understand the importance of math-on-the-Web and start investing
seriously in MathML support. However, Netscape and Microsoft
rejected the <MATH> tag from the 1995 HTML 3.0 draft and the browser
companies have kept repeating they would only rely on volunteer contributions
to move MathML forward, despite the repeated requests from MathML folks and other scientific communities. So that option is excluded too, at least in the short
to medium term.
So it remains the ambitious option: Math people and other interested parties
must get together and try to fund native MathML developments. Despite the effort
of my manager at MathJax to convince partners and raise funds, my situation has
not changed much since April and it is not clear when/if the MathJax Consortium
can take the lead in native MathML developments. Given my expertise
in Gecko, WebKit and MathML, I feel the duty to do something.
Hence I wish to reorganize
my work time: Decrease my involvement in MathJax core in order to increase
my involvement in Gecko/WebKit developments. But I need the help of the
community for that purpose. If you run a business with interest for math-on-the-Web
and are willing to fund my work, then feel free to contact me directly by
mail for further discussion. In the short term, I want
to experiment with
Crowd Funding as
discussed in the next section. If this is successful we can think
of a better organization for MathML developments in the long term.
Crowd Funding
Wikipedia defines
Crowd funding as
"the collective effort of individuals who network and pool their money, usually
via the Internet, to support efforts initiated by other people or organizations". There are several Crowd Funding platforms with similar rule/interface.
I am considering Catincan which is specialized in Open Source Crowd Funding, can be used by any backer/developer around the world, can rely on Bugzilla to track the bug status and
seems to have good process to collect the
fund from backers and to pay developers.
You can easily login to the Catincan Website if
you have a GitHub, Facebook or Google account (apparently
Persona is not supported yet...). Finally, it seems to have a communication interface between backers and
developers, so that everybody can follow the development on the funded
features.
One distinctive feature of catincan is that only well-established Open
Source projects can be funded and only developers from these projects can
propose and work on the new features ; so that backers can trust that the
features will be implemented. Of course, I have been working on Gecko, WebKit and
MathML projects
so I hope people believe I sincerely want to improve
MathML support in browsers and that I have the skills to do so ;-)
As said in my previous blog post, it is not clear at all (at least to me)
whether Crowd Funding can be a reliable method, but it is worth trying. There are
many individuals and small businesses showing interest in MathML, without
the technical knowledge or appropriate staff to improve MathML in browsers. So if each
one fund a small amount of money, perhaps we can get something.
One constraint is that each feature has 60 days to reach the
funding goal. I do not have any idea about how many people are willing
to contribute to MathML and how much money they can give.
The statistical sample of projects currently funded is too small to extract relevant
information. However, I essentially see two options:
Either propose small features
and split the big ones in small steps, so that each catincat submission
will need less work/money and improvements will be progressive with
regular feedback to backers ;
or propose larger features so they look more attractive and exciting to people
and will require less frequent submissions to catincat.
At the beginning, I plan to start with the former and if the crowd funding is
successful perhaps try the latter.
Status in Open Source Layout Engines
Note: Obviously, Open Source Crowd Funding does not apply
to Internet Explorer, which is the one main rendering engine not mentioned below. Although
Microsoft has done a great job on MathML for Microsoft Word, they did not
give any public statement about MathML in Internet Explorer and all the bug
reports for MathML have been resolved "by design" so far. If you are interested
in MathML rendering and accessibility in Internet Explorer, please check
Design Science blog for the latest updates
and tools.
Blink
Note: I am actually focusing on the history of Chromium here but of course there are other Blink-based browsers. Note that programs like QtWebEngine (formerly WebKit-based) or Opera (formerly Presto-based) lost the opportunity to get MathML support when they switched to Blink.
Alex Milowski and François Sausset's first MathML implementation did not
pass Google's security review. Dave Barton fixed many issues in that implementation and as far as I know, there were not any known security vulnerabilities when Dave submitted his last version. MathML was enabled in Chrome 24 but Chrome developers had some concerns about the design of the MathML implementation in WebKit, which indeed violated some assumptions of WebKit layout code. So MathML was disabled in Chrome 25 and as said in the introduction, the source code was entirely removed when they forked.
Currently, the Chromium Dashboard indicates that MathML is shipped in Firefox/Safari, has positive feedback from developers and is an established standard ; but the Chromium status remains "No active development".
If I understand correctly,
Google's official position is that
they do not plan to invest in MathML development but will accept external
contributions and may re-enable MathML when it is ready
(for some sense of "ready" to be defined).
Given the MathML story in
Chrome, it seems really unlikely that any volunteer will magically show up and be willing to
submit a MathML patch. Incidentally, note the
interesting work
of the ChromeVox team regarding MathML accessibility:
Their recent video
provides a good overview of what they achieve (where Volker Sorge politely regrets
that "MathML is not implemented in all browsers").
Although Google's design concerns have now been addressed in WebKit, one
most serious remark from one Google engineer is that the WebKit MathML implementation is
of too low quality to be shipped so they just prefer to have no MathML
at all. As a consequence, the best short term strategy seems to be improving
WebKit MathML support and, once it is good enough, to submit a patch to
Google. The immediate corollary is that if you wish to see MathML in Chrome
or other Blink-based browsers you should
help WebKit MathML development. See the next section for
more details.
chromatic
Actually, I tried to import MathML into Blink one day this summer. However,
there were divergences between the WebKit and Blink code bases that made that
a bit difficult. I do not plan to try again anytime soon, but if someone is
interested, I have published my script and patch on GitHub. Note there may be even more divergences now and the patch is
certainly bit-rotten. I also thought about creating/maintaining a "Chromatic"
browser (Chrome + mathematics) that would be a temporary fork to let Blink
users benefit from native MathML until it is integrated back in Blink. But
at the moment, that would probably be too much effort for one person and
I would prefer to focus on WebKit/Gecko developments for now.
WebKit
The situation in WebKit is much better. As said before, Google's concerns
are now addressed and MathML will be enabled again in all WebKit releases
soon.
Martin Robinson is interested in helping the MathML developments in
WebKit and his knowledge of fonts will be important to improve operator
stretching, which is one of the biggest issue right now.
One new volunteer contributor, Gurpreet Kaur, also started to
do some work on WebKit like support for the *scriptshifts
attributes or for the <menclose> element. Last but
not least, a couple of Apple/WebKit developers reviewed and accepted
patches and even helped to fix a few bugs, which made possible to move
development forward.
When he was still working on WebKit, Dave Barton opened bug 99623 to track the top priorities. When the bugs below and their related dependencies are fixed, I think the rendering in WebKit will be good enough to be usable for advanced math notations and WebKit will pass the MathML Acid 1 test.
Bug 44208:
For example, in expression like
$\mathrm{sin}\left(x\right)$,
the "x" should be in italic but not the "sin". This is actually slightly
more complicated: It says when the default mathvariant
value must be normal/italic.
mathvariant is more like
the
text-transform CSS property in the sense that it remaps
some characters to the corresponding mathematical characters (italic, bold, fraktur,
double-struck...) for example
$\mathfrak{A}$ (mathvariant=fraktur A)
should render exactly the same as $\U0001d504$ (U+1D504).
By the way, there is the related bug 24230 on Windows, that prevents to use plane 1 characters.
The best solution will probably be to
implement mathvariant correctly. See also Gecko's ongoing work by James Kitchener below.
Bug 99618: Implement <mmultiscripts>, that allows expressions like
${}_{6}{}^{14}\mathrm{C}$ or $R_{i}{}_{j}{}_{;}{}^{j}=\frac{1}{2}S_{;}{}_{i}$. As said in the introduction, this is fixed in WebKit Nightly.
Bug 99614: Support for stretchy operators like in
${\left(\frac{\overline{{z}_{1}+{z}_{2}}}{3}\right)}^{4}$. Currently,
WebKit can only stretch operators vertically using a few Unicode constructions
like ⎛ (U+239B) + ⎜ (U+239C) + ⎝ (U+239D) for the left parenthesis.
Essentially only similar delimiters like brackets, braces etc are supported.
For small
sizes like $(\text{}$ or for large operators like
$\sum {n}^{2}$ it is necessary to use non-unicode glyphs in various math fonts, but this
is not possible in WebKit MathML yet. All of this will require a fair amount of
work: implementing horizontal stretching, font-specific stuff,
largeop/symmetric properties etc
Bug 99620:
Implement the operator dictionary. Currently, WebKit treats all the operators the same way, so for
example it will use the same 0.2em spacing before and after parenthesis, equal sign or invisible
operators in e.g.
$f\left(x\right)={x}^{2}$. Instead it should use the information provided by the MathML operator dictionary. This dictionary also specifies whether operators are stretchy, symmetric or
largeop and thus is related to the previous point.
Bug 119038: Use the code for vertical stretchy operators to draw the radical symbols
in roots like $\sqrt{\frac{2}{3}}$. Currently,
WebKit uses graphic primitives which do not give a really good rendering.
Bug 115610: Implement <mspace> which is used by many MathML generators
to do some spacing in mathematical formulas. As said in the introduction, this is fixed in WebKit Nightly.
In order to pass the Mozilla MathML torture tests, at least displaystyle and scriptlevel must be implemented too, probably as internal CSS properties. This should also allow to pass
Joe Java's MathML test, although that one relies on the infamous <mfenced>
that duplicates the stretchy operator features and is implemented inconsistently
in rendering engines. I think passing the MathML Acid 2 test will require slightly more effort,
but I expect this goal to be achievable if I have more time to work on WebKit:
Bug 120164: Implement negative spacing for <mspace> (I have an experimental patch).
Bug 85730: Implement <mpadded>, which is also used by MathML generators to do some tweaking of formulas. I have only done some experiments, that would be a generalization of <mspace>
Bug 85733: Implement the href attribute ; well I guess the name is explicit enough to understand what it is used for! I only have some experimental patch here too. That would be mimicing what is done in SVG or HTML.
Bug 120059 and
bug 100626: Implement <maction> (at least partially) and <semantics>,
which have been asked by long-time MathML users Jacques Distler and Michael Kohlhase. I have patches ready for that and this could be fixed relatively soon, I just need to find time to finish the work.
In general passing the MathML Acid 2 test is not too hard, you merely only need to implement those few MathML elements whose exact rendering is clearly defined by the MathML specification. Passing the MathML Acid 3 test is not expected in the medium term. However, the score will
naturally increase while we improve WebKit MathML implementation. The priority
is to implement what is currently known to be important to users.
To give examples of bugs not previously mentioned: Implementing menclose or fixing various DOM issues like bugs 57695, 57696 or 107392.
More advanced features like those mentioned in the next section for Gecko
are probably worth considering later (Open type MATH, linebreaking,
mlabeledtr...). It is worth noting that Apple has already
done some work on accessibility (with MathML being readable by VoiceOver
in iOS7), authoring and EPUB (MathML is enabled in WebKit-based ebook
readers
and ibooks-author has
an integrated LaTeX-to-MathML converter).
Gecko
In general I think I have a good relationship with the Mozilla community and most people have respect for the work that has been done by volunteers for almost 15 years now. The situation has greatly improved since I joined the project, at that time some people claimed the
Mozilla MathML project was dead after Roger Sidge's departure.
One important point is that Karl Tomlinson has worked
on repairing the MathML support when Roger Sidge left the project. Hence
there is at least one Mozilla employee with good knowledge of MathML who
can review the volunteer patches. Another key ingredient is the work that has recently been made by Mozilla to increase engagement of the volunteer
community like good documentation on MDN, the #Introduction channel, Josh Matthews' mentored bugs and of course programs like GSOC. However, as said
above, it is one thing to attract enthusiastic contributors and another thing
to get long-term contributors who can work independently on more advanced features. So
let's go back to my latest Roadmap for the Mozilla MathML Project and see what has been accomplished for one year:
Font support: Dmitry Shachnev created a Debian package for the
MathJax fonts and Mike Hommey added MathJax and Asana fonts in the list
of suggested packages for Iceweasel. The STIX fonts have also been
updated in Fedora and are installed by default on
Mac OS X Lion (10.7). For Linux distributions, it would be helpful
to implement Auto Installation Support. The bug to
add mathematical fonts to Android has been assigned in June but no more progress has happened so far.
Henri Sinoven opened a bug for FirefoxOS but there has not been any progress there either.
I had some patches to restore the "missing MathML fonts" warning (using an information bar) but it was refused by Firefox reviewers. However, the code to detect missing MathML font could still be used for the similar bug 648548, which also seems inactive since January. There are still some issues on the MathJax side that prevent to integrate Web fonts for the native MathML output mode. So at the moment the solution is
still to inform visitors about MathML fonts or to add MathML Web fonts to your Web site. Khaled Hosny (font and LaTeX expert) recently updated my patches to prepare the support for Open Type fonts and he offered to help on that feature.
After James Kitchener's work on mathvariant, we realized that we will
probably need to provide Arabic mathematical fonts too.
Spacing: Xuan Hu continued to work on <mpadded> improvements and I think his patch is close to be accepted. Quentin Headen has done some progress on <mtable> before focusing on his InstantBird GSOC project. He is still far from being able to work on
mtable@rowspacing/columnspacing but a work around for that has been added
to MathJax. I fixed the negative space regression
which was missing to pass the MathML Acid 2 test and is used in MathJax. Again, Khaled Hosny is willing to help to use the spacing of the Open Type MATH, but that will still be a lot of work.
<mlabeledtr>: A work around for native MathML has been added in MathJax.
Linebreaking: No progress except that I have worked on fixing a bug with intrinsic width computation. The unrelated printing issues mentioned in the blog post have been fixed, though.
Operator Stretching: No progress. I tried to analyze the regression more carefully, but nothing is ready yet.
Tabular elements: As said above, Quentin Headen has worked a bit on cleaning up <mtable> but not much improvements on that feature so far.
Token elements: My patch for <ms> landed and I have done significant progress on the bad measurement of intrinsic width for token elements (however, the fix only seems to work on Linux right now). James Kitchener has taken over my work on improving our mathvariant support and doing related refactoring of the code. I am confident that he will be able to have something ready soon. The primes in exponents should render correctly with MathJax fonts but for other math fonts we will have to do some glyph substitutions.
Dynamic MathML: No progress here but there are not so many bugs regarding Javascript+MathML, so that should not be too serious.
Documentation: It is now possible to use MathML in code sample or
directly in the source code. The MathML project pages have been entirely migrated to MDN. Also, Florian Scholz has recently been hired by Mozilla as
a documentation writer (congrats!) and will in particular continue the work he started as a volunteer to document MathML on MDN.
I apologize to volunteers who worked on bugs that are not mentioned above or who are doing documentation or testing that do not appear here. For a complete list of activity since September 2012, Bugzilla is your friend. There are two ways to consider the progress above.
If you see the glass half full, then you see that several people have continued
the work on various MathML issues, they have made some progress and we now pass
the MathML Acid 2 test. If you see
the glass half empty, then you see that most issues have not been addressed
yet and in particular those that are blocking the native MathML to be enabled
in MathJax: bug 687807, bug 415413, the math font issues discussed in the first point, and perhaps linebreaking too. That is why I believe we should go beyond volunteer-driven MathML
developments.
Most of the bugs mentioned above are tested by the MathML Acid 3 tests and we will win a few points when they are fixed. Again, passing MathML Acid 3 test is not a goal by itself so let's consider what are the big remaining areas it contains:
Improving Tabular Elements and Operator Stretching, which are obviously important and used a lot in e.g. MathJax.
Linebreaking, which as I said is likely to become fundamental with small screens and ebooks.
Elementary Mathematics (you know addition, subtraction, multiplication, and division that kids learn), which I suspect will be important for educational tools and ebooks.
Alignment: This is the one part of MathML that I am not entirely sure is relevant to work on in the short term. I understand it is useful for advanced layout but most MathML tools currently just rely on tables to do that job and as far as I know the only important engine that implements that is MathPlayer.
Finally there are other features outside the MathML rendering engines that
I also find important but for which I have less expertise:
Transferring MathML that is implementing copy/cut/drag and paste. Currently, we can do that by treating MathML as normal HTML5 code or by using the "show MathML source" feature and copying the source code. However, it would be best to implement a standard way to communicate with other MathML applications like Microsoft Word, Mathematica, Mapple, Windows' Handwriting panel etc I wrote
some work-in-progress patches last year.
Authoring MathML: Essentially implementing things like deletion, insertion etc maybe simple MathML token creation ; in Gecko's core editor, which is used by BlueGriffon, KompoZer, SeaMonkey, Thunderbird or even MDN. Other things like integrating Javascript parsers (e.g. ASCIIMath) or equation panels with buttons like are probably better done at the higher JS/HTML/XUL level. Daniel Glazman already wrote math input panels for
BlueGriffon and
Thunderbird.
MathML Accessibility: This is one important application of MathML for which there is strong demand and where Mozilla is behind the competitors. James Teh started some experimental work on his NVDA tool before the summit.
EPUB reader for FirefoxOS (and other mobile platforms): During the
"Co-creating Action Plans" session, the Mozilla Taipei people were thinking
about missing features for FirefoxOS and this idea about EPUB reader was my modest contribution.
There are a few EPUB readers relying on Gecko and it would be good to check if they work in
FirefoxOS and if they could be integrated by default, just like
Apple has iBooks. BTW, there is a version of BlueGriffon that can edit EPUB books.
Conclusion
I hope I have convinced some of the readers about the need to fund MathMLin browsers. There is a lot of MathML work to do on Gecko and WebKit but both projects have volunteers and core engineers who are willing to help. There are also several individuals / companies relying on MathML support in rendering engines for their projects and could support the MathML developments in some way. I am willing to put more of my time on Gecko and WebKit developments, but I need financial help for that purpose. I'm proposing catincan Crowd Funding in
the short term so that anyone can contribute at the appropriate level, but other alternatives to fund the MathML development can be found like asking Peter Krautzberger about native MathML funding in MathJax,
discussing with Igalia about funding Martin Robinson to work more on WebKit
MathML or contacting me directly to establish some kind of part-time
consulting agreement.
Please leave a comment on this blog or send me a private mail, if you
agree that funding MathML in browsers is important, if you like the crowd funding idea and plan to contribute ; or if you have any opinions about alternative funding options. Also, please tell me what seem to be the priority for you and
your projects among what I have mentioned above
(layout engines, features etc) or among others that I may have forgotten. Of course,
any other constructive comment to help MathML support in browsers is welcome. I plan to submit features on catincan soon, once I have more feedback on
what people are interested in. Thank you!