MathML Crowdfunding
MathML
Crowdfunding

Blog de Frédéric

To content | To menu | To search

Tuesday, February 25 2014

TeXZilla 0.9.4 Released

update 2014/03/11: TeXZilla is now available as an npm module.

Introduction

For the past two months, the Mozilla MathML team has been working on TeXZilla, yet another LaTeX-to-MathML converter. The idea was to rely on itex2MML (which dates back from the beginning of the Mozilla MathML project) to create a LaTeX parser such that:

  • It is compatible with the itex2MML syntax and is similarly generated from a LALR(1) grammar (the goal is only to support a restricted set of core LaTeX commands for mathematics, for a more complete converter of LaTeX documents see LaTeXML).
  • It is available as a standalone Javascript module usable in all the Mozilla Web applications and add-ons (of course, it will work in non-Mozilla products too).
  • It accepts any Unicode characters and supports right-to-left mathematical notation (these are important for the world-wide aspect of the Mozilla community).

The parser is generated with the help of Jison and relies on a grammar based on the one of itex2MML and on the unicode.xml file of the XML Entity Definitions for Characters specification. As suggested by the version number, this is still in development. However, we have made enough progress to present interesting features here and get more users and developers involved.

Quick Examples

\frac{x^2}{a^2} + \frac{y^2}{b^2} = 1

x2a2+y2b2=1\frac{x^2}{a^2} + \frac{y^2}{b^2} = 1

∑_{n=1}^{+∞} \frac{1}{n^2} = \frac{π^2}{6}

n=1+1n2=π26∑_{n=1}^{+∞} \frac{1}{n^2} = \frac{π^2}{6}

س = \frac{-ب\pm\sqrt{ب^٢-٤اج}}{٢ا}

س=-ب±ب٢-٤اج٢اس = \frac{-ب\pm\sqrt{ب^٢-٤اج}}{٢ا}

Live Demo / FirefoxOS Web app

A live demo is available to let you test the LaTeX-to-MathML converter with various options and examples. For people willing to use the converter on their mobiles a FirefoxOS Web app is also available.

Using TeXZilla in a CommonJS program or Web page

TeXZilla is made of a single TeXZilla.js file with a public API to convert LaTeX to MathML or extract the TeX source from a MathML element. The converter accepts some options like inline/display mode or RTL/LTR direction of mathematics.

You can load it the standard way in any Javascript program and obtain a TeXZilla object that exposes the public API. For example in a commonJS program, to convert a TeX source into a MathML source:

  var TeXZilla = require("./TeXZilla");
  console.log(TeXZilla.toMathMLString("\\sqrt{\\frac{x}{2}+y}"));

or in a Web Page, to convert a TeX source into a MathML DOM element:

  <script type="text/javascript" src="TeXZilla.js"></script>
  ...
  var MathMLElement = TeXZilla.toMathML("\\sqrt{\\frac{x}{2}+y}");

Using TeXZilla in Mozilla Add-ons

One of the goal of TeXZilla is to be integrated in Mozilla add-ons, allowing people to write cool math applications (in particular, we would like to have an add-on for Thunderbird). A simple Firefox add-on has been written and passed the AMO review, which means that you can safely include the TeXZilla.js script in your own add-ons.

TeXZilla can be used as an addon-sdk module. However, if you intend to use features requiring a DOMParser instance (for example toMathML), you need to initialize the DOM explicitly:

  var {Cc, Ci} = require("chrome");
  TeXZilla.setDOMParser(Cc["@mozilla.org/xmlextras/domparser;1"].
                        createInstance(Ci.nsIDOMParser));

More generally, for traditional Mozilla add-ons, you can do

  TeXZilla.setDOMParser(Components.
                        classes["@mozilla.org/xmlextras/domparser;1"].
                        createInstance(Components.interfaces.nsIDOMParser));

Using TeXZilla from the command line

TeXZilla has a basic command line interface. However, since CommonJS is still being standardized, this may work inconsistently between commonjs interpreters. We have tested it on slimerjs (which uses Gecko), phantomjs and nodejs. For example you can do

  $ slimerjs TeXZilla.js parser "a^2+b^2=c^2" true
  <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><...

or launch a Web service (see next section). We plan to implement a stream filter too so that it can behave the same as itex2MML: looking the LaTeX fragments from a text document and converting them into MathML.

Using TeXZilla as a Web Server

TeXZilla can be used as a Web Server that receives POST and GET HTTP requests with the LaTeX input and sends JSON replies with the MathML output. The typical use case is for people willing to perform some server-side LaTeX-to-MathML conversion.

For instance, to start the TeXZilla Webserver on port 7777:

  $ nodejs TeXZilla.js webserver 7777
  Web server started on http://localhost:7777

Then you can sent a POST request:

  $ curl -H "Content-Type: application/json" -X POST -d '{"tex":"x+y","display":"true"}' http://localhost:7777
  {"tex":"x+y","mathml":"<math xmlns=\"http://www.w3.org/1998/Math/MathML\"...

or a GET request:

  $ curl "http://localhost:7777/?tex=x+y&rtl=true"
  {"tex":"x+y","mathml":"<math xmlns=\"http://www.w3.org/1998/Math/MathML\"...

Note that client-side conversion is trivial using the public API, but see the next section.

Web Components Custom Element <x-tex>

We used the X-Tag library to implement a simple Web Components Custom Element <x-tex>. The idea is to have a container for LaTeX expressions like

  <x-tex dir="rtl">س = \frac{-ب\pm\sqrt{ب^٢-٤اج}}{٢ا}</x-tex>

that will be converted into MathML by TeXZilla and displayed in your browser: س=-ب±ب٢-٤اج٢اس = \frac{-ب\pm\sqrt{ب^٢-٤اج}}{٢ا}. You can set the display/dir attributes on that <x-tex> element and they will be applied to the <math> element. Instances of <x-tex> elements also have a source property that you can use to retrieve or set the LaTeX source. Of course, the MathML output will automatically be updated when dynamic changes occur. You can try this online demo.

CKEditor Plugins / Integration in MDN

Finally, we created a first version of a TeXZilla CKEditor plugin. An online demo is available here. We already sent a pull request to Kuma and we hope it will soon enable users to put mathematical mathematical formulas in MDN articles without having to paste the MathML into the source code view. It could be enhanced later with a more advanced UI.

Wednesday, January 29 2014

New MathML Firefox add-ons on AMO

While the patches for MathML integration in MediaWiki are progressively being reviewed and merged and while we are working on the support for Open Type fonts with a MATH table in Gecko, I finally found time to check the progress in Mozilla's add-on SDK. In particular, since the last time I tried (some years ago) they have introduced a cleaner interface for content scripts as well as the possibility to use XPCOM for missing features. Hence I have been able to update some of my experimental MathML add-ons. I have submitted two new add-ons to Mozilla's AMO that I hope could be useful to some people:

  • MathJax Native MathML, an add-on to force MathJax to switch to Gecko's MathML support without having to use the MathJax menu to change the output mode and works even on Websites where that menu is disabled. This also removes MathJax's automatic rescaling and inline-block span that are currently causing random rendering bugs with Gecko's native MathML (and will confuse possible future line-breaking support anyway).
    MathJax Native MathML
  • MathML Copy (at the moment only partially reviewed by the AMO team), an add-on to copy MathML and TeX into the clipboard. For MathML, two flavors are copied: the source as plain text (to paste in your favorite text editor) and the MathML as HTML (to paste in Thunderbird, MDN, any Gecko-based HTML editor etc). Copying TeX is only possible when it is provided via the standard MathML annotation method, which is the case in e.g. LaTeXML and Instiki documents as well as in Wikipedia in the future.
    MathML Copy

As usual, there is room for improvements and bug fixes, but that's a start. In particular I would be happy to get translations for the two strings of the MathML Copy add-on: "Copy MathML Formula" and "Copy TeX Source". Also, because I used the add-on SDK these add-ons are unfortunately only available for Firefox at the moment...

Monday, January 13 2014

Improvements to Mathematics on Wikipedia

Introduction

Wikipedia

As mentioned during the Mozilla Summit and recent MathML meetings, progress has recently be made to the way mathematical equations are handled on Wikipedia. This work has mainly be done by the volunteer contributor Moritz Schubotz (alias Physikerwelt), Wikimedia Foundation's developer Gabriel Wicke as well as members of MathJax. Moritz has been particularly involved in that project and he even travelled from Germany to San Francisco in order to meet MediaWiki developers and spend one month to do volunteer work on this project. Although the solution is essentially ready for a couple of months, the review of the patches is progressing slowly. If you wish to speed up the integration of what is probably the most important improvements to MediaWiki Math to happen, please read how you can help below.

Current Status

The approach that has been used on Wikipedia so far is the following:

  • Equations are written in LaTeX or more precisely, using a specific set of LaTeX commands accepted by texvc. One issue for the MediaWiki developers is that this program is written in OCaml and no longer maintained, so they would like to switch to a more modern setup.
  • texvc calls the LaTeX program to convert the LaTeX source into PNG images and this is the default mode. Unfortunately, using images for representing mathematical equations on the Web leads to classical problems (for example alignment or rendering quality just to mention a few of them) that can not be addressed without changing the approach.
  • For a long time, registered users have been able to switch to the MathJax mode thanks to the help of nageh, a member of the MathJax community. This mode solves many of the issues with PNG images but unfortunately it adds its own problems, some of them being just unacceptable for MediaWiki developers. Again, these issues are intrinsic to the use of a Javascript polyfill and thus yet another approach is necessary for a long-term perspective.
  • Finally, registered users can also switch to the LaTeX source mode, that is only display the text source of equations.

Short Term Plan

Native MathML is the appropriate way to fix all the issues regarding the display of mathematical formulas in browsers. However, the language is still not perfectly implemented in Web rendering engines, so some fallback is necessary. The new approach will thus be:

  • The TeX equation will still be edited by hand but it will be possible to use a visual editor.
  • texvc will be used as a filter to validate the TeX source. This will ensure that only the texvc LaTeX syntax is accepted and will avoid other potential security issues. The LaTeX-to-PNG conversion as well as OCaml language will be kept in the short term, but the plan is to drop the former and to replace the latter with a a PHP equivalent.
  • A LaTeX-to-MathML conversion followed by a MathML-to-SVG conversion will be performed server-side using MathJax.
  • By default all the users will receive the same output (MathML+SVG+PNG) but only one will be made visible, according your browser capabilities. As a first step, native MathML will only be used in Gecko and other rendering engines will see the SVG/PNG fallback ; but the goal is to progressively drop the old PNG output and to move to native MathML.
  • Registered users will still be able to switch to the LaTeX source mode.
  • Registered users will still be able to use MathJax client-side, especially if they want to use the HTML-CSS output. However, this is will no longer be a separate mode but an option to enable. That is, the MathML/SVG/PNG/Source is displayed normally and progressively replaced with MathJax's output.

Most of the features above have already been approved and integrated in the development branch or are undergoing review process.

How can you help?

MediaWiki

The main point is that everybody can review the patches on Gerrit. If you know about Javascript and/or PHP, if you are interested in math typesetting and wish to get involved in an important Open Source project such as Wikipedia then it is definitely the right time to help the MediaWiki Math project. The article How to become a MediaWiki hacker is a very good introduction.

When getting involved in a new open source project one of the most important step is to set up the development environment. There are various ways to setup a local installation of MediaWiki but using MediaWiki-Vagrant might be the simplest one: just follow the Quick Start Guide and use vagrant enable-role math to enable the Math Extension.

The second step is to create a WikiTech account and to set up the appropriate SSH keys on your MediaWiki-Vagrant virtual machine. Then you can check the Open Changes, test & review them. The Gerrit code review guide may helpful, here.

If you need more information, you can ask Moritz or try to reach people on the #mediawiki (freenode) or #mathml (mozilla) channels. Thanks in advance for your help!

Sunday, January 5 2014

Funding MathML Developments in Gecko and WebKit (part 2)

As I mentioned three months ago, I wanted to start a crowdfunding campaign so that I can have more time to devote to MathML developments in browsers and (at least for Mozilla) continue to mentor volunteer contributors. Rather than doing several crowdfunding campaigns for small features, I finally decided to do a single crowdfunding campaign with Ulule so that I only have to worry only once about the funding. This also sounded more convenient for me to rely on some French/EU website regarding legal issues, taxes etc. Also, just like Kickstarter it's possible with Ulule to offer some "rewards" to backers according to the level of contributions, so that gives a better way to motivate them.

As everybody following MathML activities noticed, big companies/organizations do not want to significantly invest in funding MathML developments at the moment. So the rationale for a crowdfunding campaign is to rely on the support of the current community and on the help of smaller companies/organizations that have business interest in it. Each one can give a small contribution and these contributions sum up in enough money to fund the project. Of course this model is probably not viable for a long term perspective, but at least this allows to start something instead of complaining without acting ; and to show bigger actors that there is a demand for these developments. As indicated on the Ulule Website, this is a way to start some relationship and to build a community around a common project. My hope is that it could lead to a long term funding of MathML developments and better partnership between the various actors.

Because one of the main demand for MathML (besides accessibility) is in EPUB, I've included in the project goals a collection of documents that demonstrate advanced Web features with native MathML. That way I can offer more concrete rewards to people and federate them around the project. Indeed, many of the work needed to improve the MathML rendering requires some preliminary "code refactoring" which is not really exciting or immediately visible to users...

Hence I launched the crowdfunding campaign the 19th of November and we reached 1/3 of the minimal funding goal in only three days! This was mainly thanks to the support of individuals from the MathML community. In mid december we reached the minimal funding goal after a significant contribution from the KWARC Group (Jacobs University Bremen, Germany) with which I have been in communication since the launch of the campaign. Currently, we are at 125% and this means that, minus the Ulule commision and my social/fiscal obligations, I will be able to work on the project during about 3 months.

I'd like to thank again all the companies, organizations and people who have supported the project so far! The crowdfunding campaign continues until the end of January so I hope more people will get involved. If you want better MathML in Web rendering engines and ebooks then please support this project, even a symbolic contribution. If you want to do a more significant contribution as a company/organization then note that Ulule is only providing a service to organize the crowdfunding campaign but otherwise the funding is legally treated the same as required by my self-employed status; feel free to contact me for any questions on the project or funding and discuss the long term perspective.

Finally, note that I've used my savings and I plan to continue like that until the official project launch in February. Below is a summary of what have been done during the five weeks before the holiday season. This is based on my weekly updates for supporters where you can also find references to the Bugzilla entries. Thanks to the Apple & Mozilla developers who spent time to review my patches!

Collection of documents

The goal is to show how to use existing tools (LaTeXML, itex2MML, tex4ht etc) to build EPUB books for science and education using Web standards. The idea is to cover various domains (maths, physics, chemistry, education, engineering...) as well as Web features. Given that many scientific circles are too much biased by "math on paper / PDF" and closed research practices, it may look innovative to use the Open Web but to be honest the MathML language and its integration with other Web formats is well established for a long time. Hence in theory it should "just work" once you have native MathML support, without any circonvolutions or hacks. Here are a couple of features that are tested in the sample EPUB books that I wrote:

  • Rendering of MathML equations (of course!). Since the screen size and resolution vary for e-readers, automatic line breaking / reflowing of the page is "naturally" tested and is an important distinction with respect to paper / PDF documents.
  • CSS styling of the page and equations. This includes using (Web) fonts, which are very important for mathematical publishing.
  • Using SVG schemas and how they can be mixed with MathML equations.
  • Using non-ASCII (Arabic) characters and RTL/LTR rendering of both the text and equations.
  • Interactive document using Javascript and <maction>, <input>, <button> etc. For those who are curious, I've created some videos for an algebra course and a lab practical.
  • Using the <video> element to include short sequences of an experiment in a physics course.
  • Using the <canvas> element to draw graphs of functions or of physical measurements.
  • Using WebGL to draw interactive 3D schemas. At the moment, I've only adapted a chemistry course and used ChemDoodle to load Crystallographic Information Files (CIF) and provide 3D-representation of crystal structures. But of course, there is not any problem to put MathML equations in WebGL to create other kinds of scientific 3D schemas.

WebKit

I've finished some work started as a MathJax developer, including the maction support requested by the KWARC Group. I then tried to focus on the main goals: rendering of token elements and more specifically operators (spacing and stretching).

  • I improved LTR/RTL handling of equations (full RTL support is not implemented yet and not part of the project goal).
  • I improved the maction elements and implemented the toggle actiontype.
  • I refactored the code of some "mrow-like" elements to make them all behave like an <mrow> element. For example while WebKit stretched (some) operators in <mrow> elements it could not stretch them in <mstyle>, <merror> etc Similarly, this will be needed to implement correct spacing around operators in <mrow> and other "mrow-like" elements.
  • I analyzed more carefully the vertical stretching of operators. I see at least two serious bugs to fix: baseline alignment and stretch size. I've uploaded an experimental patch to improve that.
  • Preliminary work on the MathML Operator Dictionary. This dictionary contains various properties of operators like spacing and stretchiness and is fundamental for later work on operators.
  • I have started to refactor the code for mi, mo and mfenced elements. This is also necessary for many serious bugs like the operator dictionary and the style of mi elements.
  • I have written a patch to restore support for foreign objects in annotation-xml elements and to implement the same selection algorithm as Gecko.

Gecko

I've continued to clean up the MathML code and to mentor volunteer contributors. The main goal is the support for the Open Type MATH table, at least for operator stretching.

  • Xuan Hu's work on the <mpadded> element landed in trunk. This element is used to modify the spacing of equations, for example by some TeX-to-MathML generators.
  • On Linux, I fixed a bug with preferred widths of MathML token elements. Concretely, when equations are used inside table cells or similar containers there is a bug that makes equations overflow the containers. Unfortunately, this bug is still present on Mac and Windows...
  • James Kitchener implemented the mathvariant attribute (e.g used by some tools to write symbols like double-struck, fraktur etc). This also fixed remaining issues with preferred widths of MathML token elements. Khaled Hosny started to update his Amiri and XITS fonts to add the glyphs for Arabic mathvariants.
  • I finished Quentin Headen's code refactoring of mtable. This allowed to fix some bugs like bad alignment with columnalign. This is also a preparation for future support for rowspacing and columnspacing.
  • After the two previous points, it was finally possible to remove the private "_moz-" attributes. These were visible in the DOM or when manipulating MathML via Javascript (e.g. in editors, tree inspector, the html5lib etc)
  • Khaled Hosny fixed a regression with script alignments. He started to work on improvements regarding italic correction when positioning scripts. Also, James Kitchener made some progress on script size correction via the Open Type "ssty" feature.
  • I've refactored the stretchy operator code and prepared some patches to read the OpenType MATH table. You can try experimental support for new math fonts with e.g. Bill Gianopoulos' builds and the MathML Torture Tests.

Blink/Trident

MathML developments in Chrome or Internet Explorer is not part of the project goal, even if obviously MathML improvements to WebKit could hopefully be imported to Blink in the future. Users keep asking for MathML in IE and I hope that a solution will be found to save MathPlayer's work. In the meantime, I've sent a proposal to Google and Microsoft to implement fallback content (alttext and semantics annotation) so that authors can use it. This is just a couple of CSS rules that could be integrated in the user agent style sheet. Let's see which of the two companies is the most reactive...

Sunday, December 1 2013

Decomposition of 2D-transform matrices

Note: some parts of this blog post (especially the Javascript program) may be lost when exported to Planet or other feed aggregators. Please view it on the original page.

I recently took a look at the description of the CSS 2D / SVG transform matrix(a, b, c, d, e, f) on MDN and I added a concrete example showing the effect of such a transform on an SVG line, in order to make this clearer for people who are not familiar with affine transformations or matrices.

This also recalled me a small algorithm to decompose an arbitrary SVG transform into a composition of basic transforms (Scale, Rotate, Translate and Skew) that I wrote 5 years ago for the Amaya SVG editor. I translated it into Javascript and I make it available here. Feel free to copy it on MDN or anywhere else. The convention used to represent transforms as 3-by-3 matrices is the one of the SVG specification.

Live demo

Enter the CSS 2D transform you want to reduce and decompose or pick one example from the list . You can also choose between LU-like or QR-like decomposition: .

CSS

Here is the reduced CSS/SVG matrix as computed by your rendering engine ? and its matrix representation:

After simplification (and modulo rounding errors), an SVG decomposition into simple transformations is ? and it renders like this:

SVG

After simplification (and modulo rounding errors), a CSS decomposition into simple transformations is ? and it renders like this:

CSS

A matrix decomposition of the original transform is:

Mathematical Description

The decomposition algorithm is based on the classical LU and QR decompositions. First remember the SVG specification: the transform matrix(a,b,c,d,e,f) is represented by the matrix

(a c e b d f 0 0 1)

and corresponds to the affine transformation

(x y)(a c b d)(x y)+(e f)

which shows the classical factorization into a composition of a linear transformation (a c b d) and a translation (e f). Now let's focus on the matrix (a c b d) and denote Δ=adbc its determinant. We first consider the LDU decomposition. If a0, we can use it as a pivot and apply one step of Gaussian's elimination:

(1 0 b/a 1)(a c b d)=(a c 0 Δ/a)

and thus the LDU decomposition is

(a c b d)=(1 0 b/a 1)(a 0 0 Δ/a)(1 c/a 0 1)

Hence if a0, the transform matrix(a,b,c,d,e,f) can be written translate(e,f) skewY(atan(b/a)) scale(a, Δ/a) skewX(c/a). If a=0 and b0 then we have Δ=cb and we can write (this is approximately "LU with full pivoting"):

(0 c b d)=(0 1 1 0)(b d 0 c)=(cos(π/2) sin(π/2) sin(π/2) cos(π/2))(b 0 0 Δ/b)(1 d/b 0 1)

and so the transform becomes translate(e,f) rotate(90°) scale(b, Δ/b) skewX(d/b). Finally, if a=b=0, then we already have an LU decomposition and we can just write

(0 c 0 d)=(c 0 0 d)(1 1 0 1)(0 0 0 1)

and so the transform is translate(e,f) scale(c, d) skewX(45°) scale(0, 1).

As a consequence, we have proved that any transform matrix(a,b,c,d,e,f) can be decomposed into a product of simple transforms. However, the decomposition is not always what we want, for example scale(2) rotate(30°) will be decomposed into a product that involves skewX and skewY instead of preserving the nice factors.

We thus consider instead the QR decomposition. If Δ0, then by applying the Gram–Schmidt process to the columns (a b),(c d) we obtain

(a c b d)=(a/r b/r b/r a/r)(r (ac+bd)/r 0 Δ/r)=(a/r b/r b/r a/r)(r 0 0 Δ/r)(1 (ac+bd)/r 2 0 1)

where r=a 2+b 20. In that case, the transform becomes translate(e,f) rotate(sign(b) * acos(a/r)) scale(r, Δ/r) skewX(atan((a c + b d)/r^2)). In particular, a similarity transform preserves orthogonality and length ratio and so ac+bd=(a b)(c d)=0 and Δ=(a b)(c d)cos(π/2)=r 2. Hence for a similarity transform we get translate(e,f) rotate(sign(b) * acos(a/r)) scale(r) as wanted. We also note that it is enough to assume the weaker hypothesis r0 (that is a0 or b0) in the expression above and so the decomposition applies in that case too. Similarly, if we let s=c 2+d 2 and instead assume c0 or d0 we get

(a c b d)=(cos(π/2) sin(π/2) sin(π/2) cos(π/2))(c/s d/s d/s c/s)(Δ/s 0 0 s)(1 0 (ac+bd)/s 2 1)

Hence in that case the transform is translate(e,f) rotate(90° - sign(d) * acos(-c/s)) scale(Delta/s, s) skewY(atan((a c + b d)/s^2)). Finally if a=b=c=d=0, then the transform is just scale(0,0).

The decomposition algorithms are now easy to write. We note that none of them gives the best result in all the cases (compare for example how they factor Rotate2 and Skew1). Also, for completeness we have included the noninvertible transforms in our study (that is Δ=0) but in practice they are not really useful (try NonInvertible).

- page 1 of 13