Blog de Frédéric

To content | To menu | To search

Tag - google

Entries feed - Comments feed

Sunday, December 20 2015

MathML at the Web Engines Hackfest 2015

Hackfest

Two weeks ago, I travelled to Spain to participate to the second Web Engines Hackfest which was sponsored by Igalia and Collabora. Such an event has been organized by Igalia since 2009 and used to be focused on WebkitGTK+. It is great to see that it has now been extended to any Web engines & platforms and that a large percentage of non-igalian developers has been involved this year. If you did not get the opportunity to attend this event or if you are curious about what happened there, take a look at the wiki page or flickr album.

Last day of the hackfest
Photo from @webengineshackfest licensed under Creative Commons Attribution-ShareAlike

I really like this kind of hacking-oriented and participant-driven event where developers can meet face to face, organize themselves in small collaboration groups to efficiently make progress on a task or give passionate talk about what they have recently been working on. The only small bemol I have is that it is still mainly focused on WebKit/Blink developments. Probably, the lack of Mozilla/Microsoft participants is probably due to Mozilla Coincidental Work Weeks happening at the same period and to the proprietary nature of EdgeHTML (although this is changing?). However, I am confident that Igalia will try and address this issue and I am looking forward to coming back next year!

MathML developments

This year, Igalia developer Alejandro G. Castro wanted to work with me on WebKit's MathML layout code and more specifically on his MathML refactoring branch. Indeed, as many people (including Google developers) who have tried to work on WebKit's code in the past, he arrived to the conclusion that the WebKit's MathML layout code has many design issues that make it a burden for the rest of the layout team and too complex to allow future improvements. I was quite excited about the work he has done with Javier Fernández to try and move to a simple box model closer to what exists in Gecko and thus I actually extended my stay to work one whole week with them. We already submitted our proposal to the webkit-dev mailing list and received positive feedback, so we will now start merging what is ready. At the end of the week, we were quite satisfied about the new approach and confident it will facilitate future maintenance and developements :-)

Main room
Photo from @webengineshackfest licensed under Creative Commons Attribution-ShareAlike

While reading a recent thread on the Math WG mailing list, I realized that many MathML people have only vague understanding of why Google (or to be more accurate, the 3 or 4 engineers who really spent some time reviewing and testing the WebKit code) considered the implementation to be unsafe and not ready for release. Even worse, Michael Kholhase pointed out that for someone totally ignorant of the technical implementation details, the communication made some years ago around the "flexbox-based approach" gave the impression that it was "the right way" (indeed, it somewhat did improve the initial implementation) and the rationale to change that approach was not obvious. So let's just try and give a quick overview of the main problems, even if I doubt someone can have good understanding of the situation without diving into the C++ code:

  1. WebKit's code to stretch operator was not efficient at all and was limited to some basic fences buildable via Unicode characters.
  2. WebKit's MathML code violated many layout invariants, making the code unreliable.
  3. WebKit's MathML code relied heavily on the C++ renderer classes for flexboxes and has to manage too many anonymous renderers.

The main security concerns were addressed a long time ago by Martin Robinson and me. Glyph assembly for stretchy operators are now drawn using low-level font painting primitive instead of creating one renderer object for each piece and the preferred width for them no longer depends on vertical metrics (although we still need some work to obtain Gecko's good operator spacing). Also, during my crowdfunding project, I implemented partial support for the OpenType MATH table in WebKit and more specifically the MathVariant subtable, which allows to directly use construction of stretchy operators specified by the font designer and not only the few Unicode constructions.

However, the MathML layout code still modifies the renderer tree to force the presence of anonymous renderers and still applies specific CSS rules to them. It is also spending too much time trying to adapt the parent flexbox renderer class which has at the same time too much features for what is needed for MathML (essentially automatic box alignment) and not enough to get exact placement and measuring needed for high-quality rendering (the TeXBook rules are more complex, taking into account various parameters for box shifts, drops, gaps etc).

During the hackfest, we started to rewrite a clean implementation of some MathML renderer classes similar to Gecko's one and based on the MathML in HTML5 implementation note. The code now becomes very simple and understandable. It can be summarized into four main functions. For instance, to draw a fraction we need:

  • computePreferredLogicalWidths which sets the preferred width of the fraction during the first layout pass, by considering the widest between numerator and denominator.
  • layoutBlock and firstLineBaseline which calculate the final width/height/ascent of the fraction element and position the numerator and denominator.
  • paint which draws the fraction bar.

Perhaps, the best example to illustrate how the complexity has been reduced is the case of the renderer of mmultiscripts/msub/msup/msubsup elements (attaching an arbitrary number of subscripts and superscripts before or after a base). In the current WebKit implementation, we have to create three anonymous wrappers (a first one for the base, a second one for prescripts and a third one for postscripts) and an anonymous wrapper for each subscript/superscript pair, add alignment styles for these wrappers and spend a lot of time maintaining the integrity of the renderer tree when dynamic changes happen. With the new code, we just need to do arithmetic calculations to position the base and script boxes. This is somewhat more complex than the fraction example above but still, it remains arithmetic calculations and we can not reduce any further if we wish quality comparable to TeXBook / MATH rules. We actually take into account many parameters from the OpenType MATH table to get much better placement of scripts. We were able to fix bug 130325 in about twenty minutes instead of fighting with a CSS "negative margin" hack on anonymous renderers.

MathML dicussions

The first day of the hackfest we also had an interesting "breakout session" to define the tasks to work on during the hackfest. Alejandro briefly presented the status of his refactoring branch and his plan for the hackfest. As said in the previous section, we have been quite successful in following this plan: Although it is not fully complete yet, we expect to merge the current changes soon. Dominik Röttsches who works on Blink's font and layout modules was present at the MathML session and it was a good opportunity to discuss the future of MathML in Chrome. I gave him some references regarding the OpenType MATH table, math fonts and the MathML in HTML5 implementation note. Dominik said he will forward the references to his colleagues working on layout so that we can have deeper technical dicussion about MathML in Blink in the long term. Also, I mentioned noto-fonts issue 330, which will be important for math rendering in Android and actually does not depend on the MathML issue, so that's certainly something we could focus on in the short term.

Álex and Fred during the MathML breakout session
Photo from @webengineshackfest licensed under Creative Commons Attribution-ShareAlike

Alejandro also proposed to me to prepare a talk about MathML in Web Engines and exciting stuff happening with the MathML Association. I thus gave a brief overview of MathML and presented some demos of native support in Gecko. I also explained how we are trying to start a constructive approach to solve miscommunication between users, spec authors and implementers ; and gather technical and financial resources to obtain a proper solution. In a slightly more technical part, I presented Microsoft's OpenType MATH table and how it can be used for math rendering (and MathML in particular). Finally, I proposed my personal roadmap for MathML in Web engines. Although I do not believe I am a really great speaker, I received positive feedback from attendees. One of the thing I realized is that I do not know anything about the status and plan in EdgeHTML and so overlooked to mention it in my presentation. Its proprietary nature makes hard for external people to contribute to a MathML implementation and the fact that Microsoft is moving away from ActiveX de facto excludes third-party plugin for good and fast math rendering in the future. After I went back to Paris, I thus wrote to Microsoft employee Christian Heilmann (previously working for Mozilla), mentioning at least the MathML in HTML5 Implementation Note and its test suite as a starting point. MathML is currently on the first page of the most voted feature requested for Microsoft Edge and given the new direction taken with Microsoft Edge, I hope we could start a discussion on MathML in EdgeHTML...

Conclusion

This was a great hackfest and I'd like to thank again all the participants and sponsors for making it possible! As Alejandro wrote to me, "I think we have started a very interesting work regarding MathML and web engines in the future.". The short term plan is now to land the WebKit MathML refactoring started during the hackfest and to finish the work. I hope people now understand the importance of fonts with an OpenType MATH table for good mathematical rendering and we will continue to encourage browser vendors and font designers to make such fonts wide spread.

The new approach for WebKit MathML support gives good reason to be optmimistic in the long term and we hope we will be able to get high-quality rendering. The fact that the new approach addresses all the issues formulated by Google and that we started a dialogue on math rendering, gives several options for MathML in Blink. It remains to get Microsoft involved in implementing its own OpenType table in EdgeHTML. Overall, I believe that we can now have a very constructive discussion with the individuals/companies who really want native math support, with the Math WG members willing to have their specification implemented in browsers and with the browser vendors who want a math implementation which is good, clean and compatible with the rest of their code base. Hopefully, the MathML Association will be instrumental in achieving that. If everybody get involved, 2016 will definitely be an exciting year for native MathML in Web engines!

Friday, October 16 2015

Open Font Format 3 released: Are browser vendors good at math?

Version 3 of the Open Font Format was officially published as ISO standard early this month. One of the interesting new feature is that Microsoft's MATH table has been integrated into this official specification. Hopefully, this will encourage type designers to create more math fonts and OS vendors to integrate them into their systems. But are browser vendors ready to use Open Font Format for native MathML rendering? Here is a table of important Open Font Format features for math rendering and (to my knowledge) the current status in Apple, Google, Microsoft and Mozilla products.

FeatureRationaleAppleGoogleMicrosoftMozilla
Pre-installed math fontsMake mathematical rendering possible with the default system fonts.OSX: Obsolete STIX
iOS: no
Android: no
Chrome OS: no
Windows: Cambria Math
Windows phone: no?
Firefox OS: no
MATH table allowed in Web fontsWorkaround the lack of pre-installed math fonts or let authors provide custom math style.WebKit: yes (no font sanitizer?)Blink: yes (OTS)Trident: yes (no font sanitizer?)Gecko: yes (OTS)
USE_TYPO_METRICS OS/2 fsSelection flag taken into accountMath fonts contain tall glyphs (e.g. integrals in display style) and so using the "typo" metrics avoids excessive line spacing for the math text.WebKit: noBlink: yesTrident: yesGecko: yes (gfx/)
Open Font Format FeaturesGood mathematical rendering requires some glyph substitutions (e.g. ssty, flac and dtls).WebKit: yesBlink: yesTrident: yesGecko: yes
Ability to parse the MATH tableGood mathematical rendering requires many font data.WebKit: yes (WebCore/platform/graphics/)Blink: noTrident: yes (LineServices)Gecko (gfx/)
Using the MATH table for native MathML renderingThe MathML specification does not provide detailed rules for mathematical rendering.WebKit: for operator stretching (WebCore/rendering/mathml/)Blink: noTrident: noGecko: yes (layout/mathml/)
Total Score:4/63/64.5/65/6

update: Daniel Cater provided a list of pre-installed fonts on Chrome OS stable, confirming that no fonts with a MATH table are available.

Sunday, January 5 2014

Funding MathML Developments in Gecko and WebKit (part 2)

As I mentioned three months ago, I wanted to start a crowdfunding campaign so that I can have more time to devote to MathML developments in browsers and (at least for Mozilla) continue to mentor volunteer contributors. Rather than doing several crowdfunding campaigns for small features, I finally decided to do a single crowdfunding campaign with Ulule so that I only have to worry only once about the funding. This also sounded more convenient for me to rely on some French/EU website regarding legal issues, taxes etc. Also, just like Kickstarter it's possible with Ulule to offer some "rewards" to backers according to the level of contributions, so that gives a better way to motivate them.

As everybody following MathML activities noticed, big companies/organizations do not want to significantly invest in funding MathML developments at the moment. So the rationale for a crowdfunding campaign is to rely on the support of the current community and on the help of smaller companies/organizations that have business interest in it. Each one can give a small contribution and these contributions sum up in enough money to fund the project. Of course this model is probably not viable for a long term perspective, but at least this allows to start something instead of complaining without acting ; and to show bigger actors that there is a demand for these developments. As indicated on the Ulule Website, this is a way to start some relationship and to build a community around a common project. My hope is that it could lead to a long term funding of MathML developments and better partnership between the various actors.

Because one of the main demand for MathML (besides accessibility) is in EPUB, I've included in the project goals a collection of documents that demonstrate advanced Web features with native MathML. That way I can offer more concrete rewards to people and federate them around the project. Indeed, many of the work needed to improve the MathML rendering requires some preliminary "code refactoring" which is not really exciting or immediately visible to users...

Hence I launched the crowdfunding campaign the 19th of November and we reached 1/3 of the minimal funding goal in only three days! This was mainly thanks to the support of individuals from the MathML community. In mid december we reached the minimal funding goal after a significant contribution from the KWARC Group (Jacobs University Bremen, Germany) with which I have been in communication since the launch of the campaign. Currently, we are at 125% and this means that, minus the Ulule commision and my social/fiscal obligations, I will be able to work on the project during about 3 months.

I'd like to thank again all the companies, organizations and people who have supported the project so far! The crowdfunding campaign continues until the end of January so I hope more people will get involved. If you want better MathML in Web rendering engines and ebooks then please support this project, even a symbolic contribution. If you want to do a more significant contribution as a company/organization then note that Ulule is only providing a service to organize the crowdfunding campaign but otherwise the funding is legally treated the same as required by my self-employed status; feel free to contact me for any questions on the project or funding and discuss the long term perspective.

Finally, note that I've used my savings and I plan to continue like that until the official project launch in February. Below is a summary of what have been done during the five weeks before the holiday season. This is based on my weekly updates for supporters where you can also find references to the Bugzilla entries. Thanks to the Apple & Mozilla developers who spent time to review my patches!

Collection of documents

The goal is to show how to use existing tools (LaTeXML, itex2MML, tex4ht etc) to build EPUB books for science and education using Web standards. The idea is to cover various domains (maths, physics, chemistry, education, engineering...) as well as Web features. Given that many scientific circles are too much biased by "math on paper / PDF" and closed research practices, it may look innovative to use the Open Web but to be honest the MathML language and its integration with other Web formats is well established for a long time. Hence in theory it should "just work" once you have native MathML support, without any circonvolutions or hacks. Here are a couple of features that are tested in the sample EPUB books that I wrote:

  • Rendering of MathML equations (of course!). Since the screen size and resolution vary for e-readers, automatic line breaking / reflowing of the page is "naturally" tested and is an important distinction with respect to paper / PDF documents.
  • CSS styling of the page and equations. This includes using (Web) fonts, which are very important for mathematical publishing.
  • Using SVG schemas and how they can be mixed with MathML equations.
  • Using non-ASCII (Arabic) characters and RTL/LTR rendering of both the text and equations.
  • Interactive document using Javascript and <maction>, <input>, <button> etc. For those who are curious, I've created some videos for an algebra course and a lab practical.
  • Using the <video> element to include short sequences of an experiment in a physics course.
  • Using the <canvas> element to draw graphs of functions or of physical measurements.
  • Using WebGL to draw interactive 3D schemas. At the moment, I've only adapted a chemistry course and used ChemDoodle to load Crystallographic Information Files (CIF) and provide 3D-representation of crystal structures. But of course, there is not any problem to put MathML equations in WebGL to create other kinds of scientific 3D schemas.

WebKit

I've finished some work started as a MathJax developer, including the maction support requested by the KWARC Group. I then tried to focus on the main goals: rendering of token elements and more specifically operators (spacing and stretching).

  • I improved LTR/RTL handling of equations (full RTL support is not implemented yet and not part of the project goal).
  • I improved the maction elements and implemented the toggle actiontype.
  • I refactored the code of some "mrow-like" elements to make them all behave like an <mrow> element. For example while WebKit stretched (some) operators in <mrow> elements it could not stretch them in <mstyle>, <merror> etc Similarly, this will be needed to implement correct spacing around operators in <mrow> and other "mrow-like" elements.
  • I analyzed more carefully the vertical stretching of operators. I see at least two serious bugs to fix: baseline alignment and stretch size. I've uploaded an experimental patch to improve that.
  • Preliminary work on the MathML Operator Dictionary. This dictionary contains various properties of operators like spacing and stretchiness and is fundamental for later work on operators.
  • I have started to refactor the code for mi, mo and mfenced elements. This is also necessary for many serious bugs like the operator dictionary and the style of mi elements.
  • I have written a patch to restore support for foreign objects in annotation-xml elements and to implement the same selection algorithm as Gecko.

Gecko

I've continued to clean up the MathML code and to mentor volunteer contributors. The main goal is the support for the Open Type MATH table, at least for operator stretching.

  • Xuan Hu's work on the <mpadded> element landed in trunk. This element is used to modify the spacing of equations, for example by some TeX-to-MathML generators.
  • On Linux, I fixed a bug with preferred widths of MathML token elements. Concretely, when equations are used inside table cells or similar containers there is a bug that makes equations overflow the containers. Unfortunately, this bug is still present on Mac and Windows...
  • James Kitchener implemented the mathvariant attribute (e.g used by some tools to write symbols like double-struck, fraktur etc). This also fixed remaining issues with preferred widths of MathML token elements. Khaled Hosny started to update his Amiri and XITS fonts to add the glyphs for Arabic mathvariants.
  • I finished Quentin Headen's code refactoring of mtable. This allowed to fix some bugs like bad alignment with columnalign. This is also a preparation for future support for rowspacing and columnspacing.
  • After the two previous points, it was finally possible to remove the private "_moz-" attributes. These were visible in the DOM or when manipulating MathML via Javascript (e.g. in editors, tree inspector, the html5lib etc)
  • Khaled Hosny fixed a regression with script alignments. He started to work on improvements regarding italic correction when positioning scripts. Also, James Kitchener made some progress on script size correction via the Open Type "ssty" feature.
  • I've refactored the stretchy operator code and prepared some patches to read the OpenType MATH table. You can try experimental support for new math fonts with e.g. Bill Gianopoulos' builds and the MathML Torture Tests.

Blink/Trident

MathML developments in Chrome or Internet Explorer is not part of the project goal, even if obviously MathML improvements to WebKit could hopefully be imported to Blink in the future. Users keep asking for MathML in IE and I hope that a solution will be found to save MathPlayer's work. In the meantime, I've sent a proposal to Google and Microsoft to implement fallback content (alttext and semantics annotation) so that authors can use it. This is just a couple of CSS rules that could be integrated in the user agent style sheet. Let's see which of the two companies is the most reactive...