An XPCOM component to parse mathematical expressions into MathML (part 1)
By fredw on Thursday, September 16 2010, 23:30 - Permalink
I have started to write a math parser usable by Mozilla-based applications. In particular, this could help to add math editing features to Mozilla's editors (BlueGriffon™, Komposer or Thunderbird etc). Of course, it will also be usable by Mozilla's extensions, such that Firemath. I will probably give more details in subsequent blog posts but the main ideas are given in that one.
First, the parser is usable through classical XPCOM calls. With the current interface, we get something like (in Javascript):
mathparser = Components.classes['@mozilla.org/editor/mathparser;1']. createInstance(Components.interfaces.nsIMathParser); node = mathparser.parse(document, input, Components.interfaces.nsIMathParser.MATHPARSER_MODE_SIMPLE);
where input is a string representing a mathematical formula and
node is the output MathML tree. Note the third parameter of the
parse function, which allows to choose a parser mode. For the
moment, I have only written one for very basic formulas. However, I plan to add
at least a LaTeX-like
mode.
As an example, if we provide the following input strings
"{∑_{i=1}^{+∞} 1/n^2} = π^2/6" and "{∫_0^{+∞}
{ⅆx}/{4(x+1)√x}} = π/4" the parser outputs:
Note that the core engine is produced using the famous parser generator Bison and hence it will be easy to adapt the work of itex2MML. The lexical analyzer is written directly and we get a very nice feature: unicode support! If people are interested, I have some patches applyable to mozilla-central...
Comments
Have you seen <a href="http://www1.chapman.edu/~jipsen/mat...">AsciiMathML</a>?
Hrm, you had to do that in C++? That should be eminently doable in JavaScript...
Wow! Superb! Yes, BlueGriffon is obviously interested but I'm with bsmedberg here: a JS component would be much better. I you can't, please consider an add-on adding the component to any xulrunner-based product.
It's not only doable in js, it has been done already, see comment 1 and the live demo here:
http://www1.chapman.edu/~jipsen/mat...
Writing a parser directly becomes difficult when the grammar is complex. I haven't found a parser generator for javascript, except jison but I'm not sure it is as much tested and stable as bison. Moreover, I think using C++ & bison produces a much more efficient parser than javascript + DOM. And of course to write a LaTeX mode, I will be able to rely on the work of itex2MML. I've made the patches available here:
http://www.maths-informatique-jeux....
For the moment, I've only tested it with mozilla-central. An add-on adding the mathparser component remains an option...
I like the spirit of your efforts, but ASCIIMathML.js is pretty sublime. Personally, we (er, by which I mean, browser enthusiasts, I guess) should be pushing hard for mathematically inclined people to use it --- or something very much like it --- as a standard.
> Writing a parser directly becomes difficult when the grammar is complex. I haven't found a parser generator for javascript, except jison
What about https://code.google.com/p/antlr-jav... ?