Developers Club geek daily blog

2 years, 8 months ago
The note from the translator: I am the server Java-programmer, but thus it has so historically developed that I work only under Windows. In team all sit generally on Mac or Linux, but someone has to test alive web interfaces of projects under the real IE, to lump as not to me? So I use it very many years and by working need, and — owing to laziness — as the main browser. In my opinion, with each new version, starting with the ninth, it becomes more and more worthy, and Project Spartan promises to be excellent at all. At least, in the technology roadmap — as equals with others. I bring to your attention transfer of article from the blog of developers giving some grounds to hope for it.
image

Providing compatibility with DOM L3 XPath


Having set for itself the task to provide rather compatible and modern web platform in Windows 10, we constantly work on improvement of support of standards, in particular, concerning DOM L3 XPath. Today we would like to tell how we have achieved it in Project Spartan.

It is a little history


Before implementing support of the DOM L3 Core standard and native XML documents in IE9, we provided to web developers MSXML library by means of the ActiveX mechanism. Except object of XMLHttpRequest, MSXML provided also partial support of language of requests of XPath through set of own API, selectSingleNode and selectNodes. From the point of view of the applications using MSXML, this way simply worked. However, it did not conform to the W3C standards neither for interaction with XML, nor for work with XPath at all.

Authors of libraries and developers of the sites had to envelop XPath calls for switching between implementation "on the fly". If you look in network for textbooks or examples on XPath, you will notice at once wrappers for IE and MSXML, for example
// code for IE
if (window.ActiveXObject || xhttp.responseType == "msxml-document") {
    xml.setProperty("SelectionLanguage", "XPath");
    nodes = xml.selectNodes(path);
    for (i = 0; i < nodes.length; i++) {
        document.write(nodes[i].childNodes[0].nodeValue);
        document.write("<br>");
    }
}

// code for Chrome, Firefox, Opera, etc.
else if (document.implementation &&document.implementation.createDocument) {
    var nodes = xml.evaluate(path, xml, null, XPathResult.ANY_TYPE, null);
    var result = nodes.iterateNext();

    while (result) {
        document.write(result.childNodes[0].nodeValue);
        document.write("<br>");
        result = nodes.iterateNext();
    }
}

For our new engine oriented to web without plug-ins we needed to provide native support of XPath.

Assessment of possible options


We have at once started evaluating the available options of embodiment of such support. It would be possible to write it from scratch, or completely to integrate into the MSXML browser, or to port System.XML from .NET, but all this would demand too much time. Therefore we have decided to implement for a start support of some main subset of XPath, in passing thinking over the full.

To define, what initial subset of the standard it is worth undertaking, we used the internal tool collecting statistics on requests on hundreds of thousands of the most popular sites. It has become clear that requests of such types most often meet:
  • //element1/element2/element3
  • //element [@attribute = "value"]
  • .//* [contains (concat ("", @ ​class, "") ", classname")]

Each of them perfectly corresponds to certain CSS selector which can be redirected to very quick implementation of API CSS selectors which is already available for us. Compare:
  • element1> element2> element3
  • element [attribute = "value"]
  • *.classname

Thus, the first step in implementation of support of XPath consisted in writing of the converter from requests of XPath in CSS selectors, and to the redirection of call in the correct place. Having made it, we again used our telemetry to measure percent of successful requests, and also to find out which of the unsuccessful meet most often.

It has appeared that such implementation covers already 94% of requests, and allows to earn at once to set of the sites. From unsuccessful the majority has appeared types
  • //element [contains (@ ​class, "className")]
  • //element [contains (concat ("", normalize-space (@ ​class), "") ", className")]
and both perfectly correspond in the element.className selector. Having added such rule, we have improved support to 97% of the sites that means practical readiness of the new engine for modern web.

Support of DOM L3 XPath in Project Spartan
Result of run of telemetry on requests of XPath

Ensuring support of the remained 3% of the sites


To support the vast majority of requests of XPath simple conversion in CSS selectors — it is excellent, but there is all the same not enough because to implement remained similarly it will not turn out any more. The grammar of XPath includes such advanced things as functions, requests to не-DOM to elements, document nodes, and difficult predicates. Some authoritative sites (including MDN) suggest the platforms which do not have adequate native support of XPath to use libraries-polifilly in such cases.

For example, wicked-good-xpath (WGX) which is written on pure JS. We have checked it on our internal test set for the XPath specification, and in comparison with native implementation it has shown 91% of compatibility, and also very decent productivity. So the idea to use WGX for the remained 3% of the sites has seemed to us very attractive. Moreover, it is the project with the source codes opened under the license MIT that is remarkably combined with our intention to make the increasing contribution to business of the open code. But we, however, still never used JavaScript-polifill in IE for ensuring support of any web standard.

To give the chance of WGX to work, and thus not to spoil document context, we start it in the separate, isolated JS engine copy, transferring it to input request and necessary data from the page, and on output we take away ready result. Having modified the WGX code for ensuring work in such mode which is "torn off" from the document, we have at once improved display of content of many sites in our new browser.

imageimage
The sites before use of WGX

imageimage
And it after. Pay attention to the appeared prices and digits of advantageous tickets

However, in WGX there were also bugs because of which it behaves perfectly both from the W3C specification, and from other browsers. We are going at first them to correct all, and then to share patches with community.

Thus, as a result of some date-mayninga on the Network, and by means of library with the open code, our new engine has for a short time got productive support of XPath, and users will get the best support of web standards soon. You can download the next Windows 10 Technical Preview, and be convinced of it. And still can write through UserVoice, how well at us it has turned out, either to tvitnut to us, or to express in comments to original article.

PS from the translator: the tendency of transformation of JavaScript into language in which platforms are written, as they say, is available. To take Firefox'ovsky Shumway, or PDF.js. Now here and Microsoft the browser, at least partially, transfers to JS.

This article is a translation of the original post at habrahabr.ru/post/253595/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus