Developers Club geek daily blog

1 year, 1 month ago
image

In this post councils are stated how not to write a code which performance will be much lower than expected. Especially it concerns situations when the V8 engine (used in Node.js, Opera, Chromium etc.) refuses to optimize some functions.

Features of V8


In this engine there is no interpreter, but there are two different compilers: normal and optimizing. It means that your JS code is always compiled and executed directly as native. You think, it means quickly? You are mistaken. Compilation in a native code not too improves performance. We only get rid of use of the interpreter, but not optimized code and will slowly work.

For example, in the normal compiler expression of a + b will look so:

mov eax, a
mov ebx, b
call RuntimeAdd

It is only a challenge of the corresponding function. If an and b are integer, then the code will look so:

mov eax, a
mov ebx, b
add eax, ebx

And this option will work much quicker than a challenge which at the runtime processes difficult additional JS semantics. In other words, the normal compiler generates not optimized, "crude" code and optimizing the compiler licks it into shape, leading to a final type. At the same time performance of optimized code can in 100 exceed time performance of "normal". But the matter is that you cannot just write any JS code and optimize it. There are many templates of programming (part of them even idiomatic) which the optimizing compiler refuses to process.

Pay attention that if the template is not optimized, then it affects all function which supports him. The code is optimized on one function for time, and the system does not know that it does all other code (if only it is not built in function which is optimized at present).

Below we will consider the majority of templates whose functions end up in "hell of a deoptimization". Most often it makes sense to change them, and proposed solutions can become unnecessary when the compiler learns to distinguish everything new and new templates.

1. Use of the built-in tools


To define how templates influence optimization, you have to be able to use Node.js with some flags of V8. You create function with a certain template, you cause it with various data types, and then you cause internal function of V8 for check and optimization:

test.js:
// Function that contains the pattern to be inspected (using with statement)
function containsWith() {
    return 3;
    with({}) {}
}

function printStatus(fn) {
    switch(%GetOptimizationStatus(fn)) {
        case 1: console.log("Function is optimized"); break;
        case 2: console.log("Function is not optimized"); break;
        case 3: console.log("Function is always optimized"); break;
        case 4: console.log("Function is never optimized"); break;
        case 6: console.log("Function is maybe deoptimized"); break;
        case 7: console.log("Function is optimized by TurboFan"); break;
        default: console.log("Unknown optimization status"); break;
    }
}

// Fill type-info
containsWith();
// 2 calls are needed to go from uninitialized -> pre-monomorphic -> monomorphic
containsWith();

%OptimizeFunctionOnNextCall(containsWith);
// The next call
containsWith();

// Check
printStatus(containsWith);

Start:

$ node --trace_opt --trace_deopt --allow-natives-syntax test.js
Function is not optimized

To check working capacity, comment out expression of with and restart:

$ node --trace_opt --trace_deopt --allow-natives-syntax test.js
[optimizing 000003FFCBF74231 <JS Function containsWith (SharedFunctionInfo 00000000FE1389E1)> - took 0.345, 0.042, 0.010 ms]
Function is optimized

It is important to use the built-in tools for check whether the selected solutions work.

2. Unsupported syntax


Some constructions explicitly are not supported by the optimizing compiler as use not optimized syntax.

Important: even if construction is unavailable or is not executed, it all the same does not allow to optimize the function supporting her.

For example, it is useless to do so:

if (DEVELOPMENT) {
    debugger;
}

This code will influence all function even if expression of debugger will not be executed.

At the moment are not optimized:

  • functions generators;
  • the functions containing expression of for-of;
  • the functions containing expression of TR-catch;
  • the functions containing expression of TR-finally;
  • the functions containing the composite operator of assignment let;
  • the functions containing the composite operator of assignment const;
  • the functions containing object literals which, in turn, contain declarations __ proto __, get or set.

Most likely, are not optimized:

  • the functions containing expression of debugger;
  • the functions causing eval ();
  • the functions containing expression of with.

That there was no misunderstanding: if function contains something from listed below, then it will not be optimized entirely:

function containsObjectLiteralWithProto() {
    return {__proto__: 3};
}

function containsObjectLiteralWithGetter() {
    return {
        get prop() {
            return 3;
        }
    };
}

function containsObjectLiteralWithSetter() {
    return {
        set prop(val) {
            this.val = val;
        }
    };
}

Direct calls of eval and with deserved a specific mention as all they work with, appears in dynamic area of visibility, so, these expressions can exert a negative impact on many other functions if it becomes impossible to analyze that there occurs.

Bypass solution: it is impossible to refuse some of these expressions in a code of a ready-made product. For example, from TR-finally or TR-catch. For minimization of harmful influence they should be isolated within small functions:

var errorObject = {value: null};
function tryCatch(fn, ctx, args) {
    try {
        return fn.apply(ctx, args);
    }
    catch(e) {
        errorObject.value = e;
        return errorObject;
    }
}

var result = tryCatch(mightThrow, void 0, [1,2,3]);
// Unambiguously tells whether the call threw
if(result === errorObject) {
    var error = errorObject.value;
}
else {
    // Result is the returned value
}

3. Use of arguments


There are many methods to use arguments so that it will be impossible to optimize function. So during the work with arguments it is necessary to be especially careful.

3.1. Reassignment of the set parameter on condition of use of arguments in a function body (only in the unstable mode (sloppy mode))


Common example:

function defaultArgsReassign(a, b) {
     if (arguments.length < 2) b = 5;
}

In this case it is possible to save parameter in a new variable:

function reAssignParam(a, b_) {
    var b = b_;
    // Unlike b_, b can safely be reassigned
    if (arguments.length < 2) b = 5;
}

If it was the only way of application of arguments in function, then it could be replaced with check with undefined:

function reAssignParam(a, b) {
    if (b === void 0) b = 5;
}

If there is a probability that arguments will be used later in function, then it is possible not to worry about reassignment.

Other method of a solution: to include a high security ('use strict') for the file or function.

3.2. The flowing-away arguments


function leaksArguments1() {
    return arguments;
}

function leaksArguments2() {
    var args = [].slice.call(arguments);
}

function leaksArguments3() {
    var a = arguments;
    return function() {
        return a;
    };
}

The object of arguments should not be transferred anywhere.

Proxying can be performed by means of creation of an internal array:

function doesntLeakArguments() {
                    // .length is just an integer, this doesn't leak
                    // the arguments object itself
    var args = new Array(arguments.length);
    for(var i = 0; i < args.length; ++i) {
                // i is always valid index in the arguments object
        args[i] = arguments[i];
    }
    return args;
}

In this case it is necessary to write a lot of code so it makes sense to solve at first and whether the game is worth the candle. Besides the optimizirovaniye means a large number of a code, with more obviously expressed semantics.

However if your project is at an assembly stage, then it can be reached by means of the macro which is not demanding use of source maps and allowing to save the source code in the form of normal JavaScript.

function doesntLeakArguments() {
    INLINE_SLICE(args, arguments);
    return args;
}

This technique is used in bluebird, and at an assembly stage the code turns into such:

function doesntLeakArguments() {
    var $_len = arguments.length;var args = new Array($_len); for(var $_i = 0; $_i < $_len; ++$_i) {args[$_i] = arguments[$_i];}
    return args;
}

3.3. Assignment to arguments


It can be made only in the unstable mode:

function assignToArguments() {
    arguments = 3;
    return arguments;
}

Solution method: just do not write such idiotic code. In a high security similar creativity will lead to an exception.

How it is possible to use arguments safely?


  • Apply arguments.length.
  • Apply arguments[i] where i is always the correct integer index in arguments and cannot be out of its borders.
  • Never use arguments directly without .length or [i].
  • It is possible to apply fn.apply (y, arguments) in a high security. More than anything else, in particular .slice. Function#apply.
  • Remember that adding of properties to functions (for example, $inject fn. =...) and to limited functions (bound functions) (for example, result of work Function#bind) leads to creation of the hidden classes, therefore, it is unsafe when using #apply.

If you observe all listed, then use of arguments will not lead to memory allocation for this object.

4. Switch-case


Expression of switch-case can have up to 128 points case today and if to exceed this quantity, then the function containing this expression will not be able to be optimized.

function over128Cases(c) {
    switch(c) {
        case 1: break;
        case 2: break;
        case 3: break;
        ...
        case 128: break;
        case 129: break;
    }
}

Keep quantity of case within 128 pieces by means of an array of functions or if-else.

5. For-in


Expression of For-in can interfere with function optimization in several ways.

5.1. The key is not local variable


function nonLocalKey1() {
    var obj = {}
    for(var key in obj);
    return function() {
        return key;
    };
}

var key;
function nonLocalKey2() {
    var obj = {}
    for(key in obj);
}

The key cannot be from upper area of visibility, as well as cannot refer to lower. It has to be exclusively local variable.

5.2. The iterated object is not "simple listed"


5.2.1. Objects in the hash table mode ("the normalized objects", "dictionaries" — objects, whose auxiliary data structure the hash table represents) are not simple listed

function hashTableIteration() {
    var hashTable = {"-": 3};
    for(var key in hashTable);
}

The object can pass into the mode a hash table, for example, when you dynamically add too many properties (out of the designer), property delete, use properties which are not correct identifiers, etc. In other words, if you use object so as if it is a hash table, then it also turns into a hash table. It is impossible to transfer such objects in for-in at all. To learn whether there is an object in the mode a hash table, it is possible to cause console.log(%HasFastProperties(obj)) at the flag activated in Node.js - allow-natives-syntax.

5.2.2. In a chain of prototypes of object there are fields with enumerated values

Object.prototype.fn = function() {};

This line allocates with property listed a chain of prototypes of all objects (except for Object.create(null)). Thus, any function containing expression of for-in becomes not optimized (if only they do not execute search of objects of Object.create (null)).

By means of Object.defineProperty you can appropriate not listed properties. It is not recommended to do it at the runtime. And here for effective determination of static things like properties of a prototype — most that.

5.2.3. The object contains the listed array indexes

It is necessary to tell that properties of an index of an array are defined in the ECMAScript specification:

The property P name (in the form of a line) is an array index in only case when if ToString(ToUint32(P)) is equal P, and ToUint32(P) is not equal 232 − 1. Property, whose name is an array index, is also called an element.

As a rule, it belongs to arrays, but normal objects can also possess array indexes:

normalObj[0] = value;
function iteratesOverArray() {
    var arr = [1, 2, 3];
    for (var index in arr) {

    }
}

Search of an array by means of for-in turns out more slowly, than by means of for, besides the function containing for-in is not exposed to optimization.

If to transfer in for-in the object which is not simple listed, then it will exert a negative impact on function.

Solution method: always use Object.keys and touch an array by means of the cycle for. If you really need all properties from a chain of prototypes, then create the isolated support function:

function inheritedKeys(obj) {
    var ret = [];
    for(var key in obj) {
        ret.push(key);
    }
    return ret;
}

6. Infinite loops with difficult logic of conditions of an output or with not clear conditions of an output


Sometimes when writing a code you understand that it is necessary to make a cycle, but do not represent that to place in it. Then you enter while (true) { or for (;;) {, and then you insert into the cycle break about which soon you forget. Refactoring time when it becomes clear that function is executed slowly comes or in general the deoptimization is observed. The reason can appear in the forgotten interruption condition.

Refactoring of a cycle for the sake of the room of a condition of an output in conditional part of expression of a cycle can be nontrivial. If the condition is part of an if clause at the end of a cycle and the code has to be executed at least once, then a refaktorta a cycle to do { } by while ();. If the condition of an output is located at the beginning, then place it in conditional part of a loop body. If the condition of an output is located in the middle, then can be played with a code: at each movement of part of a code from top line in lower leave the copy of a line over a cycle. After the condition of an output can be checked by means of conditional or at least the simple logical test, the cycle should not fall under a deoptimization any more.

This article is a translation of the original post at habrahabr.ru/post/273839/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus