[BUG] Bad memory leak

Report · Aug 12, 2018

Hi Guys

I wanted to demonstrate a bad memory bug that seems to apply to all the CC apps that I tried.

The leak is relevant to all the persistent engines. It probably has been around from the beginning of ExtendScript.

If one has a function and that function has a nested function in the form of var foo = function then all the nested functions within the wrapper function will be stuck in memory and every time one call the outer function the inner functions will be created (which is not necessarily great) and stored in memory. Garbage collection will not help.

Here's a simple example

No leak

function outer() {
    function inner1(){}
    function inner2(){}
    function inner3(){}
}
// At this point we should have 4 functions in memory
outer();
outer();
outer();
outer();
// At this point we should still only have 4 functions in memory
// At this point we have 4 functions in memory

Leak

function outer() {
    var inner1 = function(){};
    function inner2(){}
    function inner3(){}
}
// At this point we should have 4 functions in memory
outer();
outer();
outer();
outer();
// At this point we should still only have 4 functions in memory
// AT THIS POINT WE HAVE 16 FUNCTIONS IN MEMORY

Note it's not just the "var" function that leaks, the "var" function causes all the functions to leak.

The are 2 possible work-arounds that I can think of.

1) Don't use "var" functions, this is not generally recommended and linters will show errors if you do that freely but it might still be an option here.

2) Use Prototypes the instead of inner functions, this has the pro of not having the inner functions created on each calling or the outer function and the con of a lack of privacy.

I was wondering if others had any comments on this.

Also Marc Autret one posted a fancy dancy memory listing script using an undocumented $ method and a flashy UI I thought that might be useful for investigating but I can't find it. So if anyone can share a link for that it would be appreciated.

The below snippet demonstrates the leak, The $.leak function is probably similar to the summary difference function by Dirk Becker his site was offline and I decided to write my own version.

Don't forget to escape the #targetengine line if not executing in InDesign.

 // DEMO
 #targetengine transient // Comment this line out is not on InDesign
$.leak = $.leak || function(){
    var oldSummary, newSummary, summaryObject, diffObject, key, diffArrray, report, count, value;
    oldSummary = this.oldSummary || {};
    newSummary = $.summary();
    summaryObject = {};
    diffObject = {};
    newSummary.replace(/(\d+) (\S+)/g, function(whole, count, prop){
        var diff;
        count = +count;
        summaryObject[prop] = count;
        diff = count - (oldSummary[prop] || 0);
        if (diff) {
            diffObject[prop] = diff;
        }
    });
    for (key in oldSummary){
        if(!summaryObject[key]){
            diffObject[key] = -oldSummary[key];
        }
    }
    this.oldSummary = summaryObject;
    diffArrray = [];
    for (key in diffObject){
        value = diffObject[key];
        if (value > 0) { value = '+' + value;}
        diffArrray.push([key, '\t' + value , '\t' + summaryObject[key]]);
    }
    diffArrray.sort(function(a,b){
        if (a[1] === b[1]) { return 0;}
        if (+a[1] > +b[1]) { return -1;}
        return 1;
    });
    count = diffArrray.length;
    if(!count) {return 'No change from last $.leak()';}
    report = diffArrray.join('\n').replace(/,/g, ': ');
    return count + ' memory change' + (count > 1 ? 's' : '') + '\n' + diffArrray.join('\n').replace(/,/g, ': ');
};
$.leak();
$.leak();
function withVarFunction(){
    // As soon as we have a var foo = function all inner functions leak!
    var innerFunction1 = function(){};
    function innerFunction2(){};
    function innerFunction3(){};
}
withVarFunction();
withVarFunction = null;
$.gc();
$.gc();
alert('withVarFunction\n' +$.leak());
function withoutVarFunction(){
    function innerFunction1 (){};
    function innerFunction2(){};
    function innerFunction3(){};
}
withoutVarFunction();
$.gc();
$.gc();
alert('withoutVarFunction\n' +$.leak());
withoutVarFunction = null;
$.gc();
$.gc();
alert('After null\n' +$.leak());

Regards

Trevor

Report · Aug 12, 2018

According to this , they recommend that an anonymous function should not be assigned to a variable. I am wondering, what would happen if their suggestion is followed to var inner1 = function inner1(){}; ?

Sorry, too scared to try the lead code myself at the moment

Report · Aug 12, 2018

Doesn't help (you had my hopes up for a moment but it didn't last 🙂

Report · Aug 12, 2018

Hey I think it's solved!

function withVarFunction(){ 
    // As soon as we have a var foo = function all inner functions leak! 
    var innerFunction1 = function(){}; 
    function innerFunction2(){}; 
    function innerFunction3(){};
    innerFunction1 = null; // as long as one sets the var to null inside the function then there's no leek!
}

I really think people should take note!

Report · Aug 12, 2018

Hi Trevor,

Thanks for those interesting points. Indeed, assigning a function expression to a variable from within a function scope is not a great practice. And, more generally, creating more function references than actually needed is always dangerous regarding garbage collection.

The problem you describe is not intrinsically related to using var, it globally regards every circumstance where some extra function reference is not released, and then all functions references that belong to the same scope are locked.

PART 1. Consider the following example:

function test()
{
    function f1(){};
    function f2(){};
    function f3(){};
    return f1;
}
test();

Note. — Lines 01 to 07 encode a function declaration (FC.) It is important to keep in mind that FCs are parsed and registered before any instruction of the scope they belong to. Assuming my test func is declared in the global scope, at the script level, you may append alert(test) above line 01 and this would successfully show the code of that function. So, there is no point in testing some condition before and after the FC, for the declaration event has already happened and its cost in memory is strictly adding one function reference, namely the function test. As long as this function is not executed—which only occurs at line 09—the inner functions f1, f2, f3 simply don't exist. So it is wrong to believe that three additional function references have been registered in memory before line 09 being executed.

At line 09, test is executed. It then builds its own scope, with three FCs for f1, f2, f3. Each FC strictly coincides with a single function reference.

Note. — Technically, object references are handled through counters. As long as the reference counter of some object is not zero, the garbage collection mechanism cannot relax that object, considering that some reference, somewhere, still points out to it.

Now, an important fact about functions within a same lexical scope (here, test's scope) is, no function can reset its counter to zero as long as there exists some reference, in whatever scope, to any of those functions. The reason is, I think, that the interpreter ignores what may happen in f1, f2, or f3 (those functions haven't been executed, and they won't) but it must guarantee that the own scope of any function can access to the parent scope (that's the JS closure paradigm.)

So, if by any means a reference to either f1, f2, or f3 still exists when test() returns, then these specific instances of f1, f2, and f3 must be preserved (this is done in ExtendScript throughout the special [[workspace]] objects.) By chance, in the above example, the outer code does not retrieve the returned value of the test function, so $.gc() will work fine. But if you change line 09 into, say,

$.result = test();

then f1,f2,f3 references remain locked—until you unset the reference $.result itself, using e.g $.result=0.

Note, also, that the function f1 which test returns when called the first time is not the function f1 which it would return when called again! That is, each time you call test(), you actually invoke three FC and you actually create three new function references within a new [[workspace]].

PART 2. Now, what is the problem with var? Let's consider the following variant:

function test()
{
    var f1 = function(){};
    function f2(){};
    function f3(){};
}
test();

In this case, test's scope shows two FC (for f2 and f3) and one function expression (FE) assigned to f1. Technically, f1 is then an extra reference to some anonymous function which, anyway, had its own counter initialized to 1. So, this particular function has now its reference counter set to 2, while f2 and f3 counters are still set to 1. Here, it is your responsability to release the extra reference, using e.g f1=0; at the end of test's execution scope. Why? Because the garbage collector is not smart enough to see that f1 and the function it presently refers to will remain linked forever. The GC needs your help to kill the reference attached to the variable f1. I'm not sure I could explain this phenomenon further, but I don't regard this as a bug.

By the way, your tests get exactly the same results in ExtendScript CS4.

Best,

Marc

Report · Aug 12, 2018

Thanks Marc, I shall read again in the morning when I wake up.

Report · Aug 12, 2018

Hello Marc, excellent piece of information shared by you as always.

The first part of the explanation makes sense but i have just a slight query on the second scenario. In this case f1 is a local scope variable, then won't it be automatically gc'ed as we go out of scope once the enclosing function execution finishes.

Would like to know your views on this. I may be missing some obvious point here, excuse my ignorance in that case.

Report · Aug 12, 2018

@Marc, Good points.

The danger is there even without var.

In summary (combined).

1. Take care using nested functions whether they are declarations or expressions.

2. Returning a function is asking for trouble.

3. Make sure to help out the garbage collector and set all function references to null when they are done with.

4. Point 3 applies even (if not especially) to local functions which we would have thought would be garbage collected automatically but fact will not unless set to null.

Report · Aug 13, 2018

Does any of this danger apply to the constructor function pattern?

function MyObj() {

this.myMethod = function(){

alert("Hello world!");

}

var obj = new MyObj();

Report · Aug 13, 2018

@ Silly-V

Does any of this danger apply to the constructor function pattern?
function MyObj() {
   this.myMethod = function(){
       alert("Hello world!");
   }
}
var obj = new MyObj();

No, it doesn't seem so.

Used in a constructor—I mean, in a function called as a constructor—the syntax

this.myMethod = <funcExpression>;

has the advantage of attaching the function reference to the outer, newly created object.

So, assuming you reset obj to zero downstream and call $.gc(), the workspace should vanish.

Note, if MyObj is not invoked as a constructor, you will need to relax the outer ref as well:

function MyObj(){ this.myMethod=function(){} }
MyObj();
$.global.myMethod = 0; // relax the ref
$.gc();  
alert( $.summary() );  // no workspace 🙂

@+

Marc

Report · Aug 13, 2018

Thanks so much for providing the constructor advice!

This is going to be a post oft-linked in the surrounding few years, no doubt!

As for me, at least I can have the peace of mind to not go back to re-write my core objects as prototypes.

Thanks again.

Report · Aug 22, 2018

Hi All,

I've done a little more research on the topic.

Here's a few links of interest.

Re: Memory Leak - targetengine?!

Re: Re: How to keep track of all the classes/methods/properties created in a long script This has Marc Autret's "fancy dancy memory listing script using an undocumented $ method and a flashy UI" I was looking for. I can see that being helpful.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management

A big discovery for me was the effect of try { } catch (e) { }

It's great to distribute scripts with nice error reporting but wrapping a whole script in a giant try scope enclosure can result in a giant leak.

So my conclusion is:

1) Don't do have giant try catches one can use small enclosures for dodgey DOM operations (In Illustrator most DOM operations seem to be dodgey 🙂 but they should be limited to that.

2) Have 2 versions of each version the script, (a) Distribution (b) Development / Debug. Bite one's pride and distribute the "Distribute" script which will give those horrible "meaningless" standard ExtendScript errors. When you get a complaint screenshot from your client(s) send them the debug version which can provide a more meaningful message.

Note: All this discussion is only relevant for persistent engines. For InDesign's "main" engine you can be prettymuch as sloppy as you like.

HTH

Trevor

P.s. to get all the hidden undocumented $ methods / properties we can use reflect. We could do that to make a list of all the undocumented methods and properties comparing with the OMV or dictionary data. What fun.

$.reflect.methods

// Result: about, toString, write, writeln, bp, getenv, setenv, sleep, colorPicker, evalFile, list, listLO, summary, gc

// list, listLO, summary are the undocumented ones

$.reflect.properties

// Result: error, version, build, buildDate, global, stack, level, flags, strict, locale, localize,

decimalPoint, memCache, appEncoding, screens, os, fileName, line, hiresTimer, dictionary,

engineName, includePath, __proto__

Report · Aug 22, 2018

I use many small try catches to build an error log which the customer can screenshoot and send me, or I use functions which return false or some error as an object without the try/catch - since many Illustrator operations rely on required artwork being present, etc.

Although the level of meaning to the errors is up to myself to write in, but I guess I'm wanting to understand what you mean by your own debug/distribute workflow as for me it's one and the same at this point.

Report · Aug 13, 2018

@Manan

Thanks for your comment.

(…) i have just a slight query on the second scenario. In this case f1 is a local scope variable, then won't it be automatically gc'ed as we go out of scope once the enclosing function execution finishes. Would like to know your views on this.

That's definitely a good question and I confess I do not master this side of the topic, which deeply involves JavaScript scope chain, execution context and all these pretty things. My intuition is, by contrast with f2 and f3 which are lexically scoped at creation time, f1 is assigned at runtime and generates an extra reference to the anonymous function it points out to at that time. By extra reference I simply mean a ‘+1’ in the ref counter associated to that function. In the code block of test, many other statements would have a similar effect: we could write as well var f1 = f2, or just callee.F = f2, etc. In any of these cases, an extra reference is created increasing the ref count of some function (e.g f2) in this execution context. And, as you know, any pointer to any inner function in the execution context has the effect of locking lexical entities (due to the particular way JavaScript handles function scopes, closure, etc.)

Now, what we experiment in ExtendScript is that the GC is smart enough to release the natural function references resulting from function declarations. Maybe just because the subsystem can easily check that the counter is 1 for every function in the scope, and then can conclude than no sibling reference exist. But, should any function have its counter > 1, the GC does not seem to investigate whether the extra references are, in fact, no longer employed. So it must keep locked the whole lexical space, even after the test function has returned, and then the infamous memory leak occurs.

I'm not sure whether my view is correct, that's just the picture I've in mind when dealing with such issue. I know I have to manually remove extra references—which is done by reassigning any scalar value.

Best,

Marc

Report · Aug 13, 2018

Great observations, you have great observation skills in addition to the exceptional information sharing skills. Keep up the good work Marc!

Report · Aug 13, 2018

As always, garbage collection doesn't mean "no more leaks"... In general, is there a memory leak detection tool for such cases like Deleaker for C++, Valgrind for C etc., or each time one should just be careful and that's all?

Report · Aug 13, 2018

This is against common & basic expectations, I think.

Report · Aug 13, 2018

If $.summary() still contains (workspace) objects after double $.gc() at the end of the script, you can bet you have a functional memory leak.

Adobe Community

[BUG] Bad memory leak

1 Correct answer