Improving SpiderMonkey’s load() for Command-line JavaScript

December 20, 2008

In a previous post on command-line JavaScript, I gave a trick for making JavaScript files you can load from the command-line as well as from other JavaScript files.

Let’s examine it more closely.

Let’s say you have two JavaScript files, a.js and b.js.

a.js

//usr/bin/env js -e 'var __main__ = true;' $0 $*; exit

load("b.js")

b.js

//usr/bin/env js -e 'var __main__ = true;' $0 $*; exit

if(typeof __main__ !== 'undefined') {
    print("Only run this if I am run from the command-line.");
}

Now you run ./a.js. Unfortunately, b.js will print its output because __main__ is still set from a when it is loaded.

One solution is to give the __main__ variable more information than before. For example, we could do:

a.js

//usr/bin/env js -e 'var __main__ = "a.js";' $0 $*; exit

load("b.js")

b.js

 

//usr/bin/env js -e 'var __main__ = "b.js";' $0 $*; exit

if(typeof __main__ !== 'undefined' && __main__ === "b.js") {
    print("Only run this if I am run from the command-line.");
}


This happily gives us the results we want.

This is more complexity than I want. It’s starting to become too error-prone.

Here is a small library which solves the problem and also provides once-only loading.

load_once.js

//usr/bin/env js -e 'var __main__ = "load_once.js";' $0 $*; exit

//Protect this file from multiple loading.
if(typeof load_once === 'undefined') {
	// When we load a file, if it follows the __main__ convention
	// we want it not to execute its __main__ code.
	var load_protecting_main = function(path) {
		var __main__old;
		if(typeof '__main__' !== 'undefined') {
			__main__old = __main__;
			__main__ = undefined;
		}
		load(path);
		if(__main__old) {
			__main__ = __main__old;
		}
	}

	// load() throws an exception on failure and returns nothing on success.
	// safe_load() returns true on success and false on failure.
	var safe_load = function(path) {
		var result = true;
		try {
			load_protecting_main(path);
		} catch(e) {
			result = false;
		}
		return result;
	};

	//alias for safe_load. We want to 'reload' already loaded files.
	var reload = safe_load;

	load_protecting_main("memoize.js");

	//Doesn't work for pathological loading, e.g. load("a.js") vs. load("../pwd/a.js")
	var load_once = memoize_1(safe_load, {"memoize.js": true, "load_once.js": true});
}

memoize.js

//usr/bin/env js -e ‘var __main__ = true;’ $0 $*; exit

//Takes a function f of one argument and returns a new function
//which can be used in place of f, but caches results.
//If f is recursive, then make sure to use the form
// f = memoize_1(f);
// Or the recursive calls to f won't use the memoized version.
memoize_1 = function(f, initial_cache) {
	var cache = initial_cache || {};
	return function(i) {
		return cache[i] || (cache[i] = f(i));
	};
}

This library isn’t perfect. load_once.js and memoize.js need to be in the same directory as their users. For example, if you load it from a subdirectory, it will fail to find memoize.js because its call to load() will run from the subdirectory, not from its containing directory. I haven’t found an elegant solution for this problem yet.

To illustrate:

Imagine you have a folder structure like so:

…/prog/main.js

…/prog/load_once.js

…/prog/memoize.js

If main.js uses load(“load_once.js”), everything works fine.

If we have the following folder structure:

…/prog/main.js

…/prog/libs/memoize.js

…/prog/libs/load_once.js

If main.js attempts to load “libs/load_once.js”, it won’t work.

Advertisements
%d bloggers like this: