Archive for December, 2008

Improving SpiderMonkey’s load() for Command-line JavaScript

December 20, 2008

In a previous post on command-line JavaScript, I gave a trick for making JavaScript files you can load from the command-line as well as from other JavaScript files.

Let’s examine it more closely.

Let’s say you have two JavaScript files, a.js and b.js.

a.js

//usr/bin/env js -e 'var __main__ = true;' $0 $*; exit

load("b.js")

b.js

//usr/bin/env js -e 'var __main__ = true;' $0 $*; exit

if(typeof __main__ !== 'undefined') {
    print("Only run this if I am run from the command-line.");
}

Now you run ./a.js. Unfortunately, b.js will print its output because __main__ is still set from a when it is loaded.

One solution is to give the __main__ variable more information than before. For example, we could do:

a.js

//usr/bin/env js -e 'var __main__ = "a.js";' $0 $*; exit

load("b.js")

b.js

 

//usr/bin/env js -e 'var __main__ = "b.js";' $0 $*; exit

if(typeof __main__ !== 'undefined' && __main__ === "b.js") {
    print("Only run this if I am run from the command-line.");
}


This happily gives us the results we want.

This is more complexity than I want. It’s starting to become too error-prone.

Here is a small library which solves the problem and also provides once-only loading.

load_once.js

//usr/bin/env js -e 'var __main__ = "load_once.js";' $0 $*; exit

//Protect this file from multiple loading.
if(typeof load_once === 'undefined') {
	// When we load a file, if it follows the __main__ convention
	// we want it not to execute its __main__ code.
	var load_protecting_main = function(path) {
		var __main__old;
		if(typeof '__main__' !== 'undefined') {
			__main__old = __main__;
			__main__ = undefined;
		}
		load(path);
		if(__main__old) {
			__main__ = __main__old;
		}
	}

	// load() throws an exception on failure and returns nothing on success.
	// safe_load() returns true on success and false on failure.
	var safe_load = function(path) {
		var result = true;
		try {
			load_protecting_main(path);
		} catch(e) {
			result = false;
		}
		return result;
	};

	//alias for safe_load. We want to 'reload' already loaded files.
	var reload = safe_load;

	load_protecting_main("memoize.js");

	//Doesn't work for pathological loading, e.g. load("a.js") vs. load("../pwd/a.js")
	var load_once = memoize_1(safe_load, {"memoize.js": true, "load_once.js": true});
}

memoize.js

//usr/bin/env js -e ‘var __main__ = true;’ $0 $*; exit

//Takes a function f of one argument and returns a new function
//which can be used in place of f, but caches results.
//If f is recursive, then make sure to use the form
// f = memoize_1(f);
// Or the recursive calls to f won't use the memoized version.
memoize_1 = function(f, initial_cache) {
	var cache = initial_cache || {};
	return function(i) {
		return cache[i] || (cache[i] = f(i));
	};
}

This library isn’t perfect. load_once.js and memoize.js need to be in the same directory as their users. For example, if you load it from a subdirectory, it will fail to find memoize.js because its call to load() will run from the subdirectory, not from its containing directory. I haven’t found an elegant solution for this problem yet.

To illustrate:

Imagine you have a folder structure like so:

…/prog/main.js

…/prog/load_once.js

…/prog/memoize.js

If main.js uses load(“load_once.js”), everything works fine.

If we have the following folder structure:

…/prog/main.js

…/prog/libs/memoize.js

…/prog/libs/load_once.js

If main.js attempts to load “libs/load_once.js”, it won’t work.

Higher-order Python Decorators

December 19, 2008

May you find them useful.

def simple_pre_decorator_maker(pre, *args, **kwargs):
	"""
	Returns a function that can be used as a decorator.
	Decorated function will call pre(*args, **kwargs) before its main contents.

	>>> def a():
	...     print "a"
	...
	>>> pre_a = simple_pre_decorator_maker(a)
	>>> @pre_a
	... def b():
	...     print "b"
	...
	>>> b()
	a
	b
	"""
	def wrapper(function):
		def new_function(*argz, **kwargz):
			pre(*args, **kwargs)
			return function(*argz, **kwargz)
		return new_function
	return wrapper

def simple_post_decorator_maker(post, *args, **kwargs):
	"""
	Returns a function that can be used as a decorator.
	Decorated function will call post(*args, **kwargs) after its main contents.

	>>> def a():
	...     print "a"
	...
	>>> post_a = simple_post_decorator_maker(a)
	>>> @post_a
	... def b():
	...     print "b"
	...
	>>> b()
	b
	a
	"""
	def wrapper(function):
		def new_function(*argz, **kwargz):
			result = function(*argz, **kwargz)
			post(*args, **kwargs)
			return result
		return new_function
	return wrapper

def overriding_pre_decorator_maker(pre):
	"""
	Returns a function that can be used as a decorator.
	Decorated function will call pre(*args, **kwargs) before its main contents,
	where args and kwargs are the arguments to the decorated function.
	If pre() returns a value, it will be used as the decorated return value.
	If pre() returns None, None will be used as the decorated return value.

	>>> def none_returner():
	...     return None
	...
	>>> def true_returner():
	...     return True
	...
	>>> do_nothing_decorator = overriding_pre_decorator_maker(none_returner)
	>>> @do_nothing_decorator
	... def a_returner():
	...     return "A"
	...
	>>> a_returner()
	'A'
	>>> force_true_decorator = overriding_pre_decorator_maker(true_returner)
	>>> @force_true_decorator
	... def a_returner():
	...     return "A"
	...
	>>> a_returner()
	True
	"""
	def wrapper(function):
		def new_function(*args, **kwargs):
			return pre(*args, **kwargs) or function(*args, **kwargs)
		return new_function
	return wrapper

def overriding_post_decorator_maker(post):
	"""
	Returns a function that can be used as a decorator.
	Decorated function will call return post(result) after their main contents,
	where result is the return value of the original function.

	>>> def increment(x):
	...     return x + 1
	...
	>>> successor = overriding_post_decorator_maker(increment)
	>>> @successor
	... def two():
	...     return 1
	...
	>>> two()
	2

	"""
	def wrapper(function):
		def new_function(*args, **kwargs):
			return post(function(*args, **kwargs))
		return new_function
	return wrapper

Command-line scripting with Javascript

December 19, 2008

The short version:

Bash users, preface your standalone javascripts with:
//usr/bin/env js -e 'var __main__ = true;' $0 $*; exit

The long version:

To make standalone javascript files that are compatible with both bash and javascript’s load() function, you need to be clever.

I’ve been using the SpiderMonkey javascript engine for command-line javascript as I play around with JavaScript: The Good Parts. Instead of fooling around with a web browser, I put code into text files and run them on the command-line.

Aside: Bash

I use bash as my shell. 

When bash loads a file, if that file starts with something like

#!/path/to/interpreter

<rest of file>

then bash loads the interpreter and passes it <rest of file>.

So, I make a file like

#!/usr/bin/env js

print("Hello world.");

and the command js sees print("Hello world."); and does its thing.

Otherwise, bash just treats the file as a bash script and interprets it with its own rules.

Aside: load()

In SpiderMonkey, the load() function takes a string parameter. The string represents a path to a file, and SpiderMonkey reads and executes the contents of the file in the context of the loader.

There’s a problem: If a file starts with #!/usr/bin/env js, load sputters out a SyntaxError, because that string isn’t valid JavaScript. We can’t use the #! syntax.

The Solution

We want a way of structuring our javascript code such that:

  1. If interpreted by bash, bash loads js and runs the file.
  2. If loaded by SpiderMonkey’s load, the JavaScript code is loaded into the loader.
  3. The file javascript has some way of knowing if it has been loaded or run from the command-line.

In several other scripting languages such as Ruby, Perl and Python, ‘#‘ is the comment character so there’s no problem with the #! line.

In JavaScript, we can ignore things using //.

/ is the path separator on Unix systems.

So, bash will interpret the string //usr/bin/env js as a command to run, and js will ignore it.

Next, we have a little bit more magic. js expects a file name, and it can also take arguments to pass in to the JavaScript (which will be stored in the global variable arguments).

In bash syntax, $0 represents the name of the file being invoked, and $* represents the arguments.

So, //usr/bin/env js $0 $* runs js on the current file, passing it the arguments that were passed to the script.

We don’t want bash to try to run the JavaScript code, so we attach ; exit  to the command to cause it to stop executing the file.

Thus, the magic incantation //usr/bin/env js $0 $*; exit allows a JavaScript file to be executed on the command line and loaded from other JavaScript files.

So, we can make a file like:

//usr/bin/env js $0 $*; exit

for(var i in arguments) {
    print(arguments[i]);
}

and save it as args.js and run it like so:

js-play aran$ ./args.js Hello World.
Hello
World.
js-play aran$ 

We can also make another file like:

//usr/bin/env js $0 $*; exit

load("args.js");

and save it as argsloader.js and run it like so:

js-play aran$ chmod +x argsloader.js
js-play aran$ ./argsloader.js Hello World.
Hello
World.

Now, for a finishing touch, we want our file to know if it has been loaded by bash or by another javascript. For this, we take advantage of SpiderMonkey’s -e option, which runs some JavaScript code before running a file.

We can use -e to define a variable. Then our script can check for the presence or absence of that variable definition and thus decide.

//usr/bin/env js -e 'var __main__ = true;' $0 $*; exit

if(typeof __main__ !== 'undefined') {
    print("I'm a standalone program.")
} else {
    print("I was loaded.");
}

 

js-play aran$ ./argsPlay.js 
I'm a standalone program.
js-play aran$ js -e 'load("argsPlay.js");'
I was loaded.
js-play aran$ 
 

And so, our final magic incantation is //usr/bin/env js -e 'var __main__ = true;' $0 $*; exit and it is good.

Stop Reading This Blog

December 4, 2008

You’re still here. Well, stick around, I’ll keep it interesting.

I know some people read this blog. You might have a blog of your own. You might have been hurt when you found out that, unless you’re my thesis supervisor, I don’t read your blog or any other blog for that matter: I quit Google Reader many months ago and I get the Third Bit via email.

Courtesy of Hacker News, I just found a great write-up of my reasons for this: “Real Advice Hurts.” Go read it.

It’s great when someone smarter than you says something you’ve been wanting to say.

This really doesn’t make sense unless you read the link.

I’ve been groggily waking up to my ambitions’ need for more do less learn. It’s why I sought out a mentor who gets things done. It’s part of why I joined a ferociously pragmatic organization founded by a ferociously pragmatic old friend. Real Advice Hurts hit hard. I love to learn—really, truly, deeply love—and it’s a trap and it’s killing me.

But! Blogging about this won’t help me. Your blog reader won’t help you. So! Go away. If you come back, guard your time. Stop reading this blog.