Revision 93414 of Pythonic Modules

Revision slug: CommonJS/Modules/Pythonic_Modules
Revision title: Pythonic Modules
Revision id: 93414
Created: Feb 2, 2009, 10:32:44 PM
Creator: Chris
Is current revision? No
Comment 1 words added, 1 words removed

Revision Content

This proposal describes a module system similar to the one implemented in Helma NG. I call it Pythonic Modules because it is heavily inspired by the way modules are implemented in Python. It provides protection against name collisions by isolating module scopes, while being reasonably easy to implement in a server or standalone JavaScript runtime.

I have written about this in other places, but I try to rephrase its essential properties here in a more general way. I will for the purpose of this proposal refer to a generic JavaScript runtime that implements the pythonic module system. Note that while our JavaScript runtime uses the file system to store and access its modules, this is no requirement of this proposal.

Scripts are modules

For our JavaScript runtime, every script represents a module. This is true for all scripts, regardless of whether they are part of a core library or a user-written application. There is nothing special a script must contain in order to make it a module.

Module names

Modules are managed by the JavaScript runtime by looking for files within one or more directories which we call module directories. A module name translates to a file name by adding the .js extension to it. Thus, when our JavaScript runtime tries to load a module named A, it will look for a file called A.js in its module directories. Modules that live in subdirectories of a module directory are accessed using a dotted module name where each element in the module name corresponds ot an element in the file path. For example, a module named A.B.C will cause our JavaScript runtime to search its module directories for a file called A/B/C.js.

Every module has its own scope

This is maybe the most radical step away from JavaScript as we know it, since the shared global scope is one of JavaScript's more prominent features. But it is also one of the most critisized one, and one that will seriously hamper development of real large scale applications. As it turns out, giving each script its own top level scope is both easy and backwards compatible.

When our JavaScript runtime starts up, it creates the familiar global JavaScript object containing the Object, Array, Date, Math, etc. objects. However, whenever the JavaScript runtime loads a module, it doesn't use the global object but instead creates a new, empty JavaScript object to evaluate the module on. This object, which we call the module scope, has two important features:

It represents a top-level scope, i.e. its parent scope is set to null.
It has the shared global object in its prototype chain.

This makes sure module code will never pollute the shared global object (or any other module scope, for that matter), because it is the top-most object in its scope chain, but can still see the standard global objects through the module scope's prototype chain. With this setup, modules code will never unintentionally pollute any other scope. Users of our JavaScript runtime can just write global functions and variables, even accidentally omitting the var keyword, without any risk of disturbing with other modules or global code.

Importing modules

Since modules are shielded from each other, there must be a well-defined way for one module to load and make use of another module. Our JavaScript runtime provides one global require() function to allow one module to load and use another:

require('modulename')

This causes our JavaScript runtime to load the module with the given name, evaluate it on a module scope, and return the module scope to the caller.

Visibility of loaded modules

One great feature of the separated module scopes is protection against name collisions as described above. Another, maybe equally important feature is the fact that imported modules and module properties are only visible to the very modules that imported them.

Module loading and caching

Our JavaScript runtime keeps a map of loaded modules. Before a module is loaded, the runtime first checks whether the module has already been loaded before. If so, the existing module scope is reused. Our runtime also makes sure module scopes are registered in the loaded module map before evaluating them in order to be able to deal with cyclic imports.

¹) Helma NG currently only consults the exported properties for include(), since this is the feature that is most prone to unintentional scope pollution.

Revision Source

<p>This proposal describes a module system similar to the one implemented in <a class="external" href="https://github.com/hns/helma-ng/tree/master" title="https://github.com/hns/helma-ng/tree/master">Helma NG</a>. I call it Pythonic Modules because it is heavily inspired by the way modules are <a class="external" href="https://pytut.infogami.com/node8.html" title="https://pytut.infogami.com/node8.html">implemented in Python</a>. It provides protection against name collisions by isolating module scopes, while being reasonably easy to implement in a server or standalone JavaScript runtime.</p>
<p>I have written about this in <a class="external" href="https://dev.helma.org/wiki/Modules+and+Scopes+in+Helma+NG/" title="https://dev.helma.org/wiki/Modules+and+Scopes+in+Helma+NG/">other places</a>, but I try to rephrase its essential properties here in a more general way. I will for the purpose of this proposal refer to a generic JavaScript runtime that implements the pythonic module system. Note that while our JavaScript runtime uses the file system to store and access its modules, this is no requirement of this proposal.</p>
<h3>Scripts are modules</h3>
<p>For our JavaScript runtime, every script represents a module. This is true for all scripts, regardless of whether they are part of a core library or a user-written application. There is nothing special a script must contain in order to make it a module.</p>
<h3>Module names</h3>
<p>Modules are managed by the JavaScript runtime by looking for files within one or more directories which we call <em>module directories</em>. A module name translates to a file name by adding the .js extension to it. Thus, when our JavaScript runtime tries to load a module named A, it will look for a file called A.js in its module directories. Modules that live in subdirectories of a module directory are accessed using a dotted module name where each element in the module name corresponds ot an element in the file path. For example, a module named A.B.C will cause our JavaScript runtime to search its module directories for a file called A/B/C.js.</p>
<h3>Every module has its own scope</h3>
<p>This is maybe the most radical step away from JavaScript as we know it, since the shared global scope is one of JavaScript's more prominent features. But it is also one of the most critisized one, and one that will seriously hamper development of real large scale applications. As it turns out, giving each script its own top level scope is both easy and backwards compatible.</p>
<p>When our JavaScript runtime starts up, it creates the familiar global JavaScript object containing the Object, Array, Date, Math, etc. objects. However, whenever the JavaScript runtime loads a module, it doesn't use the global object but instead creates a new, empty JavaScript object to evaluate the module on. This object, which we call the <em>module scope</em>, has two important features: </p>
<ol> <li>It represents a top-level scope, i.e. its parent scope is set to null.</li> <li>It has the shared global object in its prototype chain.</li>
</ol>
<p>This makes sure module code will never pollute the shared global object (or any other module scope, for that matter), because it is the top-most object in its scope chain, but can still see the standard global objects through the module scope's prototype chain. With this setup, modules code will never unintentionally pollute any other scope. Users of our JavaScript runtime can just write global functions and variables, even accidentally omitting the var keyword, without any risk of disturbing with other modules or global code.</p>
<h3>Importing modules</h3>
<p>Since modules are shielded from each other, there must be a well-defined way for one module to load and make use of another module. Our JavaScript runtime provides one global require() function to allow one module to load and use another:</p>
<p><strong>require('modulename')</strong></p>
<p>This causes our JavaScript runtime to load the module with the given name, evaluate it on a module scope, and return the module scope to the caller.</p>
<h3>Visibility of loaded modules</h3>
<p> One great feature of the separated module scopes is protection against name collisions as described above. Another, maybe equally important feature is the fact that imported modules and module properties are only visible to the very modules that imported them. </p>
<h3>Module loading and caching</h3>
<p>Our JavaScript runtime keeps a map of loaded modules. Before a module is loaded, the runtime first checks whether the module has already been loaded before. If so, the existing module scope is reused. Our runtime also makes sure module scopes are registered in the loaded module map <em>before</em> evaluating them in order to be able to deal with cyclic imports.</p>
<p>¹) Helma NG currently only consults the exported properties for include(), since this is the feature that is most prone to unintentional scope pollution.</p>

Revert to this revision