技术概述

这个页面主要是 Bill McCloskey 的关于多进程 Firefox 的摘要: https://billmccloskey.wordpress.com/2013/12/05/multiprocess-firefox/

从非常高的水平上来看，多进程 Firefox 像下面说明的一样工作。当 Firefox 启动时就执行的进程被称为父进程。起初，这个进程像单进程 Firefox 一样工作：它打开一个窗口显示browser.xul。该文件包含 Firefox 所有主要的 UI 元素。Firefox 有一个可变的 GUI 工具箱(XUL)，它允许GUI元素像网页内容一样地声明和布局。类似于网页内容，Firefox 的 UI 有一个window 对象，它有一个document 属性，并且这个属性包含了在 browser.xul 中所有的XML元素。所有的 Firefox 菜单，工具栏，侧边栏和标签页都是 document 属性下的 XML 元素。每一个标签页（tab）元素都包含一个 <browser> 元素来显示网页内容.

多进程 Firefox 与单进程 Firefox 的最主要区别是每个 <browser> 元素都有一个remote="true" 属性。当这个 browser 元素被添加到 document 中时，一个新的内容进程开始启动。这个进程被称为子进程。父进程和子进程之间通信的通道被创建。起初，子进程显示 about:blank，但是父进程可以给子进程发送一个命令来导航它显示其他内容。

绘制

有时，显示的网页内容需要从子进程到父进程然后显示到屏幕。多进程Firefox依赖于一个新的Firefox特性(off main thread compositing ，OMTC)。简要来说，每个Firefox窗口被分成若干层，这种层某种程度相似于photoshop中的层。每一次Firefox进行渲染时，这写层被提交到合成进程来修建并翻译这些层并把它们合成到一个图片。

层被像树一样地构造。树的根节点对整个Firefox窗口负责。这个层包含负责描绘菜单和标签的其他层。一个子树显示所有网页内容。网页内容可能被分成许多层，但是他们都以一个单内容层为根节点。

在多进程Firefox，内容层子树实际上是一个垫片。在大多数时间，它包含一个能够简单地保持到子进程的通信链接的引用的占位符节点。内容进程包括网页内容的层树。它构建并且描绘这个层树。当描绘完成时，它通过IPC将层数的结构发送给父进程。当父进程受到这个层树时，它删除这个占位符内容节点并且将其替换为源于内容的实际树。然后它正常地合成并且绘制。当它完成后，它将占位符放回。

因为Firefox OS的需要，OMTC怎样于多进程一起工作的基本构架已经存在了一段时间。然而，Matt Woodrow和 David Anderson已经完成了大量工作来使得其在Windows，Mac和Linux正常工作。一个巨大的挑战是使多进程Firefox能够在所有平台下都能使OMTC启动。现在，只有Macs默认使用OMTC。

用户输入

Events in Firefox work the same way as they do on the web. Namely, there is a DOM tree for the entire window, and events are threaded through this tree in capture and bubbling phases. Imagine that the user clicks on a button on a web page. In single-process Firefox, the root DOM node of the Firefox window gets the first chance to process the event. Then, nodes lower down in the DOM tree get a chance. The event handling proceeds down through to the XUL <browser> element. At this point, nodes in the web page’s DOM tree are given a chance to handle the event, all the way down to the button. The bubble phase follows, running in the opposite order, all the way back up to the root node of the Firefox window.

With multiple processes, event handling works the same way until the <browser> element is hit. At that point, if the event hasn’t been handled yet, it gets sent to the child process by IPC, where handling starts at the root of the content DOM tree. The parent process then waits to run its bubbling phase until the content process has finished handling the event.

进程间通信

所有 IPC 使用 Chromium IPC 程序库。每个子进程都有单独的与父进程的 IPC 链接。子进程之间不能直接通信。为了避免死锁和确保响应能力，父进程不允许坐等子进程的消息。但是，子进程可以阻塞等待父进程的消息。

相比于人们预期的直接通过 IPC 发送数据包，我们使用代码生成使这个过程更漂亮。IPC 协议在 IPDL 中定义， which sort of stands for “inter-* protocol definition language”. A typical IPDL file is PNecko.ipdl. It defines a set messages and their parameters. Parameters are serialized and included in the message. To send a message M, C++ code just needs to call the method SendM. To receive the message, it implements the method RecvM.

IPDL is used in all the low-level C++ parts of Gecko where IPC is required. In many cases, IPC is just used to forward actions from the child to the parent. This is a common pattern in Gecko:

void AddHistoryEntry(param) {
  if (XRE_GetProcessType() == GeckoProcessType_Content) {
    // If we're in the child, ask the parent to do this for us.
    SendAddHistoryEntry(param);
    return;
  }

  // Actually add the history entry...
}

bool RecvAddHistoryEntry(param) {
  // Got a message from the child. Do the work for it.
  AddHistoryEntry(param);
  return true;
}

When AddHistoryEntry is called in the child, we detect that we’re inside the child process and send an IPC message to the parent. When the parent receives that message, it calls AddHistoryEntry on its side.

For a more realistic illustration, consider the Places database, which stores visited URLs for populating the awesome bar. Whenever the user visits a URL in the content process, we call this code. Notice the content process check followed by the SendVisitURI call and an immediate return. The message is received here; this code just calls VisitURI in the parent.

The code for IndexedDB, the places database, and HTTP connections all runs in the parent process, and they all use roughly the same proxying mechanism in the child.

框架脚本

IPDL takes care of passing messages in C++, but much of Firefox is actually written in JavaScript. Instead of using IPDL directly, JavaScript code relies on the message manager to communicate between processes. To use the message manager in JS, you need to get hold of a message manager object. There is a global message manager, message managers for each Firefox window, and message managers for each <browser> element. A message manager can be used to load JS code into the child process and to exchange messages with it.

As a simple example, imagine that we want to be informed every time a load event triggers in web content. We’re not interested in any particular browser or window, so we use the global message manager. The basic process is as follows:

// Get the global message manager.
let mm = Cc["@mozilla.org/globalmessagemanager;1"].
         getService(Ci.nsIMessageListenerManager);

// Wait for load event.
mm.addMessageListener("GotLoadEvent", function (msg) {
  dump("Received load event: " + msg.data.url + "\n");
});

// Load code into the child process to listen for the event.
mm.loadFrameScript("chrome://content/content-script.js", true);

For this to work, we also need to have a file content-script.js:

// Listen for the load event.
addEventListener("load", function (e) {
  // Inform the parent process.
  let docURL = content.document.documentURI;
  sendAsyncMessage("GotLoadEvent", {url: docURL});
}, false);

This file is called a frame script. When the loadFrameScript function call runs, the code for the script is run once for each <browser> element. This includes both remote browsers and regular ones. If we had used a per-window message manager, the code would only be run for the browser elements in that window. Any time a new browser element is added, the script is run automatically (this is the purpose of the true parameter to loadFrameScript). Since the script is run once per browser, it can access the browser’s window object and docshell via the content and docShell globals.

The great thing about frame scripts is that they work in both single-process and multiprocess Firefox. To learn more about the message manager, see the message manager guide.

跨进程 API

There are a lot of APIs in Firefox that cross between the parent and child processes. An example is the webNavigation property of XUL <browser> elements. The webNavigation property is an object that provides methods like loadURI, goBack, and goForward. These methods are called in the parent process, but the actions need to happen in the child. First I’ll cover how these methods work in single-process Firefox, and then I’ll describe how we adapted them for multiple processes.

The webNavigation property is defined using the XML Binding Language (XBL). XBL is a declarative language for customizing how XML elements work. Its syntax is a combination of XML and JavaScript. Firefox uses XBL extensively to customize XUL elements like <browser> and <tabbrowser>. The <browser> customizations reside in browser.xml. Here is how browser.webNavigation is defined:

<field name="_webNavigation">null</field>

<property name="webNavigation" readonly="true">
   <getter>
   <![CDATA[
     if (!this._webNavigation)
       this._webNavigation = this.docShell.QueryInterface(Components.interfaces.nsIWebNavigation);
     return this._webNavigation;
   ]]>
   </getter>
</property>

This code is invoked whenever JavaScript code in Firefox accesses browser.webNavigation, where browser is some <browser> element. It checks if the result has already been cached in the browser._webNavigation field. If it hasn’t been cached, then it fetches the navigation object based off the browser’s docshell. The docshell is a Firefox-specific object that encapsulates a lot of functionality for loading new pages, navigating back and forth, and saving page history. In multiprocess Firefox, the docshell lives in the child process. Since the webNavigation accessor runs in the parent process, this.docShell above will just return null. As a consequence, this code will fail completely.

One way to fix this problem would be to create a fake docshell in C++ that could be returned. It would operate by sending IPDL messages to the real docshell in the child to get work done. We may eventually take this route in the future. We decided to do the message passing in JavaScript instead, since it’s easier and faster to prototype things there. Rather than change every docshell-using accessor to test if we’re using multiprocess browsing, we decided to create a new XBL binding that applies only to remote <browser> elements. It is called remote-browser.xml, and it extends the existing browser.xml binding.

The remote-browser.xml binding returns a JavaScript shim object whenever anyone uses browser.webNavigation or other similar objects. The shim object is implemented in its own JavaScript module. It uses the message manager to send messages like "WebNavigation:LoadURI" to a content script loaded by remote-browser.xml. The content script performs the actual action.

The shims we provide emulate their real counterparts imperfectly. They offer enough functionality to make Firefox work, but add-ons that use them may find them insufficient. I’ll discuss strategies for making add-ons work in more detail later.

跨进程对象包装器

The message manager API does not allow the parent process to call sendSyncMessage; that is, the parent is not allowed to wait for a response from the child. It’s detrimental for the parent to wait on the child, since we don’t want the browser UI to be unresponsive because of slow content. However, converting Firefox code to be asynchronous (i.e., to use sendAsyncMessage instead) can sometimes be onerous. As an expedient, we’ve introduced a new primitive that allows code in the parent process to access objects in the child process synchronously.

These objects are called cross-process object wrappers, frequently abbreviated to CPOWs. They’re created using the message manager. Consider this example content script:

addEventListener("load", function (e) {
  let doc = content.document;
  sendAsyncMessage("GotLoadEvent", {}, {document: doc});
}, false);

In this code, we want to be able to send a reference to the document to the parent process. We can’t use the second parameter to sendAsyncMessage to do this: that argument is converted to JSON before it is sent up. The optional third parameter allows us to send object references. Each property of this argument becomes accessible in the parent process as a CPOW. Here’s what the parent code might look like:

let mm = Cc["@mozilla.org/globalmessagemanager;1"].
         getService(Ci.nsIMessageListenerManager);

mm.addMessageListener("GotLoadEvent", function (msg) {
  let uri = msg.objects.document.documentURI;
  dump("Received load event: " + uri + "\n");
});
mm.loadFrameScript("chrome://content/content-script.js", true);

It’s important to realize that we’re send object references. The msg.objects.document object is only a wrapper. The access to its documentURI property sends a synchronous message down to the child asking for the value. The dump statement only happens after a reply has come back from the child.

Because every property access sends a message, CPOWs can be slow to use. There is no caching, so 1,000 accesses to the same property will send 1,000 messages.

Another problem with CPOWs is that they violate some assumptions people might have about message ordering. Consider this code:

mm.addMessageListener("GotLoadEvent", function (msg) {
  mm.sendAsyncMessage("ChangeDocumentURI", {newURI: "hello.com"});
  let uri = msg.objects.document.documentURI;
  dump("Received load event: " + uri + "\n");
});

This code sends a message asking the child to change the current document URI. Then it accesses the current document URI via a CPOW. You might expect the value of uri to come back as "hello.com". But it might not. In order to avoid deadlocks, CPOW messages can bypass normal messages and be processed first. It’s possible that the request for the documentURI property will be processed before the "ChangeDocumentURI" message, in which case uri will have some other value.

For this reason, it’s best not to mix CPOWs with normal message manager messages. It’s also a bad idea to use CPOWs for anything security-related, since you may not get results that are consistent with surrounding code that might use the message manager.

Despite these problems, we’ve found CPOWs to be useful for converting certain parts of Firefox to be multiprocess-compatible. It’s best to use them in cases where users are less likely to notice poor responsiveness. As an example, we use CPOWs to implement the context menu that pops up when users right-click on content elements. Whether this code is asynchronous or synchronous, the menu cannot be displayed until content has responded with data about the element that has been clicked. The user is unlikely to notice if, for example, tab animations don’t run while waiting for the menu to pop up. Their only concern is for the menu to come up as quickly as possible, which is entirely gated on the response time of the content process. For this reason, we chose to use CPOWs, since they’re easier than converting the code to be asynchronous.

It’s possible that CPOWs will be phased out in the future. Asynchronous messaging using the message manager gives a user experience that is at least as good as, and often strictly better than, CPOWs. We strongly recommend that people use the message manager over CPOWs when possible. Nevertheless, CPOWs are sometimes useful.

绘制

用户输入

进程间通信

框架脚本

跨进程 API

跨进程对象包装器

文档标签和贡献者