Control basic data flow through your apps

Draft
This page is not complete.

The core of every web app is its internal and external data flow. As the web has evolved, the complexity of these data flows has increased. Understanding these data flows will help you build better and more efficient web apps.

The basic HTTP data flow

The web is based on a client/server architecture sustained by the HTTP protocol.

A basic representation of a web site architecture

The above schema shows the basic flow involved in displaying a website. Resources for the website are hosted on the server, and the client (the web browser) requests resources using HTTP. In this configuration, the web browser is only in charge of displaying the web content. All the data, and the actions performed on this data are handled by the server.

This client/server architecture—also known as the REST architecture—is used by many websites today. This architecture describes how resources are accessed, and how they are handled by HTTP. A good understanding of REST will help you build robust resource access and services over HTTP.

One drawback of this approach is that all assets for a webpage are retrieved every time the webpage is requested; even assets that are shared across pages.This can lead to less desirable visual effects like display latencies and FOUC, among others. To avoid issues like these, web browsers handle a cache of resources. However, webpages are still fully rendered every time they are requested.

Advanced web site data flow

At the end of the 90s, Microsoft invented a technical way to workaround that necessity to repeatedly request resources. That solution is now known as AJAX and is based on a new JavaScript object named XMLHttpRequest.

A simple modern architecture for web sites

The main change introduced by this technology is where and when HTTP requests are handled. With this technology, the HTTP requests can be handled directly by a web page rather than by the built-in browser mechanism such as hyperlinks. Thanks to that change, the web page takes the opportunity to request resources asynchronously any time it needs them and only if it needs them. At the beginning it was thought to request HTML or XML fragments to update portion of the web page. Today, it's used to request data (usually, but not necessarily, in the JSON format) necessary to update the page content. If it's a bit different in a technical point of view, it does not change anything in an architectural point of view.

Today's biggest web sites use this technique quite extensively, however it is important to know when it's a good idea to use it and when it is not necessary.

Web app data flow

Recently, a set of new technolgies start to land in web browsers. They are IndexedDB, AppCache and offline events. The first one is nothing less than a data base directly embedded in the browser. It allows us to store and manipulate data locally without any request to a server to do the job. The second one is a new cache mechanism that allows to cache (none data) resources forever. Once a resource is cached through that mechanism, the browser will never request the server for that resource again. The last one is an API to know when the browser is connected to the network or not.

These three technologies change many things. As a result, the connection to the server is no longer a preriquiste to use a web site. For that reason, such web sites are called web application. At that point, the whole life cycle of the application can be handled inside the browser. However the server is still useful, but only for two point:

A first connexion is required to "install" or "update" the application (caching ressources, setup data schema, etc.)
The server can be used to backup the applications data to prevent any issue on the local machine or for users who use several browsers or computers to access the application.

A basic web app data flow architecture

Beyond HTTP

How other technologies fit into our data flow architecture, including Web sockets, WebRTC (but this is advanced stuff: defer detailed discussions to the advanced network communication section).

Go deeper

We need to write separate follow on articles covering the following:

HTTP basics, and how this can help you be a better master of your data. For this part it may be a good idea to take a look at what HTTP information is currently available on MDN and refresh/reshape/update/reorganize that content
Ajax basics (architecture, technique)
Data storage systems and Ajax
Intro to handling offline AJAX and timeouts
Handling data within the browser (persistant or temporary; technique with cookies vs. storage vs. IndexedDB vs. data attributes). For offline stuff, defer to the Offline section
Advanced Ajax topics, such as UI interaction and perceived performance.
Links to other useful demos and resources