This article explains all of the basic 3D theroy that's useful to know when you are first getting started working with 3D.
Coordinate system
3D essentially is all about representations of shapes in a 3D space, with a coordinate system used to calculate their position.
WebGL uses the right-hand coordinate system — the x
axis points to the right, the y
axis points up, and the z
axis points out of the screen, as seen in the above diagram. It is counter-clockwise to cartesian coordinates.
Rendering pipeline
The rendering pipeline is the process by which images are prepared and output onto the screen. The graphics rendering pipeline takes the 3D objects built from primitives described using vertices, applies processing, calculates the fragments and renders them on the 2D screen as pixels.
All the shapes are built from vertices. Every vertex is described by these attributes:
- Position: Identifies it in a 3D space (
x
,y
,z
). - Color: Holds an RGBA value (R, G and B for the red, green, and blue channels, alpha for transparency — all values range from
0.0
to1.0
). - Normal: A way to describe the direction the vertex is facing.
- Texture: A 2D image that the vertex can use to decorate its surface instead of a simple color.
Other terminology worth know is as follows:
- A Primitive: An input to the pipeline — it's built from vertices and can be a triangle, point or line.
- A Pixel: A point on the screen arranged in the 2D grid, which holds an RGB color.
- A Fragment: A 3D projection of a pixel, and has all the same attributes as a pixel.
Objects
A face of the given shape is a plane between vertices. For example, a cube has 8 different vertices (points in space) and 6 different faces, each constructed out of 4 vertices. Also, by connecting the points we're creating the edges of the cube. The geometry is built from a vertex and the face, while material is a texture, which uses an image. If we connect the geometry with the material we will get a mesh.
Rendering pipeline consists of vertex and fragment processing. They are programmable — you can write your own shaders that manipulate the output.
Transformation matrix
Vertex processing is all about transforming the coordinates and projecting them onto the screen. A transform converts a vertex from one space to another, and is done by multiplying the vector with the transformation matrix.
There are four stages to this processing: arranging the objects in the world (called world or model transformation), positioning and setting the orientation of the camera (view transformation), defining the camera settings (projection transformation) and outputting the image (viewport transformation).
Model (world) transformation
Objects are drawn in local space, so they need to be transformed to be drawn in the global, world space. It is done with the affine transforms — for example the rotation and scaling belong to the linear transformation, while translation is not linear.
View transformation
View transformation is about placing the camera in the 3D space. The camera has three parameters: location, direction, and orientation. The view matrix is used to transform the camera space to the world space.
Projection (perspective) transformation
Projection sets up what can be seen by the camera — the configuration includes field of view, aspect ratio and optional near and far planes. Objects outside of the view are not visible, and are ignored in the rendering process to boost performance. If an object is partially visible it is clipped to the camera's visible area. Projection transforms individual vertices.
Rasterization (viewport) transformation
Rasterization converts primitives to a set of fragments and maps them to the 3D viewport.
Fragment processing
Fragment processing focuses on textures and lightning. It calculates final colors based on the given parameters.
Output manipulation
During the output manipulation we use the z-buffer, or depth-buffer. Removing everything that is not visible because it was hidden behind another object can greatly increase the performance of the application.
If one object is in front of the other and it's not entirely opaque (the material has transparency), the object behind it has to be rendered with that in mind — alpha blending can be used to calculate the proper colors of the objects in this situation.
Lighting
The color we see on the screen is a result of the light source interacting with the surface color of the object's material. Light might be absorbed or reflected. The standard Phong Lightning Model implemented in WebGL has four basic types of lighting:
- Diffuse: A distant directional light, like the sun.
- Specular: A point of light, just like a light bulb in a room or a flash light.
- Ambient: The constant light applied to everything on the scene.
- Emissive: The light emitted directly by the object.
Conclusion
Now you know the basic theory behind 3D manipulation. If you want to move on to practice and see some demos in action, follow up with the tutorials below:
- Building up a basic demo with Three.js
- Building up a basic demo with Babylon.js
- Building up a basic demo with PlayCanvas
- Building up a basic demo with A-Frame
Go ahead and create some cool cutting-edge 3D experiments yourself!